Appearance
Change the Problem, Not the Algorithm
Quick Revisit
- REACH FOR THIS WHEN: the problem seems hard but might become easy if you rephrase it
- KEY DECISION: which transformation simplifies the problem -- compression, complement, dual, difference, contribution, or offline?
- QUICK RULE: don't change the algorithm, change the problem so a simple algorithm works
- SEE ALSO: Binary Search Reductions, Reduction Patterns
Sometimes the smartest move isn't finding a better algorithm. It's rephrasing the question so a simple algorithm works.
Coordinate Compression
The situation. Values range up to
The transformation. Replace each value with its rank among all distinct values. Sort the values, assign ranks 0 through
cpp
vector<int> vals = a; // copy
sort(vals.begin(), vals.end());
vals.erase(unique(vals.begin(), vals.end()), vals.end());
for (int& x : a)
x = lower_bound(vals.begin(), vals.end(), x) - vals.begin();
// Now all values are in [0, vals.size()-1]When to use it. Whenever a data structure needs values as indices and the value range is too large but the count of distinct values is manageable. Inversions, 2D sweep line, Fenwick-based counting -- coordinate compression is often the first step.
The Complement Trick
The situation. Counting "valid" configurations directly is complicated because the constraints interact.
The transformation. Count the total configurations (usually simple), then subtract the invalid ones.
Example: Count pairs
Example: Probability of at least one success.
When it backfires. The complement is only useful when one side is simpler than the other. If both "valid" and "invalid" are equally complex, you've gained nothing.
The Dual Problem
The situation. The problem as stated leads to a hard optimization, but there's an equivalent formulation that's easier.
König's theorem (the classic dual): In a bipartite graph, the size of the maximum matching equals the size of the minimum vertex cover. So "minimum vertex cover" reduces to "maximum bipartite matching," which we know how to solve with augmenting paths or Hopcroft-Karp.
Min-cut = Max-flow. You want the cheapest way to disconnect source from sink? Solve max-flow instead. Same answer, completely different algorithm.
Minimum path cover in a DAG =
These dualities aren't tricks you invent on the spot. They're theorems you know, and recognizing when a problem is secretly the dual of something familiar is the skill.
Difference Arrays and Prefix Sums
The situation. You need to perform many range additions, then query individual values.
The transformation. Instead of updating every element in
cpp
// Range add [l, r] by v
diff[l] += v;
diff[r + 1] -= v;
// Reconstruct after all operations
for (int i = 1; i < n; i++)
diff[i] += diff[i - 1];The deeper principle. Prefix sums and difference arrays are inverses. If your problem involves one, consider switching to the other. Range sum queries with point updates -> Fenwick tree on values. Range updates with point queries -> Fenwick tree on differences.
Contribution Technique
The situation. You need the sum of some quantity over all subsets, subarrays, pairs, or structures. Computing each structure individually is too slow.
The transformation. Instead of iterating over structures, iterate over elements and ask: "how many structures does this element contribute to?"
Example: Sum of all subarray minimums. Don't iterate over subarrays. For each element
Use a monotone stack to find
Example: Sum of GCDs over all pairs. For each value
The contribution technique turns an
Going Offline
Precondition. Going offline is allowed only when answering one query does not change the inputs of later queries. The classic blocker is "the next query's parameters are XORed with the previous answer" — that forces online processing. If queries depend on past answers but the dependency can be decoded safely (e.g., the dependency is on results of operations, not on judge-supplied query bodies), you can sometimes still reorder; check before committing.
The situation. Queries arrive online, and answering each one independently is expensive. But the problem doesn't actually require online processing.
The transformation. Read all queries first, reorder them so they're cheaper to answer together.
Sorting queries by right endpoint. Process elements left to right. When you reach position
Mo's algorithm. Sort queries by
CDQ divide-and-conquer. Split the timeline in half. Recursively solve each half. Then handle the cross-half interactions. Turns online 2D problems into offline 1D problems.
When you can't go offline. Some problems explicitly force online processing (e.g., each query depends on the previous answer via XOR). In those cases, you need persistent or link-cut data structures instead.
Recognition Cues
| If you see in the problem... | Transformation to try |
|---|---|
| Values up to 10^9 but few distinct values | Coordinate compression |
| "Count valid X" with complex constraints | Complement: total - invalid |
| "Minimum vertex cover" on bipartite graph | Dual: maximum matching (Konig's) |
| "Disconnect source from sink at minimum cost" | Dual: max-flow = min-cut |
| Many range additions, then query individual values | Difference array |
| "Sum of f(x) over all subarrays/pairs/subsets" | Contribution: iterate over elements, count appearances |
| Many range queries, order doesn't matter | Go offline: sort queries, sweep |
| "Probability of at least one" | Complement: 1 - P(none) |
The Transformation Toolbox (Summary)
| Transformation | You Have | You Get |
|---|---|---|
| Coordinate compression | Values up to | Indices up to |
| Complement | "Count valid" (hard) | "Total - invalid" (easier) |
| Dual problem | Hard optimization | Equivalent easier optimization |
| Difference array | Range updates | Endpoint updates + prefix reconstruction at the end |
| Contribution | Sum over structures | Sum over elements |
| Offline | Arbitrary query order | Sorted/batched queries |
None of these change the underlying algorithm. They change the problem so that a simple algorithm suffices.
Common Mistakes in Choosing
- Using the complement when both sides are equally hard. The complement trick only helps when one side is simpler. If "count valid" and "count invalid" are both complex, you've gained nothing.
- Forgetting to undo coordinate compression. After solving with compressed coordinates, the answer may need to be mapped back to original values. Don't output ranks when the problem asks for values.
- Applying difference arrays when updates are not uniform. Difference arrays handle "add v to range [l,r]." If the update varies by position (e.g., add i*v to position i), you need a different technique.
- Assuming you can go offline. Some problems force online processing (e.g., each query depends on the previous answer via XOR). Read the problem carefully for such constraints.
- Missing the contribution technique. When you see "sum over all subarrays" and think O(n^2), stop and ask: "how many subarrays does element i appear in?" This is the most commonly missed transformation.
Cross-References
- Prefix Sums -- the inverse of difference arrays
- Difference Arrays -- range updates via endpoint marking
- Coordinate Compression -- the full technique
- Mo's Algorithm -- offline query reordering
- Contribution Technique -- element-wise counting
- Graph Modeling -- another reduction approach
- Binary Search Reductions -- optimization to decision
Practice Problems
| # | Problem | Source | Rating | Transformation |
|---|---|---|---|---|
| 1 | Sum of Subarray Minimums | LeetCode | 1700 | Contribution + monotone stack |
| 2 | Range Update Queries | CSES | 1400 | Difference array |
| 3 | DQUERY | SPOJ | 1700 | Offline + BIT by right endpoint |
| 4 | Count Inversions | SPOJ | 1500 | Coordinate compression + BIT |
| 5 | Maximum Flow | CSES | 1800 | Min-cut duality |
| 6 | XOR and Favorite Number | CF 617E | 2100 | Mo's algorithm (offline reordering) |
Recap
- Six transformations: coordinate compression, complement, dual problem, difference array, contribution, offline processing.
- None change the algorithm -- they change the problem so a simple algorithm suffices.
- Coordinate compression: when values are huge but distinct count is small, replace with ranks.
- Complement trick: count total minus invalid when "valid" is complex but "invalid" is simple.
- Contribution technique: iterate over elements, not structures -- turns O(n^2) sums into O(n log n) or O(n).
- Going offline: sort or batch queries for cheaper processing. Mo's, sweep, CDQ divide-and-conquer.