Appearance
CDQ Divide and Conquer
Quick Revisit
- USE WHEN: multi-dimensional partial order (2D/3D inversions, updates-before-queries in time)
- INVARIANT: left-half contributions to right-half computed at each recursion level; cross-half only
- TIME: O(n log² n) | SPACE: O(n)
- CLASSIC BUG: not resetting the BIT/data structure after processing each recursive call
- PRACTICE: 08-ladder-mixed
Offline technique that solves multi-dimensional partial-order problems by recursively splitting one dimension and computing cross-contributions between halves. Replaces persistent or 2D data structures with a simple recursive framework.
ID: AL-42 Difficulty: [Advanced]Prerequisites: Divide and Conquer, Fenwick Tree, Offline QueriesCF Rating Range: 1900--2100
Contents
- 2. Intuition
- 3. How It Works
- 4. Implementation
- 5. Complexity Analysis
- 6. Problem Patterns & Tricks
- 7. Contest Cheat Sheet
- 8. Gotchas & Debugging
- 9. Self-Test
- 10. Practice Problems
- 11. Further Reading
Intuition
2.1 The Pain
Count 2D inversions: given
Brute force: check every pair.
Points: A(1,5) B(2,3) C(3,6) D(4,1) E(5,4) F(6,2)
15 pairs to check, 10 satisfy both conditions. O(n^2).For
2.2 The Key Insight
Split the index/time dimension in half. All contributions from the left half to the right half can be computed efficiently (e.g. with a BIT over the second dimension), then recurse on each half independently.
The trick is dimensional reduction. A 2D partial order has two conditions:
2.3 Visual Walkthrough
Six points sorted by
Index: 0 1 2 3 4 5
Point: (1,5) (2,3) (3,6) (4,1) (5,4) (6,2)
A B C D E FLevel 0 -- Split [0,5] at mid=2:
LEFT [0..2]: A(1,5) B(2,3) C(3,6)
RIGHT [3..5]: D(4,1) E(5,4) F(6,2)
Cross-contributions: pairs (i in LEFT, j in RIGHT) with y_i > y_j.
D(y=1): A(5>1) B(3>1) C(6>1) => 3
E(y=4): A(5>4) C(6>4) => 2
F(y=2): A(5>2) B(3>2) C(6>2) => 3
cross = 8Recurse on each half. The full recursion tree:
[0..5] cross=8
/ \
[0..2] cross=0 [3..5] cross=1
/ \ / \
[0..1] c=1 [2..2] [3..4] c=0 [5..5]
/ \ / \
[0] [1] [3] [4]Total = 8 + 0 + 1 + 1 + 0 = 10. Matches brute force.
Generalized CDQ recursion -- left half contributes to right half:
CDQ Recursion Tree -- left half contributes to right half:
[1..........8]
/ \
[1..4] ======> [5..8] left contributes to right
/ \ / \
[1,2] => [3,4] [5,6] => [7,8]
/ \ / \ / \ / \
1 2 3 4 5 6 7 8
======> means "process contributions from left to right"
Key: at every level, cross-contributions flow left --> right.
Contributions *within* each half are handled by deeper recursion.
Every pair (i < j) is counted exactly once at the LCA of their leaves.Worked Example: Counting Inversions via CDQ (Merge-Sort Style)
Array A = [5, 3, 6, 1, 4, 2] (count pairs i<j with A[i]>A[j])
── Level 2: split into pairs ──────────────────────────────
[5, 3] [6, 1] [4, 2]
cross:1 cross:1 cross:1 (5>3, 6>1, 4>2)
sorted->[3,5] sorted->[1,6] sorted->[2,4]
── Level 1: merge pairs into quads ────────────────────────
Merge [3, 5] with [1, 6]:
Compare 3 vs 1: take 1, inversions += 2 (both 3,5 > 1)
Compare 3 vs 6: take 3, inversions += 0
Compare 5 vs 6: take 5, inversions += 0
Take 6. cross = 2
sorted -> [1, 3, 5, 6]
Merge [2, 4] (only one pair, becomes right half at next level)
── Level 0: merge [1, 3, 5, 6] with [2, 4] ───────────────
ptr_L -> 1, ptr_R -> 2
| Take | From | remaining left | += inversions |
|------|-------|----------------|---------------|
| 1 | left | [3, 5, 6] | 0 |
| 2 | right | -- | += 3 (3,5,6) |
| 3 | left | [5, 6] | 0 |
| 4 | right | -- | += 2 (5,6) |
| 5 | left | [6] | 0 |
| 6 | left | [] | 0 |
cross = 5
Total inversions = 1 + 1 + 1 + 2 + 5 = 10
Brute force check: (5,3)(5,1)(5,4)(5,2)(3,1)(3,2)(6,1)(6,4)(6,2)(4,2) = 10 OK2.4 The Invariant
After processing interval
, all contributions from elements in to elements in have been counted. Recursion on the two halves counts contributions within each half.
Together, every pair
2.5 When to Reach for This
| Trigger | Example |
|---|---|
| 2D partial order / 2D inversion counting | "Count pairs with |
| 3D partial order (CDQ + BIT) | "Count points dominated in all three coordinates" |
| Offline point updates + range queries | "Process updates and queries in order, but offline" |
| DP optimization with prefix constraints | " |
| Replacing persistent/2D data structures | Any offline 2D problem where online is overkill |
Rule of thumb: If you see two or three ordering conditions on pairs and the problem is offline, think CDQ.
Dimension Role Assignment
In a k-dimensional problem (e.g., 3D partial order / CDQ on BIT):
| Dimension | Handled by | When |
|---|---|---|
| 1st | Global sort | Before anything (sort input by dim 1) |
| 2nd | CDQ merge | During divide-and-conquer (left-to-right cross contribution) |
| 3rd | BIT/segment tree | Inside the cross-contribution step |
Dimension reduction pipeline -- CDQ peels off one dimension at each stage:
3D partial order: (a_i <= a_j) AND (b_i <= b_j) AND (c_i <= c_j)
CDQ eliminates one dimension at each level:
+------------------+ +------------------+ +------------------+
| 3 dimensions | --> | 2 dimensions | --> | 1 dimension |
| sort by dim a | | CDQ on dim b | | BIT/merge on c |
| (preprocessing) | | (divide-conquer) | | (data structure) |
+------------------+ +------------------+ +------------------+
^ ^ ^
| | |
Global sort Merge-sort by b BIT update/query
eliminates a eliminates b handles cThis assignment is NOT arbitrary. The 1st dimension must be the one that, once sorted, doesn't need to be "unsorted." The 2nd dimension is the one CDQ's merge handles. The 3rd dimension needs a data structure because it's neither sorted nor merged.
What the Code Won't Teach You
When to use CDQ vs. alternatives. CDQ is the right tool when:
- You have a 3D partial order problem, or
- You have a DP optimization where the transition is
-- a 2D constraint on the transition.
The alternative to CDQ for (1) is a persistent segment tree or a merge sort tree, which are often simpler to implement but use more memory. The alternative for (2) is usually nothing -- CDQ D&C DP optimization fills a niche that few other techniques handle cleanly.
Why CDQ is
The merge is load-bearing, not cleanup. Reading the code, the merge-sort step at the end of cdq(l, r) looks like cleanup. It is not. If you remove it, the parent call's two-pointer sweep over the second dimension breaks because it assumes both halves are sorted. The algorithm is a merge sort that counts things along the way.
3. How It Works
Given
- Sort by
(or use the original index as first dimension). - Recurse with
cdq(l, r):- Base:
, return. . cdq(l, mid),cdq(mid+1, r).- Merge step: compute cross-contributions using
.
- Base:
- Merge-sort
by so parent calls can two-pointer sweep.
Complexity:
Implementation
4.1 CDQ for 2D Inversion Count (merge-sort style)
Count pairs
The 1D ancestor. Standard merge-sort inversion counting is the one-dimensional CDQ: split the index dimension at the midpoint, count pairs that cross (left
, right with ), recurse. The "second dimension" is the value , handled by the merge-sort sweep itself. Once you see this, 3D partial order is just two more dimensions stacked on top: the merge sort handles one of them, and a BIT handles the third. CDQ is merge-sort with a counting hook.
How the merge-sort cross-counting works:
CDQ merge-sort pass:
Left half (sorted): Right half (sorted):
+---+---+---+---+ +---+---+---+---+
| 1 | 3 | 5 | 7 | | 2 | 4 | 6 | 8 |
+---+---+---+---+ +---+---+---+---+
^ ^
i j
For each j, count how many i's in left half are GREATER than a[j].
Since both halves are sorted, use two-pointer:
When a[i] <= a[j]: advance i (no inversion with this j)
When a[i] > a[j]: ALL of a[i..mid] > a[j] --> add (mid - i + 1)
After counting, merge both halves into sorted order for parent level.cpp
#include <bits/stdc++.h>
using namespace std;
long long ans;
vector<int> tmp;
// After cdq(l,r), a[l..r] is sorted (merge-sort invariant).
void cdq(vector<int>& a, int l, int r) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(a, l, mid);
cdq(a, mid + 1, r);
// Count cross-contributions: for each j in [mid+1,r],
// count elements in [l,mid] that are greater than a[j].
int i = l, j = mid + 1, k = 0;
tmp.resize(r - l + 1);
while (i <= mid && j <= r) {
if (a[i] <= a[j]) {
tmp[k++] = a[i++];
} else {
// All a[i..mid] > a[j], contributing (mid - i + 1) inversions.
ans += mid - i + 1;
tmp[k++] = a[j++];
}
}
while (i <= mid) tmp[k++] = a[i++];
while (j <= r) tmp[k++] = a[j++];
copy(tmp.begin(), tmp.begin() + k, a.begin() + l);
}
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n;
cin >> n;
vector<int> a(n);
for (int& x : a) cin >> x;
ans = 0;
cdq(a, 0, n - 1);
cout << ans << "\n";
}4.2 CDQ for 2D Partial Order with BIT
Given
cpp
#include <bits/stdc++.h>
using namespace std;
struct Point {
int x, y, idx;
int result;
};
struct BIT {
int n;
vector<int> tree;
vector<int> modified; // track modified indices for cleanup
BIT(int n) : n(n), tree(n + 1, 0) {}
void update(int i, int delta) {
modified.push_back(i);
for (; i <= n; i += i & (-i))
tree[i] += delta;
}
int query(int i) {
int s = 0;
for (; i > 0; i -= i & (-i))
s += tree[i];
return s;
}
void clear() {
for (int i : modified)
for (; i <= n; i += i & (-i))
tree[i] = 0;
modified.clear();
}
};
void cdq(vector<Point>& pts, int l, int r, BIT& bit) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(pts, l, mid, bit);
cdq(pts, mid + 1, r, bit);
// Both halves sorted by y (merge-sort invariant from recursion).
// Two-pointer: insert left elements into BIT by y, query for right.
int i = l;
for (int j = mid + 1; j <= r; j++) {
while (i <= mid && pts[i].y <= pts[j].y) {
bit.update(pts[i].y, 1);
i++;
}
pts[j].result += bit.query(pts[j].y - 1);
}
bit.clear();
// Merge-sort by y for parent calls.
inplace_merge(pts.begin() + l, pts.begin() + mid + 1,
pts.begin() + r + 1,
[](const Point& a, const Point& b) {
return a.y < b.y;
});
}
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n; cin >> n;
vector<Point> pts(n);
for (int i = 0; i < n; i++) {
cin >> pts[i].x >> pts[i].y;
pts[i].idx = i; pts[i].result = 0;
}
// Coordinate compress y to [1..m].
vector<int> ys(n);
for (int i = 0; i < n; i++) ys[i] = pts[i].y;
sort(ys.begin(), ys.end());
ys.erase(unique(ys.begin(), ys.end()), ys.end());
for (auto& p : pts)
p.y = (int)(lower_bound(ys.begin(), ys.end(), p.y) - ys.begin()) + 1;
sort(pts.begin(), pts.end(), [](const Point& a, const Point& b) {
return a.x < b.x || (a.x == b.x && a.y < b.y);
});
int maxY = (int)ys.size();
BIT fenwick(maxY);
cdq(pts, 0, n - 1, fenwick);
vector<int> res(n);
for (auto& p : pts) res[p.idx] = p.result;
for (int i = 0; i < n; i++) cout << res[i] << "\n";
}4.3 CDQ for 3D Partial Order (CDQ + nested BIT)
The classic "3D partial order" problem: given
Pattern: track touched indices, undo at the end of each merge step. Every CDQ implementation that uses a BIT (or any auxiliary structure) needs to leave that structure clean for the next merge step at the same level. The pattern, used in the BIT below, is: every
updatepushes the touched index onto a stack; theundo()walks the stack and zeros only those indices. This makes cleanupinstead of . Forgetting this is the single most common CDQ bug. The naive memsetat the end of each call destroys thecomplexity and can mask off-by-one bugs in the merge pointer (see "the mistake that teaches" below).
Strategy:
- Sort by
. - CDQ on the array index eliminates the
dimension. - In the merge step, sweep by
(two-pointer), use a BIT on .
3D Partial Order Walkthrough
Concrete example with 5 elements, each having attributes
Raw input:
E0 = (3, 2, 5)
E1 = (1, 4, 3)
E2 = (2, 1, 2)
E3 = (1, 3, 1)
E4 = (2, 2, 4)Step 1 -- Sort globally by
After sort by (a, b, c):
idx a b c
---- --------
0 1 3 1 (was E3)
1 1 4 3 (was E1)
2 2 1 2 (was E2)
3 2 2 4 (was E4)
4 3 2 5 (was E0)
No duplicates here, so no deduplication needed.
All elements have cnt = 1.Step 2 -- CDQ(0, 4): split at mid = 2
LEFT [0..2]: (1,3,1) (1,4,3) (2,1,2)
RIGHT [3..4]: (2,2,4) (3,2,5)
Recurse: CDQ(0,2), then CDQ(3,4).Step 2a -- CDQ(0, 2): split at mid = 1
LEFT [0..1]: (1,3,1) (1,4,3)
RIGHT [2..2]: (2,1,2)
Recurse: CDQ(0,1), then CDQ(2,2) [base case].Step 2a-i -- CDQ(0, 1): split at mid = 0
LEFT [0..0]: (1,3,1) -- base case
RIGHT [1..1]: (1,4,3) -- base case
Cross-contribution (left -> right):
Two-pointer sweep by b:
ptr=0: e[0].b=3 <= e[1].b=4 --> BIT.update(c=1, cnt=1)
Query for e[1]: BIT.query(c=3) = 1
e[1].result += 1 --> result[1] = 1
BIT.undo()
Merge-sort by b: [(1,3,1), (1,4,3)] (already sorted by b)Back to CDQ(0, 2): cross-contribution
LEFT [0..1] sorted by b: (1,3,1) (1,4,3)
RIGHT [2..2]: (2,1,2)
Two-pointer sweep by b:
j=2: e[2].b=1. ptr=0: e[0].b=3 > 1, so no insertions.
Query for e[2]: BIT.query(c=2) = 0
e[2].result += 0 --> result[2] = 0
BIT.undo()
Merge-sort by b: [(2,1,2), (1,3,1), (1,4,3)]
State of [0..2] sorted by b: b=1, b=3, b=4Step 2b -- CDQ(3, 4): split at mid = 3
LEFT [3..3]: (2,2,4) -- base case
RIGHT [4..4]: (3,2,5) -- base case
Cross-contribution:
ptr=3: e[3].b=2 <= e[4].b=2 --> BIT.update(c=4, cnt=1)
Query for e[4]: BIT.query(c=5) = 1
e[4].result += 1 --> result[4] = 1
BIT.undo()
Merge-sort by b: [(2,2,4), (3,2,5)] (tie in b=2, stable)Back to CDQ(0, 4): cross-contribution
LEFT [0..2] sorted by b: (2,1,2) (1,3,1) (1,4,3)
RIGHT [3..4] sorted by b: (2,2,4) (3,2,5)
Two-pointer sweep by b:
j=3, e[3].b=2:
ptr=0: e[0].b=1 <= 2 --> BIT.update(c=2, cnt=1), ptr=1
ptr=1: e[1].b=3 > 2, stop.
Query: BIT.query(c=4) = 1
e[3].result += 1 --> result[3] = 0+1 = 1
j=4, e[4].b=2:
ptr=1: e[1].b=3 > 2, stop.
Query: BIT.query(c=5) = 1
e[4].result += 1 --> result[4] = 1+1 = 2
BIT.undo()
Merge-sort by b: [(2,1,2), (2,2,4), (3,2,5), (1,3,1), (1,4,3)]Step 3 -- Final results:
Element (a,b,c) result d = result+cnt-1 Meaning
------- ------- ------ ---------------- -------
E3 (1,3,1) 0 0 0 elements dominate E3
E1 (1,4,3) 1 1 1 element dominates E1
E2 (2,1,2) 0 0 0 elements dominate E2
E4 (2,2,4) 1 1 1 element dominates E4
E0 (3,2,5) 2 2 2 elements dominate E0
Verification (brute force):
E3(1,3,1): no element has all coords <= --> 0 *
E1(1,4,3): E3(1<=1, 3<=4, 1<=3) --> 1 *
E2(2,1,2): no element has all coords <= --> 0 *
E4(2,2,4): E2(2<=2, 1<=2, 2<=4) --> 1 *
E0(3,2,5): E2(2<=3, 1<=2, 2<=5)
E4(2<=3, 2<=2, 4<=5) --> 2 *
f[0]=2, f[1]=2, f[2]=1 (output for n=5)cpp
#include <bits/stdc++.h>
using namespace std;
struct Element {
int a, b, c, cnt, result;
bool operator==(const Element& o) const {
return a == o.a && b == o.b && c == o.c;
}
};
struct BIT {
int n;
vector<int> tree;
vector<int> stk;
BIT() : n(0) {}
BIT(int n) : n(n), tree(n + 1, 0) {}
void update(int i, int v) {
stk.push_back(i);
for (; i <= n; i += i & (-i))
tree[i] += v;
}
int query(int i) {
int s = 0;
for (; i > 0; i -= i & (-i))
s += tree[i];
return s;
}
void undo() {
for (int i : stk)
for (int j = i; j <= n; j += j & (-j))
tree[j] = 0;
stk.clear();
}
};
BIT bit;
vector<Element> tmp;
void cdq(vector<Element>& e, int l, int r) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(e, l, mid);
cdq(e, mid + 1, r);
// Cross-contribution: two-pointer sweep by b.
// Insert left elements' c into BIT, query for right elements.
int ptr = l;
for (int j = mid + 1; j <= r; j++) {
while (ptr <= mid && e[ptr].b <= e[j].b) {
bit.update(e[ptr].c, e[ptr].cnt);
ptr++;
}
e[j].result += bit.query(e[j].c);
}
bit.undo();
// Merge sort by b for parent calls.
tmp.clear();
int li = l, ri = mid + 1;
while (li <= mid && ri <= r) {
if (e[li].b <= e[ri].b) tmp.push_back(e[li++]);
else tmp.push_back(e[ri++]);
}
while (li <= mid) tmp.push_back(e[li++]);
while (ri <= r) tmp.push_back(e[ri++]);
for (int k = 0; k < (int)tmp.size(); k++)
e[l + k] = tmp[k];
}
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n, k;
cin >> n >> k;
vector<Element> raw(n);
for (int i = 0; i < n; i++) {
cin >> raw[i].a >> raw[i].b >> raw[i].c;
raw[i].cnt = 1;
raw[i].result = 0;
}
// Sort by (a, b, c).
sort(raw.begin(), raw.end(), [](const Element& x, const Element& y) {
if (x.a != y.a) return x.a < y.a;
if (x.b != y.b) return x.b < y.b;
return x.c < y.c;
});
// Deduplicate identical elements.
vector<Element> e;
for (int i = 0; i < n; ) {
int j = i;
while (j < n && raw[j] == raw[i]) j++;
Element cur = raw[i];
cur.cnt = j - i;
cur.result = 0;
e.push_back(cur);
i = j;
}
int m = (int)e.size();
bit = BIT(k);
cdq(e, 0, m - 1);
// f[d] = number of elements with exactly d elements dominating them.
// An element with count c contributes: result + (c-1) own-group pairs.
vector<int> f(n, 0);
for (auto& el : e) {
int d = el.result + el.cnt - 1;
f[d] += el.cnt;
}
for (int i = 0; i < n; i++)
cout << f[i] << "\n";
}Merge Trace ASCII -- 3D CDQ Cross-Contribution Step
Below is a step-by-step trace of the cross-contribution phase for the top-level CDQ call from the walkthrough above. The LEFT half [0..2] and RIGHT half [3..4] are already sorted by
LEFT (sorted by b): R0=(2,1,2) R1=(1,3,1) R2=(1,4,3)
RIGHT (sorted by b): R3=(2,2,4) R4=(3,2,5)
BIT range: [1..5] (all zeros initially)
Two-pointer: ptr starts at left[0], j sweeps right[0..1].
---------------------------------------------------------------
STEP 1: j -> R3 (b=2)
---------------------------------------------------------------
ptr -> R0 (b=1): 1 <= 2? YES
BIT.update(c=2, cnt=1)
BIT state:
index: 1 2 3 4 5
value: 0 1 0 1 0
+-------+---+
| c=2 inserted, propagated to index 4
ptr advances to R1
ptr -> R1 (b=3): 3 <= 2? NO, stop.
Query for R3: BIT.query(c=4)
Traverse: 4 -> 0 (picks up index 4: val=1)
Result = 1
R3.result += 1
+---+---+---+---+---+
BIT snapshot after step 1: | 0 | 1 | 0 | 1 | 0 |
+---+---+---+---+---+
1 2 3 4 5
---------------------------------------------------------------
STEP 2: j -> R4 (b=2)
---------------------------------------------------------------
ptr -> R1 (b=3): 3 <= 2? NO, stop.
(No new insertions.)
Query for R4: BIT.query(c=5)
Traverse: 5 -> 4 -> 0
index 5: val=0
index 4: val=1
Result = 1
R4.result += 1
+---+---+---+---+---+
BIT snapshot after step 2: | 0 | 1 | 0 | 1 | 0 |
+---+---+---+---+---+
1 2 3 4 5
---------------------------------------------------------------
CLEANUP: BIT.undo()
---------------------------------------------------------------
Undo stack: [2] (only index 2 was inserted)
Reset index 2 -> 0, propagate to index 4 -> 0
+---+---+---+---+---+
BIT after undo: | 0 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+
1 2 3 4 5
---------------------------------------------------------------
MERGE by b:
---------------------------------------------------------------
Left: R0(b=1) R1(b=3) R2(b=4)
Right: R3(b=2) R4(b=2)
Merge result: R0(b=1) R3(b=2) R4(b=2) R1(b=3) R2(b=4)
^ ^ ^ ^ ^
ptr ptr ptr ptr ptr
left right right left left
Cross-contribution summary:
R3.result = 1 (dominated by R0 via left half)
R4.result = 1 (dominated by R0 via left half)
(Combined with results from recursive calls:
R3 total = 1, R4 total = 1+1 = 2)4.4 CDQ for Offline Point Updates + Range Queries
Convert online "update point, query prefix sum" into an offline CDQ problem. Each operation gets a timestamp
cpp
struct Op { int t, type, x, val, qidx; };
// type: 0=update, 1=query. val: update delta or query sign (+1/-1).
void cdq(vector<Op>& ops, int l, int r) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(ops, l, mid);
cdq(ops, mid + 1, r);
// Collect left-updates and right-queries, both sorted by x.
vector<Op*> upd, qry;
for (int i = l; i <= mid; i++)
if (ops[i].type == 0) upd.push_back(&ops[i]);
for (int i = mid + 1; i <= r; i++)
if (ops[i].type == 1) qry.push_back(&ops[i]);
// Cross-contribution: updates in [l,mid] affect queries in [mid+1,r].
int ptr = 0;
for (auto* q : qry) {
while (ptr < (int)upd.size() && upd[ptr]->x <= q->x) {
bit.update(upd[ptr]->x, upd[ptr]->val);
ptr++;
}
ans[q->qidx] += (long long)q->val * bit.query(q->x);
}
bit.undo();
// Merge-sort by x for parent calls.
inplace_merge(ops.begin() + l, ops.begin() + mid + 1,
ops.begin() + r + 1,
[](const Op& a, const Op& b) { return a.x < b.x; });
}Complexity Analysis
Time
| Variant | Merge step | Recurrence | Total |
|---|---|---|---|
| 2D merge-sort CDQ | |||
| 2D CDQ + BIT | |||
| 3D CDQ + BIT | |||
| ... |
with : roughly ops, well under 2s. - BIT cleanup must be
, not .
Problem Patterns & Tricks
Pattern 1 -- 2D Inversion / Dominance Counting
Count pairs
Recognize it: "Count pairs satisfying two simultaneous ordering conditions."
CF 1042D -- Pair of Topics (count pairs a_i+a_j > b_i+b_j, rephrase as 1D)
Luogu P3810 -- 3D partial order (classic template problem)Pattern 2 -- 3D Partial Order
Each element has three attributes
Recognize it: "For each element, count how many have all three attributes
Luogu P3810 -- Three-dimensional partial order (the canonical problem)
CF 12D -- Ball (pairs with all three conditions, with care for ties)Pattern 3 -- Offline Dynamic 2D Prefix Sums
Point updates and rectangle queries arrive in a stream. Process offline: assign timestamps, CDQ on time, BIT on one spatial dimension, sweep on the other. A rectangle query decomposes into four prefix queries.
Recognize it: "Update a 2D grid point, query a rectangle sum, and the problem does not force online processing."
CF 848C -- Goodbye Souvenir (offline, CDQ on time + BIT)
CF 1093E -- Intersection of Permutations (offline approach)Pattern 4 -- DP Optimization with CDQ
The DP transition
Key difference: Must process left half first (to finalize its DP values) before computing cross-contributions to the right half.
cpp
void cdq(int l, int r) {
if (l == r) {
// Finalize dp[l] here (e.g., dp[l] += base_cost[l]).
return;
}
int mid = (l + r) / 2;
cdq(l, mid); // Compute dp values for left half.
// Cross-contribution: use dp[i] for i in [l,mid] to update
// dp[j] for j in [mid+1,r].
compute_cross(l, mid, mid + 1, r);
cdq(mid + 1, r); // Now compute dp values for right half.
}CF 1530F -- Bingo (CDQ-style DP optimization)
IOI 2016 -- Aliens (CDQ + convex hull trick variant)Pattern 5 -- Converting Online to Offline
Any problem with "update, then query" can be made offline by recording all operations with timestamps. CDQ on the timestamp dimension replaces a persistent data structure with a simple recursive framework.
When it fails: The problem forces online (each query answer affects the next query). Look for "decode the query using the last answer" -- that kills CDQ.
CF 848C -- Goodbye Souvenir (classic CDQ offline conversion)
CF 1527E -- Partition Game (DP + offline, CDQ-amenable)Contest Cheat Sheet
+--------------------------------------------------------------------+
| CDQ DIVIDE AND CONQUER -- QUICK REFERENCE |
+--------------------------------------------------------------------+
| WHEN: 2D/3D partial order, offline updates+queries, DP with |
| partial-order transitions. Problem must be OFFLINE. |
| TIME: O(n log^2 n) typical SPACE: O(n + V) |
+--------------------------------------------------------------------+
| TEMPLATE (3D partial order): |
| 1. Sort by a. |
| 2. cdq(l, r): |
| if l==r: return |
| mid = (l+r)/2 |
| cdq(l, mid); cdq(mid+1, r); |
| // cross: two-pointer by b, BIT on c |
| ptr = l |
| for j in [mid+1, r]: |
| while ptr<=mid and e[ptr].b <= e[j].b: |
| bit.update(e[ptr].c, e[ptr].cnt); ptr++ |
| e[j].result += bit.query(e[j].c) |
| bit.undo() |
| merge_sort [l,r] by b |
+--------------------------------------------------------------------+
| CHECKLIST: |
| [ ] Sorted by first dimension before CDQ |
| [ ] Merge-sort by second dimension inside CDQ |
| [ ] BIT cleanup uses stack (not memset!) |
| [ ] Coordinate compress all dimensions |
| [ ] Handle ties correctly (use <= not <, or count duplicates) |
| [ ] For DP variant: left-then-cross-then-right order |
+--------------------------------------------------------------------+Gotchas & Debugging
BIT must be reset between merge steps. After computing cross-contributions from the left half to the right half, undo all BIT updates (set back to 0). If you don't, the BIT accumulates stale values from previous merge steps, causing overcounting. Use a "cleanup" loop that reverses the updates, NOT memset(bit, 0, sizeof(bit)) (which is O(n) and destroys the complexity).
1. BIT cleanup must be O(elements touched), not O(maxval).memset after each merge turns
2. Tie-breaking in the first dimension. Elements with the same
3. Forgetting to merge-sort by the second dimension. The two-pointer sweep at the parent level requires sorted data. The merge-sort step is not optional.
4. DP variant: wrong recursion order. For CDQ-based DP, compute left, then cross-contribution, then right. Standard CDQ is order-independent; the DP variant is not.
5. Duplicate elements. Deduplicate and store a count field. An element with count
6. Off-by-one in BIT queries. Strict bit.query(c_j - 1). Non-strict bit.query(c_j). Mixing these shifts every answer.
7. Coordinate compression. Compress each dimension independently before using as BIT indices.
Mental Traps
Trap: "CDQ is just divide and conquer, I'll figure out the details." The word "divide and conquer" is accurate but underspecifies the key invariant: after CDQ processes a range
Trap: "CDQ and merge sort are the same thing." CDQ uses merge sort internally but they are not the same. Merge sort sorts by one key. CDQ sorts by one key while counting contributions that cross the midpoint in a second key. The counting step (using a BIT on the third dimension) is what makes CDQ CDQ.
Trap: "I can replace CDQ with a 2D BIT." Sometimes you can, but often you can't. A 2D BIT has
Trap: "CDQ is only for 3D partial order." CDQ is used for 3D partial order, but it's also used to optimize DP transitions with two-dimensional dependencies. The structure is the same -- divide on one dimension, merge-sort on another, BIT on the third -- but the semantic roles (what "dimension" means) differ. Conflating the two use cases leads to applying the wrong version.
The mistake that teaches
3D partial order cleanup with memset. I spent three hours on WA-large-test. My BIT was accumulating stale values. The fix was not memset(bit, 0, sizeof(bit)) after each cross-merge -- that's O(MAXVAL) per call, and on recursive calls it wipes out parent-level values. Worse, memset masked an off-by-one in my merge pointer that had accidentally inserted a right-half element into the BIT. Correct cleanup: undo only the inserts you actually made by walking the same left-half elements and subtracting. This is O(right - left) per call, totals O(n log n), and exposes the kind of bugs memset hides.
cpp
for (auto& e : left_half) bit.update(e.dim3, +1); // insert
for (auto& e : right_half) ans[e.id] += bit.query(e.dim3); // query
for (auto& e : left_half) bit.update(e.dim3, -1); // undoIn recursive algorithms with shared global state, always undo your modifications. Never rely on bulk resets.
Debug Drills
Drill 1: Cross-half merge counts wrong direction.
cpp
void solve(int l, int r) {
if (r - l <= 1) return;
int mid = (l + r) / 2;
solve(l, mid);
solve(mid, r);
// Cross-half: left contributes to right
sort(pts + l, pts + mid, by_y);
sort(pts + mid, pts + r, by_y);
int j = l;
for (int i = mid; i < r; i++) {
while (j < mid && pts[j].y <= pts[i].y)
bit_update(pts[j].z, 1), j++;
pts[i].ans += bit_query(pts[i].z);
}
// cleanup
for (int k = l; k < j; k++)
bit_update(pts[k].z, -1);
}What's wrong?
The two recursive calls solve(l, mid) and solve(mid, r) are both executed before the cross-half merge. But in DP-style CDQ, the right half depends on the left half's results. You must solve the left half first, then do the cross-half merge (so left-half answers propagate to the right half), and then solve the right half. Fix: move solve(mid, r) to after the cross-half step. For pure counting (non-DP) CDQ, the order doesn't matter -- but for DP optimization, it's critical.
Drill 2: Tie-breaking causes missed contributions.
cpp
// Sorting by a, then by b, then by c
sort(pts, pts + n, [](auto& a, auto& b) {
if (a.x != b.x) return a.x < b.x;
if (a.y != b.y) return a.y < b.y;
return a.z < b.z;
});
// Then CDQ processes indices [0, n)What's wrong?
When elements have equal x values, the CDQ split at mid may place element j in the right half and element i in the left half even though x[j] == x[i]. The cross-half step then counts j as being "dominated" by i in the x-dimension, but the problem requires strict or non-strict inequality -- and equal x-values on opposite sides of mid get treated as if left < right. Fix: after sorting, deduplicate elements with identical (x, y, z) by counting multiplicities, or ensure that elements with equal x-values are handled correctly by the cross-half step (e.g., by not contributing across halves when x-values are equal).
Drill 3: BIT cleanup misses some entries.
cpp
// After cross-half merge:
for (int k = l; k < mid; k++)
bit_update(pts[k].z, -1);What's wrong?
The cleanup loop iterates over all left-half elements (l to mid), but during the merge, the pointer j only advanced to some j <= mid -- meaning only elements from l to j-1 were actually inserted into the BIT. Removing elements that were never inserted causes the BIT to go negative, corrupting future queries. Fix: for (int k = l; k < j; k++) bit_update(pts[k].z, -1); -- only undo what was actually done.
Drill 4: Cross-contribution direction in comments.
cpp
void cdq(int l, int r) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(l, mid);
cdq(mid + 1, r);
// Cross: compute how RIGHT affects LEFT
sort(a + l, a + mid + 1, by_dim2);
sort(a + mid + 1, a + r + 1, by_dim2);
// merge and accumulate...
}What's wrong?
The cross-contribution should compute how the LEFT half affects the RIGHT half, not the other way around. In a partial order counting problem, element j in the left half (earlier in dim1) can dominate element i in the right half. The code comment says "RIGHT affects LEFT" -- if the merge logic follows this comment, it counts contributions backwards. Fix the direction: insert left-half elements into the BIT, query right-half elements against it.
Drill 5: BIT cleanup with memset.
cpp
void cdq(int l, int r) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(l, mid);
cdq(mid + 1, r);
// cross step: insert left into BIT, query right
for (int i = l; i <= mid; i++)
bit.update(a[i].z, 1);
for (int i = mid + 1; i <= r; i++)
ans[a[i].id] += bit.query(a[i].z);
memset(bit.tree, 0, sizeof(bit.tree)); // nuclear reset
}What's wrong?
memset resets the entire BIT array, which is O(MAX_VAL) per CDQ call. Across all O(n log n) recursive calls, this gives O(n * MAX_VAL * log n) total work -- potentially orders of magnitude slower than intended. Worse, if the BIT is shared and a parent call had pending values, they're destroyed. Fix: undo only the values you inserted:
cpp
for (int i = l; i <= mid; i++)
bit.update(a[i].z, -1);This is O(right - left) per call, totaling O(n log n) across all calls.
Drill 6: Forgetting to stable-sort by dim2 in the merge step.
cpp
void cdq(int l, int r) {
if (l >= r) return;
int mid = (l + r) / 2;
cdq(l, mid);
cdq(mid + 1, r);
// Just sort both halves by dim2 and two-pointer merge
sort(a + l, a + mid + 1, by_dim2);
sort(a + mid + 1, a + r + 1, by_dim2);
int i = l, j = mid + 1;
while (i <= mid && j <= r) {
if (a[i].y <= a[j].y) {
bit.update(a[i].z, 1);
i++;
} else {
ans[a[j].id] += bit.query(a[j].z);
j++;
}
}
while (j <= r) { ans[a[j].id] += bit.query(a[j].z); j++; }
// cleanup...
}What's wrong?
The sort by dim2 destroys the dim1 ordering within each half, which the recursive calls relied on. After returning from cdq(l, mid) and cdq(mid+1, r), the elements within each half should still be sorted by dim1 for parent-level processing. The fix: use a merge-sort approach where the CDQ function itself returns elements sorted by dim2 (like merge sort returns sorted arrays), or use a temporary buffer and restore the original order after the cross step. Alternatively, sort by dim1 at the start of each CDQ call instead of relying on the caller.
Debugging checklist:
- Test on
. Compare against brute force for . - Verify BIT is clean at each merge start:
assert(bit.query(maxval) == 0). - Print elements at each level -- are halves sorted by the right attribute?
Self-Test
Before moving on, verify you can answer these without looking at the notes:
- [ ] Dimension roles: Describe the role of each dimension in a 3D partial order CDQ -- which dimension is handled by global sort, which by CDQ's merge, and which by the BIT?
- [ ] CDQ skeleton: Write the skeleton of
cdq(l, r)for DP optimization -- the four steps (left recursion, cross contribution, right recursion, merge) in the correct order. - [ ] BIT reset reasoning: Explain why the BIT must be reset between merge steps. Give a concrete example of what goes wrong (overcounting) if you skip the reset.
- [ ] Complexity breakdown: State the time complexity --
-- and explain where each log factor comes from (D&C depth vs. BIT operations). - [ ] Applicability judgment: Name one problem where CDQ D&C applies and one where it looks applicable but isn't (e.g., forced-online queries).
- [ ] Merge is load-bearing: Explain why the merge-sort step after cross-contribution is mandatory, not optional cleanup. What breaks at the parent level if you skip it?
- [ ] CDQ vs. alternatives: When would you prefer a persistent segment tree or merge sort tree over CDQ, and vice versa?
Practice Problems
Rating Progression
| CF Rating | What to Master |
|---|---|
| 1200 | Understand merge sort inversion counting as the simplest 2D partial-order problem (a precursor to CDQ thinking). |
| 1500 | Formalize the 2D partial-order template: sort by dimension 1, divide-and-conquer, use a BIT for dimension 2 in the cross-half merge. Understand the divide-and-conquer framework: solve left, solve right, compute cross-contributions. |
| 1800 | Full 3D partial order (sort by dim 1, CDQ on dim 2, BIT on dim 3); recognize when a DP recurrence can be framed as a partial-order problem. Handle the BIT cleanup correctly. |
| 2100 | Nested CDQ (CDQ inside CDQ) for 4D+ problems; combining CDQ with other offline techniques (e.g., CDQ + parallel binary search); handling duplicates and equal elements correctly across all dimensions. CDQ for DP optimization, CDQ + sweep. |
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | Inversions | CF EDU | Easy | Classic inversion count; CDQ = merge sort |
| 2 | Ball | CF 12D | Medium | 2D dominance with tie-breaking |
| 3 | Three-dimensional Partial Order | Luogu P3810 | Medium | Canonical 3D CDQ template |
| 4 | Pair of Topics | CF 1042D | Medium | Rephrase as inversion count with CDQ |
| 5 | Goodbye Souvenir | CF 848C | Medium-Hard | CDQ on time + BIT for offline queries |
| 6 | Partition Game | CF 1527E | Medium-Hard | DP optimization with CDQ |
| 7 | Flowers | CF 840C | Hard | 2D problem reducible to CDQ |
| 8 | Intersecting Segments | CF 1093E | Hard | Offline 2D with CDQ approach |
| 9 | Guard Mark | Luogu P4169 | Hard | 4D CDQ (nested CDQ layers) |
| 10 | Mokia | Luogu P4390 | Hard | CDQ on time for 2D point update + rect query |
Further Reading
- CF Blog: CDQ tutorial -- detailed explanation with 3D partial order.
- OI Wiki: CDQ Divide and Conquer -- full proofs and examples.
See also: 01-offline-queries.md | 14-divide-and-conquer.md | 13-binary-indexed-tree-fenwick.md | 03-mo-algorithm.md | 05-parallel-binary-search.md