Skip to content

Divide and Conquer

Quick Revisit

  • USE WHEN: Problem decomposes into independent halves and the merge step solves cross-boundary cases (inversions, closest pair, order statistics)
  • INVARIANT: Each recursive call returns a correctly solved (and often sorted) subproblem; the merge step handles all cross-half interactions
  • TIME: O(n log n) typical (Master Theorem: T(n) = 2T(n/2) + O(n)) | SPACE: O(n)
  • CLASSIC BUG: Forgetting to copy the merged result back to the original array—parent merges then see unsorted data and produce wrong counts
  • PRACTICE: Practice Ladder

AL-DC | Split, recurse, combine—the general paradigm behind merge sort, inversion counting, closest pair, and many more O(nlogn) algorithms.

Difficulty: Beginner | Prerequisites: Sorting | CF Rating: 1200–1900


Contents


Intuition

You have this array of six values:

text
  Index:  0    1    2    3    4    5
        +----+----+----+----+----+----+
  a[]:  |  5 |  3 |  8 |  1 |  2 |  7 |
        +----+----+----+----+----+----+

Your task: count the number of inversions—pairs (i,j) with i<j but a[i]>a[j].

Your first instinct—check every pair:

text
  (5,3) yes   (5,8) no    (5,1) yes   (5,2) yes   (5,7) no
  (3,8) no    (3,1) yes   (3,2) yes   (3,7) no
  (8,1) yes   (8,2) yes   (8,7) yes
  (1,2) no    (1,7) no
  (2,7) no

  Inversions found: (5,3) (5,1) (5,2) (3,1) (3,2) (8,1) (8,2) (8,7)
  Total = 8
  Pairs checked = C(6,2) = 15

15 comparisons for 6 elements. For n=105, the brute force checks (n2)5×109 pairs—a guaranteed TLE under any time limit.

We are comparing every element against every other element, even though most pairs give us no useful information. There has to be a better way.

"Split the array in half, recursively count inversions in each half, then count cross-half inversions during the merge step—merge sort does exactly this, giving O(nlogn)."

Think of two sorted stacks of exams. You want to interleave them into one sorted pile. Each time you pick from the right stack instead of the left, every remaining paper in the left stack is "out of order" with respect to that paper—and you know exactly how many are left. The merge step counts cross-inversions for free as a side effect of merging.

The analogy fits well but breaks in one place: real exam stacks aren't pre-sorted. The recursion handles that—by the time you merge two halves, each half is already sorted by the recursive calls below it.

Visual Walkthrough

Same array: [5, 3, 8, 1, 2, 7]. We run merge sort with inversion counting.

Step 1: Split recursively until base cases

text
                 [5, 3, 8, 1, 2, 7]
                /                   \
          [5, 3, 8]              [1, 2, 7]
          /       \              /       \
       [5, 3]    [8]         [1, 2]    [7]
       /    \                /    \
     [5]   [3]            [1]   [2]

Step 2: Merge [5] and [3]—count cross-inversions

text
  Left:  [5]     Right: [3]
          ^               ^
          L               R

  Compare 5 vs 3: pick 3 from right.
    Elements remaining in left = 1  -->  cross_inv += 1
  Pick 5.

  Merged: [3, 5]     cross_inv = 1

Step 3: Merge [3, 5] and [8]

text
  Left:  [3, 5]     Right: [8]
          ^                  ^
          L                  R

  3 < 8: pick 3.   cross_inv += 0
  5 < 8: pick 5.   cross_inv += 0
  Pick 8.

  Merged: [3, 5, 8]     cross_inv = 0

Step 4: Merge [1] and [2]

text
  Left:  [1]     Right: [2]
          ^               ^

  1 < 2: pick 1.   cross_inv += 0
  Pick 2.

  Merged: [1, 2]     cross_inv = 0

Step 5: Merge [1, 2] and [7]

text
  Left:  [1, 2]     Right: [7]

  1 < 7: pick 1.   cross_inv += 0
  2 < 7: pick 2.   cross_inv += 0
  Pick 7.

  Merged: [1, 2, 7]     cross_inv = 0

Step 6: Merge [3, 5, 8] and [1, 2, 7]—the big merge

text
  Left:  [3, 5, 8]     Right: [1, 2, 7]
          ^                     ^
          L                     R

  3 vs 1: pick 1 from right.
    Remaining in left = 3  -->  cross_inv += 3    (1 < 3, 1 < 5, 1 < 8)

  Left:  [3, 5, 8]     Right: [2, 7]
          ^                     ^

  3 vs 2: pick 2 from right.
    Remaining in left = 3  -->  cross_inv += 3    (2 < 3, 2 < 5, 2 < 8)

  Left:  [3, 5, 8]     Right: [7]
          ^                     ^

  3 vs 7: pick 3.   cross_inv += 0
  5 vs 7: pick 5.   cross_inv += 0
  7 vs 8: pick 7 from right.
    Remaining in left = 1  -->  cross_inv += 1    (7 < 8)
  Pick 8.

  Merged: [1, 2, 3, 5, 7, 8]     cross_inv = 7

Final tally:

text
  +---------------------------------------------+
  | Merge step            cross_inv  running sum |
  +---------------------------------------------+
  | [5]+[3]                  1           1       |
  | [3,5]+[8]                0           1       |
  | [1]+[2]                  0           1       |
  | [1,2]+[7]                0           1       |
  | [3,5,8]+[1,2,7]          7           8       |
  +---------------------------------------------+
  | TOTAL inversions = 8                         |
  | Brute force: 15 comparisons                  |
  | Merge sort:  ~11 comparisons (O(n log n))    |
  +---------------------------------------------+

The gap widens dramatically at scale: brute force does O(n2) work; merge-sort-based counting does O(nlogn).

The Invariant

text
+-------------------------------------------------------------------+
| INVARIANT: After merging subarrays [lo..mid] and [mid+1..hi],     |
| both halves are sorted AND all cross-inversions between the two   |
| halves have been counted.                                         |
|                                                                   |
| total_inversions = left_inv + right_inv + cross_inv               |
|                                                                   |
| CROSS-INVERSION RULE: When we pick a[R] from the right half      |
| during the merge (because a[R] < a[L]), every remaining element   |
| in the left half forms an inversion with a[R].                    |
| So: cross_inv += (mid - L + 1)                                   |
+-------------------------------------------------------------------+

Why it works: Both halves are sorted (by the recursive invariant). If a[R] < a[L], then a[R] is also less than a[L+1], a[L+2], ..., a[mid]—all of which originally had smaller indices than a[R]. That is exactly mid - L + 1 inversions, counted in O(1).

Look at Step 6 above: when we pick 1 from the right, all 3 elements [3, 5, 8] remaining in the left half form inversions with 1. We add 3 in one shot instead of checking each pair individually.

What the Code Won't Teach You

The real power of D&C is handling cross-subproblem interactions. The classic case—"count pairs (i,j) where i<j and a[i]>a[j]"—is O(n2) by brute force. D&C handles pairs within each half recursively, and handles cross-pairs (where i is in the left half and j in the right) during the combine step. What makes this work is the sorted order the recursive calls leave behind. The code shows you the merge loop; it won't tell you that sorting the halves is the key move, not an incidental side effect.

text
  What the code does:          What you need to understand:

  if (a[j] < a[i])            Because left half is SORTED,
      inv += mid - i;          a[i] < a[i+1] < ... < a[mid-1]
                                ALL of them are > a[j].
                                One comparison gives you
                                (mid - i) inversions in O(1).

  +----+----+----+             +----+
  | 3  | 5  | 8  |      vs    | 2  |
  +----+----+----+             +----+
    ^                            ^
    i                            j
    a[j]=2 < a[i]=3
    --> 2 < 3, 2 < 5, 2 < 8
    --> inv += 3 (all at once!)

D&C optimization for DP is a separate advanced technique. When a DP has the form dp[i]=minj<i(dp[j]+cost(j,i)) and the optimal j is monotone (as i increases, the optimal j never decreases), you can use divide and conquer to locate transition points in O(nlogn) instead of O(n2). The name is confusing: "D&C optimization" borrows the D&C structure to speed up a DP—it has nothing to do with sorting or counting. Whether the technique applies depends entirely on the monotonicity of the cost function, which no amount of staring at merge sort will reveal.

Now that you understand the deeper mechanics, here is how to recognize D&C problems in the wild.

When to Reach for This

Trigger phrases—think divide and conquer when you see:

  • "Count inversions" or "count swaps to sort"—classic merge sort variant.
  • "Closest pair of points"—the canonical D&C geometry problem.
  • "Merge sort variant" or "modify merge sort"—often a direct hint.
  • "O(nlogn) from splitting an array in half"—the signature of D&C.
  • "Count pairs (i,j) with i<j satisfying some order-based condition"—if the condition relates to relative ordering, merge sort counting likely applies.

Counter-examples—problems that look like D&C but are not:

  • "Count pairs with a[i]+a[j]=k"—a two-sum/hash-map problem. The condition is value-based, not order-based; D&C gives no advantage. See: Two Pointers.
  • "Find the median"—a D&C approach exists (quickselect), but the STL nth_element handles it. The problem doesn't decompose into "count something across halves."
  • "Longest increasing subsequence"—despite involving ordering, LIS has optimal substructure that makes DP + binary search the right tool, not merge-sort-style D&C. See: 02-dp-intro-1d.md.

Worked Example: Tricky Case—Reverse-Sorted Array

The maximum inversion count occurs when the array is reverse-sorted. Trace [4, 3, 2, 1]:

text
  Initial: [4, 3, 2, 1]

  Split into [4, 3] and [2, 1]

  Merge [4] + [3]:
    4 > 3, pick 3, inv += 1 (only 4 remains in left)
    pick 4.
    Result: [3, 4]  inv = 1

  Merge [2] + [1]:
    2 > 1, pick 1, inv += 1 (only 2 remains in left)
    pick 2.
    Result: [1, 2]  inv = 1

  Merge [3, 4] + [1, 2]:
    3 vs 1: pick 1, inv += 2  (both 3,4 remain)
    3 vs 2: pick 2, inv += 2  (both 3,4 remain)
    pick 3, pick 4.
    Result: [1, 2, 3, 4]  cross_inv = 4

  +----------------------------------------+
  | Step             | inv | running total  |
  +----------------------------------------+
  | [4]+[3]          |  1  |      1         |
  | [2]+[1]          |  1  |      2         |
  | [3,4]+[1,2]      |  4  |      6         |
  +----------------------------------------+
  | TOTAL = 6 = C(4,2)                     |
  | Every pair is an inversion in a         |
  | reverse-sorted array!                   |
  +----------------------------------------+

This is the worst case: (n2) inversions. For n=105, that's 5×109—overflows int! Always use long long.


C++ STL Reference

Function / ClassHeaderWhat it doesTime Complexity
sort(first, last)<algorithm>Introsort (quicksort + heapsort + insertion sort)O(nlogn)
stable_sort(first, last)<algorithm>Merge sort (preserves equal-element order)O(nlogn)
inplace_merge(first, mid, last)<algorithm>Merge two consecutive sorted ranges in placeO(n) with buffer, O(nlogn) without
merge(f1, l1, f2, l2, dest)<algorithm>Merge two sorted ranges into destO(n+m)
nth_element(first, nth, last)<algorithm>Partial sort; element at nth is correct, left/right partitionedO(n) avg

Notes:

  • stable_sort is literally merge sort under the hood—useful when you need stability.
  • inplace_merge is handy for bottom-up merge sort or when you have two sorted halves to combine.
  • For inversion counting and closest pair, you write your own merge since the counting logic lives inside the merge step. STL merge cannot be hooked.

Implementation (Contest-Ready)

Version 1: Minimal Contest Template

cpp
#include <bits/stdc++.h>
using namespace std;

// --- Inversion count via merge sort ---
long long merge_count(vector<int>& a, int lo, int hi) {
    if (hi - lo <= 1) return 0;
    int mid = (lo + hi) / 2;
    long long inv = merge_count(a, lo, mid) + merge_count(a, mid, hi);
    vector<int> tmp;
    int i = lo, j = mid;
    while (i < mid && j < hi) {
        if (a[i] <= a[j]) {
            tmp.push_back(a[i++]);
        } else {
            inv += mid - i;
            tmp.push_back(a[j++]);
        }
    }
    while (i < mid) tmp.push_back(a[i++]);
    while (j < hi) tmp.push_back(a[j++]);
    copy(tmp.begin(), tmp.end(), a.begin() + lo);
    return inv;
}

// --- Closest pair of points ---
using pt = pair<double, double>;
#define X first
#define Y second

double dist(pt a, pt b) {
    double dx = a.X - b.X, dy = a.Y - b.Y;
    return sqrt(dx * dx + dy * dy);
}

double closest_strip(vector<pt>& strip, double d) {
    sort(strip.begin(), strip.end(), [](const pt& a, const pt& b) {
        return a.Y < b.Y;
    });
    for (int i = 0; i < (int)strip.size(); i++)
        for (int j = i + 1; j < (int)strip.size() && strip[j].Y - strip[i].Y < d; j++)
            d = min(d, dist(strip[i], strip[j]));
    return d;
}

double closest_rec(vector<pt>& pts, int lo, int hi) {
    if (hi - lo <= 3) {
        double mn = 1e18;
        for (int i = lo; i < hi; i++)
            for (int j = i + 1; j < hi; j++)
                mn = min(mn, dist(pts[i], pts[j]));
        sort(pts.begin() + lo, pts.begin() + hi, [](const pt& a, const pt& b) {
            return a.Y < b.Y;
        });
        return mn;
    }
    int mid = (lo + hi) / 2;
    double midX = pts[mid].X;
    double d = min(closest_rec(pts, lo, mid), closest_rec(pts, mid, hi));
    inplace_merge(pts.begin() + lo, pts.begin() + mid, pts.begin() + hi,
        [](const pt& a, const pt& b) { return a.Y < b.Y; });
    vector<pt> strip;
    for (int i = lo; i < hi; i++)
        if (abs(pts[i].X - midX) < d)
            strip.push_back(pts[i]);
    for (int i = 0; i < (int)strip.size(); i++)
        for (int j = i + 1; j < (int)strip.size() && strip[j].Y - strip[i].Y < d; j++)
            d = min(d, dist(strip[i], strip[j]));
    return d;
}

double closest_pair(vector<pt> pts) {
    sort(pts.begin(), pts.end());
    return closest_rec(pts, 0, pts.size());
}

int main() {
    int n;
    cin >> n;
    vector<int> a(n);
    for (auto& x : a) cin >> x;
    cout << merge_count(a, 0, n) << "\n";
}

Version 2: Explained Version

cpp
#include <bits/stdc++.h>
using namespace std;

// Count inversions using a modified merge sort.
// An inversion is a pair (i, j) with i < j and a[i] > a[j].
// Range: [lo, hi) -- half-open interval.
long long merge_count(vector<int>& a, int lo, int hi) {
    // Base case: 0 or 1 elements -> no inversions, already sorted.
    if (hi - lo <= 1) return 0;

    int mid = (lo + hi) / 2;

    // Recurse: count inversions within each half.
    // After these calls, a[lo..mid) and a[mid..hi) are each sorted.
    long long inv = merge_count(a, lo, mid) + merge_count(a, mid, hi);

    // Merge step: combine two sorted halves into one sorted range.
    // Key: when we pick from the right half (a[j] < a[i]),
    // every remaining element in the left half (indices i..mid-1)
    // forms an inversion with a[j]. There are (mid - i) such elements.
    vector<int> tmp;
    int i = lo, j = mid;
    while (i < mid && j < hi) {
        if (a[i] <= a[j]) {
            // a[i] is in order relative to a[j] -- no inversion.
            tmp.push_back(a[i++]);
        } else {
            // a[j] < a[i], and since left half is sorted,
            // a[j] < a[i], a[i+1], ..., a[mid-1].
            // That is (mid - i) cross-inversions.
            inv += mid - i;
            tmp.push_back(a[j++]);
        }
    }
    // Drain remaining elements from whichever half is not exhausted.
    while (i < mid) tmp.push_back(a[i++]);
    while (j < hi) tmp.push_back(a[j++]);

    // Write sorted result back to original array.
    copy(tmp.begin(), tmp.end(), a.begin() + lo);
    return inv;
}

// --- Merge sort (plain, no counting) ---
void merge_sort(vector<int>& a, int lo, int hi) {
    if (hi - lo <= 1) return;
    int mid = (lo + hi) / 2;
    merge_sort(a, lo, mid);
    merge_sort(a, mid, hi);

    // Standard merge of two sorted halves.
    vector<int> tmp;
    int i = lo, j = mid;
    while (i < mid && j < hi) {
        if (a[i] <= a[j]) tmp.push_back(a[i++]);
        else               tmp.push_back(a[j++]);
    }
    while (i < mid) tmp.push_back(a[i++]);
    while (j < hi)  tmp.push_back(a[j++]);
    copy(tmp.begin(), tmp.end(), a.begin() + lo);
}

int main() {
    int n;
    cin >> n;
    vector<int> a(n);
    for (auto& x : a) cin >> x;

    // merge_count modifies the array (sorts it) as a side effect.
    // If you need the original order, copy first.
    vector<int> b = a;
    long long inversions = merge_count(b, 0, n);
    cout << "Inversions: " << inversions << "\n";

    // Plain merge sort:
    merge_sort(a, 0, n);
    for (int i = 0; i < n; i++) cout << a[i] << " \n"[i == n - 1];
}

Version 3: Standalone Inversion Count (Submit-Ready)

Complete, self-contained solution for CSES 1744: Inversions. Reads n and the array, outputs the inversion count. Copy-paste and submit.

cpp
#include <bits/stdc++.h>
using namespace std;

long long merge_count(vector<int>& a, vector<int>& tmp, int lo, int hi) {
    if (hi - lo <= 1) return 0;
    int mid = (lo + hi) / 2;
    long long inv = merge_count(a, tmp, lo, mid) + merge_count(a, tmp, mid, hi);
    int i = lo, j = mid, k = lo;
    while (i < mid && j < hi) {
        if (a[i] <= a[j]) tmp[k++] = a[i++];
        else { inv += mid - i; tmp[k++] = a[j++]; }
    }
    while (i < mid) tmp[k++] = a[i++];
    while (j < hi)  tmp[k++] = a[j++];
    for (int x = lo; x < hi; x++) a[x] = tmp[x];
    return inv;
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);
    int n;
    cin >> n;
    vector<int> a(n), tmp(n);
    for (auto& x : a) cin >> x;
    cout << merge_count(a, tmp, 0, n) << "\n";
}

Key differences from Version 1: Pre-allocated tmp array avoids repeated heap allocation inside recursion—faster for large n. Uses long long throughout (critical: max inversions (n2) overflows int for n>65536).


Operations Reference

Here are the core D&C operations and their complexities—use this to quickly estimate whether your approach fits the time limit.

OperationTimeSpace
Merge sortO(nlogn)O(n) auxiliary
Count inversions (merge sort)O(nlogn)O(n) auxiliary
Closest pair of points (2D)O(nlog2n) naive / O(nlogn) optimizedO(n)
General D&C with linear mergeO(nlogn) by Master TheoremO(n) per level
D&C with O(nlogn) mergeO(nlog2n)O(nlogn)

Master Theorem reminder: T(n)=2T(n/2)+O(n) solves to T(n)=O(nlogn).

The Merge Step, Pointer by Pointer

text
Left (sorted):  [1, 3, 5, 8]     Right (sorted): [2, 4, 6, 7]
                 ^                                  ^
                 L                                  R

Step 1: 1 < 2, pick 1 from left    Result: [1]
Step 2: 3 > 2, pick 2 from right   Result: [1, 2]
Step 3: 3 < 4, pick 3 from left    Result: [1, 2, 3]
Step 4: 5 > 4, pick 4 from right   Result: [1, 2, 3, 4]
Step 5: 5 < 6, pick 5 from left    Result: [1, 2, 3, 4, 5]
Step 6: 8 > 6, pick 6 from right   Result: [1, 2, 3, 4, 5, 6]
Step 7: 8 > 7, pick 7 from right   Result: [1, 2, 3, 4, 5, 6, 7]
Step 8: R exhausted, drain left     Result: [1, 2, 3, 4, 5, 6, 7, 8]

Total comparisons: 7 (linear in n, not n^2)
Each element is touched exactly once --> O(n) merge

Master Theorem Decision Tree

For any recurrence T(n)=aT(n/b)+O(nc):

text
              T(n) = a*T(n/b) + O(n^c)
                        |
               Compare log_b(a) vs c
               /        |        \
       log_b(a) > c    = c      < c
            |            |         |
     O(n^log_b(a))  O(n^c*logn)  O(n^c)
      (recursion     (balanced)   (merge
       dominates)                  dominates)

Concrete examples:

text
+-------------------+---+---+---+-----------+----------------+
| Algorithm         | a | b | c | log_b(a)  | Result         |
+-------------------+---+---+---+-----------+----------------+
| Merge sort        | 2 | 2 | 1 | 1 = c    | O(n log n)     |
| Binary search     | 1 | 2 | 0 | 0 = c    | O(log n)       |
| Karatsuba         | 3 | 2 | 1 | 1.58 > c | O(n^1.58)      |
| Closest pair      | 2 | 2 | 1 | 1 = c    | O(n log n)     |
| Naive matrix mult | 8 | 2 | 2 | 3 > c    | O(n^3)         |
| Strassen          | 7 | 2 | 2 | 2.81 > c | O(n^2.81)      |
+-------------------+---+---+---+-----------+----------------+

See also: Binary Search for the simplest D&C recurrence (a=1,b=2,c=0).


Problem Patterns & Tricks

Inversion Count

The classic D&C application. Count pairs (i,j) with i<j and a[i]>a[j]. This equals the minimum number of adjacent swaps to sort the array (bubble sort swap count).

cpp
// Inside the merge step, when picking from the right:
inv += mid - i;  // all remaining left elements form inversions

Problems: CSES: Inversion Count, CF 987C


Closest Pair of Points

Given n points in 2D, find the pair with minimum Euclidean distance. Sort by x-coordinate, split, recurse, then check a narrow vertical strip of width 2d around the split line. The strip check is O(n) because each point only compares against at most 6-7 neighbors in y-order.

text
  CLOSEST PAIR -- the strip is where cross-pair candidates live:

  Points sorted by x, split at midX:

       d         d
    <------>  <------>
         |         |
    *    |  *      |
      *  |    *    |
    *    |      *  |        * = points
  -------+---------+------  midX line
    *    | *       |
         |    *    |
    *    |         |
         |         |
   LEFT  |  STRIP  |  RIGHT
   half  | (width  |  half
         |   2d)   |

  Only points inside the strip can form a
  cross-pair closer than d. For each point
  in the strip, check at most 6-7 neighbors
  in y-order --> O(n) strip check.

Problems: CSES: Closest Pair, CF 429D


Count of Pairs Satisfying an Order Condition

Many problems reduce to: count pairs (i,j) with i<j where some function of a[i] and a[j] holds. If the condition can be checked during a merge of sorted halves, merge-sort-based counting works.

Example: count pairs with a[i]>2a[j]—modify the merge step to sweep with a separate pointer for the condition before doing the standard merge.

text
  SEPARATE POINTER SWEEP for a[i] > 2*a[j]:

  Left (sorted):  [3, 5, 8]     Right (sorted): [1, 2, 7]
                    ^                              ^
                    p                              q

  For q=0 (a[q]=1): advance p while a[p] <= 2*1=2
    p stops at index 0 (a[0]=3 > 2). inv += 3-0 = 3
  For q=1 (a[q]=2): advance p while a[p] <= 2*2=4
    p advances to index 1 (a[1]=5 > 4). inv += 3-1 = 2
  For q=2 (a[q]=7): advance p while a[p] <= 2*7=14
    p advances to index 3 (past end). inv += 3-3 = 0

  Total cross-pairs: 3 + 2 + 0 = 5
  Key: p only moves forward --> O(n) total for the sweep
cpp
// Before merging, count pairs with a[i] > 2*a[j]:
int p = lo;
for (int q = mid; q < hi; q++) {
    while (p < mid && a[p] <= 2LL * a[q]) p++;
    inv += mid - p;
}
// Then do the standard merge.

Problems: CF 1430E, CF 1691F


Merge Sort Tree

Build a segment-tree-like structure where each node stores the sorted subarray for its range. Answering "how many elements in [l,r] are k" becomes binary searches at O(log2n) per query. Built naturally by storing merge sort intermediate results.

Problems: CF 220B, CF 1042D


D&C on Geometry (Beyond Closest Pair)

The split-recurse-combine paradigm applies to many computational geometry problems: convex hull (split by x), closest pair, line intersection counting. The merge step usually involves a clever sweep of a boundary region.

Problems: CF 429D, CF 442B


D&C Optimization for DP

When a DP recurrence has the form dp[i][j]=minkopt(i,j)(dp[i1][k]+C(k,j)) and the optimal split point opt(i,j) is monotone in j, divide and conquer reduces the O(n2) transition to O(nlogn). This is an advanced technique that uses the D&C structure for optimization rather than direct array splitting.

Problems: CF 321E, CF 868F


Contest Cheat Sheet

text
+--------------------------------------------------------------+
|  DIVIDE AND CONQUER CHEAT SHEET                              |
+--------------------------------------------------------------+
|  WHEN TO USE:                                                 |
|  - Count inversions or order-based pair counting             |
|  - Closest pair of points in 2D                              |
|  - Any problem that splits into two halves + linear merge    |
|  - Problem hints at O(n log n) via "split in half"           |
+--------------------------------------------------------------+
|  INVERSION COUNT (merge step):                                |
|    if (a[j] < a[i]) inv += mid - i;  // pick from right     |
+--------------------------------------------------------------+
|  CLOSEST PAIR:                                                |
|    Sort by x. Recurse halves. Check strip of width 2d.       |
|    Strip: only compare points within d in y-distance.        |
+--------------------------------------------------------------+
|  MASTER THEOREM:                                              |
|    T(n) = 2T(n/2) + O(n)  -->  O(n log n)                   |
+--------------------------------------------------------------+
|  TIME:  O(n log n) for standard D&C with linear merge        |
|  SPACE: O(n) auxiliary                                        |
+--------------------------------------------------------------+

Gotchas & Debugging

Off-by-One in Half-Open vs Closed Ranges

The most common bug. Pick a convention and never mix it:

  • Half-open [lo, hi): mid = (lo + hi) / 2, left = [lo, mid), right = [mid, hi). This is the cleanest and matches STL iterators.
  • Closed [lo, hi]: mid = (lo + hi) / 2, left = [lo, mid], right = [mid+1, hi].

Mixing conventions in the same code is a recipe for infinite recursion or missed elements.

text
  Half-open [lo, hi):           Closed [lo, hi]:

  Array: [3, 5, 8, 1, 2]       Array: [3, 5, 8, 1, 2]
  lo=0, hi=5, mid=2             lo=0, hi=4, mid=2

  Left:  [0, 2) = {3, 5}       Left:  [0, 2] = {3, 5, 8}
  Right: [2, 5) = {8, 1, 2}    Right: [3, 4] = {1, 2}
         ^                             ^
         mid is START of right         mid+1 is START of right

  DANGER -- mixing conventions:
  Half-open split with closed mid:
    Left:  [0, 2] = {3, 5, 8}    <-- 3 elements
    Right: [2, 5) = {8, 1, 2}    <-- 3 elements, 8 counted TWICE!

Forgetting to Sort Both Halves During Recursion

The merge step only works if both halves are already sorted. If your base case does not sort (a single-element array is trivially sorted—that's fine), make sure every return path leaves the subarray sorted.

Integer Overflow in Inversion Count

For n=105 with a reverse-sorted array, the inversion count is (n2)5×109, which overflows int. Always use long long for the inversion counter.

Closest Pair: Not Sorting Strip by Y

The strip must be scanned in y-order. If you forget to sort (or maintain y-order via inplace_merge), the inner loop's early termination condition strip[j].Y - strip[i].Y < d is meaningless and the algorithm degrades to O(n2).

Closest Pair: Comparing Only 6-7 Neighbors

The strip check's inner loop runs at most ~7 iterations per point (a geometric packing argument). If your solution is TLE-ing, make sure the inner loop breaks when the y-distance exceeds d—without this, the strip check is O(n2).

Allocating Temporary Arrays Inside Recursion

Creating a vector<int> tmp at every recursive call is safe but adds allocation overhead. For maximum speed, allocate a single auxiliary array of size n once and pass it through. In most contest settings the per-call allocation is fine for n2×105.

Debugging Tip

For inversion counting, test with:

  • Sorted array [1,2,3,4,5] → 0 inversions.
  • Reverse-sorted [5,4,3,2,1](52)=10 inversions.
  • Single element → 0.
  • Two elements [2,1] → 1; [1,2] → 0.

If all four pass, your implementation is almost certainly correct.

Mental Traps

"If I can solve it recursively, it's divide and conquer." Not all recursion is D&C. D&C specifically splits the problem into independent subproblems, solves them separately, then combines. If the subproblems share state or depend on each other (like a DP recurrence), it's not D&C.

text
  WRONG thinking:                    RIGHT thinking:

  "I call the function               "I split into INDEPENDENT
   recursively, so it                  subproblems, solve each,
   must be D&C!"                       then COMBINE results."

  fib(n) = fib(n-1) + fib(n-2)      merge_count(lo, mid) +
       |                              merge_count(mid, hi) +
       v                              cross_inversions(lo, mid, hi)
  Subproblems OVERLAP                      |
  (fib(n-2) is used by both)              v
  --> This is DP, not D&C           Subproblems are INDEPENDENT
                                    --> This is D&C

"The recursion tree is just for analysis, not implementation." The recursion tree is your implementation guide. Each level corresponds to a merge pass. If your merge is O(n) and the tree has O(logn) levels, the total is O(nlogn). If your merge is O(nlogn), the total becomes O(nlog2n).

text
  Recursion tree = implementation blueprint:

  Level 0:  [----------- n -----------]     merge cost: O(n)
            /                          \
  Level 1: [--- n/2 ---]  [--- n/2 ---]    merge cost: O(n/2) * 2 = O(n)
            /     \          /     \
  Level 2: [n/4] [n/4]    [n/4] [n/4]      merge cost: O(n/4) * 4 = O(n)
              ...            ...
  Level k:  [1] [1] ... [1] [1]            merge cost: O(1) * n  = O(n)

  Total levels: log(n)
  Total work per level: O(n)
  TOTAL: O(n * log n)

  If your merge is O(n log n) instead of O(n):
  --> Each level costs O(n log n) --> Total O(n log^2 n)

"D&C is just merge sort and binary search." Merge sort and binary search are the most famous instances, but the same structure drives closest pair of points, inversion counting, fast multiplication (Karatsuba), D&C DP optimization, and CDQ divide and conquer for offline problems.

text
  D&C is a FAMILY of techniques:

  +------------------+----------------------------------+
  |  Problem         |  Combine step                    |
  +------------------+----------------------------------+
  |  Merge sort      |  Merge two sorted halves         |
  |  Inversions      |  Count cross-pairs during merge  |
  |  Closest pair    |  Check strip around split line   |
  |  Karatsuba       |  Combine partial products        |
  |  D&C DP opt      |  Binary search for opt split     |
  +------------------+----------------------------------+
       ^
       |
  Each has a DIFFERENT combine step --
  the "split + recurse" part is always the same

Self-Test

Before moving on, verify you can do these without looking at notes:

  • [ ] Implement merge sort from memory—both the divide/recurse phase and the merge (combine) phase.
  • [ ] Explain "counting inversions" as a D&C problem: what happens in the combine step, and why does it count cross-inversions correctly?
  • [ ] Apply the master theorem to T(n)=2T(n/2)+O(nlogn)—what is the total complexity?
  • [ ] State when "divide and conquer DP optimization" applies (what property must the DP's cost function have?) and what complexity it achieves.
  • [ ] Identify the combine step in "closest pair of points" divide and conquer—what do you need to check across the dividing line?
  • [ ] Given the array [4, 3, 2, 1], trace the merge sort inversion count by hand and verify you get 6 inversions.
  • [ ] Explain why int mid = (lo + hi) / 2 can overflow and how to fix it.

Practice Problems

#ProblemSourceDifficultyKey Idea
1InversionsCSESEasyDirect merge-sort inversion count
2CF 987C -- Three displaysCF *1400EasyInversion-style pair counting
3Closest PairCSESMediumD&C closest pair of points
4CF 429D -- Tricky FunctionCF *1800MediumReduce to closest pair via prefix sums
5CF 1430E -- String ReversalCF *1700MediumCount inversions in a permutation
6CF 1691F -- K-Set TreeCF *2100Medium-HardD&C pair counting on trees
7CF 220B -- Little Elephant and ArrayCF *1800MediumMerge sort tree for offline queries
8CF 321E -- Ciel and GondolasCF *2400HardD&C optimization on DP
9CF 868F -- Yet Another MinimizationCF *2500HardD&C DP optimization
10CF 442B -- Andrey and ProblemCF *1800MediumGreedy with D&C flavor

If I Had 5 Minutes

D&C isn't about splitting—it's about what you can compute during the merge that would be O(n2) otherwise. Every cross-pair (i,j) with i in the left half and j in the right is expensive to check naively. But if both halves are sorted, a single two-pointer sweep handles all cross-pairs in O(n). The recursion ensures the halves get sorted; the merge step is where the work happens.

  1. Core idea: Split in half, recurse, combine in O(n) → total O(nlogn).
  2. Inversions: When picking from right during merge, inv += mid - i.
  3. Closest pair: Sort by x, recurse halves, check strip of width 2d sorted by y.
  4. Master Theorem: T(n)=2T(n/2)+O(n)O(nlogn).
  5. Gotcha: Use long long for inversion count; stick to half-open [lo, hi).

Think of sorting a deck: split in half, sort each half, then interleave. Every time you reach for the "wrong" pile first, you're counting inversions for free.


Pattern Fingerprint

Constraint / SignalTechniqueReference
"Count inversions" or "min swaps to sort"Merge sort + count during mergeInversion Count
"Closest pair of points"D&C split by x, strip checkClosest Pair of Points
"Count pairs (i<j) with order condition"Modified merge sortCount of Pairs Satisfying an Order Condition
"O(nlogn) and split array in half"Generic D&CBinary Search for simpler variant
"DP with monotone optimal split"D&C DP optimizationD&C Optimization for DP / DP Framework
"Offline range queries on sorted data"Merge sort treeMerge Sort Tree

Rating Progression

text
+--------------------------------------------------------------------+
| CF ~1200: Implement merge sort. Count inversions on a basic array. |
+--------------------------------------------------------------------+
| CF ~1500: Reduce "minimum adjacent swaps to sort" to inversion     |
|           count. Closest pair of points (standard).                |
+--------------------------------------------------------------------+
| CF ~1800: Count pairs (i,j) with custom order condition using      |
|           modified merge. Merge sort tree for offline queries.      |
+--------------------------------------------------------------------+
| CF ~2100+: D&C optimization for DP. CDQ divide and conquer.        |
|            Multi-dimensional pair counting.                        |
+--------------------------------------------------------------------+

Before You Code Checklist

  • [ ] Subproblems independent? If they share state, consider DP instead.
  • [ ] Combine step is O(n)? If it's O(n2), D&C won't help—total stays O(n2).
  • [ ] Using long long? Inversion count can reach (n2)5×109.
  • [ ] Half-open or closed? Pick one convention and never mix them.
  • [ ] Base case correct? Single element = sorted, zero inversions.

What Would You Do If...?

Scenario 1: "Count pairs (i,j) where a[i]a[j]>k and i<j." Modified merge sort. After sorting both halves, use a two-pointer sweep: for each j in the right half, advance a pointer p in the left half while a[p]a[j]k. Cross-pairs = mid - p. Then do the standard merge. See Two Pointers for the sweep technique.

Scenario 2: "Find the k-th smallest element in an unsorted array in O(n)." Not merge-sort D&C—use quickselect (partition-based D&C). Partition around a pivot, recurse into the side containing index k. Average O(n), worst O(n2). In contests, prefer nth_element from <algorithm>.

Scenario 3: "Given n points, count pairs with distance d." D&C closest-pair variant. Split by x, recurse, check cross-pairs in the strip. Beware: the count itself can be O(n2), so D&C only helps if the count is bounded or you need the minimum.


Historical Note

John von Neumann invented merge sort in 1945 while working on the EDVAC computer—one of the earliest stored-program machines. It was the first O(nlogn) sorting algorithm, and its "split, recurse, merge" structure became the blueprint for every divide-and-conquer algorithm that followed. The inversion-counting trick embedded in merge sort's combine step is genuinely ancient, yet it still shows up unchanged in modern contest problems.


The Mistake That Teaches

The bug: You submit your inversion count and get Wrong Answer on test 3.

The merge looks correct, base case returns 0, counting adds mid - i. You test [5,4,3,2,1] → 10 OK. [1,2,3] → 0 OK. What's wrong?

The hunt: You add debug output and find the array is not fully sorted after merge_count returns. The merge produces the right count but never copies tmp back to the original array. Early merges appear correct (single elements are trivially sorted), but larger merges see unsorted data and the formula mid - i gives wrong cross-inversion counts.

The fix: One missing line: copy(tmp.begin(), tmp.end(), a.begin() + lo);


When NOT to Use This

The mnemonic is "Split, Sort, Sweep"—split the problem, sort the halves (via recursion), sweep cross-pairs during merge. When that pattern doesn't fit, don't force it:

  • Value-based pair conditions (a[i]+a[j]=k): Use hash maps or Two Pointers.
  • Longest increasing subsequence: DP + binary search, not D&C. See DP Intro.
  • Online updates needed: D&C is inherently offline. Use a segment tree or BIT. See Tree Basics.
  • Small n (5000): Brute force O(n2) is fine—don't over-engineer.
  • Overlapping subproblems: That's DP, not D&C. See Meet in the Middle for a different splitting strategy.

Debug This

Find the bug in each snippet. Click to reveal the answer.

Bug 1: Off-by-one in merge

cpp
long long merge_count(vector<int>& a, int lo, int hi) {
    if (hi - lo <= 1) return 0;
    int mid = (lo + hi) / 2;
    long long inv = merge_count(a, lo, mid) + merge_count(a, mid, hi);
    vector<int> tmp;
    int i = lo, j = mid;
    while (i <= mid && j < hi) {           // <-- look here
        if (a[i] <= a[j]) tmp.push_back(a[i++]);
        else { inv += mid - i; tmp.push_back(a[j++]); }
    }
    while (i <= mid) tmp.push_back(a[i++]);
    while (j < hi) tmp.push_back(a[j++]);
    copy(tmp.begin(), tmp.end(), a.begin() + lo);
    return inv;
}
Answer With half-open [lo, hi), the left half is [lo, mid). Using i <= mid reads a[mid] which belongs to the right half -- double-counting it. Fix: i < mid.

Bug 2: Integer overflow

cpp
int merge_count(vector<int>& a, int lo, int hi) {
    if (hi - lo <= 1) return 0;
    int mid = (lo + hi) / 2;
    int inv = merge_count(a, lo, mid) + merge_count(a, mid, hi);
    // ... merge logic with inv += mid - i ...
    return inv;
}
Answer Return type and inv are int. For n = 2*10^5 reverse-sorted, inversions reach ~2*10^10 -- overflows int. Fix: use long long.

Bug 3: Wrong base case

cpp
long long merge_count(vector<int>& a, int lo, int hi) {
    if (hi - lo <= 0) return 0;   // <-- look here
    int mid = (lo + hi) / 2;
    long long inv = merge_count(a, lo, mid) + merge_count(a, mid, hi);
    // ... merge logic ...
    return inv;
}
Answer When hi - lo == 1, mid = lo, so merge_count(a, lo, lo) returns 0 but merge_count(a, lo, hi) recurses with identical arguments -> infinite recursion and stack overflow. Fix: hi - lo <= 1.

Bug 4: Forgetting to copy back

cpp
long long merge_count(vector<int>& a, int lo, int hi) {
    if (hi - lo <= 1) return 0;
    int mid = (lo + hi) / 2;
    long long inv = merge_count(a, lo, mid) + merge_count(a, mid, hi);
    vector<int> tmp;
    int i = lo, j = mid;
    while (i < mid && j < hi) {
        if (a[i] <= a[j]) tmp.push_back(a[i++]);
        else { inv += mid - i; tmp.push_back(a[j++]); }
    }
    while (i < mid) tmp.push_back(a[i++]);
    while (j < hi) tmp.push_back(a[j++]);
    // copy is missing here!
    return inv;
}
Answer The sorted result in tmp is never written back to a[]. Future merges see unsorted data and compute wrong cross-inversion counts. Fix: add copy(tmp.begin(), tmp.end(), a.begin() + lo); before the return.

Further Reading


What to Solve Next

Built for the climb from Codeforces 1100 to 2100.