DP -- Divide and Conquer Optimization

Quick Revisit
USE WHEN: Partition DP where $opt (i, j) \leq opt (i, j + 1)$ — optimal split point is monotone (cost satisfies quadrangle inequality)
INVARIANT: For the midpoint $j_{m}$ , find $opt (i, j_{m})$ ; left half searches $m \leq opt$ , right half searches $m \geq opt$
TIME: O(kn log n) | SPACE: O(n)
CLASSIC BUG: Passing wrong $m$ -search bounds into recursive calls (must be $[m_{l o}, opt]$ and $[opt, m_{h i}]$ , not the $j$ bounds)
PRACTICE: 04-ladder-dp

AL-32 | When the optimal split point is monotone, cut the DP transition from $O (k n^{2})$ to $O (k n \log n)$ -- the standard trick for partition-cost DPs on Codeforces Div 1 C/D.

Difficulty: [Advanced]Prerequisites: DP -- 1D Introduction, Binary SearchCF Rating: 2000-2300+

Intuition
- Visual Walkthrough * The Invariant * What the Code Won't Teach You * When to Reach for This
C++ STL Reference
Implementation (Contest-Ready)
Operations Reference
Problem Patterns & Tricks
Contest Cheat Sheet
Gotchas & Debugging * Mental Traps
Self-Test
Practice Problems
Further Reading

Intuition

You want to split a row of $n$ data points into exactly $k$ contiguous segments, minimizing a total cost. The natural DP tries every possible split point for every position -- and that inner loop blows up.

You have a recurrence that partitions $n$ elements into $k$ contiguous groups:

d p [i] [j] = min_{0 \leq m < j} (d p [i - 1] [m] + C (m + 1, j))

Layer $i$ represents "use exactly $i$ groups to cover positions $1. . j$ ." For each of the $k$ layers you try every split point $m$ for every position $j$ , giving $O (n)$ work per state and $O (k n^{2})$ total.

When $n = 10^{5}$ and $k = 20$ , that is $2 \times 10^{11}$ operations -- hopelessly slow. You need to cut the inner loop from $O (n)$ to $O (\log n)$ per state.

Let $opt (i, j)$ denote the smallest split point $m$ that achieves the minimum for state $(i, j)$ .

If the optimal split point $opt (i, j)$ is monotone non-decreasing in $j$ (i.e., $opt (i, j) \leq opt (i, j + 1)$ for all $j$ ), you can solve each row of the DP with divide and conquer, reducing the total cost from $O (k n^{2})$ to $O (k n \log n)$ .

Why does monotonicity help? Once you know $opt (i, j_{m})$ for the midpoint $j_{m}$ of some range $[j_{l o}, j_{h i}]$ , you know:

Every $j < j_{m}$ has $opt (i, j) \leq opt (i, j_{m})$ .
Every $j > j_{m}$ has $opt (i, j) \geq opt (i, j_{m})$ .

So you split the problem into two halves, each with a strictly smaller search range for $m$ . This is divide and conquer on the $j$ -axis, and each "level" of the recursion scans a total of $O (n)$ split points, across $O (\log n)$ levels.

The monotonicity condition holds whenever the cost function $C (l, r)$ satisfies the quadrangle inequality (QI):

C (a, c) + C (b, d) \leq C (a, d) + C (b, c) for a \leq b \leq c \leq d

Intuitively, "nested intervals cost no more than crossing intervals." Squared range sums, convex functions of prefix-sum differences, and many combinatorial costs satisfy QI.

Naive vs D&C: Scanning Pattern Comparison

  NAIVE O(kn²): For each j, scan ALL possible split points m.
  Every cell scans the full range independently.

  j=1:  [x]
  j=2:  [x x]
  j=3:  [x x x]
  j=4:  [x x x x]
  j=5:  [x x x x x]
  j=6:  [x x x x x x]
  j=7:  [x x x x x x x]
  j=8:  [x x x x x x x x]
        ─────────────────────
        m=0 1 2 3 4 5 6 7        Total x's ~= n²/2

  D&C OPTIMIZATION O(kn log n): Solve midpoint first, then
  constrain each half's search range using monotonicity.

  Level 0 -- solve j=4:    [. . . x x x x .]       scan m in [0..7]
                                                           ↓ opt(4)=2
  Level 1 -- solve j=2,6:  [. x x . . . . .]       m in [0..2]
                           [. . . . . x x x]       m in [2..7]
                                                           ↓
  Level 2 -- solve j=1,3,  [x . . . . . . .]       m in [0..1]
             5,7:          [. . x . . . . .]       m in [1..2]
                           [. . . . x x . .]       m in [2..5]
                           [. . . . . . x x]       m in [5..7]

  Each level's x's partition [0..n], so total per level = O(n).
  Levels = O(log n).  Grand total = O(n log n) per layer.

Visual Walkthrough

Setup: $n = 8$ , $k = 2$ (two groups). Layer $i = 2$ : compute $d p [2] [j]$ for $j = 2. .8$ using values from $d p [1] [*]$ . Suppose the optimal split points turn out to be:

  j:      2   3   4   5   6   7   8
  opt:    1   1   2   2   3   5   6
                          ^
          non-decreasing -- monotonicity holds

Step 1 -- Solve the midpoint. Range $j \in [2, 8]$ , search $m \in [1, 7]$ . Midpoint $j_{m} = 5$ . Scan $m = 1. .4$ and find $opt (2, 5) = 2$ .

  j:    [2 ---- 5 ---- 8]       m-range: [1 .. 7]
               ^
            solve j=5 -> opt=2

Step 2 -- Recurse left. Range $j \in [2, 4]$ , search $m \in [1, 2]$ (narrowed: $m \leq opt (2, 5) = 2$ ). Midpoint $j_{m} = 3$ . Scan $m = 1. .2$ and find $opt (2, 3) = 1$ .

  j:    [2 -- 3 -- 4]           m-range: [1 .. 2]
             ^
          solve j=3 -> opt=1

Then recurse into $j = 2$ with $m \in [1, 1]$ and $j = 4$ with $m \in [1, 2]$ .

Step 3 -- Recurse right. Range $j \in [6, 8]$ , search $m \in [2, 7]$ (narrowed: $m \geq opt (2, 5) = 2$ ). Midpoint $j_{m} = 7$ . Scan $m = 2. .6$ and find $opt (2, 7) = 5$ .

  j:    [6 -- 7 -- 8]           m-range: [2 .. 7]
             ^
          solve j=7 -> opt=5

Then recurse into $j = 6$ with $m \in [2, 5]$ and $j = 8$ with $m \in [5, 7]$ .

Step 4 -- Leaf calls. Each remaining sub-range has one element. Scan the narrowed $m$ -range to fill in $d p [2] [2]$ , $d p [2] [4]$ , $d p [2] [6]$ , $d p [2] [8]$ .

  Full recursion tree (level -> total m-scans = O(n)):

  Level 0:  j in [2..8],  m in [1..7]      scan 4 values for j=5
                /                \
  Level 1:  [2..4] m:[1..2]    [6..8] m:[2..7]
            scan 2 for j=3      scan 5 for j=7
            /      \            /      \
  Level 2: [2]    [4]        [6]     [8]
           m:[1,1] m:[1,2]   m:[2,5]  m:[5,7]
           scan 1  scan 2    scan 4   scan 3

  Total scans across all levels: O(n) per level * O(log n) levels

The Invariant

Invariant: When computing $d p [i] [j]$ for $j \in [j_{l o}, j_{h i}]$ , the optimal split point lies in $[m_{l o}, m_{h i}]$ . Solving the midpoint $j_{m i d}$ and finding $opt (i, j_{m i d})$ splits both the $j$ -range and the $m$ -range:
Left half: $j \in [j_{l o}, j_{m i d} - 1]$ , $m \in [m_{l o}, opt (i, j_{m i d})]$
Right half: $j \in [j_{m i d} + 1, j_{h i}]$ , $m \in [opt (i, j_{m i d}), m_{h i}]$
Each recursive level partitions $[m_{l o}, m_{h i}]$ into non-overlapping scans (plus shared endpoints), so total work per level is $O (n)$ . With $O (\log n)$ levels, one layer costs $O (n \log n)$ and the full DP costs $O (k n \log n)$ .

General Recursion Tree of solve(l, r, opt_lo, opt_hi)

  solve(l, r, opt_lo, opt_hi) splits the j-range [l..r] at its
  midpoint and narrows the opt-range for each half.

  solve(1, 8, 0, 7)                        j-range=[1..8], opt-range=[0..7]
       │  mid=4, scan m∈[0..7] -> opt(4)=2
       │
       ├── solve(1, 3, 0, 2)               j=[1..3], opt=[0..2]  <- narrowed!
       │       │  mid=2, scan m∈[0..2] -> opt(2)=1
       │       ├── solve(1, 1, 0, 1)       j=[1], opt=[0..1]
       │       │       scan m∈[0..1] -> done (leaf)
       │       └── solve(3, 3, 1, 2)       j=[3], opt=[1..2]
       │               scan m∈[1..2] -> done (leaf)
       │
       └── solve(5, 8, 2, 7)               j=[5..8], opt=[2..7]  <- narrowed!
               │  mid=6, scan m∈[2..7] -> opt(6)=4
               ├── solve(5, 5, 2, 4)       j=[5], opt=[2..4]
               │       scan m∈[2..4] -> done (leaf)
               └── solve(7, 8, 4, 7)       j=[7..8], opt=[4..7]
                       │  mid=7, scan m∈[4..7] -> opt(7)=5
                       ├── solve(8, 8, 5, 7)   (leaf)
                       └── (empty left)

  KEY: At each level, the opt-ranges are non-overlapping (up to shared
  endpoints), so total scan work per level = O(n). Depth = O(log n).

DP Table Filling: How D&C Narrows the Search

  Rows = layers (k), Columns = positions (j).
  Each row is filled by one D&C pass over the previous row.

         j=1   j=2   j=3   j=4   j=5   j=6   j=7   j=8
  k=1:  [base] [base] [base] [base] [base] [base] [base] [base]
         │      │      │      │      │      │      │      │
         ▼      ▼      ▼      ▼      ▼      ▼      ▼      ▼
  k=2:  [    ] [    ] [    ] [MID ] [    ] [    ] [    ] [    ]
                                ↑ solve mid first
                             opt=2
                        ┌───────┴───────┐
                   left half         right half
                   j∈[1..3]          j∈[5..8]
                   opt∈[0..2]        opt∈[2..7]
                        │                 │
                        ▼                 ▼
  k=2:  [    ] [MID ] [    ] [done] [    ] [MID ] [    ] [    ]
                  ↑                          ↑
               opt=1                      opt=4
           ┌────┴────┐               ┌────┴──────┐
          j=1       j=3             j=5     j∈[7..8]
         opt∈[0,1] opt∈[1,2]      opt∈[2,4] opt∈[4,7]

  k=3:  (same D&C process, using completed k=2 row as "prv")

  Arrow pattern: each layer reads the FULL previous layer (prv[]),
  but the D&C recursion constrains which m-values are scanned.

Worked Example: Cell-by-Cell Trace for n = 6, k = 2

  Computing dp[2][j] for j = 1..6.   opt-range starts as [0..5].
  Suppose cost(m+1, j) = (j - m)².   dp[1][j] = j² (base layer).

  Level 0:  solve(1, 6, 0, 5)
    mid j=3, scan m ∈ {0,1,2,3,4,5}:
      m=0: dp[1][0]+cost(1,3) = 0+9  = 9
      m=1: dp[1][1]+cost(2,3) = 1+4  = 5
      m=2: dp[1][2]+cost(3,3) = 4+1  = 5  <- tie, take m=2
    -> dp[2][3] = 5,  opt(2,3) = 2

  Level 1L: solve(1, 2, 0, 2)
    mid j=1, scan m ∈ {0}:
      m=0: 0+1 = 1
    -> dp[2][1] = 1,  opt(2,1) = 0
    Then j=2, scan m ∈ {0,1,2}:
      m=0: 0+4 = 4    m=1: 1+1 = 2    m=2: 4+0 = 4
    -> dp[2][2] = 2,  opt(2,2) = 1

  Level 1R: solve(4, 6, 2, 5)
    mid j=5, scan m ∈ {2,3,4}:
      m=2: 4+9 = 13   m=3: 9+4 = 13   m=4: 16+1 = 17
    -> dp[2][5] = 13,  opt(2,5) = 2
    Then j=4, scan m ∈ {2}:
      m=2: 4+4 = 8
    -> dp[2][4] = 8,  opt(2,4) = 2
    Then j=6, scan m ∈ {2,3,4,5}:
      m=2: 4+16 = 20  m=3: 9+9 = 18  m=4: 16+4 = 20  m=5: 25+1 = 26
    -> dp[2][6] = 18,  opt(2,6) = 3

  Summary:  j:   1   2   3   4   5   6
           dp:   1   2   5   8  13  18
          opt:   0   1   2   2   2   3    <- non-decreasing OK

  Total m-scans: 1 + 3 + 6 + 1 + 3 + 4 = 18  (vs 6x6 = 36 naive)
  Savings grow with n:  O(n log n) vs O(n²).

What the Code Won't Teach You

How to verify the monotonicity condition in practice. Before applying D&C optimization to a new problem, implement the naive $O (k n^{2})$ DP first. Add a line that records opt[i][j] -- the split point $m$ that achieves the minimum for each state $(i, j)$ . Print opt[i][j] for a fixed row $i$ across all $j$ . If the values are non-decreasing, the precondition holds. If they're not, D&C optimization will silently give wrong answers. This takes 5 minutes and saves hours of debugging.

cpp

// Verification snippet -- add to your naive O(kn²) DP:
for (int j = 1; j <= n; j++) {
    for (int m = 0; m < j; m++) {
        if (prv[m] + cost(m+1, j) < cur[j]) {
            cur[j] = prv[m] + cost(m+1, j);
            opt[j] = m;   // record optimal split
        }
    }
}
// After filling row i, print opt[j] values:
for (int j = 1; j <= n; j++) cerr << "opt[" << j << "]=" << opt[j] << " ";
cerr << "\n";
// Expected: non-decreasing sequence. If not, D&C optimization does NOT apply.

The relationship between D&C optimization and Knuth's optimization. Both exploit the same underlying phenomenon -- monotonicity of optimal split points. The difference is the DP shape they apply to:

Knuth's optimization: Interval DP of the form $d p [l] [r] = min_{l \leq m < r} (d p [l] [m] + d p [m + 1] [r] + C (l, r))$ . Reduces $O (n^{3}) \to O (n^{2})$ . The split point $m$ satisfies $opt (l, r - 1) \leq opt (l, r) \leq opt (l + 1, r)$ .
D&C optimization: Layered/partitioning DP of the form $d p [i] [j] = min_{m} (d p [i - 1] [m] + C (m + 1, j))$ . Reduces $O (k n^{2}) \to O (k n \log n)$ . The split point satisfies $opt (i, j) \leq opt (i, j + 1)$ .

Both require the quadrangle inequality on the cost function, but they apply it to different recurrence structures. Using D&C on an interval DP (or Knuth's on a partitioning DP) won't work.

Why the recursion depth is $O (\log n)$ . Each call to solve(jl, jr, kl, kr) picks the midpoint $j m = (j l + j r) / 2$ and recurses on $[j l, j m - 1]$ and $[j m + 1, j r]$ . The $j$ -range halves at each level, so the recursion has depth $⌈ \log_{2} n ⌉$ . At each level, the $m$ -ranges across all recursive calls at that level partition $[0, n]$ (up to shared endpoints), so total scan work per level is $O (n)$ . Multiply: $O (\log n)$ levels $\times$ $O (n)$ per level $= O (n \log n)$ per layer.

With those deeper insights in hand, here is how to recognize D&C optimization problems in practice.

When to Reach for This

Trigger phrases in problem statements:

"Partition array into exactly $k$ contiguous groups, minimize total cost."
You write a 1D/1D DP and notice (or can prove) that $opt$ is monotone.
Cost function is a concave or convex function of a prefix-sum difference (squared sums, absolute deviations, etc.).

Quick QI test: if $C (l, r)$ depends only on $r - l$ and is convex, QI usually holds. When in doubt, brute-force verify on small $n$ :

cpp

for (int a = 1; a <= n; a++)
  for (int b = a; b <= n; b++)
    for (int c = b; c <= n; c++)
      for (int d = c; d <= n; d++)
        assert(cost(a,c) + cost(b,d) <= cost(a,d) + cost(b,c));

Counter-examples -- when this does NOT apply:

$opt$ is not monotone (e.g., cost is the number of distinct elements in a range -- QI fails in general). Use a different technique.
Cost is linear in the split point (no curvature). Use the Convex Hull Trick (CHT) instead.
Cost has no structure at all. Fall back to $O (k n^{2})$ or consider Knuth's optimization if the Knuth conditions hold.

C++ STL Reference

The implementation uses only basic containers and algorithms:

Function / Class	Header	What it does	Time Complexity
`vector<long long>`	`<vector>`	DP rows `prv[]`, `cur[]`	$O (n)$ alloc
`fill(first, last, val)`	`<algorithm>`	Reset DP row to `LLONG_MAX`	$O (n)$
`partial_sum(first, last, dest)`	`<numeric>`	Build prefix sums for cost queries	$O (n)$
`min(a, b)`	`<algorithm>`	Pick best transition	$O (1)$
`LLONG_MAX`	`<climits>`	Infinity sentinel for min-DP	--
`function<void(...)>`	`<functional>`	Recursive lambda (if needed)	--

Implementation (Contest-Ready)

Version 1: Minimal Contest Template

⚠️ Monotonicity is a PRECONDITION, not a hope. The D&C optimization requires that the optimal split point opt(i, j) satisfies opt(i, j) <= opt(i, j+1) (or >=). If this doesn't hold, the optimization gives wrong answers silently.
How to verify: Before applying D&C optimization, implement the O(n²) DP and print opt(i, j) for each j in a fixed row i. If the values are non-decreasing, proceed. If not, D&C optimization does NOT apply.
Sufficient condition (Knuth/SMAWK): If the cost function C(l, r) satisfies the quadrangle inequality: C(a, c) + C(b, d) <= C(a, d) + C(b, c) for a <= b <= c <= d, then monotonicity holds.

Solves: partition array $a [1. . n]$ into exactly $K$ contiguous groups, minimize $\sum C (l, r)$ where $C (l, r) = (sum of a [l . . r])^{2}$ .

cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n, K;
    cin >> n >> K;
    vector<long long> a(n + 1), pre(n + 1, 0);
    for (int i = 1; i <= n; i++) { cin >> a[i]; pre[i] = pre[i - 1] + a[i]; }

    auto cost = [&](int l, int r) -> long long {
        long long s = pre[r] - pre[l - 1];
        return s * s;
    };

    const long long INF = 1e18;
    vector<long long> prv(n + 1, INF), cur(n + 1, INF);

    // Base layer: 1 group covering [1..j]
    for (int j = 1; j <= n; j++) prv[j] = cost(1, j);

    for (int i = 2; i <= K; i++) {
        fill(cur.begin(), cur.end(), INF);
        // D&C: compute cur[jl..jr] with k in [kl..kr]
        auto solve = [&](auto& self, int jl, int jr, int kl, int kr) -> void {
            if (jl > jr) return;
            int jm = (jl + jr) / 2;
            int best_k = kl;
            for (int k = kl; k <= min(jm - 1, kr); k++) {
                long long val = prv[k] + cost(k + 1, jm);
                if (val < cur[jm]) { cur[jm] = val; best_k = k; }
            }
            self(self, jl, jm - 1, kl, best_k);
            self(self, jm + 1, jr, best_k, kr);
        };
        solve(solve, i, n, i - 1, n - 1);
        swap(prv, cur);
    }
    cout << prv[n] << "\n";
}

Version 2: Explained

cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n, K;
    cin >> n >> K;
    vector<long long> a(n + 1), pre(n + 1, 0);
    for (int i = 1; i <= n; i++) {
        cin >> a[i];
        pre[i] = pre[i - 1] + a[i];
    }

    // Cost of one group covering [l..r].
    // Here: squared range sum.  Replace with your problem's cost.
    auto cost = [&](int l, int r) -> long long {
        long long s = pre[r] - pre[l - 1];
        return s * s;
    };

    const long long INF = 1e18;
    // prv[j] = dp[i-1][j], cur[j] = dp[i][j].
    // Only two rows needed -- previous layer and current layer.
    vector<long long> prv(n + 1, INF), cur(n + 1, INF);

    // Layer 1: exactly one group, so dp[1][j] = cost(1, j).
    for (int j = 1; j <= n; j++)
        prv[j] = cost(1, j);

    // Layers 2..K: use divide and conquer on j.
    for (int i = 2; i <= K; i++) {
        fill(cur.begin(), cur.end(), INF);

        // solve(jl, jr, kl, kr):
        //   Compute cur[j] for j in [jl..jr].
        //   The optimal k for these j-values lies in [kl..kr].
        //
        // Strategy:
        //   1. Pick midpoint jm = (jl+jr)/2.
        //   2. Brute-force scan k in [kl..min(jm-1, kr)] to find opt(jm).
        //      (k < jm because group i must be non-empty: [k+1..jm].)
        //   3. Recurse left  [jl..jm-1] with k in [kl..opt(jm)].
        //   4. Recurse right [jm+1..jr] with k in [opt(jm)..kr].
        //
        // Why correct: opt is monotone, so left half needs k <= opt(jm)
        // and right half needs k >= opt(jm).
        auto solve = [&](auto& self, int jl, int jr, int kl, int kr) -> void {
            if (jl > jr) return;
            int jm = (jl + jr) / 2;
            int best_k = kl;

            // Scan all feasible k for position jm.
            // k can be at most jm-1 (so that group [k+1..jm] is non-empty),
            // and at most kr (upper bound from monotonicity).
            for (int k = kl; k <= min(jm - 1, kr); k++) {
                long long val = prv[k] + cost(k + 1, jm);
                if (val < cur[jm]) {
                    cur[jm] = val;
                    best_k = k;
                }
            }

            // Left half: j < jm, so opt <= best_k.
            self(self, jl, jm - 1, kl, best_k);
            // Right half: j > jm, so opt >= best_k.
            self(self, jm + 1, jr, best_k, kr);
        };

        // j ranges from i (need at least i elements for i groups) to n.
        // k ranges from i-1 (previous layer needs at least i-1 elements) to n-1.
        solve(solve, i, n, i - 1, n - 1);
        swap(prv, cur);
    }

    // After K layers, answer is in prv[n] (swapped from cur).
    cout << prv[n] << "\n";
}

Operations Reference

Operation	Time	Space
Build prefix sums	$O (n)$	$O (n)$
Base layer (1 group)	$O (n)$	--
One D&C layer	$O (n \log n)$	$O (n)$
Full DP ( $K$ layers)	$O (K n \log n)$	$O (n)$
Naive DP (no optimization)	$O (K n^{2})$	$O (n)$
Cost function query (prefix sums)	$O (1)$	--

Problem Patterns & Tricks

Pattern 1: Partition into K Groups -- Squared Cost

Partition array into exactly $K$ contiguous groups, minimize $\sum (group sum)^{2}$ . The squared-sum cost satisfies QI. Direct application of the template.

cpp

auto cost = [&](int l, int r) -> long long {
    long long s = pre[r] - pre[l - 1];
    return s * s;
};

CF examples: CF 868F (Yet Another Minimization Problem)

Pattern 2: Partition into K Groups -- Count of Distinct Pairs

Cost = number of pairs $(x, y)$ with $x < y$ and $a [x] = a [y]$ in the group. Maintain a frequency array and update incrementally as $k$ moves (similar to Mo's algorithm). Satisfies QI.

CF examples: CF 868F

Pattern 3: Minimum Cost to Cut -- 1D Facility Location

Place $K$ facilities on a line; each client goes to the nearest one. Cost of a segment is $\sum | a [j] - median |$ or a similar convex function of the group. Monotone opt applies.

CF examples: CF 1527E (Partition Game)

Pattern 4: CSES-Style "Divide into K Segments"

Minimize the maximum segment sum, or minimize a convex function of segment sums. When the objective is convex in the partition, D&C optimization applies. Sometimes combined with binary search on the answer ("aliens trick" / lambda optimization).

CF examples: CSES - Dividing into Segments

Pattern 5: Building Bridges / Weighted Intervals

$d p [j] = min_{k < j} (d p [k] + C (k, j))$ -- a 1D DP where the cost function has monotone $opt$ . This is the single-layer variant ( $K = 1$ outer loop, just one D&C call). Often appears as "building bridges" or "placing checkpoints."

cpp

// Single-layer D&C optimization
auto solve = [&](auto& self, int jl, int jr, int kl, int kr) -> void {
    if (jl > jr) return;
    int jm = (jl + jr) / 2, best_k = kl;
    for (int k = kl; k <= min(jm - 1, kr); k++)
        if (dp[k] + cost(k + 1, jm) < dp[jm])
            { dp[jm] = dp[k] + cost(k + 1, jm); best_k = k; }
    self(self, jl, jm - 1, kl, best_k);
    self(self, jm + 1, jr, best_k, kr);
};

CF examples: CF 321E (Ciel and Gondola)

Pattern 6: Online Cost with Incremental Updates

When $C (l, r)$ cannot be computed in $O (1)$ but can be extended left/right in $O (1)$ (like a frequency counter), scan $k$ from $k l$ to $k r$ while maintaining a sliding data structure. Total work is still $O (n \log n)$ per layer because the $k$ -ranges at each recursion level form a partition of $[0, n]$ .

CF examples: CF 868F

Contest Cheat Sheet

+----------------------------------------------------------------+
|          DP DIVIDE AND CONQUER OPTIMIZATION                    |
+----------------------------------------------------------------+
| WHEN TO USE:                                                   |
|   - dp[i][j] = min over k { dp[i-1][k] + C(k+1,j) }         |
|   - opt(i,j) is non-decreasing in j (monotone split)          |
|   - C(l,r) satisfies quadrangle inequality                    |
|   - Reduces O(Kn^2) to O(Kn log n)                            |
+----------------------------------------------------------------+
| TEMPLATE CORE:                                                 |
|   solve(jl, jr, kl, kr):                                      |
|     jm = (jl+jr)/2                                             |
|     scan k in [kl..min(jm-1, kr)] -> find best_k              |
|     recurse left:  (jl, jm-1, kl, best_k)                     |
|     recurse right: (jm+1, jr, best_k, kr)                     |
+----------------------------------------------------------------+
| KEY BOUNDS:                                                    |
|   j in [i .. n],  k in [i-1 .. n-1]                           |
|   Two rows only: prv[] and cur[], swap after each layer        |
+----------------------------------------------------------------+
| COMPLEXITY:  Time O(Kn log n)   Space O(n)                     |
| VERIFY QI:   C(a,c)+C(b,d) <= C(a,d)+C(b,c)  for a<=b<=c<=d  |
+----------------------------------------------------------------+

Gotchas & Debugging

Two-Row Requirement

You must have the previous layer fully computed before starting D&C on the current layer. Do not try to compute in-place -- the recursion reads prv[k] values that must not be overwritten.

k-Range Bounds

The most common bug: passing wrong kl / kr to recursive calls.

Left call gets kl to best_k (not best_k - 1).
Right call gets best_k to kr (not best_k + 1).

best_k is included in both halves. This is correct because the left half constrains $opt \leq best_k$ and the right half constrains $opt \geq best_k$ .

k < jm Constraint

In the inner scan, k must satisfy k <= jm - 1 so that group $[k + 1, j m]$ is non-empty. Use min(jm - 1, kr) as the loop bound. Forgetting this causes index-out-of-range or empty groups.

Cost Function Must Be Correct

The D&C trick only helps with the transition scan -- if $C (l, r)$ is buggy, no amount of optimization saves you. Test $C$ independently on small cases.

Verifying QI

If unsure whether your cost function satisfies QI:

cpp

// Brute-force QI check on small n (debugging only)
for (int a = 1; a <= n; a++)
  for (int b = a; b <= n; b++)
    for (int c = b; c <= n; c++)
      for (int d = c; d <= n; d++)
        assert(cost(a,c) + cost(b,d) <= cost(a,d) + cost(b,c));

Overflow

If $C (l, r)$ involves squared sums and $a [i] \leq 10^{9}$ , the sum can reach $10^{14}$ and the square $10^{28}$ -- this overflows long long. Use __int128 or restructure the cost.

Debugging Tips

Print best_k for a small example and verify monotonicity by hand.
Compare output against the $O (K n^{2})$ naive solution on random inputs up to $n = 50$ .
If TLE, check that your cost function is truly $O (1)$ per query (or that the incremental version is amortized correctly).

Off-by-one in the opt-search bound

The single tricky line in the entire template is the inner-loop bound: for (int k = kl; k <= min(jm - 1, kr); k++). Writing min(jm, kr) instead of min(jm - 1, kr) lets k = jm, which reads an uncomputed value from the previous layer (or an empty group [jm+1, jm]) and silently corrupts dp[i][jm]. Every D&C optimization bug I've hit lived in this single line -- comment it with // k must be strictly less than jm and re-read it before submitting.

Mental Traps

"The monotonicity condition is obvious." It is not. You must prove that $opt (i, j) \leq opt (i, j + 1)$ for your specific cost function. The quadrangle inequality is a sufficient condition, but it is not always easy to verify by inspection. Some cost functions that "look" monotone are not (e.g., cost based on the number of distinct elements in a range). Always verify empirically on small inputs or prove it analytically before committing to D&C optimization.

  THE MONOTONICITY PROPERTY YOU MUST VERIFY:

  For a fixed layer i:

    opt(i,1) <= opt(i,2) <= opt(i,3) <= ... <= opt(i,n)
       │          │          │                    │
       ▼          ▼          ▼                    ▼
    best m     best m     best m              best m
    for j=1    for j=2    for j=3             for j=n

  OK VALID (monotone):    opt = [0, 0, 1, 2, 2, 3, 5, 6]
  X INVALID (not mono):  opt = [0, 2, 1, 3, 2, 5, 4, 6]
                                    ↑ ↑        ↑ ↑
                               2>1 violates    5>4 violates

  If INVALID: D&C optimization restricts the search range for some j
  to NOT include its true optimal m -> silent wrong answers.

"D&C optimization always applies when cost is concave/convex." Concavity or convexity of $C (l, r)$ as a function of the interval is sufficient for the quadrangle inequality (and therefore monotone split points), but it is not the only way monotonicity arises. The key condition is monotonicity of $opt (i, j)$ -- not concavity itself. There exist cost functions that are neither concave nor convex but still have monotone optimal splits. Conversely, there are cost functions with local concavity that still violate QI. Always check the actual condition (monotone opt), not a proxy.

"I can just use this whenever the naive DP is $O (k n^{2})$ ." The $O (k n^{2}) \to O (k n \log n)$ reduction only works when the QBSM (Quadrangle inequality / optimal Break-point Sorted Monotonicity) condition holds. Without it, the D&C recursion prunes the search range incorrectly. Other $O (k n^{2})$ DPs may need entirely different optimizations: the Convex Hull Trick (when the transition is linear in $m$ ), Knuth's optimization (for interval DPs), or the SMAWK algorithm (for totally monotone matrices).

Debug Drills

Drill 1: Scanning opt in wrong range

What's wrong?

cpp

void solve(int k, int lo, int hi, int opt_lo, int opt_hi) {
    if (lo > hi) return;
    int mid = (lo + hi) / 2;
    int best_j = opt_lo;
    for (int j = opt_lo; j <= opt_hi; j++) {  // BUG: j can exceed mid!
        long long val = dp[k-1][j] + cost(j + 1, mid);
        if (val < dp[k][mid]) {
            dp[k][mid] = val;
            best_j = j;
        }
    }
    solve(k, lo, mid - 1, opt_lo, best_j);
    solve(k, mid + 1, hi, best_j, opt_hi);
}

The loop bound j <= opt_hi doesn't enforce j < mid. When opt_hi >= mid, the loop considers j = mid, which means cost(mid + 1, mid) is evaluated (empty range) -- silent corruption. Fix: Change the loop bound to j <= min(opt_hi, mid - 1).

Drill 2: Forgot to initialize dp to infinity

What's wrong?

cpp

long long dp[K_MAX + 1][N_MAX + 1]; // uninitialized!

void solve_all() {
    for (int i = 1; i <= n; i++)
        dp[1][i] = cost(1, i); // base case
    for (int k = 2; k <= K; k++)
        solve(k, 1, n, 0, n - 1);
    cout << dp[K][n] << endl;
}

dp[k][i] for k >= 2 is never initialized to infinity. The solve routine compares val < dp[k][mid] to update, but dp[k][mid] starts as 0 (or garbage). Since val is positive, no update fires, and the answer is 0. Fix: Before each solve(k, ...) call, initialize dp[k][1..n] = INF (e.g., 1e18).

Drill 3: Cost function not O(1)

What's wrong?

cpp

// cost(l, r) = number of distinct elements in a[l..r]
int cost(int l, int r) {
    set<int> s;
    for (int i = l; i <= r; i++) s.insert(a[i]);
    return s.size();
    // BUG: O(n log n) per call -> total O(n^2 log^2 n)
}

The D&C optimization makes O(n log n) calls to cost(). An O(n) cost function blows the complexity back up past the naive O(Kn²). Fix: Use incremental cost with a global frequency array and maintain cost(l, r) by adding/removing one element at a time as pointers shift. Total pointer movement across the recursion is O(n log n), so amortized updates stay within budget (the CF 868F pattern).

Self-Test

Before using this in a contest, verify you can do all of the following without looking at the notes:

[ ] State the D&C optimization precondition: the optimal split point $opt (i, j)$ must be monotone non-decreasing in $j$ for each fixed layer $i$ .
[ ] Explain how the solve(l, r, opt_lo, opt_hi) recursion reduces $O (k n^{2})$ to $O (k n \log n)$ : solving the midpoint narrows the opt-range for each half, and the total scan work per recursion level is $O (n)$ across $O (\log n)$ levels.
[ ] Verify the monotonicity condition on a small example ( $n \leq 8$ ) by hand: compute $opt (i, j)$ for all $j$ in a fixed row and check they're non-decreasing.
[ ] Implement the D&C optimization for a partitioning problem from scratch in under 15 minutes (write solve, the layer loop, and the cost function).
[ ] Distinguish when to use D&C optimization vs Knuth's optimization vs Convex Hull Trick: D&C for layered partition DPs with monotone opt; Knuth's for interval DPs $d p [l] [r]$ with QI; CHT when the transition is a linear function of the split point.
[ ] Explain why this technique requires computing $C (l, r)$ efficiently: the inner scan evaluates $C (m + 1, j)$ for each candidate $m$ , so $O (1)$ cost queries are needed to keep the per-level work at $O (n)$ .

Practice Problems

CF Rating	What you should be comfortable with
1200	Understand the O(Kn²) partition DP; write the brute-force three-nested-loop version
1500	Know what "monotonicity of opt" means; verify the monotonicity condition on small cases
1800	Implement the D&C template cleanly; precompute prefix sums for O(1) cost queries; solve CF 321E
2100	Prove the quadrangle inequality for non-trivial cost functions; combine D&C with incremental cost updates (CF 868F); distinguish D&C from CHT and Knuth

#	Problem	Source	Difficulty	Key Idea
1	Ciel and Gondola	CF 321E	2000	Partition into $K$ groups, direct D&C opt template
2	Partition Game	CF 1527E	2100	Partition cost based on repeated elements
3	Yet Another Minimization Problem	CF 868F	2500	D&C opt with incremental cost (frequency pairs)
4	Tear Up the Floor	CF 1609F	2400	Partition rows, quadrangle inequality cost
5	Lances	CF 1707C	2400	D&C opt on geometric cost
6	APIO 2014 -- Sequence	APIO	Hard	Classic K-partition, squared cost
7	IOI 2014 -- Holiday	IOI	Hard	D&C optimization with segment tree cost
8	Guardians of the Lunatics	SPOJ	Medium	Partition into K groups, absolute cost

DP -- Divide and Conquer Optimization ​

Contents ​

Intuition ​

Naive vs D&C: Scanning Pattern Comparison ​

Visual Walkthrough ​

The Invariant ​

General Recursion Tree of solve(l, r, opt_lo, opt_hi) ​

DP Table Filling: How D&C Narrows the Search ​

Worked Example: Cell-by-Cell Trace for n = 6, k = 2 ​

What the Code Won't Teach You ​

When to Reach for This ​

C++ STL Reference ​

Implementation (Contest-Ready) ​

Version 1: Minimal Contest Template ​

Version 2: Explained ​

Operations Reference ​

Problem Patterns & Tricks ​

Pattern 1: Partition into K Groups -- Squared Cost ​

Pattern 2: Partition into K Groups -- Count of Distinct Pairs ​

Pattern 3: Minimum Cost to Cut -- 1D Facility Location ​

Pattern 4: CSES-Style "Divide into K Segments" ​

Pattern 5: Building Bridges / Weighted Intervals ​

Pattern 6: Online Cost with Incremental Updates ​

Contest Cheat Sheet ​

Gotchas & Debugging ​

Two-Row Requirement ​

k-Range Bounds ​

k < jm Constraint ​

Cost Function Must Be Correct ​

Verifying QI ​

Overflow ​

Debugging Tips ​

Off-by-one in the opt-search bound ​

Mental Traps ​

Debug Drills ​

Self-Test ​

Practice Problems ​

Further Reading ​

DP -- Divide and Conquer Optimization

Contents

Intuition

Naive vs D&C: Scanning Pattern Comparison

Visual Walkthrough

The Invariant

General Recursion Tree of solve(l, r, opt_lo, opt_hi)

DP Table Filling: How D&C Narrows the Search

Worked Example: Cell-by-Cell Trace for n = 6, k = 2

What the Code Won't Teach You

When to Reach for This

C++ STL Reference

Implementation (Contest-Ready)

Version 1: Minimal Contest Template

Version 2: Explained

Operations Reference

Problem Patterns & Tricks

Pattern 1: Partition into K Groups -- Squared Cost

Pattern 2: Partition into K Groups -- Count of Distinct Pairs

Pattern 3: Minimum Cost to Cut -- 1D Facility Location

Pattern 4: CSES-Style "Divide into K Segments"

Pattern 5: Building Bridges / Weighted Intervals

Pattern 6: Online Cost with Incremental Updates

Contest Cheat Sheet

Gotchas & Debugging

Two-Row Requirement

k-Range Bounds

k < jm Constraint

Cost Function Must Be Correct

Verifying QI

Overflow

Debugging Tips

Off-by-one in the opt-search bound

Mental Traps

Debug Drills

Self-Test

Practice Problems

Further Reading