Appearance
Knuth's Optimization
Quick Revisit
- USE WHEN: Interval DP where cost
satisfies the quadrangle inequality (e.g., optimal BST, stone merging) - INVARIANT:
— split-point search range is bounded by neighbors - TIME: O(n²) | SPACE: O(n²)
- CLASSIC BUG: Wrong fill order — must iterate by increasing interval length so that
and are already computed - PRACTICE: 04-ladder-dp
AL-41 | Reduces
| Field | Value |
|---|---|
| Difficulty | [Advanced] |
| CF Rating | 1900-2100 |
| Prerequisites | Interval DP |
Contents
- Intuition -- Why Does Interval DP Feel Slow?
- Where the Speedup Comes From
- The Quadrangle Inequality
- Using Monotonicity to Speed Up the DP
- Worked Example: 6 Piles Step by Step
- Visual Intuition
- C++ STL Reference
- Implementation (Contest-Ready)
- Operations Reference
- Problem Patterns & Tricks
- Contest Cheat Sheet
- Gotchas & Debugging
- Self-Test
- Practice Problems
- Further Reading
Intuition -- Why Does Interval DP Feel Slow?
Consider the stone-merging problem. You have
Suppose
Pile: 1 2 3 4 5 6
Weight: 4 7 3 5 2 8This is a textbook interval DP problem. Define
Base case:
Here
This recurrence is
Where the Speedup Comes From
Let
dp table (upper triangle):
j=1 j=2 j=3 j=4 j=5 j=6
i=1 [ 0 11 14 19 21 29 ]
i=2 [ - 0 10 15 17 32 ]
i=3 [ - - 0 8 10 25 ]
i=4 [ - - - 0 7 22 ]
i=5 [ - - - - 0 10 ]
i=6 [ - - - - - 0 ]
opt table (which k gave the minimum):
j=1 j=2 j=3 j=4 j=5 j=6
i=1 [ 1 1 2 2 2 3 ]
i=2 [ - 2 2 3 3 3 ]
i=3 [ - - 3 3 3 4 ]
i=4 [ - - - 4 4 5 ]
i=5 [ - - - - 5 5 ]
i=6 [ - - - - - 6 ]Look at any row of the
This is monotonicity of optimal split points. The best place to split
The Quadrangle Inequality
This monotonicity is not magic -- it comes from a property of the cost function. A function
Intuitively, this says "crossing is cheap": the cost of two overlapping intervals is no more than the cost of one big interval plus one small interval. You can think of it as a kind of subadditivity for how the cost grows as intervals widen.
For the stone-merging problem,
Since all weights are non-negative, this holds with equality. (The sum function is actually concave in the interval endpoints, which is even stronger.)
Knuth's theorem (1971): If
Using Monotonicity to Speed Up the DP
Instead of scanning all
The total work becomes:
For a fixed length
This is a telescoping sum! The
Worked Example: 6 Piles Step by Step
Let us trace the computation for our 6-pile example with weights
Length L=1 (base cases):
dp[1][1]=0, dp[2][2]=0, ..., dp[6][6]=0
opt[1][1]=1, opt[2][2]=2, ..., opt[6][6]=6
Length L=2:
dp[1][2]: k in [opt[1][1]..opt[2][2]] = [1..2], but k < j=2 so k=1
k=1: dp[1][1]+dp[2][2]+sum(1,2) = 0+0+11 = 11
opt[1][2] = 1
dp[2][3]: k in [opt[2][2]..opt[3][3]] = [2..3], k < 3 so k=2
k=2: 0+0+10 = 10
opt[2][3] = 2
dp[3][4]: k=3: 0+0+8 = 8 opt[3][4] = 3
dp[4][5]: k=4: 0+0+7 = 7 opt[4][5] = 4
dp[5][6]: k=5: 0+0+10 = 10 opt[5][6] = 5
Length L=3:
dp[1][3]: k in [opt[1][2]..opt[2][3]] = [1..2]
k=1: dp[1][1]+dp[2][3]+sum(1,3) = 0+10+14 = 24
k=2: dp[1][2]+dp[3][3]+sum(1,3) = 11+0+14 = 25
Best: 24 at k=1 ... wait, but our table says dp[1][3]=14?Let me recompute more carefully. The stone-merge cost for merging piles
Piles (0-indexed): a[0..5] = [4, 7, 3, 5, 2, 8]
prefix[0..6] = [0, 4, 11, 14, 19, 21, 29]
cost(i, j) = prefix[j+1] - prefix[i]
Length L=1 (base):
dp[i][i] = 0 for all i, opt[i][i] = i
Length L=2 (j = i+1):
dp[0][1]: k=0: 0+0+11 = 11 opt[0][1]=0
dp[1][2]: k=1: 0+0+10 = 10 opt[1][2]=1
dp[2][3]: k=2: 0+0+8 = 8 opt[2][3]=2
dp[3][4]: k=3: 0+0+7 = 7 opt[3][4]=3
dp[4][5]: k=4: 0+0+10 = 10 opt[4][5]=4
Length L=3 (j = i+2):
dp[0][2]: k in [opt[0][1]..opt[1][2]] = [0..1]
k=0: dp[0][0]+dp[1][2]+cost(0,2) = 0+10+14 = 24
k=1: dp[0][1]+dp[2][2]+cost(0,2) = 11+0+14 = 25
min=24, opt[0][2]=0
dp[1][3]: k in [opt[1][2]..opt[2][3]] = [1..2]
k=1: 0+8+14 = 22
k=2: 10+0+14 = 24
min=22, opt[1][3]=1
dp[2][4]: k in [opt[2][3]..opt[3][4]] = [2..3]
k=2: 0+7+10 = 17
k=3: 8+0+10 = 18
min=17, opt[2][4]=2
dp[3][5]: k in [opt[3][4]..opt[4][5]] = [3..4]
k=3: 0+10+15 = 25
k=4: 7+0+15 = 22
min=22, opt[3][5]=4
Length L=4 (j = i+3):
dp[0][3]: k in [opt[0][2]..opt[1][3]] = [0..1]
k=0: 0+22+19 = 41
k=1: 11+8+19 = 38
min=38, opt[0][3]=1
dp[1][4]: k in [opt[1][3]..opt[2][4]] = [1..2]
k=1: 0+17+17 = 34
k=2: 10+7+17 = 34
min=34, opt[1][4]=1
dp[2][5]: k in [opt[2][4]..opt[3][5]] = [2..4]
k=2: 0+22+18 = 40
k=3: 8+10+18 = 36
k=4: 17+0+18 = 35
min=35, opt[2][5]=4
Length L=5 (j = i+4):
dp[0][4]: k in [opt[0][3]..opt[1][4]] = [1..1]
k=1: 11+17+21 = 49
min=49, opt[0][4]=1
dp[1][5]: k in [opt[1][4]..opt[2][5]] = [1..4]
k=1: 0+35+25 = 60
k=2: 10+22+25 = 57
k=3: 22+10+25 = 57
k=4: 34+0+25 = 59
min=57, opt[1][5]=2
Length L=6 (j = i+5):
dp[0][5]: k in [opt[0][4]..opt[1][5]] = [1..2]
k=1: 11+35+29 = 75
k=2: 24+22+29 = 75
min=75, opt[0][5]=1The minimum total cost to merge all 6 piles is
Visual Intuition
Here is the completed
opt table (0-indexed):
j=0 j=1 j=2 j=3 j=4 j=5
i=0 [ 0 0 0 1 1 1 ]
i=1 [ . 1 1 1 1 2 ]
i=2 [ . . 2 2 2 4 ]
i=3 [ . . . 3 3 4 ]
i=4 [ . . . . 4 4 ]
i=5 [ . . . . . 5 ]
(. = not applicable, below the diagonal)The monotonicity property constrains each entry from below (left neighbor) and above (lower neighbor):
For dp[i][j], we need opt[i][j-1] and opt[i+1][j]:
opt[i][j-1] -----> opt[i][j] <----- opt[i+1][j]
(left) <= (here) <= (below)
Search only k in [opt[i][j-1] .. opt[i+1][j]]Now look at how this makes the total work telescope. For length
i=0: search [opt[0][2]..opt[1][3]] = [0..1] --> 2 values of k
i=1: search [opt[1][3]..opt[2][4]] = [1..2] --> 2 values of k
i=2: search [opt[2][4]..opt[3][5]] = [2..4] --> 3 values of k
--------
7 total
Without Knuth: 3 intervals * 4 splits each = 12 total
With Knuth: 7 total (telescoping saves ~40%)For larger
Length L, intervals (0, L), (1, L+1), ..., (n-L-1, n-1):
|<-- opt[0][L-1] .... opt[1][L] -->|
|<-- opt[1][L] .... opt[2][L+1] -->|
|<-- opt[2][L+1] .... opt[3][L+2] -->|
The ranges overlap only at boundaries --> total = O(n) per length
n lengths --> O(n^2) totalDiagram: How opt bounds narrow the search for opt[i][j]
opt table (0-indexed, filled by diagonals of increasing length):
j=0 j=1 j=2 j=3 j=4 j=5
i=0 [ 0 0 0 1 1 ? ]
i=1 [ . 1 1 1 1 ? ]
i=2 [ . . 2 2 2 ? ]
i=3 [ . . . 3 3 ? ]
i=4 [ . . . . 4 4 ]
i=5 [ . . . . . 5 ]
The "?" entries on diagonal L=5 are not yet computed.
To fill opt[0][5]:
opt[0][4] = 1 ─── lower bound ───┐
▼
opt[0][5] = ?
▲
opt[1][5] = 2 ─── upper bound ───┘
⇒ Search k ∈ {1, 2} only, instead of k ∈ {0, 1, 2, 3, 4}
To fill opt[1][5]: opt[1][4] = 1, opt[2][5] = ?
To fill opt[2][5]: opt[2][4] = 2, opt[3][5] = ?
Fill order for diagonal L=5: first (2,5), then (1,5), then (0,5)?
No -- all entries on the SAME diagonal L=5 depend on diagonal L=4
(already done) and on each other only through neighboring entries
on the same diagonal, which are all filled left-to-right (increasing i).
This works because opt[i+1][j] is filled before opt[i][j] within the
same diagonal when we iterate i from high to low... but actually we
iterate i from low to high, so we use opt[i+1][j] from the PREVIOUS
diagonal (length L-1). Both bounds come from length L-1. OKDiagram: Naive O(n³) vs Knuth O(n²) -- scanning ranges compared
NAIVE O(n³): for each cell (i,j), scan ALL k from i to j-1
Diagonal L=5 (n=6):
dp[0][5]: scan k ∈ {0,1,2,3,4} ███████████████ 5 values
dp[1][5]: scan k ∈ {1,2,3,4} ████████████ 4 values... (but only 1 interval here)
──────────────────────────────────────────────
Total: every cell scans ~O(n) values
Sum across all diagonals: O(n) cells x O(n) scan = O(n²) per diagonal
O(n) diagonals -> O(n³)
KNUTH O(n²): for each cell, scan k in [opt[i][j-1], opt[i+1][j]]
Diagonal L=4 (n=6):
dp[0][4]: k ∈ [opt[0][3], opt[1][4]] = [1, 1] █ 1 value
dp[1][5]: k ∈ [opt[1][4], opt[2][5]] = [1, 4] ████ 4 values
─────────
5 total
Ranges CHAIN: upper bound of one ~= lower bound of next
┌────────────────────────────────────────────────┐
│ Naive: ██████████████████████████████████ │ <- full scan
│ ████████████████████████████ │ per cell
│ ██████████████████████ │
│ ████████████████ │
│ │
│ Knuth: ████ │ <- narrow,
│ ███ │ chained
│ ██ │ ranges
│ █ │
└────────────────────────────────────────────────┘
Chained ranges telescope: total per diagonal = O(n)
O(n) diagonals -> O(n²)What the Code Won't Teach You
The quadrangle inequality, intuitively: "mixing is no worse than nesting."
The formal QI condition
Why the optimization works: telescoping search ranges.
The key insight is that
Amortized cost for one diagonal (fixed length L, n=8):
Intervals of length L=4:
i=0: search k in [opt[0][3], opt[1][4]] width = opt[1][4] - opt[0][3] + 1
i=1: search k in [opt[1][4], opt[2][5]] width = opt[2][5] - opt[1][4] + 1
i=2: search k in [opt[2][5], opt[3][6]] width = opt[3][6] - opt[2][5] + 1
i=3: search k in [opt[3][6], opt[4][7]] width = opt[4][7] - opt[3][6] + 1
Total = (opt[4][7] - opt[0][3]) + n_intervals <= n + n = O(n)
\___________ ________/
V
telescoping sum!
The "chaining" visually on the k-axis:
i=0: [===]
i=1: [======]
i=2: [===]
i=3: [===]
──────────────────────── k-axis
0 1 2 3 4 5 6
Each range starts roughly where the previous one ended.
Total coverage = O(n).
O(n) per diagonal x O(n) diagonals = O(n²) total.The amortized argument in detail. For a fixed diagonal (all intervals of length
Relationship to the SMAWK algorithm. SMAWK solves the "row minima of a totally monotone matrix" problem in
C++ STL Reference
Knuth's optimization does not require exotic data structures. The main tools:
| Function / Class | Header | What it does | Time |
|---|---|---|---|
vector<vector<int>> dp(n, vector<int>(n)) | <vector> | 2D DP table | |
vector<vector<int>> opt(n, vector<int>(n)) | <vector> | Optimal split table | |
partial_sum(a.begin(), a.end(), p.begin()+1) | <numeric> | Prefix sums | |
int dp[N][N] | -- | Stack-allocated 2D array (contest style) |
For contest use, fixed-size arrays dp[N][N] and opt[N][N] are simpler and faster than nested vectors. Just make sure
Implementation (Contest-Ready)
Problem: Given
Minimal Template
cpp
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 5005;
long long dp[MAXN][MAXN];
int opt[MAXN][MAXN];
long long pre[MAXN];
int a[MAXN];
int main() {
int n;
scanf("%d", &n);
for (int i = 0; i < n; i++) scanf("%d", &a[i]);
pre[0] = 0;
for (int i = 0; i < n; i++) pre[i + 1] = pre[i] + a[i];
auto cost = [&](int i, int j) -> long long {
return pre[j + 1] - pre[i];
};
for (int i = 0; i < n; i++) {
dp[i][i] = 0;
opt[i][i] = i;
}
for (int len = 2; len <= n; len++) {
for (int i = 0; i + len - 1 < n; i++) {
int j = i + len - 1;
dp[i][j] = LLONG_MAX;
int lo = opt[i][j - 1];
int hi = (j + 1 < n) ? opt[i + 1][j] : j - 1;
hi = min(hi, j - 1);
for (int k = lo; k <= hi; k++) {
long long val = dp[i][k] + dp[k + 1][j] + cost(i, j);
if (val < dp[i][j]) {
dp[i][j] = val;
opt[i][j] = k;
}
}
}
}
printf("%lld\n", dp[0][n - 1]);
return 0;
}Explained Version
cpp
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 5005;
// dp[i][j] = minimum cost to merge piles i..j into one pile
long long dp[MAXN][MAXN];
// opt[i][j] = the split point k that achieved dp[i][j]
int opt[MAXN][MAXN];
// prefix sums for O(1) range-sum queries
long long pre[MAXN];
int a[MAXN];
int main() {
int n;
scanf("%d", &n);
for (int i = 0; i < n; i++) scanf("%d", &a[i]);
// Build prefix sums: pre[i] = a[0] + a[1] + ... + a[i-1]
pre[0] = 0;
for (int i = 0; i < n; i++) pre[i + 1] = pre[i] + a[i];
// cost(i, j) = total weight of piles i through j
auto cost = [&](int i, int j) -> long long {
return pre[j + 1] - pre[i];
};
// Base case: merging a single pile costs nothing.
// The "optimal split" for a single pile is itself.
for (int i = 0; i < n; i++) {
dp[i][i] = 0;
opt[i][i] = i;
}
// Fill by increasing interval length.
// This ensures opt[i][j-1] and opt[i+1][j] are available when we need them.
for (int len = 2; len <= n; len++) {
for (int i = 0; i + len - 1 < n; i++) {
int j = i + len - 1;
dp[i][j] = LLONG_MAX;
// Knuth's bounds: only search k in [opt[i][j-1], opt[i+1][j]]
int lo = opt[i][j - 1];
// For the last row (i+1 > j is impossible here, but j+1 might be out of range)
int hi = (j + 1 < n) ? opt[i + 1][j] : j - 1;
hi = min(hi, j - 1); // k must be < j
for (int k = lo; k <= hi; k++) {
long long val = dp[i][k] + dp[k + 1][j] + cost(i, j);
if (val < dp[i][j]) {
dp[i][j] = val;
opt[i][j] = k;
}
}
}
}
printf("%lld\n", dp[0][n - 1]);
return 0;
}Operations Reference
| Variant | Time | Space |
|---|---|---|
| Standard interval DP | ||
| With Knuth's optimization |
Both variants use
Problem Patterns & Tricks
Pattern 1: Merging Consecutive Piles
The classic application. You have
cpp
auto cost = [&](int i, int j) -> long long {
return pre[j + 1] - pre[i];
};Problems:
Pattern 2: Optimal Binary Search Tree
Given keys
The sum-of-frequencies cost satisfies QI. This is the original problem Knuth solved in 1971.
Problems:
Pattern 3: Breaking a Segment Optimally
You have a stick of length
cpp
// marks[] are sorted cut positions, with marks[0]=0, marks[m+1]=L
// dp[i][j] = min cost to make all cuts between marks[i] and marks[j]
auto cost = [&](int i, int j) -> long long {
return marks[j] - marks[i];
};Problems:
Pattern 4: File Merge / Tape Merge
Merge
Problems:
Pattern 5: Detecting When Knuth's Applies
Not every interval DP benefits from Knuth's optimization. Here is a checklist:
- Is it an interval DP? Recurrence has the form
. - Does the cost depend only on the endpoints?
should not depend on . - Does
satisfy the quadrangle inequality? Check . - Is
monotone on intervals? for .
Common cost functions that satisfy QI:
- Prefix sums:
OK - Weighted prefix sums:
for non-negative weights OK - Squared range:
OK - Max/min of range: depends on specifics -- verify carefully
Cost functions that do not satisfy QI:
- Arbitrary per-split costs that depend on
- Cost functions involving max/min in non-monotone ways
Pattern 6: 1D DP with Knuth-like Structure
Sometimes the recurrence is not a 2D interval DP but a 1D DP of the form:
If
Contest Cheat Sheet
+--------------------------------------------------------------+
| KNUTH'S OPTIMIZATION -- QUICK REFERENCE |
+--------------------------------------------------------------+
| WHEN: |
| - Interval DP: dp[i][j] = min_k{dp[i][k]+dp[k+1][j]+C} |
| - C(i,j) depends only on i,j (not on k) |
| - C satisfies quadrangle inequality |
| - C is monotone: C(b,c) <= C(a,d) for a<=b<=c<=d |
| |
| CORE IDEA: |
| opt[i][j-1] <= opt[i][j] <= opt[i+1][j] |
| Search k only in [opt[i][j-1] .. opt[i+1][j]] |
| |
| TEMPLATE: |
| for (len = 2..n) |
| for (i = 0; i+len-1 < n; i++) |
| j = i+len-1 |
| lo = opt[i][j-1], hi = opt[i+1][j] |
| for (k = lo..min(hi, j-1)) |
| update dp[i][j] and opt[i][j] |
| |
| COMPLEXITY: O(n^2) time, O(n^2) space |
| INIT: dp[i][i]=0, opt[i][i]=i |
+--------------------------------------------------------------+Gotchas & Debugging
Wrong filling order
The DP must be filled by increasing length for (len = 2; len <= n; len++).
Off-by-one in the opt bounds
The upper bound
cpp
int hi = (i + 1 <= j && j < n) ? opt[i + 1][j] : j - 1;
hi = min(hi, j - 1); // k must be strictly less than jForgetting to initialize opt for base cases
Every
Dependency ordering of opt
When computing
: this is at length , already computed. OK : this is also at length (interval has length ), already computed. OK
So filling by increasing length guarantees all dependencies are met.
Integer overflow
For long long for
0-indexed vs 1-indexed confusion
Pick one and stick with it. The template above is 0-indexed (
Forgetting to initialize opt for length-2 intervals
The first time I shipped Knuth's on SPOJ BRKSTRNG, my dp[i][i] = 0 and opt[i][i] = i for the base case, but intervals of length 2 were reading opt[i][j-1] = opt[i][i] just fine -- so I assumed the DP initialisation was complete. The actual bug: my first non-trivial length was opt[i+1][j] = opt[i+1][i+1] must also be i+1, not the 0 that sat in the statically-allocated array. Once a single length-2 upper bound was 0, its k-loop read dp[i][0] (garbage or stale) and the wrong answer cascaded up the diagonals. Always initialize opt[i][i] = i for all
Mental Traps
"Knuth's optimization applies to any interval DP."
No. It requires two specific conditions: (1) the cost function
"I need to prove QI formally every time."
In a contest, you don't have time for a formal proof. Instead, verify computationally on small inputs: run the
cpp
// After computing opt[i][j] with brute-force O(n^3):
for (int len = 2; len <= n; len++)
for (int i = 0; i + len - 1 < n; i++) {
int j = i + len - 1;
assert(opt[i][j - 1] <= opt[i][j]); // row monotonicity
if (i + 1 <= j)
assert(opt[i][j] <= opt[i + 1][j]); // column monotonicity
}If the assertion holds for random small inputs (
"Knuth's optimization and D&C optimization are interchangeable."
They are not. They apply to fundamentally different DP shapes:
Knuth's optimization: D&C optimization:
───────────────────── ─────────────────
Interval DP Layered/partitioning DP
dp[i][j] = min over k of dp[layer][j] = min over i of
dp[i][k] + dp[k+1][j] + C(i,j) dp[layer-1][i] + C(i,j)
2D table indexed by (i, j) 2D table indexed by (layer, j)
opt[i][j] bounded by neighbors opt[layer][j] monotone in j
O(n³) -> O(n²) O(kn²) -> O(kn log n)
Fill order: diagonals Fill order: layer by layer,
(increasing interval length) D&C on positions within each layerThe monotonicity structure is different: Knuth uses a 2D sandwich bound (left and below neighbors), while D&C uses a 1D monotonicity (opt non-decreasing within each layer).
Knuth: opt[i][j] is sandwiched in 2D
opt[i][j-1] <= opt[i][j] <= opt[i+1][j]
↑ ↑ ↑
(left) (here) (below)
j-1 j j
i [ <= ] -> [ ? ] i [ ? ]
i+1[ <= ]
↑
(column bound)
D&C: opt[layer][j] is monotone in 1D
opt[L][0] <= opt[L][1] <= opt[L][2] <= ... <= opt[L][n-1]
monotone within each layerDebug Drills
Drill 1: Wrong iteration order
What's wrong?
cpp
// Knuth-optimized interval DP
for (int i = 0; i < n; i++) {
for (int j = i + 2; j < n; j++) { // BUG: iterating i then j, not by length!
for (int k = opt[i][j-1]; k <= opt[i+1][j]; k++) {
long long val = dp[i][k] + dp[k+1][j] + cost(i, j);
if (val < dp[i][j]) {
dp[i][j] = val;
opt[i][j] = k;
}
}
}
}This loop iterates by i first, then j. But opt[i+1][j] requires the interval [i+1, j] already be solved -- which it won't be until i+1 is processed. Fix: Restructure by increasing length: for (int len = 2; ...; len++) for (int i = 0; i + len - 1 < n; i++) { int j = i + len - 1; ... }. Then both opt[i][j-1] (shorter interval) and opt[i+1][j] (same length, later start) are available.
Drill 2: Uninitialized opt base case
What's wrong?
cpp
int dp[505][505], opt[505][505];
memset(dp, 0x3f, sizeof(dp));
// BUG: opt is not initialized!
for (int i = 0; i < n; i++) dp[i][i] = 0;
// forgot: opt[i][i] = i;
for (int len = 2; len <= n; len++) {
for (int i = 0; i + len - 1 < n; i++) {
int j = i + len - 1;
for (int k = opt[i][j-1]; k <= opt[i+1][j]; k++) { ... }
}
}opt[i][i] is never set to i. When the len = 2 loop reads opt[i][j-1] = opt[i][i], it gets 0 (or garbage), making k start from a wrong position. Fix: Add for (int i = 0; i < n; i++) opt[i][i] = i; right after dp[i][i] = 0;.
Drill 3: Cost function doesn't satisfy QI (silent wrong answer)
What's wrong?
cpp
// cost(i, j) = max(a[i..j]) - min(a[i..j]) (range of values)
int cost(int i, int j) {
return range_max(i, j) - range_min(i, j);
}
// ... Knuth-optimized DP using this cost ...
// Compiles and runs, but gives WRONG answers.The range cost max - min does not satisfy the quadrangle inequality in general. Applying Knuth's optimization to a non-QI cost gives wrong answers silently -- the code runs without errors. Fix: Either verify QI holds (brute-force check opt[i][j-1] <= opt[i][j] <= opt[i+1][j] across random small inputs), or fall back to the standard
Self-Test
Before attempting practice problems, verify you can do the following without looking at notes:
- [ ] State the quadrangle inequality condition for a cost function
: for all , - [ ] Explain why
reduces the total work from to (telescoping argument across each diagonal) - [ ] Verify QI on a small example (
) by computing all values with brute force and checking monotonicity - [ ] Implement Knuth-optimized interval DP from scratch, including the
table, correct bounds, and tie-breaking with strict < - [ ] Distinguish Knuth's optimization from D&C optimization: Knuth applies to interval DP
(range-based), D&C applies to layered DP (partition-based) - [ ] State the fill order requirement: diagonals of increasing length
, so that and are available when computing
Practice Problems
| CF Rating | What you should be comfortable with |
|---|---|
| 1200 | Write the O(n³) interval DP for "minimum cost to merge segments"; understand what a split point is and how to iterate over all of them |
| 1500 | Know the quadrangle inequality condition; understand why it implies monotonicity of opt[i][j]; verify QI on small examples |
| 1800 | Implement Knuth's optimization: fill opt alongside dp, use opt[i][j-1] and opt[i+1][j] as bounds for the k-loop; solve SPOJ BRKSTRNG and UVa 10304 |
| 2100 | Apply Knuth's to non-obvious cost functions; combine with other techniques (e.g., D&C for the layered version); prove QI for custom costs |
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | BRKSTRNG - Breaking String | SPOJ | 1800 | Classic breaking-segment problem, direct Knuth's |
| 2 | Optimal BST | UVa 10304 | 1800 | Knuth's original problem -- frequency-weighted BST |
| 3 | Cutting | CSES | 1900 | Breaking a stick at given positions |
| 4 | Zuma | CF 41D | 1900 | Interval DP variant -- check if QI holds |
| 5 | Partition Game | CF 1527E | 2100 | 1D DP with QI cost -- related optimization |
| 6 | An old Stone Game | POJ 1738 | 2000 | Stone merging with large |
| 7 | Guardians of the Lunatics | HackerRank | 2100 | Knuth-style optimization on partition DP |
| 8 | Gary's Herd | CF Gym | 2200 | Interval merge with non-trivial cost |
Further Reading
- cp-algorithms: Knuth's Optimization -- clean explanation with proof sketch
- Codeforces Blog: Knuth's Optimization -- community discussion with problem links
- Jeffrey Shallit's Notes on Knuth's Result -- academic perspective
- D. E. Knuth, Optimum Binary Search Trees, Acta Informatica 1 (1971), pp. 14-25 -- the original paper
- Yao's Speedup Lemma (1980) -- generalization of Knuth's monotonicity result to broader class of recurrences
- Interval DP -- prerequisite: standard interval DP without optimization