Appearance
The Art of Choosing What to Remember
Quick Revisit
- REACH FOR THIS WHEN: you know the problem is DP but can't figure out what the state should be
- KEY DECISION: what is the minimum information from the past needed to make optimal future decisions?
- QUICK RULE: if your state space is too large, compress it -- bitmask, last-k, rank, or modular residue
- SEE ALSO: DP vs Greedy Guide, DS Augmentation
DP isn't about filling tables. It's about deciding what information from the past you need to make decisions about the future.
Prerequisites: DP Thinking Framework
Why State Design Is the Hardest Part
Writing the recurrence is mechanical. Implementing the transitions is straightforward. But choosing the state -- that's where problems are won or lost.
A DP state encodes exactly the minimum information needed to make all future decisions optimally, without knowing the specific history that led there. Too much information and your state space explodes. Too little and you can't distinguish situations that need different handling.
The mantra: A state captures enough past to determine the future, and nothing more.
Four Techniques for Finding the Right State
1. Profile DP -- When the Grid Is Too Wide
Situation. Tiling a grid, placing non-attacking pieces, or any problem where decisions in one column depend on the shape of the previous column.
The state. Instead of tracking every cell, track only the profile -- the pattern along the boundary between "processed" and "unprocessed."
Classic example: Domino tiling of a
Column i-1 Column i
██░ ░██ Profile = 100 (top cell pokes into column i)
░░░ -> ░░░ Only 3 rows -> only 2³ = 8 possible profiles
░░░ ░░░For an
2. Bitmask DP -- When Tracking a Subset
Situation. You're assigning items to slots, visiting cities, or partitioning into groups, and you need to know which items have been used.
The state. A bitmask of
Classic example: Traveling Salesman (TSP). State: dp[mask][v] = minimum cost to visit exactly the cities in mask, ending at city dp[mask | (1<<u)][u] = min(dp[mask][v] + cost[v][u]).
The key insight for reduction: if a problem says "choose a subset" or "assign
cpp
// TSP core: O(2^n * n^2)
for (int mask = 0; mask < (1 << n); mask++)
for (int v = 0; v < n; v++) if (dp[mask][v] < INF)
for (int u = 0; u < n; u++) if (!(mask & (1 << u)))
dp[mask | (1 << u)][u] = min(dp[mask | (1 << u)][u],
dp[mask][v] + cost[v][u]);3. "Last Few Elements" -- When Only Recent History Matters
Situation. The constraint involves consecutive elements, runs, or patterns of bounded length.
The state. Track only the last
Example: No three consecutive identical elements. State: dp[i][last_value][run_length] where run_length ∈ {1, 2}. You don't need to know what happened at position 0 when you're at position 1000 -- you only need the current run.
Example: Longest alternating subsequence. State: dp[i][last_direction] where direction ∈ {up, down}. The entire history compresses to a single bit.
4. Coordinate Compression in the State
Situation. The values in the problem are huge (
The state. Replace raw values with their rank among all distinct values. A state over ranks has manageable size.
Example: Count rectangles with sum <= K in a grid. If the prefix sums range up to
Choosing Your State Design
| Technique | When to use | State size | Typical constraint |
|---|---|---|---|
| Profile DP | Grid tiling, boundary-dependent decisions | O(2^m * n) | m small, often 10-20 (transitions can tighten this) |
| Bitmask DP | Assign/visit/partition n items | O(2^n * n) | n <= 20 |
| Last-k elements | Constraints involve consecutive elements or bounded runs | O(n * k * alphabet) | k small (1-3) |
| Coordinate compression | Huge value range, few distinct values | O(n) or O(n log n) | values up to 10^9, distinct count <= 10^5 |
Decision cues:
- "Assign n things to n slots" (n <= 20) --> bitmask DP
- "Tiling an m x n grid" (m small, often <= 10-20) --> profile DP
- "No three consecutive same elements" --> last-k with run length
- "Values up to 10^9 but only 10^5 distinct" --> coordinate compression
Worked Example: State Discovery in Action
Problem. Given a string of digits, insert + signs between some digits to minimize the total sum. You must use exactly
Step 1: What are the decisions? Where to place each + sign.
Step 2: What changes after each decision? The position in the string and how many + signs remain.
Step 3: What's the minimal state? dp[i][j] = minimum sum achievable using the first dp[i][j] = min over p of (dp[p][j-1] + number(p+1..i)).
Step 4: Sanity check. State space:
Step 5: Implementation watch. The transition relies on fast number(p+1..i) evaluation. Two practical issues: (a) the substring's numeric value can overflow 64-bit integers for long blocks, so use big integers, modular arithmetic, or work with sums of digits if the recurrence allows it; (b) if you naively recompute the value inside the inner loop you double the constant factor — precompute it incrementally or via prefix tables. State design forces these issues to surface; they don't go away by ignoring them.
Notice the process: we didn't start with the recurrence. We started by asking what we need to remember.
Common Mistakes in Choosing
Too much state. "Let me track the entire configuration." If your state is the full grid or the complete sequence, you've just rewritten brute force with memoization. Ask: which parts of the configuration actually matter for future decisions?
Forgetting a dimension. Your DP gives wrong answers because states that should be distinct are merged. Common culprit: forgetting to track parity, remaining budget, or which "mode" you're in (e.g., currently holding vs. not holding a stock).
Wrong direction. Sometimes dp[i] = "best answer for the first dp[i] = "best answer for items
Inventing unnecessary dimensions. Tracking information that doesn't affect future decisions bloats the state space. Ask: "if I remove this dimension, can two states that are currently distinct actually make different optimal choices?" If not, drop the dimension.
Not sanity-checking the state space size. Before coding, compute |states| x |transitions per state|. If the product exceeds ~10^8, the DP will TLE. Either compress the state or find a different approach.
Bad State vs Better State
State design is easiest to learn by contrast. A few examples where the obvious state is wrong, and a tightening fixes it:
| Problem | Bad state | Why it fails | Better state |
|---|---|---|---|
| Buy/sell stock with cooldown | dp[i] = best profit using first i days | Loses whether you currently hold a share or are in cooldown | dp[i][hold] with hold ∈ {free, holding, cooldown} |
| LIS by ending value | dp[i] = LIS ending at index i | dp[len] = smallest ending value of any LIS of length len, then binary search ( | |
| No 3 consecutive identical chars | dp[i][full history] | State explodes in i | dp[i][last_char][run_length ∈ {1,2}] |
| Knapsack with item-count cap | dp[i][weight] only | Forgets items used; over-counts | dp[i][weight][k] or 0/1 iteration over items |
| Path count on grid with k obstacles passable | dp[i][j] | Can't tell whether obstacle was already used | dp[i][j][used] |
The pattern: when two situations that should diverge in the future are merged in the present, the state is too coarse. When the state contains information no future transition reads, it is too fine.
The State Design Checklist
When you're stuck designing a DP state, walk through these questions:
- What decision am I making at each step?
- After making this decision, what information from the past do I need to evaluate future decisions?
- Can I compress that information? (Bitmask? Last-k? Rank? Modular residue?)
- Is the state space small enough? Compute
and check. - If the state space is too large, what information can I drop or approximate?
Cross-References
- DP Thinking Framework -- the prerequisite for state design
- Bitmask DP -- detailed bitmask DP techniques
- DP vs Greedy Guide -- when to use DP at all
- Problem Transformations -- coordinate compression and other reductions
- Reduction Patterns -- the full "if X, think Y" catalog
Practice Problems
| # | Problem | Source | Rating | State Design Insight |
|---|---|---|---|---|
| 1 | Hamiltonian Flights | CSES | 1800 | Bitmask + endpoint |
| 2 | Broken Profile | CF Blog | 2000 | Profile DP on grid boundary |
| 3 | Buy and Sell Stock with Cooldown | LeetCode | 1600 | State = |
| 4 | Digit Sum | AtCoder | 1700 | Position + remainder + tight flag |
| 5 | Partition into Subsets | AtCoder | 2100 | Bitmask over subset of elements |
Recap
- A state captures enough past to determine the future, and nothing more.
- Four compression techniques: profile (boundary shape), bitmask (subset), last-k (recent history), coordinate compression (rank among distinct values).
- Discovery process: what decisions am I making? What past info do future decisions need? Can I compress it? Is the state space small enough?
- Too much state = brute force with memoization. Too little state = wrong answers from merged distinct situations.
- Always sanity-check: |states| x |transitions| must fit in the time limit before you start coding.