Skip to content

The Art of Choosing What to Remember

Quick Revisit

  • REACH FOR THIS WHEN: you know the problem is DP but can't figure out what the state should be
  • KEY DECISION: what is the minimum information from the past needed to make optimal future decisions?
  • QUICK RULE: if your state space is too large, compress it -- bitmask, last-k, rank, or modular residue
  • SEE ALSO: DP vs Greedy Guide, DS Augmentation

DP isn't about filling tables. It's about deciding what information from the past you need to make decisions about the future.

Prerequisites: DP Thinking Framework

Why State Design Is the Hardest Part

Writing the recurrence is mechanical. Implementing the transitions is straightforward. But choosing the state -- that's where problems are won or lost.

A DP state encodes exactly the minimum information needed to make all future decisions optimally, without knowing the specific history that led there. Too much information and your state space explodes. Too little and you can't distinguish situations that need different handling.

The mantra: A state captures enough past to determine the future, and nothing more.

Four Techniques for Finding the Right State

1. Profile DP -- When the Grid Is Too Wide

Situation. Tiling a grid, placing non-attacking pieces, or any problem where decisions in one column depend on the shape of the previous column.

The state. Instead of tracking every cell, track only the profile -- the pattern along the boundary between "processed" and "unprocessed."

Classic example: Domino tiling of a 3×n grid. The profile is which cells in the current column are already covered by a horizontal domino started in the previous column. For a grid with 3 rows, the profile is a 3-bit mask -- only 8 states. Process column by column: for each profile, enumerate which dominos you can place and what profile they leave for the next column.

Column i-1    Column i
  ██░         ░██       Profile = 100 (top cell pokes into column i)
  ░░░    ->    ░░░       Only 3 rows -> only 2³ = 8 possible profiles
  ░░░         ░░░

For an m×n grid, the profile has 2m states, so this works when m is small — commonly m10--20, depending on how many transitions per profile you need to enumerate. A bare 2mn pass is fine at m=20; if each profile spawns another exponential of subset transitions, you may be capped closer to m=12--15.

2. Bitmask DP -- When Tracking a Subset

Situation. You're assigning items to slots, visiting cities, or partitioning into groups, and you need to know which items have been used.

The state. A bitmask of n bits, where bit i = 1 means item i has been used/visited.

Classic example: Traveling Salesman (TSP). State: dp[mask][v] = minimum cost to visit exactly the cities in mask, ending at city v. Transition: for each unvisited city u, dp[mask | (1<<u)][u] = min(dp[mask][v] + cost[v][u]).

The key insight for reduction: if a problem says "choose a subset" or "assign n things (n20)," the bitmask is the state.

cpp
// TSP core: O(2^n * n^2)
for (int mask = 0; mask < (1 << n); mask++)
    for (int v = 0; v < n; v++) if (dp[mask][v] < INF)
        for (int u = 0; u < n; u++) if (!(mask & (1 << u)))
            dp[mask | (1 << u)][u] = min(dp[mask | (1 << u)][u],
                                          dp[mask][v] + cost[v][u]);

3. "Last Few Elements" -- When Only Recent History Matters

Situation. The constraint involves consecutive elements, runs, or patterns of bounded length.

The state. Track only the last k decisions (or some compressed summary of them), not the entire history.

Example: No three consecutive identical elements. State: dp[i][last_value][run_length] where run_length ∈ {1, 2}. You don't need to know what happened at position 0 when you're at position 1000 -- you only need the current run.

Example: Longest alternating subsequence. State: dp[i][last_direction] where direction ∈ {up, down}. The entire history compresses to a single bit.

4. Coordinate Compression in the State

Situation. The values in the problem are huge (109), but there are few distinct values (105).

The state. Replace raw values with their rank among all distinct values. A state over ranks has manageable size.

Example: Count rectangles with sum <= K in a grid. If the prefix sums range up to 1018, you can't index by value. But if you compress the O(n) distinct prefix sums to ranks 0n1, a Fenwick tree over ranks solves it.

Choosing Your State Design

TechniqueWhen to useState sizeTypical constraint
Profile DPGrid tiling, boundary-dependent decisionsO(2^m * n)m small, often 10-20 (transitions can tighten this)
Bitmask DPAssign/visit/partition n itemsO(2^n * n)n <= 20
Last-k elementsConstraints involve consecutive elements or bounded runsO(n * k * alphabet)k small (1-3)
Coordinate compressionHuge value range, few distinct valuesO(n) or O(n log n)values up to 10^9, distinct count <= 10^5

Decision cues:

  • "Assign n things to n slots" (n <= 20) --> bitmask DP
  • "Tiling an m x n grid" (m small, often <= 10-20) --> profile DP
  • "No three consecutive same elements" --> last-k with run length
  • "Values up to 10^9 but only 10^5 distinct" --> coordinate compression

Worked Example: State Discovery in Action

Problem. Given a string of digits, insert + signs between some digits to minimize the total sum. You must use exactly k plus signs.

Step 1: What are the decisions? Where to place each + sign.

Step 2: What changes after each decision? The position in the string and how many + signs remain.

Step 3: What's the minimal state? dp[i][j] = minimum sum achievable using the first i digits with exactly j plus signs. Transition: the last number starts at some position p, so dp[i][j] = min over p of (dp[p][j-1] + number(p+1..i)).

Step 4: Sanity check. State space: O(nk). Transition cost: O(n) per state. Total: O(n2k). For n,k1000, that's fast enough.

Step 5: Implementation watch. The transition relies on fast number(p+1..i) evaluation. Two practical issues: (a) the substring's numeric value can overflow 64-bit integers for long blocks, so use big integers, modular arithmetic, or work with sums of digits if the recurrence allows it; (b) if you naively recompute the value inside the inner loop you double the constant factor — precompute it incrementally or via prefix tables. State design forces these issues to surface; they don't go away by ignoring them.

Notice the process: we didn't start with the recurrence. We started by asking what we need to remember.

Common Mistakes in Choosing

Too much state. "Let me track the entire configuration." If your state is the full grid or the complete sequence, you've just rewritten brute force with memoization. Ask: which parts of the configuration actually matter for future decisions?

Forgetting a dimension. Your DP gives wrong answers because states that should be distinct are merged. Common culprit: forgetting to track parity, remaining budget, or which "mode" you're in (e.g., currently holding vs. not holding a stock).

Wrong direction. Sometimes dp[i] = "best answer for the first i items" doesn't work, but dp[i] = "best answer for items in" does (or vice versa). If transitions feel unnatural, try reversing.

Inventing unnecessary dimensions. Tracking information that doesn't affect future decisions bloats the state space. Ask: "if I remove this dimension, can two states that are currently distinct actually make different optimal choices?" If not, drop the dimension.

Not sanity-checking the state space size. Before coding, compute |states| x |transitions per state|. If the product exceeds ~10^8, the DP will TLE. Either compress the state or find a different approach.

Bad State vs Better State

State design is easiest to learn by contrast. A few examples where the obvious state is wrong, and a tightening fixes it:

ProblemBad stateWhy it failsBetter state
Buy/sell stock with cooldowndp[i] = best profit using first i daysLoses whether you currently hold a share or are in cooldowndp[i][hold] with hold ∈ {free, holding, cooldown}
LIS by ending valuedp[i] = LIS ending at index iO(n2) scan unavoidabledp[len] = smallest ending value of any LIS of length len, then binary search (O(nlogn))
No 3 consecutive identical charsdp[i][full history]State explodes in idp[i][last_char][run_length ∈ {1,2}]
Knapsack with item-count capdp[i][weight] onlyForgets items used; over-countsdp[i][weight][k] or 0/1 iteration over items
Path count on grid with k obstacles passabledp[i][j]Can't tell whether obstacle was already useddp[i][j][used]

The pattern: when two situations that should diverge in the future are merged in the present, the state is too coarse. When the state contains information no future transition reads, it is too fine.

The State Design Checklist

When you're stuck designing a DP state, walk through these questions:

  1. What decision am I making at each step?
  2. After making this decision, what information from the past do I need to evaluate future decisions?
  3. Can I compress that information? (Bitmask? Last-k? Rank? Modular residue?)
  4. Is the state space small enough? Compute |states|×|transitions per state| and check.
  5. If the state space is too large, what information can I drop or approximate?

Cross-References

Practice Problems

#ProblemSourceRatingState Design Insight
1Hamiltonian FlightsCSES1800Bitmask + endpoint
2Broken ProfileCF Blog2000Profile DP on grid boundary
3Buy and Sell Stock with CooldownLeetCode1600State =
4Digit SumAtCoder1700Position + remainder + tight flag
5Partition into SubsetsAtCoder2100Bitmask over subset of elements

Recap

  • A state captures enough past to determine the future, and nothing more.
  • Four compression techniques: profile (boundary shape), bitmask (subset), last-k (recent history), coordinate compression (rank among distinct values).
  • Discovery process: what decisions am I making? What past info do future decisions need? Can I compress it? Is the state space small enough?
  • Too much state = brute force with memoization. Too little state = wrong answers from merged distinct situations.
  • Always sanity-check: |states| x |transitions| must fit in the time limit before you start coding.

Built for the climb from Codeforces 1100 to 2100.