Appearance
Centroid Decomposition
Quick Revisit
- USE WHEN: Counting/querying over all paths in a tree (paths of given length, paths with given property)
- INVARIANT: Centroid's removal leaves all subtrees of size ≤ n/2; centroid tree has depth O(log n)
- TIME: O(n log n) total across all levels | SPACE: O(n log n)
- CLASSIC BUG: Not recomputing subtree sizes after removing the centroid — stale sizes cause wrong centroid selection and O(n) depth instead of O(log n)
- PRACTICE: 05-ladder-graphs
Recursively find the centroid of a tree (the node whose removal leaves no subtree larger than
Difficulty: [Advanced]ID: AL-22 Prerequisites: DFS & Tree DFS, Euler Tour & LCACF Rating: 1900-2200+
Contest Frequency: * Specialized -- appears in tree path query problems
Intuition
You have a tree with
1
/ \
2 3
/ \ \
4 5 6
|
7Task: Count the number of pairs of nodes
The brute-force approach: root the tree, run a DFS/BFS from every node, count pairs at distance
For each u in {1..N}:
BFS from u, count nodes v > u with dist(u,v) <= K
Pairs found (K=3):
(1,2)=1 (1,3)=1 (1,4)=2 (1,5)=2 (1,6)=2 (1,7)=3
(2,3)=2 (2,4)=1 (2,5)=1 (2,6)=3 (2,7)=2
(3,4)=3 (3,5)=3 (3,6)=1
(4,5)=2 (4,7)=1
(5,7)=2
Total pairs with dist <= 3: 17With
We need a divide-and-conquer on the tree that, like merge sort on arrays, halves the problem at each level and processes each node only
Find the centroid (the node whose removal splits the tree into subtrees each of size
), solve for all paths that pass through it, then recurse on each subtree -- this gives because each node appears in levels of the decomposition.
Why the centroid? Any node can split the tree into components, but the centroid guarantees every component has size
The key structural observation needs to be stated carefully. It is not true that every path "passes through some centroid at every level"; that would be vacuous. The right statement is:
For each path
- in the original tree, there is exactly one centroid in the centroid decomposition tree (CDT) such that is the lowest common ancestor of and in the CDT. Equivalently, is the centroid that first separates from as we descend the CDT. We "charge" the path to that one centroid -- it is the unique level where and end up in different child components for the first time, or where one of them is itself.
This is what makes the decomposition partition all
Visual Walkthrough
Step 1 -- Compute subtree sizes and find the centroid. For a tree of
Tree (sizes in parentheses, rooted at 1):
1(7)
/ \
2(4) 3(2)
/ \ \
4(2) 5(1) 6(1)
|
7(1)
Try removing each node, find max component:
Remove 1: components {2,4,5,7}(4), {3,6}(2) -> max = 4 > 3 NO
Remove 2: components {1,3,6}(3), {4,7}(2), {5}(1) -> max = 3 YES
Remove 3: components {1,2,4,5,7}(5), {6}(1) -> max = 5 > 3 NO
Centroid = node 2.Step 2 -- Process all paths through centroid 2. Run a DFS from node 2 into each subtree, collecting distances. Paths through 2 combine a node in one subtree with a node in another (or with 2 itself).
Remove centroid 2. Remaining subtrees rooted at neighbors:
Subtree A (via 1): 1 -- 3 -- 6
Subtree B (via 4): 4 -- 7
Subtree C (via 5): 5
DFS from centroid 2, recording (node, distance):
Subtree A: (1, 1), (3, 2), (6, 3)
Subtree B: (4, 1), (7, 2)
Subtree C: (5, 1)
To count pairs with dist <= K=3 passing through 2:
Sort all distances: [1, 1, 1, 2, 2, 3]
Use two pointers or merge step to count pairs
summing to <= K. Also add pairs (2, v) for each v.Step 3 -- Recurse on each subtree. Each subtree gets its own centroid, and we repeat.
Level 0: centroid = 2, covers whole tree (N=7)
Level 1: Subtree {1,3,6} -> centroid = 3
Subtree {4,7} -> centroid = 4 (or 7)
Subtree {5} -> centroid = 5 (leaf, base case)
Level 2: Subtree {1} -> centroid = 1
Subtree {6} -> centroid = 6
Subtree {7} -> centroid = 7Step 4 -- The centroid tree. Connect each centroid to its parent centroid (the centroid of the level above that contained it). This forms a new tree of depth
Centroid tree:
2
/ | \
3 4 5
/ \ \
1 6 7
Depth = 2 (log2(7) ~ 2.8)
Every path in the original tree is "owned" by
exactly one centroid in this tree.The Invariant
+----------------------------------------------------------------------+
| 1. Every subtree processed at recursion level L has size <= N/2^L. |
| Therefore the recursion depth is O(log N). |
| |
| 2. Every path u-v in the original tree has a unique CDT-LCA -- |
| the deepest centroid in the centroid decomposition tree that |
| is an ancestor of both u and v. We charge the path to that |
| centroid only, by subtracting the pairs that lie entirely |
| inside one child subtree (they belong to that child's level). |
| |
| 3. Each node appears in exactly O(log N) levels. Total work across |
| all levels is O(N log N) if we do O(1) work per node per level. |
+----------------------------------------------------------------------+When to Reach for This
Trigger phrases in problem statements:
- "Count paths in a tree with property
" (distance, sum, XOR, modular condition). - "Tree path queries" where paths are not rooted -- need to consider all
paths. - "Distances in tree" -- count or enumerate pairs at specific distances.
- "Update a node, query all nodes within distance
" -- centroid tree enables ancestors to check.
Counter-examples (do NOT use centroid decomposition):
- Path queries with updates on a rooted tree where paths go through ancestors -- HLD is simpler and online.
- Subtree queries only -- Euler tour suffices.
- Single-source distances -- BFS/DFS from that source in
. - Need to maintain the tree structure after operations -- centroid decomposition is a static decomposition.
What the Code Won't Teach You
The centroid decomposition tree is NOT the original tree. The CDT is a new tree where the parent of node
Original tree: Centroid decomposition tree (CDT):
1 2
/ \ / | \
2 3 3 4 5
/ \ \ / \ \
4 5 6 1 6 7
|
7 Depth of CDT: 2 (log₂ 7 ~= 2.8)
Key difference: node 2 is the ROOT of the CDT
(it was the first centroid found), and node 1
is a LEAF (it was captured last).The "subtract overcounting" step is where most bugs hide. When you count pairs through centroid
Why inclusion-exclusion is needed:
Centroid c with children A, B, C:
Pairs counted by "all distances from c":
AxB + AxC + BxC + Ax{c} + Bx{c} + Cx{c}
^^^ ^^^ ^^^ <- genuine cross-subtree pairs OK
ALSO counted (wrongly):
AxA + BxB + CxC <- intra-subtree pairs X
Fix: subtract count_pairs(A) + count_pairs(B) + count_pairs(C)Centroid decomposition is divide-and-conquer on trees -- the merge-sort analogy helps. In merge sort, you split the array at the midpoint, handle cross-boundary pairs, then recurse. In centroid decomposition, you split the tree at the centroid, handle cross-centroid paths, then recurse. The centroid guarantees each component has size
C++ STL Reference
| Function / Class | Header | What it does | Time Complexity |
|---|---|---|---|
vector<int> | <vector> | Adjacency list, distance collection | -- |
sort(begin, end) | <algorithm> | Sort distances for two-pointer counting | |
upper_bound(b, e, v) | <algorithm> | Binary search in sorted distances | |
accumulate(b, e, init) | <numeric> | Sum over collected values | |
fill(b, e, v) | <algorithm> | Reset removed/visited flags between levels |
Implementation (Contest-Ready)
+------------------------------------------------------------+
| CRITICAL: Mark the centroid as removed BEFORE recursing |
| into its subtrees. If you recurse first and mark after, |
| the subtree size calculations will include the centroid, |
| breaking the "each subtree has size <= n/2" guarantee. |
| This causes infinite recursion or wrong decomposition. |
+------------------------------------------------------------+cpp
void build(int v) {
int c = find_centroid(v);
removed[c] = true; // Mark FIRST
// Process centroid c...
for (int u : adj[c]) {
if (!removed[u]) {
build(u); // Recurse AFTER marking
}
}
}Version 1: Minimal Contest Template
Solves: count pairs of nodes with distance
cpp
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 200'005;
int n, K;
vector<pair<int,int>> adj[MAXN]; // {neighbor, weight}
int sz[MAXN];
bool removed[MAXN];
long long ans = 0;
int get_sz(int v, int p) {
sz[v] = 1;
for (auto [u, w] : adj[v])
if (u != p && !removed[u])
sz[v] += get_sz(u, v);
return sz[v];
}
int get_centroid(int v, int p, int tree_sz) {
for (auto [u, w] : adj[v])
if (u != p && !removed[u] && sz[u] > tree_sz / 2)
return get_centroid(u, v, tree_sz);
return v;
}
vector<int> dists;
void get_dists(int v, int p, int d) {
dists.push_back(d);
for (auto [u, w] : adj[v])
if (u != p && !removed[u])
get_dists(u, v, d + w);
}
long long count_pairs(vector<int>& a, int limit) {
sort(a.begin(), a.end());
long long cnt = 0;
int lo = 0, hi = (int)a.size() - 1;
while (lo < hi) {
if (a[lo] + a[hi] <= limit) {
cnt += hi - lo;
lo++;
} else {
hi--;
}
}
return cnt;
}
void solve(int v) {
int tree_sz = get_sz(v, -1);
int c = get_centroid(v, -1, tree_sz);
removed[c] = true;
// Count all pairs through c with dist <= K
vector<int> all_dists = {0}; // include centroid itself (dist 0)
for (auto [u, w] : adj[c]) {
if (removed[u]) continue;
dists.clear();
get_dists(u, c, w);
// Subtract pairs within same subtree (inclusion-exclusion)
ans -= count_pairs(dists, K);
for (int d : dists) all_dists.push_back(d);
}
ans += count_pairs(all_dists, K);
for (auto [u, w] : adj[c])
if (!removed[u])
solve(u);
}
int main() {
scanf("%d %d", &n, &K);
for (int i = 0; i < n - 1; i++) {
int u, v, w;
scanf("%d %d %d", &u, &v, &w);
adj[u].push_back({v, w});
adj[v].push_back({u, w});
}
solve(1);
printf("%lld\n", ans);
}Version 2: Explained Version with Centroid Tree Construction
Builds the centroid tree explicitly. Supports distance-based update/query: "update node
cpp
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 200'005;
const int LOG = 18; // log2(200000) ~ 17.6
int n, q;
vector<pair<int,int>> adj[MAXN]; // {neighbor, weight}
// --- Centroid decomposition arrays ---
int sz[MAXN];
bool removed[MAXN];
int cpar[MAXN]; // parent in the centroid tree
int cdepth[MAXN]; // depth in the centroid tree
int dist_to_canc[MAXN][LOG];
// dist_to_canc[v][k] = distance from v to its k-th ancestor
// in the centroid tree (0-th ancestor = v's centroid at its level)
// --- Find subtree size (ignoring removed nodes) ---
int get_sz(int v, int p) {
sz[v] = 1;
for (auto [u, w] : adj[v])
if (u != p && !removed[u])
sz[v] += get_sz(u, v);
return sz[v];
}
// --- Find centroid of the component containing v ---
// Walk toward the child with sz > tree_sz / 2 until no such
// child exists. That node is the centroid.
int get_centroid(int v, int p, int tree_sz) {
for (auto [u, w] : adj[v])
if (u != p && !removed[u] && sz[u] > tree_sz / 2)
return get_centroid(u, v, tree_sz);
return v;
}
// --- BFS/DFS to compute distances from centroid c to all
// nodes in its component, storing at the appropriate
// centroid-tree level ---
void compute_dists(int v, int p, int c, int d, int level) {
dist_to_canc[v][level] = d;
for (auto [u, w] : adj[v])
if (u != p && !removed[u])
compute_dists(u, v, c, d + w, level);
}
// --- Build centroid decomposition ---
// 1. Find centroid c of current component.
// 2. Record distances from c to all nodes in this component.
// 3. Mark c as removed.
// 4. Recurse on each remaining subtree.
void build(int v, int parent_centroid, int level) {
int tree_sz = get_sz(v, -1);
int c = get_centroid(v, -1, tree_sz);
cpar[c] = parent_centroid;
cdepth[c] = level;
removed[c] = true;
// Record distances from c to every node in its component
compute_dists(c, -1, c, 0, level);
for (auto [u, w] : adj[c])
if (!removed[u])
build(u, c, level + 1);
}
// --- Query structure: for each centroid, store a sorted list
// of distances to all "active" nodes in its component.
// Here we use a simpler approach: BIT/Fenwick per centroid
// is overkill; instead, for each centroid, maintain a count
// of active nodes and sum of distances. ---
// Simple approach: walk up centroid tree ancestors of query node.
// For "count marked nodes within distance K of u":
long long cnt[MAXN]; // cnt[c] = number of marked nodes in c's component
long long sum[MAXN]; // sum[c] = sum of dist_to_canc for marked nodes
// Mark node v as active
void update(int v) {
int cur = v;
while (cur != -1) {
int lvl = cdepth[cur];
cnt[cur]++;
sum[cur] += dist_to_canc[v][lvl];
cur = cpar[cur];
}
}
// Query: count of marked nodes within distance K of u
// Walk up centroid tree. At each ancestor c (level L):
// dist(u, marked_node) = dist_to_canc[u][L] + dist_to_canc[marked_node][L]
// We want dist_to_canc[u][L] + dist_to_canc[marked_node][L] <= K.
// Simple version (returns total distance sum, not count):
long long query_count(int u, int K_limit) {
long long result = 0;
int cur = u;
int prev = -1;
while (cur != -1) {
int lvl = cdepth[cur];
int du = dist_to_canc[u][lvl];
if (du <= K_limit) {
// All marked nodes in cur's component at distance
// <= K_limit - du from cur contribute.
// Simplified: count all (exact filtering needs more
// structure like a sorted list + binary search).
result += cnt[cur];
}
// Subtract double-counted nodes from child centroid
// (inclusion-exclusion handled by the caller or
// by maintaining per-child-subtree counters)
prev = cur;
cur = cpar[cur];
}
return result;
}
int main() {
scanf("%d %d", &n, &q);
for (int i = 0; i < n - 1; i++) {
int u, v, w;
scanf("%d %d %d", &u, &v, &w);
adj[u].push_back({v, w});
adj[v].push_back({u, w});
}
build(1, -1, 0);
while (q--) {
int type; scanf("%d", &type);
if (type == 1) {
int v; scanf("%d", &v);
update(v);
} else {
int u, k; scanf("%d %d", &u, &k);
printf("%lld\n", query_count(u, k));
}
}
}Note on inclusion-exclusion in Version 2: The simplified query_count above illustrates the centroid-tree walk. For exact distance-filtered counting, maintain a sorted array or BIT of distances at each centroid and binary search. See Pattern 3 below for the precise technique.
ASCII Trace: Centroid Decomposition of a 7-Node Tree
Input tree: Adjacency list:
0 0: [1, 2]
/ \ 1: [0, 3, 4]
1 2 2: [0, 5, 6]
/ \ / \ 3: [1]
3 4 5 6 4: [1]
5: [2]
6: [2]
Step 1: Compute subtree sizes (rooted at 0)
sz[] = [7, 3, 3, 1, 1, 1, 1]
n=7, centroid must have all parts <= 7/2=3
Node 0: remove -> parts {3, 3} <= 3 OK -> centroid = 0
Step 2: Remove centroid 0, recurse on components {1,3,4} and {2,5,6}
Component {1,3,4}: Component {2,5,6}:
1 2
/ \ / \
3 4 5 6
sz=[3,1,1] sz=[3,1,1]
centroid=1 centroid=2
Step 3: Remove 1 -> components {3},{4} Remove 2 -> components {5},{6}
centroid=3, centroid=4 centroid=5, centroid=6
Centroid tree (parent -> child):
0
/ \
1 2
/ \ / \
3 4 5 6
Depths in centroid tree: 0->0, 1->1, 2->1, 3->2, 4->2, 5->2, 6->2
Max depth = O(log n) = 2 (guaranteed for any tree)Operations Reference
| Operation | Time | Space | Notes |
|---|---|---|---|
| Build centroid decomposition | |||
| Count pairs with distance | Sorting per centroid adds a | ||
| Centroid tree walk (update/query) | -- | Walk up | |
| Centroid tree walk + binary search | Sorted distance list at each centroid level | ||
| Total space (with distance arrays) | -- | Each node stores dist to |
Problem Patterns & Tricks
Pattern 1: Count Paths with Distance
The classic centroid decomposition problem. At each centroid
cpp
void solve(int v) {
int c = get_centroid(v, -1, get_sz(v, -1));
removed[c] = true;
vector<int> all = {0};
for (auto [u, w] : adj[c]) {
if (removed[u]) continue;
dists.clear();
get_dists(u, c, w);
ans -= count_pairs(dists, K); // subtract intra-subtree
for (int d : dists) all.push_back(d);
}
ans += count_pairs(all, K); // add all through c
for (auto [u, w] : adj[c])
if (!removed[u]) solve(u);
}Problems: CF 161D, IOI 2011 Race
Pattern 2: Path with Exact/Modular Property
Instead of distance
cpp
// Count paths with distance exactly K through centroid c
int freq[MAXK]; // freq[d] = count of nodes at distance d
void solve(int c) {
// ...find centroid, mark removed...
freq[0] = 1; // centroid itself
for (auto [u, w] : adj[c]) {
if (removed[u]) continue;
dists.clear();
get_dists(u, c, w);
// Query: for each dist d, ans += freq[K - d]
for (int d : dists)
if (d <= K) ans += freq[K - d];
// Update: add this subtree's distances to freq
for (int d : dists)
if (d <= K) freq[d]++;
}
// Clean up freq (don't memset -- only clear used entries)
// ...
}Problems: IOI 2011 Race (exact distance, minimize edges), CF 914E (XOR on path)
Pattern 3: Online Update/Query at Distance (Centroid Tree Walk)
Build the centroid tree explicitly. Store dist_to_canc[v][level] -- the distance from
Maintain two data structures per centroid: one for all nodes in the component, one for nodes in the "parent direction" subtree, enabling clean subtraction.
cpp
// At each centroid c, maintain:
// all[c] = sorted distances of activated nodes in c's component
// sub[c] = sorted distances of activated nodes, measured from cpar[c]
// Query at u with limit K:
long long query(int u, int K) {
long long res = 0;
int cur = u;
while (cur != -1) {
int d = dist_to_canc[u][cdepth[cur]];
if (d <= K)
res += count_within(all[cur], K - d);
if (cpar[cur] != -1) {
d = dist_to_canc[u][cdepth[cpar[cur]]];
if (d <= K)
res -= count_within(sub[cur], K - d);
}
cur = cpar[cur];
}
return res;
}Problems: CF 342E (closest marked node), CF 1303G
Pattern 4: XOR / Bitwise Path Queries
When the path "distance" uses XOR (or other bitwise operations), centroid decomposition pairs naturally with a trie. At each centroid, insert XOR-prefix distances into a binary trie, then for each node in a new subtree, query the trie for the maximum XOR pairing.
Problems: CF 914E (max XOR path in tree)
Pattern 5: Counting Paths by Number of Edges
When edge weights are all 1 (unweighted tree), "distance" = number of edges. The distance array at each centroid is bounded by the component size. Use a simple frequency array of size
cpp
int freq[MAXN];
// At centroid c: freq[d]++ for each node at depth d
// Query: for target T, answer += freq[T - d] for each new node at depth dProblems: CF 161D (pairs at exact distance
Pattern 6: Minimize/Maximize Over All Paths Through Centroid
Instead of counting, find the path through each centroid that minimizes or maximizes some objective (e.g., minimum number of edges among paths with weight sum
Problems: IOI 2011 Race (min edges for exact weight sum
Contest Cheat Sheet
+------------------------------------------------------------------+
| CENTROID DECOMPOSITION CHEAT SHEET |
+------------------------------------------------------------------+
| WHEN TO USE: |
| - Count/find paths with a distance/weight property |
| - Pairs of nodes at distance <= K, == K, or XOR == K |
| - Update node, query all nodes within distance K |
| - Divide-and-conquer on trees (the "merge sort" of trees) |
+------------------------------------------------------------------+
| FIND CENTROID: |
| get_sz(v, -1) then get_centroid(v, -1, sz[v]) |
| Centroid c: max component after removal <= N/2 |
+------------------------------------------------------------------+
| CORE LOOP (counting pattern): |
| solve(v): |
| c = get_centroid(v) |
| removed[c] = true |
| for each child subtree of c: |
| collect distances |
| subtract intra-subtree pairs (inclusion-exclusion) |
| add all-through-c pairs |
| recurse on each child subtree |
+------------------------------------------------------------------+
| CENTROID TREE (online queries): |
| build(v, parent_centroid, level): |
| c = get_centroid(v) |
| cpar[c] = parent_centroid |
| store dist from every node to c at this level |
| recurse on subtrees |
| update(v): walk up cpar[], O(log N) ancestors |
| query(u): walk up cpar[], query + subtract (incl-excl) |
+------------------------------------------------------------------+
| COMPLEXITY: |
| Build: O(N log N) Depth: O(log N) |
| Count pairs: O(N log^2 N) (sort per level) |
| Online update/query: O(log^2 N) per op (with BIT/sorted) |
| Space: O(N log N) for distance arrays |
+------------------------------------------------------------------+
| GOTCHA: Re-root get_sz before get_centroid at each level. |
| GOTCHA: Inclusion-exclusion -- subtract same-subtree pairs. |
| GOTCHA: Clean freq[] by undoing, NOT memset (TLE risk). |
+------------------------------------------------------------------+Gotchas & Debugging
1. Forgetting to Recompute Subtree Sizes
get_sz must be called before get_centroid at every recursive level. After removing previous centroids, subtree sizes change. Using stale sizes finds the wrong centroid and breaks the
2. Missing Inclusion-Exclusion
When counting pairs through centroid
3. Cleaning Up Frequency Arrays with memset
After processing a centroid, you need to reset the frequency array. Using memset(freq, 0, sizeof freq) is
cpp
// GOOD: undo only entries you set
for (int d : all_dists) freq[d] = 0;
// BAD: memset over a huge array
memset(freq, 0, sizeof freq); // TLE if array is 10^94. Stack Overflow on Deep Trees
Like HLD, the recursive solve, get_sz, and get_centroid functions can overflow the stack on degenerate trees (
- Increase stack size:
ulimit -s unlimitedon Linux. - Convert
get_szandget_diststo iterative using an explicit stack. - The centroid-finding recursion itself is bounded by tree depth, not
, so it is usually safe.
5. Off-by-One in Two-Pointer Counting
When using two pointers on a sorted distance array to count pairs summing to
cpp
// CORRECT: lo < hi ensures we count unordered pairs, not self-pairs
while (lo < hi) {
if (a[lo] + a[hi] <= K) { cnt += hi - lo; lo++; }
else hi--;
}Forgetting lo < hi (using lo <= hi) counts a node paired with itself.
6. Edge Weights vs. Hop Count
Confirm whether the problem uses weighted edges or counts hops. If unweighted, every edge has weight 1, and get_dists(u, c, 1) is correct. Accidentally using w from an adjacency list that was never initialized yields garbage.
7. Centroid Tree Parent of Root
The root centroid (level 0) has no parent. Set cpar[root_centroid] = -1 and guard all ancestor walks with while (cur != -1). Forgetting this causes an infinite loop or segfault.
8. Debugging Tips
- Verify
removed[c]is set before recursing, not after. Setting it after causes infinite recursion (the centroid is found again in its own subtrees). - Print the centroid at each level and the component size. Sizes should halve. If not,
get_centroidis broken. - Test on a line graph (1-2-3-...-N): centroid should be
, depth should be . - Test on a star graph (node 1 connected to all others): centroid = node 1, all subtrees are leaves, depth = 2.
Mental Traps
"Since each centroid is processed once, the total work is
."
Each centroid is processed once, yes. But each node appears in
How each node participates in multiple levels:
Level 0: centroid = 2 all 7 nodes processed at this level
Level 1: centroids = 3,4,5 6 remaining nodes split among them
Level 2: centroids = 1,6,7 3 leaf nodes
Node 1 appears at: level 0 (in centroid 2's component)
level 1 (in centroid 3's component)
level 2 (as its own centroid)
= 3 levels ~= log₂(7)
Total node-visits: 7 + 6 + 3 = 16 ~= 7 x log₂(7)
Each level sums to at most n -> O(n log n) total"Centroid decomposition is only for distance counting problems."
The canonical example counts pairs at distance
Applications beyond distance counting:
+-----------------------------+--------------------------------+
| Path property | Data structure at centroid |
+-----------------------------+--------------------------------+
| dist(u,v) <= K | Sorted array + two pointers |
| dist(u,v) == K | Frequency array: freq[d] |
| XOR(u,v) maximized | Binary trie of XOR-prefixes |
| weight sum ≡ 0 (mod M) | Freq array indexed by residue |
| Min edges for exact weight | freq[w] = min edges seen |
+-----------------------------+--------------------------------+Common Mistakes
- Using stale subtree sizes when finding the next centroid. Centroid decomposition TLEs on a bamboo (chain) graph because subtree sizes weren't recalculated after finding the parent centroid. With stale sizes from the previous DFS, the "centroid" found is not the true centroid and the recursion depth becomes
instead of . After identifying the centroid and removing it, re-run the size DFS in each remaining subtree before finding their centroids.
Debug Drills
Drill 1 -- Centroid finding: using stale subtree sizes
cpp
int centroid(int v, int parent, int tree_size) {
for (int to : adj[v]) {
if (to != parent && !removed[to] && sz[to] > tree_size / 2)
return centroid(to, v, tree_size);
}
return v;
}What's wrong?
sz[to] might be stale (computed for a larger tree before centroids were removed). You must recompute subtree sizes for the current component before finding the centroid:
cpp
void calc_sz(int v, int parent) {
sz[v] = 1;
for (int to : adj[v])
if (to != parent && !removed[to])
calc_sz(to, v), sz[v] += sz[to];
}Call calc_sz at the start of each decompose call.
Drill 2 -- Double-counting paths within the same subtree
cpp
// For centroid c, count paths of length k:
int total = 0;
vector<int> all_dists;
for (int child : adj[c]) {
if (removed[child]) continue;
vector<int> child_dists = get_distances(child, c);
for (int d : child_dists)
total += count(all_dists, k - d); // count pairs
for (int d : child_dists)
all_dists.push_back(d);
}What's wrong?
This approach is actually correct -- it processes subtrees one by one, counting pairs between the current subtree and all previously processed subtrees, avoiding double-counting. But if you instead collected ALL distances first and counted all pairs, you'd double-count paths within the same subtree. The "process one subtree at a time" approach shown here is the fix. A common bug is to forget this ordering and count all pairs globally.
Drill 3 -- Forgetting to mark centroid as removed
cpp
void decompose(int v) {
int c = find_centroid(v);
// Missing: removed[c] = true;
for (int to : adj[c]) {
if (!removed[to])
decompose(to);
}
}What's wrong?
Without marking c as removed, the recursive calls will see c as part of their subtrees, causing infinite recursion or incorrect subtree sizes:
cpp
removed[c] = true;Self-Test
Before moving to practice problems, verify you can answer these without looking at the notes:
- [ ] Write the centroid-finding function from memory --
get_szthenget_centroid, skippingremoved[]nodes, walking toward the child withsz > tree_sz / 2. - [ ] Explain why each node appears in
levels of the decomposition -- the centroid guarantees each component has size , so sizes halve at each level. - [ ] Implement the inclusion-exclusion for counting pairs at distance
: count all pairs through centroid , subtract pairs within each child subtree. - [ ] State why
removed[c] = truemust be set BEFORE recursing into subtrees -- otherwiseget_szincludes the centroid, breaking the size-halving guarantee. - [ ] Describe the centroid decomposition tree (CDT) and how to use it for online queries: walk up
ancestors, query/update at each level, apply inclusion-exclusion.
Practice Problems
| CF Rating | What You Should Be Able to Do |
|---|---|
| 1200 | Solve tree-path problems with brute-force DFS for small trees. |
| 1500 | Understand the centroid property and why recursion depth is |
| 1800 | Implement centroid decomposition; solve "count paths of length |
| 2100 | Combine centroid decomp with FFT/NTT for convolution-based path counting; handle path queries with updates via the centroid tree. |
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | Distance in Tree -- CF 161D | CF | CF 1800 | Count pairs at exact distance |
| 2 | Race -- IOI 2011 | IOI | Medium | Min edges for path with weight sum |
| 3 | Xenia and Tree -- CF 342E | CF | CF 2100 | Online closest marked node via centroid tree walk |
| 4 | Ciel and Gondola -- CF 321C | CF | CF 2100 | Centroid decomposition used to build balanced separator tree |
| 5 | Lucky Tree -- CF 914E | CF | CF 2100 | Max XOR on path; centroid decomp + trie |
| 6 | Tree Queries -- CF 1303G | CF | CF 2600 | Online update/query at distance; centroid tree + BIT |
| 7 | Digit Tree -- CF 716E | CF | CF 2400 | Count paths forming number divisible by |
| 8 | Tree and Queries -- CF 375D | CF | CF 2000 | Count colors appearing |
| 9 | Sherlock's Array -- CF 776F | CF | CF 2400 | Complex path property; centroid decomp + data structure |
| 10 | CSES Fixed-Length Paths I | CSES | Hard | Count paths of exactly |
Further Reading
- cp-algorithms: Centroid Decomposition -- detailed tutorial with proofs and code.
- CF Blog: Centroid Decomposition of a Tree (Tanuj Khattar) -- widely-referenced tutorial with examples.
- CF Blog: Easiest Centroid Decomposition Implementation -- minimal implementation guide.
- For the prerequisite DFS and tree traversal techniques, see: DFS & Tree DFS
- For Euler tour and LCA (useful for combining with centroid decomposition), see: Euler Tour & LCA
- For an alternative tree decomposition focused on path aggregate queries, see: Heavy-Light Decomposition