Appearance
Small-to-Large Merging
Quick Revisit
- USE WHEN: merging child data structures into parent during tree DFS (distinct colors, set union on tree)
- INVARIANT: always merge smaller set into larger — each element moves O(log n) times total
- TIME: O(n log n) total | SPACE: O(n)
- CLASSIC BUG: copying the larger set instead of swapping references/pointers — kills the amortization
- PRACTICE: 08-ladder-mixed
Always merge the smaller structure into the larger one. A single rule that turns
Difficulty: [Intermediate] · CF rating: 1600–1900 Prerequisites: Union-Find DSU, DFS and Tree DFS
Why This Exists
Suppose you have a rooted tree on
The naive plan: run a DFS, and at each node maintain a set<int> of colors seen in its subtree. When you return from a child, merge the child's set into the parent's set.
For a balanced tree this is fine -- each merge is small.
But what about a chain? 1 (red)
|
2 (blue)
|
3 (red)
|
4 (green)
|
5 (blue)Node 5's set has 1 element. Merge into node 4's set: 1 copy. Node 4's set now has 2 elements -- merge into node 3: 2 copies. Node 3 has 3 elements -- merge into node 2: 3 copies. And so on. Total copies:
The fix is embarrassingly simple: always merge the smaller set into the larger one. If the child's set is bigger than the parent's set, swap them first, then pour the smaller into the bigger.
Why does this work? Every time an element gets moved, the set it lands in is at least twice the size of the set it came from (because we only move elements from the smaller side). So each element can be moved at most
This is the exact same argument behind union by size in DSU. When you do union(u, v), you attach the smaller tree under the root of the larger tree. Each node's depth increases by 1 only when its component at least doubles, giving
Small-to-Large vs DSU-on-Tree
Both solve "aggregate data over subtrees" in
Small-to-large (set merging): Explicitly maintain a container per node. Merge smaller into larger during DFS. Works for any mergeable structure (set, map, multiset, even segment trees). Straightforward to implement.
DSU-on-tree (Euler tour trick): Don't maintain per-node containers at all. Instead, process the "heavy child" without clearing, then process light children with add/remove. Uses the heavy-light structure implicitly. Constant factor is often better because you avoid pointer-heavy containers, but the logic is trickier.
Pick set merging when the container semantics matter (e.g., you need a sorted set or a frequency map). Pick DSU-on-tree when you just need aggregate counts and want speed.
The Doubling Argument, Visually
Consider this tree with 8 nodes, each with a color label:
1(R)
/ \
2(B) 3(G)
/ \ \
4(R) 5(B) 6(R)
| |
7(G) 8(B)Colors: R=red, B=blue, G=green. We want distinct color count per subtree.
DFS processes leaves first. Here is the merge sequence with small-to-large:
Step 1: Leaf nodes get singleton sets.
node 7: {G} size 1
node 5: {B} size 1
node 8: {B} size 1
Step 2: Process node 4. Has child 7.
node 4 set: {R} size 1
node 7 set: {G} size 1
Same size -- pick either direction. Merge 7 into 4.
node 4: {R, G} size 2 moves: 1
Step 3: Process node 6. Has child 8.
node 6 set: {R} size 1
node 8 set: {B} size 1
Merge 8 into 6.
node 6: {R, B} size 2 moves: 1
Step 4: Process node 2. Children: 4, 5.
node 2 set: {B} size 1
node 4 set: {R, G} size 2
node 5 set: {B} size 1
Child 4 is larger than node 2 -- SWAP. node 2 takes {R, G}.
Merge old node 2 ({B}) into new node 2: {R, G, B} moves: 1
Merge node 5 ({B}) into node 2: B already present. moves: 1
node 2: {R, G, B} size 3 total moves this step: 2
Step 5: Process node 3. Has child 6.
node 3 set: {G} size 1
node 6 set: {R, B} size 2
Child 6 is larger -- SWAP. node 3 takes {R, B}.
Merge old node 3 ({G}) into new: {R, B, G} moves: 1
node 3: {R, B, G} size 3
Step 6: Process node 1. Children: 2, 3.
node 1 set: {R} size 1
node 2 set: {R, G, B} size 3
node 3 set: {R, B, G} size 3
Child 2 is largest -- SWAP. node 1 takes {R, G, B}.
Merge old node 1 ({R}): already present. moves: 1
Merge node 3 ({R, B, G}): all present. moves: 3
node 1: {R, G, B} size 3 total moves: 4Total element moves across all steps:
Naive merge (always child into parent) on this tree would do similar work here, but on a worst-case chain of
The Doubling Picture
Track how many times each element moves:
Element journey (tracking how many times each item is copied):
G at node 7: created
-> moved into node 4 (set size 1 -> 2) .... move #1
-> stays at node 2 via swap .... free
-> stays at node 1 via swap .... free
Total moves for this G: 1
R at node 4: created
-> stays at node 2 via swap .... free
-> stays at node 1 via swap .... free
Total moves for this R: 0
B at node 5: created
-> moved into node 2 (set size 1 -> 3) .... move #1
-> stays at node 1 via swap .... free
Total moves for this B: 1
R at node 6: created
-> stays at node 3 via swap .... free
-> moved into node 1 (set size 3 -> 3) .... move #1
Total moves for this R: 1 (duplicate, doesn't increase set)
B at node 8: created
-> moved into node 6 (set size 1 -> 2) .... move #1
-> stays at node 3 via swap .... free
-> moved into node 1 (set size 3 -> 3) .... move #2
Total moves for B(8): 2The key: every time an element moves, the receiving set is at least as large as the source set. After the move, the element is in a set at least twice its previous size. So: max moves per element
C++ STL Tools for Merging
The main containers you'll merge small-to-large:
| Container | Insert one element | Merge |
|---|---|---|
set<T> | ||
map<K,V> | ||
unordered_set<T> | ||
multiset<T> |
Critical operations:
swap(a, b)--for all STL containers. This is how you "take" the larger child's set without copying. Never skip this. a.insert(b.begin(), b.end())-- range insert. For sets/maps.a.merge(b)-- C++17. Moves nodes frombintoawithout reallocation. Faster than insert for node-based containers (set, map). After the call,bcontains only elements that had duplicates ina.a.size()--. Use to decide which is smaller.
std::set::merge is preferred over manual iteration + insert when available:
cpp
set<int> a = {1, 3, 5};
set<int> b = {2, 3, 4};
a.merge(b);
// a = {1, 2, 3, 4, 5}, b = {3} (duplicate stayed in b)For small-to-large, you almost always want the duplicate-free version, so merge works perfectly -- leftover duplicates in the smaller set don't matter since you discard it.
Data structure choice matters. set insertion is unordered_set is set; for pure membership, unordered_set is faster.
Implementation: Set Merging
The Problem
Given a rooted tree with
Contest-Ready Version
cpp
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
scanf("%d", &n);
vector<int> color(n + 1);
for (int i = 1; i <= n; i++) scanf("%d", &color[i]);
vector<vector<int>> adj(n + 1);
for (int i = 0; i < n - 1; i++) {
int u, v;
scanf("%d%d", &u, &v);
adj[u].push_back(v);
adj[v].push_back(u);
}
vector<int> ans(n + 1);
vector<set<int>*> node_set(n + 1);
// Iterative DFS with post-order processing
vector<int> order;
vector<int> par(n + 1, 0);
vector<bool> visited(n + 1, false);
stack<int> stk;
stk.push(1);
visited[1] = true;
while (!stk.empty()) {
int v = stk.top(); stk.pop();
order.push_back(v);
for (int u : adj[v]) {
if (!visited[u]) {
visited[u] = true;
par[u] = v;
stk.push(u);
}
}
}
// Process in reverse BFS order (leaves first)
reverse(order.begin(), order.end());
for (int v : order) {
node_set[v] = new set<int>();
node_set[v]->insert(color[v]);
for (int u : adj[v]) {
if (u == par[v]) continue;
// Merge smaller into larger
if (node_set[u]->size() > node_set[v]->size())
swap(node_set[v], node_set[u]);
node_set[v]->merge(*node_set[u]);
delete node_set[u];
node_set[u] = nullptr;
}
ans[v] = node_set[v]->size();
}
for (int i = 1; i <= n; i++)
printf("%d%c", ans[i], " \n"[i == n]);
delete node_set[1];
return 0;
}Walkthrough
The core loop does three things per node:
- Create a set for the node, insert its own color.
- For each child, compare set sizes. If the child's set is larger, swap pointers (this is
-- we swap set*pointers, not the sets themselves). Then merge the smaller set into the larger usingset::merge. - Record the answer as the set's size.
The swap-pointers trick is essential. Without it, you'd always copy into the parent's set regardless of size, losing the
Memory: we delete child sets after merging to avoid holding
Cleaner Version Using Recursive DFS
If stack depth isn't a concern (tree isn't a chain of
cpp
#include <bits/stdc++.h>
using namespace std;
const int MAXN = 200005;
vector<int> adj[MAXN];
int color[MAXN], ans[MAXN];
set<int>* dfs_set[MAXN];
void dfs(int v, int p) {
dfs_set[v] = new set<int>();
dfs_set[v]->insert(color[v]);
for (int u : adj[v]) {
if (u == p) continue;
dfs(u, v);
if (dfs_set[u]->size() > dfs_set[v]->size())
swap(dfs_set[v], dfs_set[u]);
dfs_set[v]->merge(*dfs_set[u]);
delete dfs_set[u];
}
ans[v] = dfs_set[v]->size();
}
int main() {
int n;
scanf("%d", &n);
for (int i = 1; i <= n; i++) scanf("%d", &color[i]);
for (int i = 0; i < n - 1; i++) {
int u, v;
scanf("%d%d", &u, &v);
adj[u].push_back(v);
adj[v].push_back(u);
}
dfs(1, 0);
for (int i = 1; i <= n; i++)
printf("%d%c", ans[i], " \n"[i == n]);
delete dfs_set[1];
}Implementation: DSU-on-Tree (Heavy-Child Variant)
DSU-on-tree (also called "small-to-large on Euler tour" or the "heavy child trick") drops explicit set containers and uses a global array of counters instead. The same amortization argument applies — each node is touched at most
The shape of the algorithm:
- Find the heavy child (largest subtree) of each node.
- DFS into all light children first; undo their contributions.
- DFS into the heavy child; keep its contributions.
- Add the current node + all light subtrees on top.
- Record the answer.
- If we ourselves are a light child of our parent, undo everything.
Why this works: the heavy child's data is reused for free — we never copy or clear it. Light children's contributions cost their subtree size, but a node is only ever inside a light subtree on
For the full contest-ready implementation, complexity proof, and standard problem variants, see DSU on Tree (Ch 6). The two files split deliberately: this file owns the principle; that file owns the implementation.
Set Merging vs DSU-on-Tree at a glance
| Aspect | Set merging | DSU-on-tree |
|---|---|---|
| Container per node | Yes (set<int>*) | No (global cnt[]) |
| Memory | ||
| Constant factor | Higher (alloc, ptr-heavy) | Lower (array ops) |
| Flexibility | Any mergeable structure | Needs add/remove ops |
| Difficulty to code | Easy | Medium |
Complexity Reference
| Operation | Time | Notes |
|---|---|---|
| Small-to-large set merge (total) | ||
Small-to-large with unordered_set | ||
Small-to-large with set::merge | Node transfer avoids realloc | |
| DSU-on-tree | Each node added/removed | |
| DSU union by size | Same analysis | |
| Naive merge (always small into big) | Chain tree |
The extra set::insert. If you replace set with a hash set, you get add/remove are
Problem Patterns and Tricks
Pattern 1: Distinct Values in Subtree
The classic application. Maintain set<int> per node, merge small-to-large. Answer is set.size() after merging all children.
Variant: "number of distinct values occurring at least map<int,int> (color -> count) instead. Merge by iterating the smaller map and updating counts in the larger.
Pattern 2: Frequency Maps for Subtree Aggregates
For "sum of colors that appear exactly once" or "most frequent color in subtree," maintain a map<int,int> mapping color to frequency, plus auxiliary variables tracking the answer.
cpp
// When merging entry (color c, count k) from smaller into larger:
// 1. Remove old contribution of c from answer
// 2. Update frequency: larger[c] += k
// 3. Add new contribution of c to answerThis pattern handles any query that decomposes over color frequencies.
Pattern 3: DSU-on-Tree for Counting Queries
DSU-on-tree shines when the "aggregate" is a simple counter or sum. Examples:
- Count of distinct colors (maintain
distinctcounter) - Sum of values at nodes with a specific color
- Maximum frequency color in subtree
You define add(v) and remove(v) to update global state when node
Pattern 4: Choosing Between Techniques
Need sorted order or set operations? --> set merge
Need just counts / sums? --> DSU-on-tree
Need to merge segment trees? --> small-to-large on seg trees
Offline, can sort queries? --> consider Mo on trees
Need persistent results across subtrees? --> set merge (keeps data)Pattern 5: Merging Sorted Structures Beyond Sets
Small-to-large works for any data structure where "merge" means inserting elements one by one. Examples:
- Vectors (sorted): Merge smaller sorted vector into larger using binary search insertion.
total. - Segment trees: Merge two segment trees by walking both simultaneously. With dynamic nodes, merging a tree of size
into a larger one costs where is the value range. Total: . - Fenwick trees / BITs: Less natural to merge -- consider DSU-on-tree with a global BIT instead.
Pattern 6: Weighted DSU (Edge-Weight Aggregation)
When edges have weights and you need subtree queries involving both structure and values, combine DSU union-by-size with weight tracking. The small-to-large principle ensures each element's weight is updated at most
Gotchas and Debugging
Swap Pointers, Not Contents
The most common bug: copying the larger set into a new set instead of swapping.
cpp
// WRONG: O(n^2) -- copies the big set every time
for (int x : *child_set) parent_set->insert(x);
// RIGHT: O(1) swap, then merge small into big
if (child_set->size() > parent_set->size())
swap(parent_set, child_set);
parent_set->merge(*child_set);If you're using set<int> node_set[MAXN] (by value, not pointer), then swap does swap contents in
Forgetting to Merge Into the Correct Side
With pointer-based sets, after swap(node_set[v], node_set[u]), the parent v now holds what was u's set. Make sure subsequent children merge into node_set[v], not some stale reference.
Root Handling
The root's answer is just its final set size. But if you're tracking something beyond size (like sum of frequencies), make sure the root's own contribution is included before reading the answer. A common mistake: inserting the node's own color after merging children, then forgetting to do it for the root.
Fix: always insert the node's own color first, before processing children.
Off-by-One in Subtree Sizes (DSU-on-Tree)
In DSU-on-tree, the Euler tour range for node
cpp
// Correct: half-open [tin[v], tout[v])
for (int t = tin[u]; t < tout[u]; t++) add(euler[t]);
// WRONG: closed interval includes one node too many
for (int t = tin[u]; t <= tout[u]; t++) add(euler[t]);Memory Leaks in Contest
Allocating new set<int> per node and forgetting to delete after merge won't cause WA, but can cause MLE on tight problems. Either delete immediately after merge or use an arena/pool allocator.
Confusing Heavy Child with Largest Set
In set merging, "heavy" means the child with the largest set after its subtree is processed -- not the child with the largest subtree size. These differ when colors repeat. For small-to-large set merge, compare set.size(), not subtree_size. For DSU-on-tree, compare subtree_size (since we don't have explicit sets).
Mental traps
"I'll merge in any order — it's the same idea." No. The direction is the entire basis of the complexity guarantee. Repeatedly merging a set of size
"I don't need the proof, just the rule." Knowing the proof tells you exactly when the technique applies. The invariant is "after each move, the containing set is at least twice as large." If you break the rule, the doubling fails and the bound fails.
"This is only a tree-DFS technique." The DFS is the most common application, but the principle applies anywhere you repeatedly merge two groups: offline graph problems, set partitioning, weighted union-find with set contents.
"DSU-on-tree is the same as small-to-large." Related but not identical. DSU-on-tree inherits the heavy child's data without copying (an optimization); explicit small-to-large merges elements one by one. DSU-on-tree uses the heavy-light structure implicitly.
Destroy small buckets after merging
A subtle TLE pattern: in Lomsat gelral (CF 600E) with std::map<int,int>, after merging a child's map into the parent, forgetting to clear() the child's map leaves it alive. Later, when the parent is merged upward, the ancestor sees a "small" bucket that is actually still full. The amortization shatters because the size-comparison lies. Rule: pour the small bucket into the big one, then destroy the small bucket.
Debug Drills
Drill 1: Merging sets on a tree -- but the answer is wrong for the root.
cpp
void dfs(int v, int p) {
sets[v].insert(color[v]);
for (int u : adj[v]) {
if (u == p) continue;
dfs(u, v);
if (sets[v].size() < sets[u].size())
swap(sets[v], sets[u]);
for (int x : sets[v]) // <-- look here
sets[v].insert(x);
}
ans[v] = sets[v].size();
}What's wrong?
The merge loop iterates over sets[v] (the larger set) and inserts into sets[v] -- it should iterate over sets[u] (the smaller set) and insert into sets[v]. Iterating and inserting into the same container is also undefined behavior for std::set. Fix: for (int x : sets[u]) sets[v].insert(x);
Drill 2: DSU union-by-size -- but find is slow on some inputs.
cpp
int par[MAXN], sz[MAXN];
int find(int x) {
while (par[x] != x) x = par[x];
return x;
}
void unite(int a, int b) {
a = find(a); b = find(b);
if (a == b) return;
if (sz[a] < sz[b]) swap(a, b);
par[b] = a;
// sz[a] += sz[b]; <-- missing line
}What's wrong?
After attaching b under a, the code never updates sz[a]. So sz[a] stays at 1, and future unions always think every component is size 1 -- effectively producing a random tree shape instead of a balanced one. Worst-case find becomes O(n). Fix: add sz[a] += sz[b]; after par[b] = a;.
Drill 3: Small-to-large merging of std::map for frequency counts -- correct logic but TLE.
cpp
void merge_into(map<int,long long>& big, map<int,long long>& small) {
for (auto& [key, val] : small)
big[key] += val;
small.clear();
}
void dfs(int v, int p) {
freq[v][color[v]] = 1;
for (int u : adj[v]) {
if (u == p) continue;
dfs(u, v);
merge_into(freq[v], freq[u]); // always merge u into v
}
}What's wrong?
The code always merges the child into the parent without checking sizes. If the child's map is larger than the parent's, you should swap them first. Without the swap, a chain-shaped tree degrades to O(n²). Fix: before merge_into, add if (freq[v].size() < freq[u].size()) swap(freq[v], freq[u]); to ensure you always iterate over the smaller map.
Self-Test
Drawn from reflection notes
Without opening any reference:
[ ] Prove the
complexity of small-to-large merging in 3-4 sentences. If you can't, you don't own this technique yet. [ ] Write the small-to-large DFS template from memory -- inherit the largest child's set, merge smaller children into it. Include the
swapstep.[ ] Explain why
std::swapon twostd::setobjects is-- what does swap actually do internally? Why is this critical for the technique? [ ] Identify a problem that looks like small-to-large but isn't -- where the set operation is not a union, or where the "small into large" direction doesn't apply. What goes wrong?
[ ] State the time complexity for small-to-large using
std::set(insert) vs std::unordered_set(average insert). What is the total in each case? When would you prefer one over the other?
Practice Problems
| Problem | Source | Difficulty | Technique | Link |
|---|---|---|---|---|
| Distinct Colors | CSES | 1600 | Set merge / DSU-on-tree | cses.fi/problemset/task/1139 |
| Lomsat gelral | CF 600E | 1900 | DSU-on-tree (most frequent color) | codeforces.com/contest/600/problem/E |
| Arpa's letter-marked tree | CF 741D | 2100 | DSU-on-tree + bitmask | codeforces.com/contest/741/problem/D |
| Blood Cousins Return | CF 246E | 1800 | Set merge for distinct names at depth | codeforces.com/contest/246/problem/E |
| Tree Queries | AtCoder ABC 202 E | 1700 | Euler tour + counting (DSU-on-tree variant) | atcoder.jp/contests/abc202/tasks/abc202_e |
| Propagating tree | CF 383C | 1700 | Euler tour with range updates | codeforces.com/contest/383/problem/C |
| Count on a tree II | SPOJ COT2 | 2000 | Mo on trees (alternative technique to compare) | spoj.com/problems/COT2 |
Start with CSES Distinct Colors (pure set merge). Then CF 600E for DSU-on-tree. CF 741D is a great "next step" that combines the technique with bitmask tricks.
Rating progression:
| CF Rating | What to Master |
|---|---|
| 1200 | Understand why naïve merging of sets is O(n²). Implement basic DSU with union by size. Know that swap is O(1) for STL containers. |
| 1500 | Implement DSU-on-tree (Euler tour + heavy child reuse). Solve "count distinct colors in subtree" type problems. Understand the log n amortized bound. |
| 1800 | Combine small-to-large with additional data structures (merging std::map or std::set with order statistics). Handle queries that require rollback or persistent structures. |
| 2100 | Apply to non-tree structures (virtual trees, auxiliary graphs). Merge segment trees or Fenwick trees with the small-to-large principle. Prove tight bounds for specific problem constraints. |
Further Reading
- CF Blog: DSU on tree -- the original tutorial by Arpa, covers DSU-on-tree in depth.
- CF Blog: Small to Large -- clean explanation of the set merge variant with examples.
- cp-algorithms: DSU on tree -- reference implementation and complexity proof.
- See: Union-Find DSU -- the DSU size heuristic uses the same analysis.
- See: DFS and Tree DFS -- Euler tour and subtree flattening are prerequisites for DSU-on-tree.