Small-to-Large Merging

Quick Revisit
USE WHEN: merging child data structures into parent during tree DFS (distinct colors, set union on tree)
INVARIANT: always merge smaller set into larger — each element moves O(log n) times total
TIME: O(n log n) total | SPACE: O(n)
CLASSIC BUG: copying the larger set instead of swapping references/pointers — kills the amortization
PRACTICE: 08-ladder-mixed

Always merge the smaller structure into the larger one. A single rule that turns $O (n^{2})$ brute-force into $O (n \log n)$ across trees, sets, and DSU alike. This file is the principle and proof; for the contest-tuned implementation of the heavy-child variant, see DSU on Tree in Chapter 6.

Difficulty: [Intermediate] · CF rating: 1600–1900 Prerequisites: Union-Find DSU, DFS and Tree DFS

Why This Exists

Suppose you have a rooted tree on $n$ nodes. Each node has a color. For every node $v$ , you want to know how many distinct colors appear in the subtree rooted at $v$ .

The naive plan: run a DFS, and at each node maintain a set<int> of colors seen in its subtree. When you return from a child, merge the child's set into the parent's set.

For a balanced tree this is fine -- each merge is small.
But what about a chain?

        1 (red)
        |
        2 (blue)
        |
        3 (red)
        |
        4 (green)
        |
        5 (blue)

Node 5's set has 1 element. Merge into node 4's set: 1 copy. Node 4's set now has 2 elements -- merge into node 3: 2 copies. Node 3 has 3 elements -- merge into node 2: 3 copies. And so on. Total copies: $1 + 2 + 3 + \dots + (n - 1) = O (n^{2})$ .

The fix is embarrassingly simple: always merge the smaller set into the larger one. If the child's set is bigger than the parent's set, swap them first, then pour the smaller into the bigger.

Why does this work? Every time an element gets moved, the set it lands in is at least twice the size of the set it came from (because we only move elements from the smaller side). So each element can be moved at most $O (\log n)$ times before it sits in a set of size $n$ . With $n$ elements each moving $O (\log n)$ times, the total work is $O (n \log n)$ .

This is the exact same argument behind union by size in DSU. When you do union(u, v), you attach the smaller tree under the root of the larger tree. Each node's depth increases by 1 only when its component at least doubles, giving $O (\log n)$ depth. Small-to-large merging IS the DSU size heuristic, generalized to arbitrary data structures.

Small-to-Large vs DSU-on-Tree

Both solve "aggregate data over subtrees" in $O (n \log n)$ . The difference:

Small-to-large (set merging): Explicitly maintain a container per node. Merge smaller into larger during DFS. Works for any mergeable structure (set, map, multiset, even segment trees). Straightforward to implement.
DSU-on-tree (Euler tour trick): Don't maintain per-node containers at all. Instead, process the "heavy child" without clearing, then process light children with add/remove. Uses the heavy-light structure implicitly. Constant factor is often better because you avoid pointer-heavy containers, but the logic is trickier.

Pick set merging when the container semantics matter (e.g., you need a sorted set or a frequency map). Pick DSU-on-tree when you just need aggregate counts and want speed.

The Doubling Argument, Visually

Consider this tree with 8 nodes, each with a color label:

              1(R)
             / \
           2(B)  3(G)
          / \      \
        4(R) 5(B)  6(R)
        |          |
       7(G)       8(B)

Colors: R=red, B=blue, G=green. We want distinct color count per subtree.

DFS processes leaves first. Here is the merge sequence with small-to-large:

Step 1: Leaf nodes get singleton sets.
  node 7: {G}        size 1
  node 5: {B}        size 1
  node 8: {B}        size 1

Step 2: Process node 4. Has child 7.
  node 4 set: {R}    size 1
  node 7 set: {G}    size 1
  Same size -- pick either direction. Merge 7 into 4.
  node 4: {R, G}     size 2     moves: 1

Step 3: Process node 6. Has child 8.
  node 6 set: {R}    size 1
  node 8 set: {B}    size 1
  Merge 8 into 6.
  node 6: {R, B}     size 2     moves: 1

Step 4: Process node 2. Children: 4, 5.
  node 2 set: {B}           size 1
  node 4 set: {R, G}        size 2
  node 5 set: {B}           size 1
  Child 4 is larger than node 2 -- SWAP. node 2 takes {R, G}.
  Merge old node 2 ({B}) into new node 2: {R, G, B}   moves: 1
  Merge node 5 ({B}) into node 2: B already present.   moves: 1
  node 2: {R, G, B}  size 3    total moves this step: 2

Step 5: Process node 3. Has child 6.
  node 3 set: {G}           size 1
  node 6 set: {R, B}        size 2
  Child 6 is larger -- SWAP. node 3 takes {R, B}.
  Merge old node 3 ({G}) into new: {R, B, G}   moves: 1
  node 3: {R, B, G}  size 3

Step 6: Process node 1. Children: 2, 3.
  node 1 set: {R}              size 1
  node 2 set: {R, G, B}        size 3
  node 3 set: {R, B, G}        size 3
  Child 2 is largest -- SWAP. node 1 takes {R, G, B}.
  Merge old node 1 ({R}): already present.   moves: 1
  Merge node 3 ({R, B, G}): all present.     moves: 3
  node 1: {R, G, B}  size 3    total moves: 4

Total element moves across all steps: $1 + 1 + 2 + 1 + 4 = 9$ .

Naive merge (always child into parent) on this tree would do similar work here, but on a worst-case chain of $n$ nodes it does $O (n^{2})$ . Small-to-large guarantees $O (n \log n)$ regardless of tree shape.

The Doubling Picture

Track how many times each element moves:

Element journey (tracking how many times each item is copied):

  G at node 7: created
    -> moved into node 4  (set size 1 -> 2)  .... move #1
    -> stays at node 2 via swap               .... free
    -> stays at node 1 via swap               .... free
  Total moves for this G: 1

  R at node 4: created
    -> stays at node 2 via swap               .... free
    -> stays at node 1 via swap               .... free
  Total moves for this R: 0

  B at node 5: created
    -> moved into node 2  (set size 1 -> 3)  .... move #1
    -> stays at node 1 via swap               .... free
  Total moves for this B: 1

  R at node 6: created
    -> stays at node 3 via swap               .... free
    -> moved into node 1  (set size 3 -> 3)  .... move #1
  Total moves for this R: 1  (duplicate, doesn't increase set)

  B at node 8: created
    -> moved into node 6  (set size 1 -> 2)  .... move #1
    -> stays at node 3 via swap               .... free
    -> moved into node 1  (set size 3 -> 3)  .... move #2
  Total moves for B(8): 2

The key: every time an element moves, the receiving set is at least as large as the source set. After the move, the element is in a set at least twice its previous size. So: max moves per element $= ⌊ \log_{2} n ⌋$ .

Total work = \sum_{elements} O (\log n) = O (n \log n)

C++ STL Tools for Merging

The main containers you'll merge small-to-large:

Container	Insert one element	Merge $k$ elements from smaller
`set<T>`	$O (\log n)$	$O (k \log n)$
`map<K,V>`	$O (\log n)$	$O (k \log n)$
`unordered_set<T>`	$O (1)$ amortized	$O (k)$ amortized
`multiset<T>`	$O (\log n)$	$O (k \log n)$

Critical operations:

swap(a, b) -- $O (1)$ for all STL containers. This is how you "take" the larger child's set without copying. Never skip this.
a.insert(b.begin(), b.end()) -- range insert. For sets/maps.
a.merge(b) -- C++17. Moves nodes from b into a without reallocation. Faster than insert for node-based containers (set, map). After the call, b contains only elements that had duplicates in a.
a.size() -- $O (1)$ . Use to decide which is smaller.

std::set::merge is preferred over manual iteration + insert when available:

cpp

set<int> a = {1, 3, 5};
set<int> b = {2, 3, 4};
a.merge(b);
// a = {1, 2, 3, 4, 5}, b = {3} (duplicate stayed in b)

For small-to-large, you almost always want the duplicate-free version, so merge works perfectly -- leftover duplicates in the smaller set don't matter since you discard it.

Data structure choice matters. set insertion is $O (\log n)$ , giving $O (n \log^{2} n)$ total. unordered_set is $O (1)$ average, giving true $O (n \log n)$ . For sorted access, use set; for pure membership, unordered_set is faster.

Implementation: Set Merging

The Problem

Given a rooted tree with $n$ nodes (rooted at 1), each node has a color $c_{v} \in {1, \dots, n}$ . For each node, output the number of distinct colors in its subtree.

Contest-Ready Version

cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n;
    scanf("%d", &n);
    vector<int> color(n + 1);
    for (int i = 1; i <= n; i++) scanf("%d", &color[i]);

    vector<vector<int>> adj(n + 1);
    for (int i = 0; i < n - 1; i++) {
        int u, v;
        scanf("%d%d", &u, &v);
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    vector<int> ans(n + 1);
    vector<set<int>*> node_set(n + 1);

    // Iterative DFS with post-order processing
    vector<int> order;
    vector<int> par(n + 1, 0);
    vector<bool> visited(n + 1, false);
    stack<int> stk;
    stk.push(1);
    visited[1] = true;
    while (!stk.empty()) {
        int v = stk.top(); stk.pop();
        order.push_back(v);
        for (int u : adj[v]) {
            if (!visited[u]) {
                visited[u] = true;
                par[u] = v;
                stk.push(u);
            }
        }
    }

    // Process in reverse BFS order (leaves first)
    reverse(order.begin(), order.end());
    for (int v : order) {
        node_set[v] = new set<int>();
        node_set[v]->insert(color[v]);
        for (int u : adj[v]) {
            if (u == par[v]) continue;
            // Merge smaller into larger
            if (node_set[u]->size() > node_set[v]->size())
                swap(node_set[v], node_set[u]);
            node_set[v]->merge(*node_set[u]);
            delete node_set[u];
            node_set[u] = nullptr;
        }
        ans[v] = node_set[v]->size();
    }

    for (int i = 1; i <= n; i++)
        printf("%d%c", ans[i], " \n"[i == n]);

    delete node_set[1];
    return 0;
}

Walkthrough

The core loop does three things per node:

Create a set for the node, insert its own color.
For each child, compare set sizes. If the child's set is larger, swap pointers (this is $O (1)$ -- we swap set* pointers, not the sets themselves). Then merge the smaller set into the larger using set::merge.
Record the answer as the set's size.

The swap-pointers trick is essential. Without it, you'd always copy into the parent's set regardless of size, losing the $O (n \log n)$ guarantee.

Memory: we delete child sets after merging to avoid holding $O (n)$ sets simultaneously.

Cleaner Version Using Recursive DFS

If stack depth isn't a concern (tree isn't a chain of $10^{5} +$ nodes):

cpp

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200005;
vector<int> adj[MAXN];
int color[MAXN], ans[MAXN];
set<int>* dfs_set[MAXN];

void dfs(int v, int p) {
    dfs_set[v] = new set<int>();
    dfs_set[v]->insert(color[v]);
    for (int u : adj[v]) {
        if (u == p) continue;
        dfs(u, v);
        if (dfs_set[u]->size() > dfs_set[v]->size())
            swap(dfs_set[v], dfs_set[u]);
        dfs_set[v]->merge(*dfs_set[u]);
        delete dfs_set[u];
    }
    ans[v] = dfs_set[v]->size();
}

int main() {
    int n;
    scanf("%d", &n);
    for (int i = 1; i <= n; i++) scanf("%d", &color[i]);
    for (int i = 0; i < n - 1; i++) {
        int u, v;
        scanf("%d%d", &u, &v);
        adj[u].push_back(v);
        adj[v].push_back(u);
    }
    dfs(1, 0);
    for (int i = 1; i <= n; i++)
        printf("%d%c", ans[i], " \n"[i == n]);
    delete dfs_set[1];
}

Implementation: DSU-on-Tree (Heavy-Child Variant)

DSU-on-tree (also called "small-to-large on Euler tour" or the "heavy child trick") drops explicit set containers and uses a global array of counters instead. The same amortization argument applies — each node is touched at most $O (\log n)$ times, because every node-to-root path has at most $\log n$ light edges.

The shape of the algorithm:

Find the heavy child (largest subtree) of each node.
DFS into all light children first; undo their contributions.
DFS into the heavy child; keep its contributions.
Add the current node + all light subtrees on top.
Record the answer.
If we ourselves are a light child of our parent, undo everything.

Why this works: the heavy child's data is reused for free — we never copy or clear it. Light children's contributions cost their subtree size, but a node is only ever inside a light subtree on $O (\log n)$ ancestors, so total add/remove work is $O (n \log n)$ .

For the full contest-ready implementation, complexity proof, and standard problem variants, see DSU on Tree (Ch 6). The two files split deliberately: this file owns the principle; that file owns the implementation.

Set Merging vs DSU-on-Tree at a glance

Aspect	Set merging	DSU-on-tree
Container per node	Yes (`set<int>*`)	No (global `cnt[]`)
Memory	$O (n \log n)$ worst case	$O (n)$
Constant factor	Higher (alloc, ptr-heavy)	Lower (array ops)
Flexibility	Any mergeable structure	Needs add/remove ops
Difficulty to code	Easy	Medium

Complexity Reference

Operation	Time	Notes
Small-to-large set merge (total)	$O (n \log^{2} n)$	$O (n \log n)$ merges, each $O (\log n)$ insert
Small-to-large with `unordered_set`	$O (n \log n)$ expected	$O (1)$ insert amortized
Small-to-large with `set::merge`	$O (n \log^{2} n)$ worst	Node transfer avoids realloc
DSU-on-tree	$O (n \log n)$	Each node added/removed $O (\log n)$ times
DSU union by size	$O (n \log n)$ total	Same analysis
Naive merge (always small into big)	$O (n^{2})$ worst case	Chain tree

The extra $\log n$ factor in set merging comes from the $O (\log n)$ cost per set::insert. If you replace set with a hash set, you get $O (n \log n)$ expected. DSU-on-tree avoids this because add/remove are $O (1)$ array operations.

Problem Patterns and Tricks

Pattern 1: Distinct Values in Subtree

The classic application. Maintain set<int> per node, merge small-to-large. Answer is set.size() after merging all children.

Variant: "number of distinct values occurring at least $k$ times." Use map<int,int> (color -> count) instead. Merge by iterating the smaller map and updating counts in the larger.

Pattern 2: Frequency Maps for Subtree Aggregates

For "sum of colors that appear exactly once" or "most frequent color in subtree," maintain a map<int,int> mapping color to frequency, plus auxiliary variables tracking the answer.

cpp

// When merging entry (color c, count k) from smaller into larger:
// 1. Remove old contribution of c from answer
// 2. Update frequency: larger[c] += k
// 3. Add new contribution of c to answer

This pattern handles any query that decomposes over color frequencies.

Pattern 3: DSU-on-Tree for Counting Queries

DSU-on-tree shines when the "aggregate" is a simple counter or sum. Examples:

Count of distinct colors (maintain distinct counter)
Sum of values at nodes with a specific color
Maximum frequency color in subtree

You define add(v) and remove(v) to update global state when node $v$ enters or leaves the current window. The framework handles the rest.

Pattern 4: Choosing Between Techniques

Need sorted order or set operations?         --> set merge
Need just counts / sums?                     --> DSU-on-tree
Need to merge segment trees?                 --> small-to-large on seg trees
Offline, can sort queries?                   --> consider Mo on trees
Need persistent results across subtrees?     --> set merge (keeps data)

Pattern 5: Merging Sorted Structures Beyond Sets

Small-to-large works for any data structure where "merge" means inserting elements one by one. Examples:

Vectors (sorted): Merge smaller sorted vector into larger using binary search insertion. $O (n \log^{2} n)$ total.
Segment trees: Merge two segment trees by walking both simultaneously. With dynamic nodes, merging a tree of size $k$ into a larger one costs $O (k \log C)$ where $C$ is the value range. Total: $O (n \log n \log C)$ .
Fenwick trees / BITs: Less natural to merge -- consider DSU-on-tree with a global BIT instead.

Pattern 6: Weighted DSU (Edge-Weight Aggregation)

When edges have weights and you need subtree queries involving both structure and values, combine DSU union-by-size with weight tracking. The small-to-large principle ensures each element's weight is updated at most $O (\log n)$ times.

Gotchas and Debugging

Swap Pointers, Not Contents

The most common bug: copying the larger set into a new set instead of swapping.

cpp

// WRONG: O(n^2) -- copies the big set every time
for (int x : *child_set) parent_set->insert(x);

// RIGHT: O(1) swap, then merge small into big
if (child_set->size() > parent_set->size())
    swap(parent_set, child_set);
parent_set->merge(*child_set);

If you're using set<int> node_set[MAXN] (by value, not pointer), then swap does swap contents in $O (1)$ for STL containers -- but only if you swap the actual container objects, not copies.

Forgetting to Merge Into the Correct Side

With pointer-based sets, after swap(node_set[v], node_set[u]), the parent v now holds what was u's set. Make sure subsequent children merge into node_set[v], not some stale reference.

Root Handling

The root's answer is just its final set size. But if you're tracking something beyond size (like sum of frequencies), make sure the root's own contribution is included before reading the answer. A common mistake: inserting the node's own color after merging children, then forgetting to do it for the root.

Fix: always insert the node's own color first, before processing children.

Off-by-One in Subtree Sizes (DSU-on-Tree)

In DSU-on-tree, the Euler tour range for node $v$ 's subtree is $[tin [v], tout [v])$ -- a half-open interval. If you use a closed interval, you'll either miss one node or process an extra node from a sibling's subtree.

cpp

// Correct: half-open [tin[v], tout[v])
for (int t = tin[u]; t < tout[u]; t++) add(euler[t]);

// WRONG: closed interval includes one node too many
for (int t = tin[u]; t <= tout[u]; t++) add(euler[t]);

Memory Leaks in Contest

Allocating new set<int> per node and forgetting to delete after merge won't cause WA, but can cause MLE on tight problems. Either delete immediately after merge or use an arena/pool allocator.

Confusing Heavy Child with Largest Set

In set merging, "heavy" means the child with the largest set after its subtree is processed -- not the child with the largest subtree size. These differ when colors repeat. For small-to-large set merge, compare set.size(), not subtree_size. For DSU-on-tree, compare subtree_size (since we don't have explicit sets).

Mental traps

"I'll merge in any order — it's the same idea." No. The direction is the entire basis of the complexity guarantee. Repeatedly merging a set of size $n / 2$ into a set of size 1 gives $O (n^{2})$ .

"I don't need the proof, just the rule." Knowing the proof tells you exactly when the technique applies. The invariant is "after each move, the containing set is at least twice as large." If you break the rule, the doubling fails and the bound fails.

"This is only a tree-DFS technique." The DFS is the most common application, but the principle applies anywhere you repeatedly merge two groups: offline graph problems, set partitioning, weighted union-find with set contents.

"DSU-on-tree is the same as small-to-large." Related but not identical. DSU-on-tree inherits the heavy child's data without copying (an optimization); explicit small-to-large merges elements one by one. DSU-on-tree uses the heavy-light structure implicitly.

Destroy small buckets after merging

A subtle TLE pattern: in Lomsat gelral (CF 600E) with std::map<int,int>, after merging a child's map into the parent, forgetting to clear() the child's map leaves it alive. Later, when the parent is merged upward, the ancestor sees a "small" bucket that is actually still full. The amortization shatters because the size-comparison lies. Rule: pour the small bucket into the big one, then destroy the small bucket.

Debug Drills

Drill 1: Merging sets on a tree -- but the answer is wrong for the root.

cpp

void dfs(int v, int p) {
    sets[v].insert(color[v]);
    for (int u : adj[v]) {
        if (u == p) continue;
        dfs(u, v);
        if (sets[v].size() < sets[u].size())
            swap(sets[v], sets[u]);
        for (int x : sets[v])   // <-- look here
            sets[v].insert(x);
    }
    ans[v] = sets[v].size();
}

What's wrong?

The merge loop iterates over sets[v] (the larger set) and inserts into sets[v] -- it should iterate over sets[u] (the smaller set) and insert into sets[v]. Iterating and inserting into the same container is also undefined behavior for std::set. Fix: for (int x : sets[u]) sets[v].insert(x);

Drill 2: DSU union-by-size -- but find is slow on some inputs.

cpp

int par[MAXN], sz[MAXN];
int find(int x) {
    while (par[x] != x) x = par[x];
    return x;
}
void unite(int a, int b) {
    a = find(a); b = find(b);
    if (a == b) return;
    if (sz[a] < sz[b]) swap(a, b);
    par[b] = a;
    // sz[a] += sz[b];   <-- missing line
}

What's wrong?

After attaching b under a, the code never updates sz[a]. So sz[a] stays at 1, and future unions always think every component is size 1 -- effectively producing a random tree shape instead of a balanced one. Worst-case find becomes O(n). Fix: add sz[a] += sz[b]; after par[b] = a;.

Drill 3: Small-to-large merging of std::map for frequency counts -- correct logic but TLE.

cpp

void merge_into(map<int,long long>& big, map<int,long long>& small) {
    for (auto& [key, val] : small)
        big[key] += val;
    small.clear();
}

void dfs(int v, int p) {
    freq[v][color[v]] = 1;
    for (int u : adj[v]) {
        if (u == p) continue;
        dfs(u, v);
        merge_into(freq[v], freq[u]);  // always merge u into v
    }
}

What's wrong?

The code always merges the child into the parent without checking sizes. If the child's map is larger than the parent's, you should swap them first. Without the swap, a chain-shaped tree degrades to O(n²). Fix: before merge_into, add if (freq[v].size() < freq[u].size()) swap(freq[v], freq[u]); to ensure you always iterate over the smaller map.

Self-Test

Drawn from reflection notes

Without opening any reference:

[ ] Prove the $O (n \log n)$ complexity of small-to-large merging in 3-4 sentences. If you can't, you don't own this technique yet.
[ ] Write the small-to-large DFS template from memory -- inherit the largest child's set, merge smaller children into it. Include the swap step.
[ ] Explain why std::swap on two std::set objects is $O (1)$ -- what does swap actually do internally? Why is this critical for the technique?
[ ] Identify a problem that looks like small-to-large but isn't -- where the set operation is not a union, or where the "small into large" direction doesn't apply. What goes wrong?
[ ] State the time complexity for small-to-large using std::set ( $O (\log n)$ insert) vs std::unordered_set ( $O (1)$ average insert). What is the total in each case? When would you prefer one over the other?

Practice Problems

Problem	Source	Difficulty	Technique	Link
Distinct Colors	CSES	1600	Set merge / DSU-on-tree	cses.fi/problemset/task/1139
Lomsat gelral	CF 600E	1900	DSU-on-tree (most frequent color)	codeforces.com/contest/600/problem/E
Arpa's letter-marked tree	CF 741D	2100	DSU-on-tree + bitmask	codeforces.com/contest/741/problem/D
Blood Cousins Return	CF 246E	1800	Set merge for distinct names at depth	codeforces.com/contest/246/problem/E
Tree Queries	AtCoder ABC 202 E	1700	Euler tour + counting (DSU-on-tree variant)	atcoder.jp/contests/abc202/tasks/abc202_e
Propagating tree	CF 383C	1700	Euler tour with range updates	codeforces.com/contest/383/problem/C
Count on a tree II	SPOJ COT2	2000	Mo on trees (alternative technique to compare)	spoj.com/problems/COT2

Start with CSES Distinct Colors (pure set merge). Then CF 600E for DSU-on-tree. CF 741D is a great "next step" that combines the technique with bitmask tricks.

Rating progression:

CF Rating	What to Master
1200	Understand why naïve merging of sets is O(n²). Implement basic DSU with union by size. Know that `swap` is O(1) for STL containers.
1500	Implement DSU-on-tree (Euler tour + heavy child reuse). Solve "count distinct colors in subtree" type problems. Understand the log n amortized bound.
1800	Combine small-to-large with additional data structures (merging `std::map` or `std::set` with order statistics). Handle queries that require rollback or persistent structures.
2100	Apply to non-tree structures (virtual trees, auxiliary graphs). Merge segment trees or Fenwick trees with the small-to-large principle. Prove tight bounds for specific problem constraints.

Small-to-Large Merging ​

Why This Exists ​

Small-to-Large vs DSU-on-Tree ​

The Doubling Argument, Visually ​

The Doubling Picture ​

C++ STL Tools for Merging ​

Implementation: Set Merging ​

The Problem ​

Contest-Ready Version ​

Walkthrough ​

Cleaner Version Using Recursive DFS ​

Implementation: DSU-on-Tree (Heavy-Child Variant) ​

Set Merging vs DSU-on-Tree at a glance ​

Complexity Reference ​

Problem Patterns and Tricks ​

Pattern 1: Distinct Values in Subtree ​

Pattern 2: Frequency Maps for Subtree Aggregates ​

Pattern 3: DSU-on-Tree for Counting Queries ​

Pattern 4: Choosing Between Techniques ​

Pattern 5: Merging Sorted Structures Beyond Sets ​

Pattern 6: Weighted DSU (Edge-Weight Aggregation) ​

Gotchas and Debugging ​

Swap Pointers, Not Contents ​

Forgetting to Merge Into the Correct Side ​

Root Handling ​

Off-by-One in Subtree Sizes (DSU-on-Tree) ​

Memory Leaks in Contest ​

Confusing Heavy Child with Largest Set ​

Mental traps ​

Destroy small buckets after merging ​

Debug Drills ​

Self-Test ​

Practice Problems ​

Further Reading ​

Small-to-Large Merging

Why This Exists

Small-to-Large vs DSU-on-Tree

The Doubling Argument, Visually

The Doubling Picture

C++ STL Tools for Merging

Implementation: Set Merging

The Problem

Contest-Ready Version

Walkthrough

Cleaner Version Using Recursive DFS

Implementation: DSU-on-Tree (Heavy-Child Variant)

Set Merging vs DSU-on-Tree at a glance

Complexity Reference

Problem Patterns and Tricks

Pattern 1: Distinct Values in Subtree

Pattern 2: Frequency Maps for Subtree Aggregates

Pattern 3: DSU-on-Tree for Counting Queries

Pattern 4: Choosing Between Techniques

Pattern 5: Merging Sorted Structures Beyond Sets

Pattern 6: Weighted DSU (Edge-Weight Aggregation)

Gotchas and Debugging

Swap Pointers, Not Contents

Forgetting to Merge Into the Correct Side

Root Handling

Off-by-One in Subtree Sizes (DSU-on-Tree)

Memory Leaks in Contest

Confusing Heavy Child with Largest Set

Mental traps

Destroy small buckets after merging

Debug Drills

Self-Test

Practice Problems

Further Reading