Skip to content

Segment Tree Merging

Merge two dynamic (pointer-based) segment trees into one in time proportional to their overlap. That sounds niche until you attach it to DFS on a tree: each vertex keeps a segment tree for its subtree, and merging children into the parent accumulates information without ever rebuilding from scratch. The total work across all merges is O(nlogn)—far less than the naive O(n2).

Difficulty: [Advanced]Prerequisites: Segment Tree, DFS & Tree DFSCF Rating Range: 2000–2400+

Contest Frequency: Specialized—appears in advanced tree problems


The Core Idea

A dynamic segment tree doesn't pre-allocate all 4n nodes—it creates nodes only when needed, using pointers or an index-based pool. A tree covering range [0,105) might hold just 20 nodes if only one value has been inserted.

Merging two such trees means walking them simultaneously. At each position:

  • If only one tree has a node, take it immediately (pointer steal, O(1)).
  • If both have nodes, combine their values and recurse into children.

The cost is O(number of shared nodes). Since each node is destroyed when merged—absorbed into the result tree—every node participates in at most one merge. Across all merges in a DFS, each of the n original insertions contributes O(logV) nodes, giving O(nlogV) total work.

text
  Tree A (values at positions 1, 4):     Tree B (values at positions 3, 4):

       [0,8)                                  [0,8)
      /     \                                /     \
   [0,4)   [4,8)                          [0,4)   [4,8)
   /          \                              \
  [0,2)      [4,6)                          [2,4)
     \          \                              \
    [1,2)*    [4,5)*                          [3,4)*   (and [4,5)* merged)

  Merge A into B:
  - At [0,8): both exist -> recurse
  - At [0,4): both exist -> recurse
    - At [0,2): only A has it -> steal pointer, done
    - At [2,4): only B has it -> keep, done
  - At [4,8): both exist -> recurse
    - At [4,6): both have [4,5) -> combine values at leaf

  Result: single tree with values at {1, 3, 4}
  Work: 7 node visits (not 16 for a full tree)

Dynamic Segment Tree with Pool Allocation

Contest-friendly: no new/delete—just array indices.

cpp
#include <bits/stdc++.h>
using namespace std;

const int MAXNODES = 20'000'000;  // tune to problem
int lc[MAXNODES], rc[MAXNODES], cnt[MAXNODES];
int pool = 0;

int newNode() {
    int id = ++pool;
    lc[id] = rc[id] = cnt[id] = 0;
    return id;
}

void update(int& v, int lo, int hi, int pos, int val) {
    if (!v) v = newNode();
    if (hi - lo == 1) { cnt[v] += val; return; }
    int mid = (lo + hi) / 2;
    if (pos < mid) update(lc[v], lo, mid, pos, val);
    else           update(rc[v], mid, hi, pos, val);
    cnt[v] = cnt[lc[v]] + cnt[rc[v]];
}

int query(int v, int lo, int hi, int ql, int qr) {
    if (!v || ql >= hi || qr <= lo) return 0;
    if (ql <= lo && hi <= qr) return cnt[v];
    int mid = (lo + hi) / 2;
    return query(lc[v], lo, mid, ql, qr)
         + query(rc[v], mid, hi, ql, qr);
}

int merge(int a, int b, int lo, int hi) {
    if (!a) return b;
    if (!b) return a;
    if (hi - lo == 1) {
        cnt[a] += cnt[b];
        return a;
    }
    int mid = (lo + hi) / 2;
    lc[a] = merge(lc[a], lc[b], lo, mid);
    rc[a] = merge(rc[a], rc[b], mid, hi);
    cnt[a] = cnt[lc[a]] + cnt[rc[a]];
    return a;
}

Typical Usage: Subtree Queries via DFS

Give each vertex its own segment tree and insert the vertex's value. After recursing into all children, merge their trees up into the current vertex. When DFS returns, the merged tree holds exactly the values in that vertex's subtree—available for any query you need: count distinct, kth smallest, range sum, and so on.

cpp
int root[MAXN];  // root[v] = seg tree root for vertex v

void dfs(int v, int par) {
    root[v] = 0;
    update(root[v], 0, MAXV, value[v], 1);
    for (int u : adj[v]) {
        if (u == par) continue;
        dfs(u, v);
        root[v] = merge(root[v], root[u], 0, MAXV);
    }
    // Now root[v] contains all values in subtree of v
    // Answer queries about v here (e.g., count distinct, kth element)
}

Why the Total Cost Is O(nlogn)

Each element appears in at most one initial tree, contributing O(logn) nodes—one per level. The total node count across all initial trees is therefore O(nlogn).

When two trees merge, the work is proportional to the number of positions where both trees have a node. Crucially, a merge destroys nodes: two overlapping nodes collapse into one. Since each node can be destroyed at most once, the total work across all merge operations is bounded by that initial node count—O(nlogn).

Key insight: merging never creates new nodes; it only consumes them. The cost is fully pre-paid by the insertions that built the trees.

Complexity

OperationTime
Insert one valueO(logV)
Merge two treesO(shared nodes)
All merges in DFSO(nlogV) amortized
Query on merged treeO(logV)

Space: O(nlogV) nodes total. Set MAXNODES accordingly—typically n20 to n40.

Segment Tree Merge vs. DSU on Tree

Both techniques aggregate information over subtrees, but the right choice depends on what the aggregation needs to do:

Seg tree mergeDSU on tree
FlexibilityCan answer any query on the merged structure (kth element, range count, etc.)Limited to operations that support add/remove
ImplementationMore complex, dynamic allocationSimpler, uses arrays
Constant factorHeavier (pointer chasing)Lighter
Best for"Kth smallest in subtree," range queries"Count distinct," frequency histograms

Pitfalls

  • Memory pool too small. Each insertion creates O(logV) nodes, so n insertions need roughly nlogV total. A pool of 20n is usually safe; bump to 40n if V is large.
  • Merging destroys tree B. After merge(a, b), tree b is invalid—its nodes have been absorbed into a. Don't query b afterward.
  • Not resetting pool between test cases. If the problem has multiple test cases, reset pool = 0 and re-zero the arrays or they'll carry over stale data.
  • Stack overflow on deep recursion. Both the DFS and the segment tree recurse. For n=2×105 with a chain tree, you need a large stack (ulimit -s unlimited or an iterative DFS).

Practice Problems

#ProblemDifficultyNotes
1CF 600E -- Lomsat gelral2300Dominant color in subtree (also solvable with DSU on tree)
2CF 208E -- Blood Cousins2100Count nodes at depth k in subtree
3CF 1009F -- Dominant Indices2300DSU-on-tree alternative via merge
4CF 1903F -- Baby Ehab Plays with Ranges2400Merge-based offline approach

Further Reading

Built for the climb from Codeforces 1100 to 2100.