Persistent Segment Tree

Quick Revisit
USE WHEN: Querying historical versions of an array, or finding k-th smallest element in any subarray
INVARIANT: Each update allocates O(log n) NEW nodes along the affected root-to-leaf path; all other nodes are SHARED with the previous version via pointer (copy-on-write)
TIME: O(log n) per update/query | SPACE: O(n log n) total across all versions
CLASSIC BUG: Reusing node storage (array-based pool) without allocating new nodes on update — silently corrupts previous versions
PRACTICE: 05-ladder-graphs

Maintain multiple versions of a segment tree where each update allocates $O (\log n)$ new nodes along one root-to-leaf path and shares every other subtree with the previous version (copy-on-write) -- answer historical queries and find the $k$ -th smallest in any subarray in $O (\log n)$ .

Difficulty: [Advanced] -- CF 1900-2100
Prerequisites: Segment Tree, Coordinate Compression
ID: DS-14

Why is this in the graph chapter? Persistent segment trees are a data structure, not a graph algorithm, but they show up here because the most common contest applications layer them on top of tree decompositions: persistent segment trees keyed by Euler-tour position give online $k$ -th-smallest-on-path queries, persistent BITs combined with HLD give versioned path aggregates, and persistent arrays are the standard tool for offline-to-online conversions on tree problems. The technique deserves its own lesson alongside the tree-query decompositions because almost every advanced application of it is on a tree.

Intuition

You have an array and receive queries of the form: "What is the $k$ -th smallest element in the subarray $a [l . . r]$ ?"

a = [3, 1, 4, 1, 5, 9]    (indices 1..6)

Query: 2nd smallest in a[2..5] = {1, 4, 1, 5} --> answer is 1
Query: 3rd smallest in a[1..4] = {3, 1, 4, 1} --> answer is 3

Naive approach: For each query, extract the subarray, sort it, return the $k$ -th element. That is $O (n \log n)$ per query.

Approach	Per query	$Q$ queries on $n = 2 \times 10^{5}$
Sort subarray	$O (n \log n)$	$\approx 3.4 \times 10^{10}$ -- TLE
Merge sort tree + binary search	$O (\log^{3} n)$	$\approx 7.8 \times 10^{7}$ -- OK but ugly
Persistent segment tree	$O (\log n)$	$\approx 3.4 \times 10^{6}$ -- fast and clean

The merge sort tree approach works but requires binary search over the answer plus a count query per candidate, giving an extra $\log^{2}$ factor. We want $O (\log n)$ per query, and we want to handle it online (each query can depend on the previous answer).

We need a way to "subtract" the frequency information of prefix $[1. . l - 1]$ from prefix $[1. . r]$ -- like prefix sums, but over entire segment trees.

"Build a segment tree over values for each prefix of the array; instead of storing $n$ independent trees ( $O (n^{2})$ memory), share unchanged subtrees between consecutive versions -- each new version costs only $O (\log n)$ new nodes."

Think of it like version control for files. When you commit a change to one file in a repository, Git does not copy every file -- it reuses the unchanged blobs and only stores the diff. A persistent segment tree does the same: when one path from root to leaf changes, all other subtrees are shared via pointers.

The analogy is nearly exact: each "commit" (version) is a new root pointer, the "files" are leaf nodes, and "directories" are internal nodes. The only place it breaks down is that Git uses content-addressable hashing while we use direct pointers.

Visual Walkthrough

We build a segment tree over a value range $[1. .6]$ (the distinct sorted values). We insert elements of a one at a time, creating a new version after each insertion.

Compressed values (for simplicity, values happen to already be in $[1. .6]$ ):

a = [3, 1, 4, 1, 5, 9]  -->  compressed: [2, 1, 3, 1, 4, 5]
sorted distinct: {1, 3, 4, 5, 9}  -->  mapped to {1, 2, 3, 4, 5}
  original:  1 -> 1,  3 -> 2,  4 -> 3,  5 -> 4,  9 -> 5

Value range segment tree has nodes covering $[1. .5]$ . Each node stores a count of how many inserted elements fall in that value range.

Version 0 (empty tree):

              root0
             [1..5]=0
            /        \
       [1..3]=0    [4..5]=0
       /    \       /    \
   [1..2]=0 [3]=0 [4]=0 [5]=0
    /   \
  [1]=0 [2]=0

Step 1: Insert compressed value 2 (original 3). Create version 1.

Only the path containing value 2 changes: root -> [1..3] -> [1..2] -> [2]. We create 4 new nodes (marked *). All other subtrees are shared.

    root0                       root1*
   [1..5]=0                    [1..5]=1*
   /      \                    /        \
[1..3]=0  [4..5]=0        [1..3]=1*   [4..5]=0  <-- shared!
 /    \                    /      \
[1..2]=0 [3]=0         [1..2]=1*  [3]=0  <-- shared!
 /   \                  /     \
[1]=0 [2]=0          [1]=0   [2]=1*
        ^-- shared!    ^-- shared!

Step 2: Insert compressed value 1 (original 1). Create version 2.

    root1                       root2*
   [1..5]=1                    [1..5]=2*
   /      \                    /        \
[1..3]=1  [4..5]=0        [1..3]=2*   [4..5]=0  <-- shared!
 /    \                    /      \
[1..2]=1 [3]=0         [1..2]=2*  [3]=0  <-- shared!
 /   \                  /     \
[1]=0 [2]=1          [1]=1*  [2]=1
                               ^-- shared from v1!

Memory growth after all 6 insertions:

  Version:    v0    v1    v2    v3    v4    v5    v6
  New nodes:   8     4     4     4     4     4     4
  Total:       8    12    16    20    24    28    32

  +--+--+--+--+--+--+--+
  |  v0 base tree (8)   |  = O(m) initial nodes
  +--+--+--+--+--+--+--+
  |v1|v2|v3|v4|v5|v6|   |  = 6 x O(log m) new nodes
  +--+--+--+--+--+--+   |
  |  shared subtrees     |  = most of the pool
  +----------------------+
  
  Total pool: O(m + n log m) ~= 8 + 6x4 = 32 nodes
  (vs 8x7 = 56 if we copied whole tree each time)

NODE POOL LAYOUT -- persistent segment tree memory:

  Pool index:  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 ...
               ├──── v0 base tree ────┤├─ v1 ─┤├─ v2 ─┤├─ v3 ─┤
               │  8 nodes (full tree) │ 4 new  │ 4 new │ 4 new │
               └──────────────────────┘└───────┘└──────┘└──────┘

  roots[0] = 1     (root of base tree)
  roots[1] = 9     (root of version 1, shares nodes 2-8)
  roots[2] = 13    (root of version 2, shares nodes from v0 and v1)
  roots[3] = 17    (root of version 3)

  Each new version adds O(log m) nodes to the pool.
  Total after n updates: 8 + 4n nodes (for m=8 values).
  
  ┌──────────────────────────────────────────────────┐
  │ NEVER modify pool[1..8] -- they belong to v0.    │
  │ NEVER modify pool[9..12] -- they belong to v1.   │
  │ Old versions are IMMUTABLE.                      │
  └──────────────────────────────────────────────────┘

After inserting all 6 elements we have versions $v_{0}, v_{1}, \dots, v_{6}$ .

Answering: "2nd smallest in $a [2. .5]$ "

Prefix $[1. .5]$ is version 5. Prefix $[1. .1]$ is version 1. The "subarray tree" is $v_{5} - v_{1}$ (count differences).

Walk both trees in parallel from root:

Step 1: At root [1..5]
  left_count = v5.left.count - v1.left.count
             = 3 - 1 = 2
  k = 2.  Since k <= left_count (2 <= 2), go LEFT.

Step 2: At [1..3]
  left_count = v5.[1..2].count - v1.[1..2].count
             = 2 - 1 = 1
  k = 2.  Since k > left_count (2 > 1), go RIGHT. k = k - 1 = 1.

Step 3: At [3..3] -- leaf!
  Value 3 in compressed space = original value 4.
  Answer: 4.

Verify: $a [2. .5] = {1, 4, 1, 5}$ , sorted = ${1, 1, 4, 5}$ , 2nd smallest $= 1$ .

Wait -- let me recheck the compression. $a [2. .5]$ with 1-indexed is elements $a [2] = 1, a [3] = 4, a [4] = 1, a [5] = 5$ , sorted: ${1, 1, 4, 5}$ , $k = 2$ gives $1$ .

Let me redo the walk carefully. Versions: $v_{1}$ has element $a [1] = 3$ inserted, $v_{5}$ has elements $a [1. .5] = {3, 1, 4, 1, 5}$ inserted. The difference $v_{5} - v_{1}$ represents elements $a [2. .5] = {1, 4, 1, 5}$ in compressed space ${1, 1, 3, 4}$ .

Step 1: root [1..5]
  left_count = v5.[1..3].cnt - v1.[1..3].cnt = 3 - 1 = 2
  k = 2 <= 2, go LEFT into [1..3].

Step 2: node [1..3]
  left_count = v5.[1..2].cnt - v1.[1..2].cnt = 2 - 1 = 1
  k = 2 > 1, go RIGHT into [3..3]. k = 2 - 1 = 1.

Step 3: leaf [3..3]
  Compressed value 3 maps to original value 4.

Sorted subarray ${1, 1, 4, 5}$ , $k = 2$ is $1$ , not $4$ . The discrepancy is because $v_{1}$ already contains one occurrence of compressed value 1 (the element $a [1] = 3$ maps to compressed 2, not 1). Let me recount.

After inserting $a [1] = 3$ (compressed 2): $v_{1}$ .[1..2].cnt $= 1$ , specifically [2].cnt $= 1$ .

After inserting $a [1. .5] = {3, 1, 4, 1, 5}$ : compressed ${2, 1, 3, 1, 4}$ . $v_{5}$ .[1..2].cnt $= 3$ (two 1s and one 2). So left_count $= 3 - 1 = 2$ . At [1..3]: $v_{5}$ .[1..2].cnt $- v_{1}$ .[1..2].cnt $= 3 - 1 = 2$ . $k = 2 \leq 2$ , go left. At [1..2]: left_count = $v_{5}$ .[1].cnt $- v_{1}$ .[1].cnt $= 2 - 0 = 2$ . $k = 2 \leq 2$ , go left to leaf [1]. Compressed 1 = original 1. Answer: 1. OK

VERSION COMPARISON -- "subarray tree" via subtraction:

  Array a = [3, 1, 4, 1, 5],  values compressed to [1..5]

  Version 2 (prefix [1..2]):    Version 5 (prefix [1..5]):
       [1..5]=2                      [1..5]=5
       /      \                      /      \
    [1..3]=2  [4..5]=0          [1..3]=3  [4..5]=2
     /    \                      /    \     /    \
  [1..2]=2 [3]=0             [1..2]=2 [3]=1 [4]=1 [5]=1

  Difference (subarray [3..5] = {4, 1, 5}):
       [1..5]: 5-2 = 3
       /            \
    [1..3]: 3-2=1  [4..5]: 2-0=2
     /    \          /    \
  [1..2]:0  [3]:1  [4]:1  [5]:1

  k=2 query on [3..5]: left_count=1, k=2>1 -> go right
    at [4..5]: left_count([4])=1, k=1<=1 -> go left
    leaf [4] -> original value 5.  Sorted {1,4,5}, 2nd = 4... 
    
  (Verify by hand: subarray [3..5] = {4,1,5}, sorted = {1,4,5}, k=2 -> 4 OK)

Brute force: sort subarray ( $O (n \log n)$ per query). Persistent tree: $O (\log n)$ per query. For $n = 2 \times 10^{5}$ , that is $\sim 17$ steps vs $\sim 3.4 \times 10^{6}$ .

The Invariant

+---------------------------------------------------------------+
| INVARIANT: Version i of the persistent segment tree equals     |
| the frequency-count tree built from {a[1], a[2], ..., a[i]}.  |
| Versions i and i-1 share all nodes EXCEPT the O(log n) nodes  |
| on the root-to-leaf path for the inserted value.               |
| Therefore: cnt_in_range(l, r, val_range) =                    |
|   version[r].query(val_range) - version[l-1].query(val_range) |
+---------------------------------------------------------------+

This is the "prefix sum over segment trees" idea. Just as prefix[r] - prefix[l-1] gives the sum of a subarray, differencing two versions gives the frequency counts for any contiguous subarray. The k-th smallest walk exploits this by choosing left/right at each level based on the difference count.

Why is the invariant maintained? Each insertion adds exactly one occurrence of a value, updating only the root-to-leaf path for that value's position. All sibling subtrees are shared by pointer, so they carry identical counts. No node from a previous version is ever mutated.

When to Reach for This

Trigger phrases in problem statements:

" $k$ -th smallest in subarray $[l, r]$ " -> persistent segment tree
"answer queries about version $i$ of the data structure" -> persistence
"undo the last update" or "time-travel queries" -> persistence
" $n, q \leq 2 \times 10^{5}$ , queries online" + "order statistics on subarrays"
"number of distinct values in range less than $x$ " -> persistence on values

Counter-examples (looks like persistence but isn't):

"k-th smallest in the whole array with updates" -- that is an order-statistics tree or a BIT with binary search. No need for persistence.
" $k$ -th smallest in subarrays, but queries are offline and $n \leq 10^{5}$ " -- Mo's algorithm with an order-statistics structure may suffice and is simpler to implement.
"Range update + range query" -- that is a lazy segment tree, not persistence. Persistence and lazy propagation interact badly (see Gotchas).

What the Code Won't Teach You

Persistence is a mindset, not a data structure. The code creates new nodes and shares old subtrees -- that's the mechanism. The insight is deeper: any tree-shaped data structure can be made persistent by copying only the O(log n) nodes on the root-to-update path. This same principle applies to persistent tries, persistent balanced BSTs, and persistent heaps. Once you see the pattern -- "copy the path, share the rest" -- you'll recognize persistence opportunities everywhere.

  THE "COPY THE PATH" PRINCIPLE:

  Version i:          Version i+1:
      R                   R'  <- new root
     / \                 / \
    A   B               A'  B  <- B shared!
   / \                 / \
  C   D               C   D' <- C shared!
                            ^
                      only D changed -> new D'

  New nodes: R', A', D'  (O(log n) = 3 nodes)
  Shared:    B, C         (O(n) nodes reused)
  Old version i still works through root R

The prefix-sum analogy is the entire algorithm. Just as prefix[r] - prefix[l-1] gives you a subarray sum, version[r] - version[l-1] gives you frequency counts for any subarray. If you understand prefix sums, you understand persistent segment trees -- the only new idea is that each "prefix" is an entire segment tree rather than a single number.

Memory is the real enemy, not time. Each update creates $O (\log n)$ new nodes. After $n$ updates, you have $O (n \log n)$ total nodes. At 12 bytes per node (left, right, count), that's ~43 MB for $n = 2 \times 10^{5}$ . If the ML is 256 MB, you're fine. If it's 64 MB, you're in trouble. Always compute your memory budget before coding.

Persistence on trees opens up path queries. When your data lives on a tree (not an array), build persistent segment trees along the DFS: the version at node $v$ is the parent's version updated with $v$ 's value. Then the segment tree at any node contains the frequency counts for the root-to-node path. Combined with LCA, you can answer "k-th smallest on the path $u \to v$ " using four root pointers: ${cnt}_{u} + {cnt}_{v} - {cnt}_{lca} - {cnt}_{parent(lca)}$ .

PERSISTENCE ON TREES -- path query structure:

  Tree:            Persistent seg trees:
       1               v1 = insert(empty, val[1])
      / \              v2 = insert(v1, val[2])
     2   3             v3 = insert(v1, val[3])
    / \                v4 = insert(v2, val[4])
   4   5               v5 = insert(v2, val[5])

  Query: k-th smallest on path 4->3
    LCA(4,3) = 1,  parent(LCA) = none (root)

    Count in [val_range] on path 4->3:
      = cnt(v4) + cnt(v3) - cnt(v1) - cnt(v_parent(1))
      = cnt(v4) + cnt(v3) - cnt(v1) - 0

  Same prefix-subtraction idea, but on a tree!

C++ STL Reference

No STL container directly implements persistence. We build from scratch.

Component	Usage	Notes
`new` / memory pool	Allocate nodes dynamically	Pool allocation is 3-5x faster than `new`
`vector<int>`	Store root pointers per version	`roots[i]` = root of version $i$
`sort` + `unique` + `lower_bound`	Coordinate compression	Map values to $[0, m)$
`<algorithm>`	`lower_bound` for decompression	Recover original value from compressed

Implementation

Minimal Contest Template

cpp

#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200'005;
const int MAXNODES = MAXN * 20;  // n * ~20 nodes (n log n)

int lc[MAXNODES], rc[MAXNODES], cnt[MAXNODES];
int tot = 0;
int roots[MAXN];

int build(int l, int r) {
    int p = ++tot;
    cnt[p] = 0;
    if (l == r) return p;
    int m = (l + r) / 2;
    lc[p] = build(l, m);
    rc[p] = build(m + 1, r);
    return p;
}

int update(int prev, int l, int r, int pos) {
    int p = ++tot;
    lc[p] = lc[prev]; rc[p] = rc[prev]; cnt[p] = cnt[prev] + 1;
    if (l == r) return p;
    int m = (l + r) / 2;
    if (pos <= m) lc[p] = update(lc[prev], l, m, pos);
    else          rc[p] = update(rc[prev], m + 1, r, pos);
    return p;
}

int query(int u, int v, int l, int r, int k) {
    if (l == r) return l;
    int m = (l + r) / 2;
    int d = cnt[lc[v]] - cnt[lc[u]];
    if (k <= d) return query(lc[u], lc[v], l, m, k);
    else        return query(rc[u], rc[v], m + 1, r, k - d);
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, q;
    cin >> n >> q;
    vector<int> a(n), sorted_a;
    for (int i = 0; i < n; i++) { cin >> a[i]; sorted_a.push_back(a[i]); }

    sort(sorted_a.begin(), sorted_a.end());
    sorted_a.erase(unique(sorted_a.begin(), sorted_a.end()), sorted_a.end());
    int m = sorted_a.size();

    auto compress = [&](int x) {
        return int(lower_bound(sorted_a.begin(), sorted_a.end(), x) - sorted_a.begin());
    };

    roots[0] = build(0, m - 1);
    for (int i = 0; i < n; i++)
        roots[i + 1] = update(roots[i], 0, m - 1, compress(a[i]));

    while (q--) {
        int l, r, k;
        cin >> l >> r >> k;  // 1-indexed
        int idx = query(roots[l - 1], roots[r], 0, m - 1, k);
        cout << sorted_a[idx] << '\n';
    }
    return 0;
}

Explained Version

cpp

#include <bits/stdc++.h>
using namespace std;

// --- Memory pool for persistent nodes ---
// Each version creates O(log m) new nodes. Total nodes: O(n log m).
// With n = 2*10^5 and m = 2*10^5, log m ~ 18, so ~3.6*10^6 nodes.
// We allocate 20 * MAXN to be safe.
const int MAXN = 200'005;
const int MAXNODES = MAXN * 20;

int lc[MAXNODES];   // left child index
int rc[MAXNODES];   // right child index
int cnt[MAXNODES];  // count of values in this subtree
int tot = 0;         // next free node index

// roots[i] = root node index of version i.
// Version 0 = empty tree.
// Version i = tree after inserting a[0], a[1], ..., a[i-1].
int roots[MAXN];

// Build the initial empty tree over value range [l, r].
// Every node has count 0. Returns the root of this subtree.
int build(int l, int r) {
    int p = ++tot;
    cnt[p] = 0;
    if (l == r) return p;
    int m = (l + r) / 2;
    lc[p] = build(l, m);
    rc[p] = build(m + 1, r);
    return p;
}

// Insert one occurrence of value at position `pos` in the value range.
// `prev` is the root of the previous version's subtree.
// Returns the root of the new version's subtree.
// Only O(log m) new nodes are created; all other children are shared.
int update(int prev, int l, int r, int pos) {
    int p = ++tot;  // allocate new node

    // Start by copying everything from prev
    lc[p] = lc[prev];
    rc[p] = rc[prev];
    cnt[p] = cnt[prev] + 1;  // one more element in this range

    if (l == r) return p;  // leaf reached

    int m = (l + r) / 2;
    // Recurse only into the side containing `pos`.
    // The other side is shared with `prev` (already copied above).
    if (pos <= m)
        lc[p] = update(lc[prev], l, m, pos);
    else
        rc[p] = update(rc[prev], m + 1, r, pos);

    return p;
}

// Find the k-th smallest value in the subarray a[ql..qr] (1-indexed).
// We walk versions u = roots[ql-1] and v = roots[qr] in parallel.
// At each node, the "subarray count" in the left child is:
//   cnt[lc[v]] - cnt[lc[u]]
// If k <= that count, the answer is in the left half; otherwise right.
int query(int u, int v, int l, int r, int k) {
    if (l == r) return l;  // found the value (compressed index)

    int m = (l + r) / 2;
    int left_count = cnt[lc[v]] - cnt[lc[u]];

    if (k <= left_count)
        return query(lc[u], lc[v], l, m, k);
    else
        return query(rc[u], rc[v], m + 1, r, k - left_count);
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, q;
    cin >> n >> q;
    vector<int> a(n);
    vector<int> sorted_a;

    for (int i = 0; i < n; i++) {
        cin >> a[i];
        sorted_a.push_back(a[i]);
    }

    // --- Coordinate compression ---
    // Map values to [0, m-1] so the segment tree has O(m) leaves.
    sort(sorted_a.begin(), sorted_a.end());
    sorted_a.erase(unique(sorted_a.begin(), sorted_a.end()), sorted_a.end());
    int m = sorted_a.size();

    auto compress = [&](int x) -> int {
        return int(lower_bound(sorted_a.begin(), sorted_a.end(), x)
                   - sorted_a.begin());
    };

    // --- Build version 0 (empty) and versions 1..n ---
    roots[0] = build(0, m - 1);
    for (int i = 0; i < n; i++)
        roots[i + 1] = update(roots[i], 0, m - 1, compress(a[i]));

    // --- Answer queries ---
    while (q--) {
        int l, r, k;
        cin >> l >> r >> k;  // 1-indexed, find k-th smallest in a[l..r]
        int compressed_idx = query(roots[l - 1], roots[r], 0, m - 1, k);
        cout << sorted_a[compressed_idx] << '\n';  // decompress
    }

    return 0;
}

Operations Reference

Operation	Time	Space (new nodes)
Build empty tree (version 0)	$O (m)$	$O (m)$
Insert one value (new version)	$O (\log m)$	$O (\log m)$
Build all $n$ versions	$O (n \log m)$	$O (n \log m)$
$k$ -th smallest in $[l, r]$	$O (\log m)$	$O (1)$
Count of values $\leq x$ in $[l, r]$	$O (\log m)$	$O (1)$

Where $m$ = number of distinct values (after coordinate compression, $m \leq n$ ).

Total memory: $O (m + n \log m)$ nodes. For $n = 2 \times 10^{5}$ , this is roughly $3.6 \times 10^{6}$ nodes -- each node stores 3 integers (12 bytes), so $\sim 43$ MB. Keep MAXNODES $\geq 20 n$ to be safe.

  QUERY WALKTHROUGH -- k-th smallest in a[2..5]:

  roots[1] (prefix [1..1])     roots[5] (prefix [1..5])
  
       R₁                           R₅
      / \                           / \
   [1..3]=1  [4..5]=0         [1..3]=3  [4..5]=2
    /    \                     /    \
  [1..2]=1 [3]=0            [1..2]=3 [3]=0
  
  Difference tree (subarray [2..5]):
       R₅ - R₁
      / \
   [1..3]=2  [4..5]=2
    /    \
  [1..2]=2 [3]=0

  k=2: left_count=2, k<=2 -> go LEFT
  k=2: left_count([1])=2, k<=2 -> go LEFT
  Leaf [1] -> compressed value 1 -> original value 1  OK

Problem Patterns & Tricks

Pattern 1: K-th Smallest in Subarray

The canonical application. Build persistent tree on value-coordinate-compressed array. Answer each query by walking two versions.

cpp

// query(roots[l-1], roots[r], 0, m-1, k)
// returns compressed index -> decompress with sorted_a[idx]

Problems: MKTHNUM (SPOJ), CF 1042D

Pattern 2: Count of Values in Range $[a, b]$ within Subarray

Instead of walking to find the $k$ -th, query the sum in a value range. The difference of two versions gives the count.

cpp

int count(int u, int v, int l, int r, int ql, int qr) {
    if (ql > r || qr < l) return 0;
    if (ql <= l && r <= qr) return cnt[v] - cnt[u];
    int m = (l + r) / 2;
    return count(lc[u], lc[v], l, m, ql, qr)
         + count(rc[u], rc[v], m + 1, r, ql, qr);
}

Problems: CF 620E

Pattern 3: Online Queries (Forced Online)

Some problems force online processing by encrypting queries with the previous answer. Merge sort tree + binary search handles offline only. Persistent segment tree handles online naturally since each query is $O (\log n)$ with no preprocessing between queries.

Problems: CF 1535F

Pattern 4: Persistence on Trees (Path Queries)

Given a weighted tree, find the $k$ -th smallest edge/node weight on the path $u \to v$ . Build persistent trees along an Euler tour or DFS order: version of node $v$ = version of parent with $v$ 's value inserted.

query on path u..v:
  answer = walk(roots[u], roots[v], roots[lca], roots[parent[lca]], ...)

The count for path $u \to v$ is:

c n t_{u} + c n t_{v} - c n t_{l c a} - c n t_{p a r e n t (l c a)}

Problems: COT (SPOJ), CF 893F

Pattern 5: Version Rollback / Undo

Maintain a stack of root pointers. "Undo" = pop the latest root. "Query version $t$ " = use roots[t]. This is trivial with persistence since old roots are never invalidated.

cpp

// undo last update
version--;
// query version t
int ans = query(roots[t - 1], roots[t], 0, m - 1, k);

Problems: CF 707D

Pattern 6: Persistent + Difference for Interval Frequency

Need the number of elements equal to $x$ in $a [l . . r]$ ? Build the persistent tree, then:

cpp

int freq = count(roots[l-1], roots[r], 0, m-1, cx, cx);
// where cx = compressed index of x

This generalizes to "how many elements in $[l, r]$ have value in $[a, b]$ ?"

Problems: CF 1422F

Contest Cheat Sheet

+---------------------------------------------------------------+
|  PERSISTENT SEGMENT TREE CHEAT SHEET                          |
+---------------------------------------------------------------+
|  When to use:                                                 |
|    - k-th smallest in subarray [l,r], online                  |
|    - queries on historical versions of a data structure        |
|    - prefix-subtraction trick on segment trees                 |
|    - count values in [a,b] within subarray [l,r]              |
+---------------------------------------------------------------+
|  Memory:                                                      |
|    const int MAXNODES = MAXN * 20;                            |
|    int lc[MAXNODES], rc[MAXNODES], cnt[MAXNODES];            |
|    int roots[MAXN];                                           |
+---------------------------------------------------------------+
|  Core calls:                                                  |
|    roots[0] = build(0, m - 1);                                |
|    roots[i+1] = update(roots[i], 0, m-1, compressed_val);    |
|    kth = query(roots[l-1], roots[r], 0, m-1, k);             |
+---------------------------------------------------------------+
|  Complexity:                                                  |
|    Build:  O(n log m) time, O(n log m) space                  |
|    Query:  O(log m) time,  O(1) space                         |
|    m = distinct values (<= n after compression)               |
+---------------------------------------------------------------+

Gotchas & Debugging

Memory: The #1 Killer

Persistent trees use far more memory than regular segment trees. Each update creates $O (\log m)$ new nodes.

Rule of thumb: allocate $20 n$ nodes. For $n = 2 \times 10^{5}$ , that is $4 \times 10^{6}$ nodes $\times$ 12 bytes $= 48$ MB.
If the memory limit is 256 MB, you are fine. If it is 64 MB, consider using short or restructuring.
Never use new/malloc per node. Use the flat array pool shown in the template. Dynamic allocation has massive overhead and will TLE or MLE.

Off-by-One in Versions

roots[0] = empty tree
roots[i] = tree after inserting a[0], a[1], ..., a[i-1]

Query a[l..r] (1-indexed):
  use roots[l-1] and roots[r]    <-- NOT roots[l] and roots[r]

Getting this wrong gives WA on almost every test case. Double-check your indexing convention (0-indexed vs 1-indexed) before submitting.

Coordinate Compression Mistakes

Forgetting to sort + unique before compressing.
Using lower_bound on the unsorted array.
Off-by-one when the compressed range is $[0, m - 1]$ but you build the tree on $[1, m]$ . Pick one convention and stick to it.

Persistence + Lazy Propagation = Pain

Combining persistence with lazy propagation is possible but tricky. When you push down a lazy tag, you create new children -- but the old version's children must remain untouched. This means pushdown itself must create new nodes. Memory usage roughly doubles. Avoid this unless the problem specifically requires it.

Node Counter Overflow

The tot counter must not exceed MAXNODES. If it does, you get silent memory corruption. Add a debug assert:

cpp

int p = ++tot;
assert(tot < MAXNODES);  // remove before submission or use #ifdef

Persistence Invariant

NEVER modify nodes from old versions. When updating, always create NEW nodes. If you modify an existing node, you corrupt all versions that share it. This is the fundamental invariant of persistence.

Kth smallest with two versions: The answer uses query(roots[r]) - query(roots[l-1]). Specifically, at each node: left_count = versions[r].left.cnt - versions[l-1].left.cnt. If left_count >= k, recurse left; else recurse right with k -= left_count.

Common Debugging Checklist

Print tot after building all versions -- is it within bounds?
Verify coordinate compression: print compressed values and check they are in $[0, m - 1]$ .
For a single query, manually trace the walk and compare with brute force.
Check that build initializes all counts to 0 (not garbage).
If MLE: reduce MAXNODES, use int not long long for counts.

Mental Traps

Trap 1: "I need to copy the entire tree for each version."

This is the $O (n)$ -per-update misconception. You do NOT copy the whole tree. You create $O (\log n)$ new nodes (one per level on the root-to-leaf path) and point everything else to the old version's nodes. If you're allocating $O (n)$ nodes per update, your implementation is wrong -- and you'll MLE before you WA.

  WRONG: Copy whole tree         RIGHT: Copy only the path
  
  n=8, update value at pos 3:   Same update:
  
  Create 15 new nodes (all!)     Create 4 new nodes (path only)
  
       [new]                          [new]
      /      \                       /      \
   [new]    [new]                [new]    [shared]
   /   \    /   \                /   \
 [new][new][new][new]         [new][shared]
 / \ / \ / \ / \              / \
 8 leaves, all new          [shared][new]
 
  Memory: O(n) per update        Memory: O(log n) per update
  Total: O(n²) -> MLE             Total: O(n log n) -> OK

Trap 2: "Persistence and lazy propagation work together naturally."

They don't. Lazy propagation pushes tags down to children -- but in a persistent tree, those children might be shared with other versions. Pushing to a shared child corrupts all versions sharing that child. To combine persistence with lazy propagation, every pushdown must create new child nodes (doubling memory usage). Avoid this unless the problem absolutely requires it.

  THE LAZY + PERSISTENCE CONFLICT:

  Version 1 and Version 2 share child node X:

  v1_root ──-> ... ──-> [X]  <-── ... <-── v2_root

  Lazy pushdown on v2 modifies X in-place:

  v1_root ──-> ... ──-> [X'] <-── ... <-── v2_root
                        ^
                   CORRUPTED! v1 now sees v2's update.

  Fix: pushdown creates NEW children
  v1_root ──-> ... ──-> [X]      [X'] <-── ... <-── v2_root
                        ^         ^
                    untouched   new copy with pushed tags
  Cost: 2x memory

Trap 3: "I can use coordinate compression with 1-indexed AND 0-indexed interchangeably."

Pick one convention and stick with it. If your segment tree covers values $[0, m - 1]$ but you build on $[1, m]$ , the root-to-leaf path targets the wrong leaf. Worse, mixing conventions between build, update, and query produces subtle off-by-one errors that pass most test cases but fail on edge values (the minimum or maximum in the array).

COORDINATE COMPRESSION -- index convention trap:

  Values: [3, 1, 4, 1, 5]
  Sorted unique: [1, 3, 4, 5]  ->  m = 4

  Convention A (0-indexed):      Convention B (1-indexed):
    1 -> 0                         1 -> 1
    3 -> 1                         3 -> 2
    4 -> 2                         4 -> 3
    5 -> 3                         5 -> 4
    build(0, m-1) = build(0,3)    build(1, m) = build(1,4)

  X MIXING: compress to 0-indexed, build(1, m)
    -> leaf for value "1" is at position 0
    -> but tree covers [1..4], so position 0 doesn't exist!
    -> silent wrong answer on queries involving minimum value

Trap 4: "Persistent segment tree always beats merge sort tree."

Not always. Merge sort tree uses $O (n \log n)$ space (same as persistent seg tree) and answers queries in $O (\log^{3} n)$ vs $O (\log n)$ . But merge sort tree is simpler to implement, handles offline queries naturally, and doesn't require node pools. For offline problems with generous time limits, merge sort tree may be the pragmatic choice. Persistence shines for online queries or when you need historical version access.

PERSISTENT SEG TREE vs MERGE SORT TREE:

  ┌─────────────────────┬─────────────────┬──────────────────┐
  │                     │ Persistent Seg  │ Merge Sort Tree  │
  ├─────────────────────┼─────────────────┼──────────────────┤
  │ Query time          │ O(log n)        │ O(log³ n)        │
  │ Space               │ O(n log n)      │ O(n log n)       │
  │ Online queries?     │ OK YES           │ X harder         │
  │ Implementation      │ Complex (pools) │ Simpler          │
  │ Debugging           │ Harder          │ Easier           │
  │ Lazy propagation    │ Painful + 2xmem │ N/A              │
  └─────────────────────┴─────────────────┴──────────────────┘

  Rule of thumb:
    Online + tight TL -> persistent seg tree
    Offline + lenient TL -> merge sort tree may suffice

Common Mistakes

Underestimating the node pool size. Persistent segment tree RE on the judge even though small tests pass and the pool seems "large enough." The trap: build for the initial empty tree allocates $O (2 n)$ nodes (a full empty tree), and then each of the $n$ insertions allocates $O (\log n)$ more -- so a pool of 20 * n is not enough. Size the pool as MAXN * 2 + MAXN * 20 (or just 40 * MAXN to be safe), and always calculate the exact bound: $2 \cdot range + (n + q) \cdot ⌈ \log_{2} (range) ⌉$ .

Debug Drills

Drill 1 -- Persistent update forgets to refresh cnt

cpp

int update(int prev, int lo, int hi, int idx) {
    int cur = new_node();
    tree[cur] = tree[prev];  // copy previous node
    if (lo == hi) {
        tree[cur].cnt++;
        return cur;
    }
    int mid = (lo + hi) / 2;
    if (idx <= mid)
        tree[cur].left = update(tree[cur].left, lo, mid, idx);
    else
        tree[cur].right = update(tree[cur].right, mid + 1, hi, idx);
    // BUG: forgot to update cnt
    return cur;
}

What's wrong?

After recursively updating the left or right child, the cnt of the current node is not recalculated. It still holds the old count from prev.

Fix: Add before return cur;:

cpp

tree[cur].cnt = tree[tree[cur].left].cnt + tree[tree[cur].right].cnt;

Drill 2 -- k-th smallest query with wrong left root

cpp

int kth(int l_root, int r_root, int lo, int hi, int k) {
    if (lo == hi) return lo;
    int mid = (lo + hi) / 2;
    int left_count = tree[tree[r_root].left].cnt - tree[tree[l_root].left].cnt;
    if (k <= left_count)
        return kth(tree[l_root].left, tree[r_root].left, lo, mid, k);
    else
        return kth(tree[l_root].right, tree[r_root].right, mid + 1, hi, k - left_count);
}

// Called as: kth(root[l], root[r], 0, MAX_VAL, k)

What's wrong?

The left root should be root[l-1], not root[l]. The persistent tree version root[i] contains elements $a_{1} \dots a_{i}$ . To get the count for subarray $[l, r]$ , you need root[r] - root[l-1].

Fix: Call as kth(root[l-1], root[r], 0, MAX_VAL, k).

Drill 3 -- Node pool sized too small

cpp

const int MAXN = 200005;
const int MAXLOG = 18;
struct Node { int left, right, cnt; } tree[MAXN * MAXLOG];
int node_cnt = 0;
int new_node() { return node_cnt++; }

// build empty tree, then n insertions
int root[MAXN];
root[0] = build(0, MAX_VAL);       // allocates ~2*MAX_VAL nodes
for (int i = 1; i <= n; i++)
    root[i] = update(root[i-1], 0, MAX_VAL, a[i]);  // allocates ~MAXLOG nodes each

What's wrong?

The pool size MAXN * MAXLOG = $200005 \times 18 \approx 3.6 \times 10^{6}$ might not be enough. The build function alone allocates $\sim 2 \times MAX_VAL$ nodes. If MAX_VAL is $10^{6}$ after compression, that's $2 \times 10^{6}$ nodes just for the empty tree, plus $2 \times 10^{5} \times 18$ for updates = $5.6 \times 10^{6}$ total.

Fix: Size the pool as 2 * MAX_VAL + MAXN * MAXLOG + extra, or skip building an explicit empty tree and let null pointer (index 0) represent empty nodes with cnt = 0.

Self-Test

Before moving on, verify you can answer these without looking at the notes:

[ ] Explain "copy the path, share the rest" -- draw a persistent segment tree after one update, marking which nodes are new and which are shared from the previous version.
[ ] Write the update function from memory -- allocating a new node, copying left/right children from prev, incrementing cnt, and recursing into only the side containing the target position.
[ ] Solve "k-th smallest in subarray $[l, r]$ " using two root pointers -- walk roots[l-1] and roots[r] in parallel, using cnt[lc[v]] - cnt[lc[u]] to decide left vs right at each level.
[ ] Calculate the memory budget: for $n = 2 \times 10^{5}$ elements and $n$ updates, how many nodes total, how many bytes, and what MAXNODES should be.
[ ] State when persistence is NOT needed -- offline queries (use Mo's algorithm), whole-array order statistics (use BIT + binary search), or range update + range query (use lazy segment tree).
[ ] Given a persistent segment tree on the value range $[0, 7]$ with 5 versions, draw the node-sharing diagram after inserting values 3, 1, 4, 1, 5 -- mark shared vs. new nodes for version 3.
[ ] Compute the exact node pool size needed for $n = 10^{5}$ elements: initial tree has $2 m - 1$ nodes (where $m$ = next power of 2 >= distinct values), plus $n \cdot ⌈ \log_{2} m ⌉$ new nodes from updates. Verify this fits in 256 MB.
[ ] Trace the k-th smallest query for $k = 3$ on subarray $[2, 5]$ through versions $v_{1}$ and $v_{5}$ , showing the left_count comparison at each tree level.
[ ] Explain why persistence + lazy propagation doubles memory: each pushdown creates 2 new child nodes instead of modifying shared children in-place.
[ ] Describe how to build persistent segment trees on a rooted tree for path queries -- what is the version ordering, and how does the LCA formula generalize the prefix subtraction?

Practice Problems

CF Rating	What You Should Be Able To Do
1200	Understand segment trees; persistence is not expected
1500	Know that persistent data structures exist; solve k-th smallest offline with merge-sort tree
1800	Implement persistent seg tree for k-th smallest in subarray; understand path copying
2100	Combine persistence with LCA for tree path queries; handle coordinate compression; manage memory pools to avoid MLE

#	Problem	Source	Difficulty	Key Idea
1	Static Range Kth Smallest	Library Checker	Easy	Direct persistent segtree
2	MKTHNUM	SPOJ	Easy	k-th smallest in subarray
3	COT	SPOJ	Medium	k-th on tree path via persistence
4	D - Pairs	CF 1042D	Medium	Count pairs with sum in range
5	D - Copying Data	CF 292D	Medium	Array version control
6	D - Persistent Bookcase	CF 707D	Medium	Version rollback on DFS tree
7	E - The Classic Problem	CF 464E	Hard	Persistent segtree for big numbers
8	F - Strange Covering	CF 893F	Hard	Tree path + persistence
9	Range Queries and Copies	CSES	1900	Persistent seg tree: create copies, query/update any version
10	Salary Queries	CSES	1700	Dynamic order statistics (can solve with persistent or BIT+compression)

Persistent Segment Tree ​

Intuition ​

Visual Walkthrough ​

The Invariant ​

When to Reach for This ​

What the Code Won't Teach You ​

C++ STL Reference ​

Implementation ​

Minimal Contest Template ​

Explained Version ​

Operations Reference ​

Problem Patterns & Tricks ​

Pattern 1: K-th Smallest in Subarray ​

Pattern 2: Count of Values in Range [a,b] within Subarray ​

Pattern 3: Online Queries (Forced Online) ​

Pattern 4: Persistence on Trees (Path Queries) ​

Pattern 5: Version Rollback / Undo ​

Pattern 6: Persistent + Difference for Interval Frequency ​

Contest Cheat Sheet ​

Gotchas & Debugging ​

Memory: The #1 Killer ​

Off-by-One in Versions ​

Coordinate Compression Mistakes ​

Persistence + Lazy Propagation = Pain ​

Node Counter Overflow ​

Persistence Invariant ​

Common Debugging Checklist ​

Mental Traps ​

Common Mistakes ​

Debug Drills ​

Self-Test ​

Practice Problems ​

Further Reading ​

Persistent Segment Tree

Intuition

Visual Walkthrough

The Invariant

When to Reach for This

What the Code Won't Teach You

C++ STL Reference

Implementation

Minimal Contest Template

Explained Version

Operations Reference

Problem Patterns & Tricks

Pattern 1: K-th Smallest in Subarray

Pattern 2: Count of Values in Range $[a, b]$ within Subarray

Pattern 3: Online Queries (Forced Online)

Pattern 4: Persistence on Trees (Path Queries)

Pattern 5: Version Rollback / Undo

Pattern 6: Persistent + Difference for Interval Frequency

Contest Cheat Sheet

Gotchas & Debugging

Memory: The #1 Killer

Off-by-One in Versions

Coordinate Compression Mistakes

Persistence + Lazy Propagation = Pain

Node Counter Overflow

Persistence Invariant

Common Debugging Checklist

Mental Traps

Common Mistakes

Debug Drills

Self-Test

Practice Problems

Further Reading