Skip to content

Segment Tree

Divide-and-conquer over array indices: answer any associative range query and handle point updates in O(logn) time, with O(n) build.

Difficulty: [Intermediate]Prerequisites: Binary Indexed Tree (Fenwick Tree)Contest Frequency: Common—essential for Div 2 D+ problems


Intuition

You have an array of 8 integers and a stream of requests:

text
a = [3, 1, 4, 1, 5, 9, 2, 6]         n = 8

The requests come in two flavors:

  1. Range sum query—"What is a[2]+a[3]+a[4]+a[5]?"
  2. Point update—"Set a[4]=7."

There are q=105 such requests, interleaved unpredictably.

Attempt 1—Brute force. Walk the range every time. One query costs O(n). Over q queries that is O(nq)=105×105=1010 operations. Way too slow (needs <108).

Attempt 2—Prefix sums. Build pre[i]=a[0]++a[i]. A range sum is pre[r]pre[l1] in O(1). Wonderful—until a point update hits. One update forces you to rebuild every prefix from the changed index onward: O(n) per update. Back to O(nq) in the worst case.

MethodQuery costUpdate costTotal for q mixed ops
Brute forceO(n)O(1)O(nq)
Prefix sumsO(1)O(n)O(nq)

Both hit the same wall. We need both operations fast.

Split in Half, Then Keep Splitting

Split the array in half. Split each half. Keep going. Now each node stores the answer for its range, and any query touches at most O(logn) nodes.

Think of a company org chart. The CEO does not ask every employee for a sales report. Regional managers know their own totals; the CEO combines a handful of regional numbers. If one salesperson's figure changes, only the managers above them need to update—O(logn) of them, not the whole company.

A segment tree is that idea applied to array indices. Every leaf is one array element. Every internal node stores the aggregate of its children's ranges: sum, min, max, GCD, or any other associative operation. You can also see it as a tournament bracket: each internal node is the winner of a match between its two children. A range query walks the bracket collecting results, and a point update replays the affected matches from the leaf back to the root—exactly O(logn) games each time.

The analogy breaks the moment the operation is not associative. Median and mode do not compose from child summaries, so a segment tree cannot recover them from local information alone.

In algebraic terms, the tree works over any monoid—an associative binary operation with an identity element. Sum has identity 0, min has identity +, max has identity , GCD has identity 0, bitwise OR has identity 0. Once you identify the operation and its identity, the segment tree mechanics handle the rest.

Walking Through the Tree

We build on the concrete array:

text
a = [3, 1, 4, 1, 5, 9, 2, 6]    (indices 0..7, sum = 31)

Step 1—Build the tree (bottom-up)

text
                          [0..7]=31
                        /            \
                 [0..3]=9            [4..7]=22
                /        \          /          \
          [0..1]=4    [2..3]=5   [4..5]=14   [6..7]=8
          /    \       /    \     /    \       /    \
       [0]=3 [1]=1 [2]=4 [3]=1 [4]=5 [5]=9 [6]=2 [7]=6

8 leaves, 7 internal nodes = 15 nodes total. Build cost: visit each node once = O(n).

Step 2—Range query: sum of a[2..5]

We want 4+1+5+9=19. Start at the root and recurse:

text
                          [0..7]=31
                        /            \
                 [0..3]=9            [4..7]=22
                /        \          /          \
          [0..1]=4   *[2..3]=5  *[4..5]=14   [6..7]=8
          /    \       /    \     /    \       /    \
       [0]=3 [1]=1 [2]=4 [3]=1 [4]=5 [5]=9 [6]=2 [7]=6

  * = node whose segment is fully inside [2..5]

  Answer = [2..3] + [4..5] = 5 + 14 = 19    (2 nodes combined)

Instead of touching 4 elements, we combined 2 pre-computed nodes. Brute force: 4 additions. Segment tree: 2 additions plus some recursion overhead that stays O(logn).

Step 3—Point update: set a[4]=7 (delta = +2)

Walk from leaf [4] to the root, adding +2 at each ancestor:

text
  Before                              After
  ------                              -----
  [4]=5                               [4]=7      (+2)
  [4..5]=14                           [4..5]=16  (+2)
  [4..7]=22                           [4..7]=24  (+2)
  [0..7]=31                           [0..7]=33  (+2)

Only 4 nodes touched (leaf + 3 ancestors) = O(log28)=O(3). The rest of the tree is untouched.

Step 4—Verify: query a[2..5] again after the update

The array is now [3,1,4,1,7,9,2,6]. Expected sum for [2..5]: 4+1+7+9=21.

text
                          [0..7]=33
                        /            \
                 [0..3]=9            [4..7]=24
                /        \          /          \
          [0..1]=4   *[2..3]=5  *[4..5]=16   [6..7]=8

  Answer = 5 + 16 = 21   (correct)

Step 5—Operation count comparison

ScenarioBrute force opsSegment tree ops
Build (n=8)015 (one-time)
Query [2..5]42 node lookups
Update a[4]14 node writes
Query [2..5] again42 node lookups
Total923

For a single round-trip the segment tree does more work. The payoff comes when q is large: O(qlogn) vs O(qn). At n=q=105 that is 1.7×106 vs 1010—a factor of almost 6,000.

The walkthrough shows the mechanics. The invariant explains why every query and update stays O(logn).

The Invariant

text
+-----------------------------------------------------------------+
| Segment-tree invariant.  For every internal node v owning      |
| the range [l, r] with midpoint m = floor((l+r)/2):             |
|                                                                 |
|   t[v] = combine( t[left(v)], t[right(v)] )                    |
|                                                                 |
| where left(v) owns [l, m] and right(v) owns [m+1, r].          |
| Every point update restores this invariant bottom-up.           |
+-----------------------------------------------------------------+

Consider any single level of the tree. A query [l,r] can interact with a node in three ways:

  1. The node's segment is fully inside [l,r]—take its value, stop recursing.
  2. The node's segment partially overlaps [l,r]—recurse into both children.
  3. The node's segment is fully outside [l,r]—return the identity, stop.

At any given level there are at most 2 partial nodes: one touching the left boundary l and one touching the right boundary r. Every node between them is fully covered and resolved by its parent one level up—it will never appear as a partial node at this level.

Since the tree has log2n levels and each level contributes at most 2 partial visits, the total nodes visited is at most:

2log2n+O(1)=O(logn)

(The O(1) accounts for the root itself.) This holds for any associative operation, not just sum.

Checkpoint: A segment tree query on range [l, r] visits at most O(log n) nodes. Why can't it ever visit more than 2 "partial overlap" nodes at any single level of the tree?

Think first, then click At any level, only the leftmost and rightmost nodes of the query range can partially overlap. Every node between them is fully contained in `[l, r]` and gets resolved by its parent one level up. So at most 2 nodes per level need to recurse into children, giving `2 log n` total visits.

What the Code Won't Teach You

You need to feel the 3-case query split, not just code it.

text
    Query [l..r] at node covering [tl..tr]:

    Case 1: [tl..tr] fully inside [l..r]
            --> return t[v] immediately (no recursion)

    Case 2: [tl..tr] fully outside [l..r]
            --> return IDENTITY (no recursion)

    Case 3: partial overlap
            --> recurse into BOTH children, combine results

    At each level, at most 2 nodes are "partial"
    (straddling a boundary). Everything between them
    is fully covered one level up -- never recursed into.

    Example: query [2..5] on tree with 8 elements

              [0..7]         partial -> recurse
             /      \
        [0..3]     [4..7]    both partial -> recurse
        /    \     /    \
    [0..1] [2..3] [4..5] [6..7]
      OUT   FULL   FULL   OUT    <-- 2 lookups, done

The combine() function IS the problem—the tree is just infrastructure. When you see a segment tree problem, most of the work is deciding what each node should store and how two children merge into a parent. The traversal code is the same every time.

Coordinate compression is a separate skill you must chain. Problems with values up to 109 but only n2×105 distinct values need compression before building the tree. That preprocessing step is often more error-prone than the tree itself.

Segment Tree vs BIT vs Sparse Table

When to use each:

  • BIT—you only need sum/XOR (invertible ops), point updates, and want the shortest, fastest code. Default choice for simple "prefix sum + update."
  • Sparse Table—purely static: no updates at all, and the operation is overlap-friendly (min, max, GCD). Gives O(1) queries after O(nlogn) build.
  • Segment Tree—everything else. Non-invertible ops with updates, lazy propagation for range updates, walking the tree, custom struct merges. The Swiss-army knife.
FeatureSegment TreeBIT (Fenwick)Sparse Table
Build timeO(n)O(n)O(nlogn)
Point updateO(logn)O(logn)Not supported
Range queryO(logn)O(logn)O(1) (overlap-friendly ops)
Range updateO(logn) with lazyO(logn) with tricksNot supported
Supports non-invertible ops (min, max, GCD)YesNoYes
Supports tree walking (k-th element, first ≥ x)YesNo*No
Code complexity~40 lines~15 lines~20 lines
Constant factorMediumSmall (2–3× faster)Smallest
MemoryO(4n)O(n)O(nlogn)

*BIT can find the k-th element via binary lifting, but only for sum-like operations.

Bridge to lazy propagation. A basic segment tree handles point updates and range queries. But what if a problem says "add 5 to every element in [l,r]"? Updating each of the rl+1 leaves individually costs O(nlogn) per query—no better than brute force. Lazy propagation fixes this: instead of pushing an update all the way to the leaves, you store a pending tag at the highest nodes whose intervals lie entirely inside [l,r]. The tag is pushed to children only when a query or update needs to descend into them, bringing range updates back down to O(logn). If your problem has range updates + range queries or range updates + point queries, move on to Segment Tree with Lazy Propagation.

When to Reach for It

Trigger phrases in problem statements:

  1. "Answer queries of type sum/min/max/GCD over a range and update individual elements"—textbook segment tree.
  2. "Find the first/last position in a range satisfying some condition"—segment tree walk.
  3. "Process queries online (no offline tricks allowed)" with range aggregation—often rules out simpler structures.
  4. "The operation is non-invertible (min, max, GCD) and there are updates"—BIT is out, segment tree is in.
  5. "Range update + range query"—segment tree with lazy propagation.

When NOT to reach for it:

  • Static range-min with no updates—Sparse Table gives O(1) per query; a segment tree's O(logn) is overkill and slower.
  • Only prefix sums, no updates—a plain prefix-sum array is O(1) per query and trivially simpler.
  • Sum queries + point updates, nothing fancier—a BIT is 2–3× faster with half the code. Reach for the segment tree only when BIT can't handle the operation.

C++ Reference

There is no STL segment tree. You build it manually. The standard approach uses a flat array of size 4n with 1-based indexing (node 1 = root, children of node v are 2v and 2v+1).

ComponentWhat it doesNotes
tree[4*MAXN]Flat array storing node values1-indexed; tree[1] = root
build(v, tl, tr)Recursively builds tree from arrayO(n)
update(v, tl, tr, pos, val)Point update at posO(logn)
query(v, tl, tr, l, r)Range query over [l, r]O(logn)
combine(a, b)Merge two children—the only thing you change per problemMust be associative

Checkpoint: Why must the combine operation be associative for a segment tree to work? What goes wrong if you try to build a segment tree for the median of a range?

Think first, then click The segment tree combines children's answers to compute a parent's answer: `t[v] = combine(t[left], t[right])`. This only works if `combine(combine(a, b), c) == combine(a, combine(b, c))`—i.e., it doesn't matter how you group sub-ranges. Median is not associative: knowing the median of two halves doesn't tell you the median of the whole range. You'd need to store all elements (e.g., merge sort tree), not just an aggregate.

Implementation

text
+------------------------------------------------------------+
| WARNING: For a recursive segment tree over n elements,     |
| the array must be sized 4*n, NOT 2*n. A complete binary    |
| tree of depth ceil(log2(n)) can have up to 4n nodes.       |
| Using 2*n causes out-of-bounds writes -- silent corruption. |
+------------------------------------------------------------+

Version A—Minimal Template (Range Sum, Point Update)

cpp
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200'005;
long long t[4 * MAXN];
int a[MAXN];
int n, q;

void build(int v, int tl, int tr) {
    if (tl == tr) { t[v] = a[tl]; return; }
    int tm = (tl + tr) / 2;
    build(2*v, tl, tm);
    build(2*v+1, tm+1, tr);
    t[v] = t[2*v] + t[2*v+1];
}

void update(int v, int tl, int tr, int pos, long long val) {
    if (tl == tr) { t[v] = val; return; }
    int tm = (tl + tr) / 2;
    if (pos <= tm) update(2*v, tl, tm, pos, val);
    else           update(2*v+1, tm+1, tr, pos, val);
    t[v] = t[2*v] + t[2*v+1];
}

long long query(int v, int tl, int tr, int l, int r) {
    if (l > r) return 0;
    if (l == tl && r == tr) return t[v];
    int tm = (tl + tr) / 2;
    return query(2*v, tl, tm, l, min(r, tm))
         + query(2*v+1, tm+1, tr, max(l, tm+1), r);
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    cin >> n >> q;
    for (int i = 0; i < n; i++) cin >> a[i];
    build(1, 0, n - 1);

    while (q--) {
        int type; cin >> type;
        if (type == 1) {
            int pos; long long val; cin >> pos >> val;
            update(1, 0, n - 1, pos, val);
        } else {
            int l, r; cin >> l >> r;
            cout << query(1, 0, n - 1, l, r) << '\n';
        }
    }
}

Version B—Explained, Generic Operation

cpp
#include <bits/stdc++.h>
using namespace std;

// ------------------------------------------------------------------
//  Generic segment tree (point update, range query)
//  Change IDENTITY and combine() for different operations:
//    Sum:  IDENTITY = 0,        combine = a + b
//    Min:  IDENTITY = LLONG_MAX, combine = min(a,b)
//    Max:  IDENTITY = LLONG_MIN, combine = max(a,b)
//    GCD:  IDENTITY = 0,        combine = gcd(a,b)
//    XOR:  IDENTITY = 0,        combine = a ^ b
// ------------------------------------------------------------------

const int MAXN = 200'005;
const long long IDENTITY = 0;  // neutral element for the operation

long long combine(long long a, long long b) {
    return a + b;  // <-- change this per problem
}

long long t[4 * MAXN];  // tree nodes; 4*n is safe upper bound
int a[MAXN];             // original array (0-indexed)
int n, q;

// Build the tree from a[tl..tr].
// v     = current node index (start with 1)
// tl,tr = segment this node is responsible for
void build(int v, int tl, int tr) {
    if (tl == tr) {
        // Leaf: directly copy array value
        t[v] = a[tl];
        return;
    }
    int tm = (tl + tr) / 2;
    build(2 * v, tl, tm);          // build left child
    build(2 * v + 1, tm + 1, tr);  // build right child
    t[v] = combine(t[2 * v], t[2 * v + 1]);
}

// Set a[pos] = val and propagate up.
void update(int v, int tl, int tr, int pos, long long val) {
    if (tl == tr) {
        // Reached the leaf for position pos
        t[v] = val;
        return;
    }
    int tm = (tl + tr) / 2;
    if (pos <= tm)
        update(2 * v, tl, tm, pos, val);
    else
        update(2 * v + 1, tm + 1, tr, pos, val);
    // Recalculate this node after child changed
    t[v] = combine(t[2 * v], t[2 * v + 1]);
}

// Query the aggregate over a[l..r].
long long query(int v, int tl, int tr, int l, int r) {
    if (l > r)
        return IDENTITY;  // empty range
    if (l == tl && r == tr)
        return t[v];       // segment matches exactly
    int tm = (tl + tr) / 2;
    // Split query into left and right parts
    long long left_ans  = query(2 * v, tl, tm, l, min(r, tm));
    long long right_ans = query(2 * v + 1, tm + 1, tr, max(l, tm + 1), r);
    return combine(left_ans, right_ans);
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    cin >> n >> q;
    for (int i = 0; i < n; i++) cin >> a[i];
    build(1, 0, n - 1);

    while (q--) {
        int type;
        cin >> type;
        if (type == 1) {
            // Point update: set a[pos] = val
            int pos;
            long long val;
            cin >> pos >> val;
            a[pos] = val;  // keep array in sync (useful for debugging)
            update(1, 0, n - 1, pos, val);
        } else {
            // Range query: aggregate over [l, r]
            int l, r;
            cin >> l >> r;
            cout << query(1, 0, n - 1, l, r) << '\n';
        }
    }
    return 0;
}

Iterative (Bottom-Up) Segment Tree

Faster constant factor, preferred by some competitive programmers. Works for commutative operations. Array is 0-indexed; tree is stored in t[0..2n).

cpp
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200'005;
long long t[2 * MAXN];
int n;

void build() {
    for (int i = n - 1; i >= 1; i--)
        t[i] = t[i << 1] + t[i << 1 | 1];
}

void update(int p, long long val) {  // set a[p] = val
    for (t[p += n] = val; p > 1; p >>= 1)
        t[p >> 1] = t[p] + t[p ^ 1];
}

long long query(int l, int r) {  // sum over [l, r]
    long long res = 0;
    for (l += n, r += n + 1; l < r; l >>= 1, r >>= 1) {
        if (l & 1) res += t[l++];
        if (r & 1) res += t[--r];
    }
    return res;
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int q;
    cin >> n >> q;
    for (int i = 0; i < n; i++) cin >> t[n + i];
    build();

    while (q--) {
        int type; cin >> type;
        if (type == 1) {
            int p; long long v; cin >> p >> v;
            update(p, v);
        } else {
            int l, r; cin >> l >> r;
            cout << query(l, r) << '\n';
        }
    }
}

Iterative vs. recursive—when to use which. The iterative version uses 2n memory and has a smaller constant factor, but only works cleanly for commutative operations and does not extend to lazy propagation or tree walking. Use iterative for simple point update + range query. Use recursive when you need lazy propagation, persistent trees, or complex merge operations.

Segment Tree Diagram: Array [3, 1, 4, 1, 5, 9, 2, 6]

text
Array indices:      0   1   2   3   4   5   6   7
Array values:       3   1   4   1   5   9   2   6

Segment tree (sum):  node [range] = value

                          [0,7]=31
                        /          \
                 [0,3]=9              [4,7]=22
                /       \            /        \
           [0,1]=4    [2,3]=5   [4,5]=14    [6,7]=8
            / \        / \        / \        / \
         [0]=3 [1]=1 [2]=4 [3]=1 [4]=5 [5]=9 [6]=2 [7]=6

Point update: set a[2] = 7  (was 4, diff = +3)

  Update path (leaf to root):
    [2]=7  ->  [2,3]=8  ->  [0,3]=12  ->  [0,7]=34
                ^+3          ^+3           ^+3

  After update:
                          [0,7]=34
                        /          \
                [0,3]=12              [4,7]=22
                /       \            /        \
           [0,1]=4    [2,3]=8   [4,5]=14    [6,7]=8
            / \        / \        / \        / \
         [0]=3 [1]=1 [2]=7 [3]=1 [4]=5 [5]=9 [6]=2 [7]=6

Range query: sum(1, 5)  -> visits shaded nodes:
  Split: [1] + [2,3] + [4,5]  = 1 + 8 + 14 = 23
  Only O(log n) = O(3) nodes visited.

Operations Reference

OperationTimeSpaceNotes
BuildO(n)O(n)Single bottom-up pass
Point updateO(logn)O(1)Walk leaf to root
Range queryO(logn)O(logn) stackAt most 2logn nodes visited
Range update (lazy)O(logn)O(n)See Segment Tree with Lazy Propagation
Total memoryO(4n)Recursive version; iterative uses 2n

Why build is O(n), not O(nlogn). A common misconception is that building costs O(nlogn) because there are O(logn) levels. But the build visits each of the 2n nodes exactly once, doing O(1) work per node. A complete binary tree with n leaves has n1 internal nodes, so total nodes 2n—one merge per node, giving Θ(n). You're not building n root-to-leaf paths; you're building bottom-up, touching each node exactly once.

Problem Patterns

Pattern 1—Range Sum + Point Update (Bread and Butter)

The most common pattern. Given n values, handle "set a[i]=x" and "what is i=lra[i]?"

cpp
// combine = a + b, IDENTITY = 0
// Straightforward build/update/query as shown above.

CF problems:

Range Min/Max Query + Point Update

Change the combine function and identity. Often appears as "find the minimum in a subarray after modifications."

cpp
const long long IDENTITY = LLONG_MAX;  // for min-query
long long combine(long long a, long long b) { return min(a, b); }

CF problems:

Coordinate Compression + Segment Tree

When values are up to 109 but there are only n2×105 of them, compress values to [0,n) and build a segment tree on the compressed range. Classic in inversion counting and "how many previous elements are smaller" queries.

cpp
// Compress
vector<int> vals(a, a + n);
sort(vals.begin(), vals.end());
vals.erase(unique(vals.begin(), vals.end()), vals.end());
auto compress = [&](int x) {
    return int(lower_bound(vals.begin(), vals.end(), x) - vals.begin());
};

// Process elements left-to-right, query prefix, then update
for (int i = 0; i < n; i++) {
    int c = compress(a[i]);
    long long cnt = query(1, 0, sz - 1, 0, c - 1);  // # elements < a[i]
    update(1, 0, sz - 1, c, query(1, 0, sz - 1, c, c) + 1);
}

CF problems:

Walk on Segment Tree (Find k-th Element, First x)

Instead of returning a value, descend the tree based on children's aggregates. Example: find the k-th smallest element in a multiset maintained by a frequency segment tree.

cpp
int kth(int v, int tl, int tr, int k) {
    if (tl == tr) return tl;
    int tm = (tl + tr) / 2;
    if (t[2*v] >= k) return kth(2*v, tl, tm, k);
    else             return kth(2*v+1, tm+1, tr, k - t[2*v]);
}

CF problems:

Merge Sort Tree (Offline / Persistent Queries)

Store a sorted vector at each node instead of a single value. Answers "how many elements in [l,r] are x?" in O(log2n).

cpp
vector<int> mt[4 * MAXN];

void build(int v, int tl, int tr) {
    if (tl == tr) { mt[v] = {a[tl]}; return; }
    int tm = (tl + tr) / 2;
    build(2*v, tl, tm);
    build(2*v+1, tm+1, tr);
    merge(mt[2*v].begin(), mt[2*v].end(),
          mt[2*v+1].begin(), mt[2*v+1].end(),
          back_inserter(mt[v]));
}

int query(int v, int tl, int tr, int l, int r, int x) {
    if (l > r) return 0;
    if (l == tl && r == tr)
        return int(upper_bound(mt[v].begin(), mt[v].end(), x) - mt[v].begin());
    int tm = (tl + tr) / 2;
    return query(2*v, tl, tm, l, min(r, tm), x)
         + query(2*v+1, tm+1, tr, max(l, tm+1), r, x);
}

CF problems:

Segment Tree on Pairs / Custom Structs

Store a struct per node. For example, "maximum subarray sum" (Kadane on segments): each node stores {prefix_max, suffix_max, total, answer}.

cpp
struct Node {
    long long pref, suf, tot, ans;
};
Node combine(Node l, Node r) {
    return {
        max(l.pref, l.tot + r.pref),
        max(r.suf, r.tot + l.suf),
        l.tot + r.tot,
        max({l.ans, r.ans, l.suf + r.pref})
    };
}

CF problems:

Contest Cheat Sheet

text
+----------------------------------------------------------+
|  SEGMENT TREE CHEAT SHEET                                |
+----------------------------------------------------------+
|  When to use:                                            |
|    - Range query + point update, O(log n) each           |
|    - Non-invertible ops (min, max, gcd) -> can't use BIT |
|    - Need to "walk" the tree (k-th element, first >= x)  |
+----------------------------------------------------------+
|  Setup:                                                  |
|    long long t[4*MAXN];    // ALWAYS 4*n                 |
|    build(1, 0, n-1);       // call once                  |
|    update(1, 0, n-1, pos, val);                          |
|    query(1, 0, n-1, l, r);                               |
+----------------------------------------------------------+
|  Operation swap:                                         |
|    Sum: IDENTITY=0,         combine = a+b                |
|    Min: IDENTITY=LLONG_MAX, combine = min(a,b)           |
|    Max: IDENTITY=LLONG_MIN, combine = max(a,b)           |
|    GCD: IDENTITY=0,         combine = gcd(a,b)           |
|    XOR: IDENTITY=0,         combine = a^b                |
+----------------------------------------------------------+
|  Complexity:  Build O(n) | Update O(log n)               |
|               Query O(log n) | Space O(4n)               |
+----------------------------------------------------------+
|  Common bugs:                                            |
|    - Array too small (use 4*n, not 2*n)                  |
|    - Wrong identity element                              |
|    - l > r not returning IDENTITY                        |
|    - Forgetting to recalculate parent after update       |
+----------------------------------------------------------+

Gotchas, Common Mistakes, and Debugging

Array Size: Always 4n

Why not 2n? The recursive tree on n elements can use up to 4n5 nodes when n is not a power of 2. Using t[2 * MAXN] will cause out-of-bounds writes and mysterious WA or RE. A concrete example: with n=5, the tree needs indices up to 19 (i.e., 4×51). The iterative (Codeforces-style) segment tree is the exception—it uses exactly 2n.

Rule of thumb: declare t[4 * MAXN]. Wastes a little memory, never fails.

0-Indexed vs. 1-Indexed Arrays

Pick one and be consistent. The implementations above use 0-indexed arrays with 1-indexed tree nodes. If the problem uses 1-indexed input, either subtract 1 on read or shift your build to build(1, 1, n).

Identity Element Must Be Correct

OperationCorrect identityWrong choice → bug
Sum0
MinLLONG_MAX (or INT_MAX)0 gives wrong min when array has negatives
MaxLLONG_MIN (or INT_MIN)0 gives wrong max when array has negatives
GCD01 would make every GCD equal 1
XOR0

gcd(0, x) = x for all x, so 0 is the correct identity for GCD.

The l > r base case. In the recursive query, after splitting into [l, min(r, tm)] and [max(l, tm+1), r], one side can become an empty range. Return IDENTITY, not 0—they differ for min, max, and GCD.

Updating the parent after a child update. If you forget the line t[v] = combine(t[2*v], t[2*v+1]) at the end of update(), the tree becomes inconsistent. This is the most common bug when coding under pressure.

Integer overflow. If n=2×105 and values reach 109, the sum can hit 2×1014. Use long long for the tree. This is a common cause of WA on first submission.

Debugging tip: print the tree.

cpp
void dump(int v, int tl, int tr, int depth = 0) {
    cerr << string(depth * 2, ' ') << "[" << tl << ".." << tr
         << "] = " << t[v] << '\n';
    if (tl == tr) return;
    int tm = (tl + tr) / 2;
    dump(2*v, tl, tm, depth + 1);
    dump(2*v+1, tm+1, tr, depth + 1);
}

Call dump(1, 0, n-1) after build and after each update to visually inspect the tree.

Mental Traps

Trap 1: "I just need to change the combine function."

True for simple swaps (sum → min → max → GCD). False when the stored type is not a single scalar. Maximum subarray sum, bracket matching, inversion counting—the node type changes entirely and combine() becomes a nontrivial struct merge.

text
    Simple combine (just swap the operator):

    Sum: t[v] = t[2v] + t[2v+1]
    Min: t[v] = min(t[2v], t[2v+1])
    Max: t[v] = max(t[2v], t[2v+1])

    Complex combine (need a struct):

    Max subarray sum:
    +--------+     +--------+     +-------------------+
    | left   |  +  | right  |  =  | combined           |
    | sum=5  |     | sum=3  |     | sum=8              |
    | pre=5  |     | pre=3  |     | pre=max(5, 5+3)=8  |
    | suf=2  |     | suf=3  |     | suf=max(3, 3+2)=5  |
    | ans=5  |     | ans=3  |     | ans=max(5,3,2+3)=5 |
    +--------+     +--------+     +-------------------+

Trap 2: "BIT is just a simpler segment tree." They solve overlapping but different problems. BIT requires an invertible operation (sum, XOR—not min, max, GCD) and can't do tree walking, lazy range updates, or custom struct merges.

Trap 3: "The iterative version is strictly better."

The iterative bottom-up seg tree is faster and uses 2n memory. But it only works cleanly for commutative operations and doesn't extend to lazy propagation or tree walking. If you only know iterative, you'll struggle when you need to walk the tree for the k-th element.

text
    Recursive vs Iterative:

    Recursive:  4*N memory | lazy: YES | walk: YES | cache: meh
    Iterative:  2*N memory | lazy: NO* | walk: NO* | cache: good
                                                  * = possible but ugly

Verdict diagnosis.

  • WA (Wrong Answer):
    • Wrong merge function (e.g., max instead of +, or not handling identity elements)
    • Off-by-one in query range (0-indexed vs. 1-indexed mismatch)
    • Forgetting to handle the identity/neutral element for empty ranges
  • TLE (Time Limit Exceeded):
    • Forgot early return when range is completely outside query—visits entire tree
    • Building tree with O(nlogn) instead of O(n) bottom-up
  • RE (Runtime Error):
    • Array too small—use 4 * n (not 2 * n) for 1-indexed recursive trees
    • Stack overflow in recursive build for very large n

Self-Test

Before moving on, verify you can do each of these without looking at the notes:

  • [ ] Write build(), update(), and query() from memory for range sum—including the l > r base case and parent recomputation
  • [ ] Name the correct identity element for sum, min, max, GCD, and XOR without hesitation
  • [ ] Explain why 4*MAXN and not 2*MAXN in one sentence
  • [ ] Describe what changes when you switch from sum to max—which line(s) and why
  • [ ] Identify two situations where a BIT or sparse table is strictly better than a segment tree
  • [ ] Design a combine() for a non-trivial merge (e.g., maximum subarray sum)—what struct do you store?
  • [ ] Trace a query for [2..5] on an 8-element tree and identify which nodes are visited

Quick-Fire Questions

Q1: What's the minimum array size for a segment tree on n=105 elements?

Answer4 × 10^5. The implicit 1-indexed binary tree can index up to 4n when n is not a power of 2, so we allocate an array of size 4n to avoid out-of-bounds access.

Q2: You change a segment tree from range-sum to range-max. Which part of the code changes and what's the new identity element?

AnswerReplace + with max() in the merge/combine operation, and change the identity element from 0 to -INF (or INT_MIN). The build, update, and query structure stay identical.

Q3: Why can't you use a BIT (Fenwick tree) for range minimum queries, but a segment tree handles them fine?

AnswerBIT relies on the operation being invertible (e.g., prefix[r] - prefix[l-1] for sums). Minimum has no inverse—knowing min(1..r) and min(1..l-1) doesn't yield min(l..r). A segment tree decomposes the query into O(log n) non-overlapping segments and merges them directly, requiring only associativity, not invertibility.

Q4: In a lazy segment tree, what happens if you forget to push down lazy tags before splitting into left and right children during a query?

AnswerThe children still hold stale values, so the query returns incorrect results. Lazy propagation requires pushing pending updates to children before any operation that accesses child nodes—both queries and updates. Forgetting this is the #1 lazy propagation bug.

Q5: A segment tree with 4 long long values per node at n=106 uses how much memory? Does it fit in 256 MB?

Answer4n nodes × 4 values × 8 bytes = 128n bytes. At n = 10^6 that's ~128 MB—borderline for a 256 MB limit. You'd need to reduce per-node storage or use a more memory-efficient structure like a BIT if possible.

Practice Problems

#ProblemSourceDifficultyKey idea
1Dynamic Range Sum QueriesCSESEasyDirect sum segtree
2Dynamic Range Minimum QueriesCSESEasyMin segtree
3Xenia and Bit OperationsCF 339DEasyAlternating OR/XOR by level
4Enemy is WeakCF 61EMediumInversions via coord compression + segtree
5Petya and ArrayCF 1042DMediumPrefix sums + segtree counting
6Pashmak and Parmida's problemCF 459DMediumFrequency + merge sort tree
7Sereja and BracketsCF 380CMedium-HardCustom struct merge for bracket sequences
8Ant colonyCF 474FHardGCD + count segtree
9Distinct Characters QueriesCF 1234DEasyBitmask OR segtree (26 bits)
10Lucky ArrayCF 121EHardSegtree + digit DP

Designing Test Cases

Segment tree bugs are subtle—wrong identity element, off-by-one in ranges, broken lazy propagation. These tests smoke them out.

Minimal cases:

  • n = 1: Single element. Build, query the only position, update it, query again. If this fails, your tree indexing is broken.
  • Update and query same position: Update index i, then query range [i, i]. The returned value must equal the updated value exactly.

Edge cases:

  • Query entire range: query(0, n-1). This should return the root of the tree. If it doesn't, your base-case boundary check is wrong.
  • Single-element range queries: query(i, i) for every i. Each should match the leaf value.
  • Adjacent updates: Update positions i and i+1, then query [i, i+1]. Catches merge bugs.
  • All identical values: Build with all elements equal. Any range query should return the same merged result. Good for catching identity-element mistakes.
  • Negative values (for max/min trees): If your identity is 0 instead of -INF, a tree full of negatives will silently return wrong answers.

Stress test (run locally before submitting):

cpp
// Brute force: plain array with O(n) query and O(1) update.
mt19937 rng(42);
int n = 500;
vector<int> brute(n);
// ... build your segment tree on `brute` ...
for (int iter = 0; iter < 100000; iter++) {
    if (rng() % 2 == 0) {
        // random point update
        int pos = rng() % n, val = rng() % 201 - 100;
        brute[pos] = val;
        seg_update(pos, val);
    } else {
        // random range query
        int l = rng() % n, r = rng() % n;
        if (l > r) swap(l, r);
        int bf_ans = brute_query(brute, l, r);  // loop l..r
        int seg_ans = seg_query(l, r);
        assert(bf_ans == seg_ans);
    }
}

Use n <= 500 so brute force is fast, but run 100k+ iterations. This catches identity-element errors, off-by-one splits, and lazy propagation bugs reliably.


Level-Up Chain

Progress through four levels of segment tree mastery. Each level layers on a harder technique.

LevelPatternProblem / DescriptionKey Idea
1Range sum + point updateDynamic Range Sum Queries (CSES)Build a basic segment tree, handle point updates and range sum queries.
2Range min with lazy propagationDynamic Range Minimum Queries (CSES) + range update variantAdd lazy propagation so range updates run in O(logn) instead of O(n).
3Merge sort treePashmak and Parmida's problem (CF 459D)Each node stores a sorted list of its subarray—enables "count elements ≤ x in range" queries via binary search inside nodes.
4Persistent segment tree for k-th queriesK-th smallest in a range via persistent segment treeBuild a version for each prefix; answer range k-th queries by subtracting two versions. Classic offline technique for static arrays.

How to Practice This

Speed drill—range-sum segment tree from scratch. Set a timer and implement a complete range-sum segment tree (build, point update, range query) on a blank file. No references.

AttemptTargetNotes
1st< 10 minFocus on correctness—get build, update, query right.
2nd< 7 minEliminate pauses—know the recursion structure cold.
3rd+< 5 minCompetition-ready. You should be typing, not thinking.

Once range-sum is automatic, swap the merge operation and identity:

VariationMerge opIdentityPractice problem
Range-minmin(a, b)INT_MAXCSES Range Minimum Queries I
Range-maxmax(a, b)INT_MINCSES Range Maximum Queries
Range-GCD__gcd(a, b)0CF 474F—Sereja and Brackets

For each: implement, test on the linked problem, then time yourself.

Weekly plan:

  1. Week 1: Drill range-sum daily until you hit 5 min consistently.
  2. Week 2: Rotate through min/max/GCD—one variation per day.
  3. Week 3: Add lazy propagation (range update + range query) to your sum tree.
  4. Ongoing: When you see a new seg tree problem, first identify the monoid (merge + identity), then code it.

Solve With Me—CF 339D "Xenia and Bit Operations"

The problem gives 2n numbers and says: on the bottom level, OR adjacent pairs. Next level up, XOR adjacent pairs. Alternate OR/XOR all the way to the root. Then handle point updates and print the root after each one.

First thought: recompute everything from scratch after each update. But n up to 17 means 217=131072 leaves, and m up to 105 updates—rebuilding the whole tree each time is O(m2n), about 1010. Way too slow.

This is literally a segment tree. Each node stores the result of combining its children. The only twist is that the "combine" operation alternates between OR and XOR depending on the depth. A node at depth d from the root uses OR if (nd) is odd, XOR if (nd) is even. A quick trace on the small sample confirms it.

The cleanest implementation stores whether each node is an OR-node or XOR-node directly in the tree. Build it once; on updates, walk up from the changed leaf to the root re-merging each node with the correct operation. Standard point update, O(n) per query—about 105×172×106 operations total. Easily fast enough.

The full implementation is maybe 40 lines of code: just a segment tree where merge(a, b) checks the level to decide OR vs. XOR. Got AC.

What to recognize next time: when a problem defines a recursive combining structure over an array with point updates, it is begging to be a segment tree. "Alternating operations" is just a parameterized merge—don't overthink it.


Further Reading

Built for the climb from Codeforces 1100 to 2100.