Skip to content

Euler Tour and Lowest Common Ancestor (LCA)

Quick Revisit

  • USE WHEN: Converting subtree/path queries to range queries, or finding the lowest common ancestor of two nodes
  • INVARIANT: Subtree of v maps to contiguous interval [tin[v], tout[v]] in the Euler tour array
  • TIME: O(n log n) preprocessing, O(1) LCA query (sparse table) or O(log n) (binary lifting) | SPACE: O(n log n)
  • CLASSIC BUG: Using tin/tout from a 2n-length Euler tour for LCA but indexing into an n-length flat array (confusing the two tour variants)
  • PRACTICE: 05-ladder-graphs

Flatten a rooted tree into an array so that subtree and path queries reduce to range queries -- then answer LCA in O(1) or O(logn).

Difficulty: [Advanced]Prerequisites: DFS and Tree DFS, Sparse Table (RMQ)CF Rating: 1700-2100

Contest Frequency: ** Regular -- key technique for tree problems


Intuition

A tree with n nodes has n1 edges and no cycles. That structure is great for humans but awkward for range-based data structures. An Euler tour fixes this: run a DFS, recording each node when you enter and when you exit. Two flavors of "tour" matter, and confusing them is the classic bug:

  1. Entry/exit timestamps (tin[v], tout[v]). Each node gets two integers from a single counter. Subtrees correspond to contiguous index ranges. Used for subtree queries.
  2. Full Euler tour (length 2n1). Each node is appended every time DFS enters it or returns through it. Used for O(1) LCA via RMQ on depths.

A note on conventions for tout. There are two common variants in the wild, and they give different formulas for subtree ranges:

  • Exit-timestamp convention (this lesson's implementation, length-2n counter): both tin[v] = timer++ and tout[v] = timer++ consume timer ticks. Subtree of v is the set of nodes with tin in [tin[v]+1,tout[v]1]. This convention makes is_ancestor(u, v) = tin[u] <= tin[v] && tout[v] <= tout[u] clean.
  • Subtree-right-bound convention (the visual walkthrough below): only tin increments; tout[v] is the maximum tin value over the subtree of v. Subtree of v is then tin[u] in [tin[v], tout[v]] directly, and the flat array has length exactly n.

Both work. Pick one, write it down at the top of your template, and never mix them.

Why care? Subtree sum, subtree update, path queries -- all become range operations on a flat array, solvable with BIT or segment tree.

LCA (Lowest Common Ancestor): Given two nodes u and v in a rooted tree, lca(u,v) is the deepest node that is an ancestor of both. Two standard approaches:

  1. Binary lifting -- precompute up[v][k] = the 2k-th ancestor of v. To find LCA, lift both nodes to the same depth, then lift both in sync until they meet. Preprocessing O(nlogn), query O(logn).

  2. Euler tour + Sparse Table (RMQ) -- record depths during a full Euler tour of length 2n1. lca(u,v) is the node with minimum depth between the first occurrences of u and v. Preprocessing O(nlogn), query O(1).

Codeforces frequency: LCA appears in div2 D/E, div1 B/C. Tags: trees, dfs and similar, data structures. Any problem mentioning "distance between two nodes on a tree" or "path query" likely needs LCA.

Common trap at ~1100: confusing subtree queries (use entry/exit times) with path queries (need LCA to split the path into two vertical chains).

A worked example

Consider a tree with 7 nodes, rooted at node 1:

           1
         / | \
        2  3  4
       / \    |
      5   6   7

Suppose each node stores a value. We need to answer many queries of two kinds:

  1. LCA(u, v): find the lowest common ancestor of nodes u and v.
  2. Subtree sum(u): compute the sum of all values in the subtree rooted at u.

Naive LCA: walk node u up to the root, walk node v up to the root, find where the paths first meet. Worst case (a chain tree): O(n) per query.

Naive subtree sum: run a DFS from u through its subtree, accumulating values. Again O(n) per query.

With q queries, total cost is O(nq) -- far too slow when n,q2×105.

We need a way to answer both query types in O(logn) or better. The obstacle: a tree is not an array, so we cannot directly apply segment trees, BITs, or sparse tables.

Flatten the tree into an array using an Euler tour -- now every subtree becomes a contiguous range, and LCA reduces to a range minimum query on depths.

Think of it like touring a museum. You enter a room (record entry time), visit every sub-room recursively, then exit (record exit time). The time interval [tin,tout] for any room contains exactly the intervals of all rooms inside it -- no overlaps, no gaps. A sub-room's interval is strictly nested inside its parent's interval.

A tree works the same way. Run a DFS and stamp each node with an entry time tin[v] and an exit time tout[v]. Now:

  • Subtree = contiguous range. Every descendant of v has its tin value between tin[v] and tout[v]. Lay out node values in tin-order and any subtree query becomes a range query on a flat array -- solvable with a BIT or segment tree.
  • LCA = range minimum query. Build a full Euler tour that records a node every time DFS enters or backtracks through it (length 2n1). The LCA of u and v is the shallowest node visited between the first occurrence of u and the first occurrence of v. A sparse table answers this in O(1).

One DFS transforms a tree into array territory.

Visual Walkthrough

We build the Euler tour on the same 7-node tree, step by step.

Step 1 -- Run DFS, record tin/tout. (Subtree-right-bound convention: only tin increments; tout[v] is the maximum tin over subtree(v).)

DFS from root 1, assigning entry times in visit order:

  Visit order:  1 -> 2 -> 5 -> (back) -> 6 -> (back) -> (back) -> 3 -> (back) -> 4 -> 7

  Node:   1   2   5   6   3   4   7
  tin:    0   1   2   3   4   5   6
  tout:   6   3   2   3   4   6   6

Step 2 -- Lay out node values in tin-order (flat array).

  Index:  0     1     2     3     4     5     6
        +-----+-----+-----+-----+-----+-----+-----+
  Node: |  1  |  2  |  5  |  6  |  3  |  4  |  7  |
        +-----+-----+-----+-----+-----+-----+-----+

Step 3 -- Verify subtree = contiguous range.

  subtree(1) = indices [0 .. 6] = {1,2,5,6,3,4,7}  (entire tree)
  subtree(2) = indices [1 .. 3] = {2,5,6}           (correct)
  subtree(4) = indices [5 .. 6] = {4,7}             (correct)
  subtree(5) = indices [2 .. 2] = {5}               (leaf)

Any subtree sum is now a range sum on this array. Use a BIT or segment tree.

Step 4 -- Build full Euler tour for LCA (length 2n1=13).

Record the current node every time DFS enters or returns through it:

  pos:    0   1   2   3   4   5   6   7   8   9  10  11  12
  node:   1   2   5   2   6   2   1   3   1   4   7   4   1
  depth:  0   1   2   1   2   1   0   1   0   1   2   1   0

Record first occurrence of each node:

  first[1]=0  first[2]=1  first[3]=7
  first[4]=9  first[5]=2  first[6]=4  first[7]=10

Step 5 -- Answer LCA queries via RMQ on depth array.

Query: LCA(5, 6)

  Range [first[5] .. first[6]] = [2 .. 4]
  pos:    2    3    4
  depth:  2    1    2
  node:   5    2    6
               ^--- minimum depth = 1 --> node 2
  LCA(5, 6) = 2.  Correct.

Query: LCA(5, 7)

  Range [first[5] .. first[7]] = [2 .. 10]
  pos:    2   3   4   5   6   7   8   9  10
  depth:  2   1   2   1   0   1   0   1   2
  node:   5   2   6   2   1   3   1   4   7
                          ^--- minimum depth = 0 --> node 1
  LCA(5, 7) = 1.  Correct.

Query: LCA(6, 3)

  Range [first[6] .. first[3]] = [4 .. 7]
  pos:    4   5   6   7
  depth:  2   1   0   1
  node:   6   2   1   3
                  ^--- minimum depth = 0 --> node 1
  LCA(6, 3) = 1.  Correct.

The Invariant

Euler-tour invariant (entry/exit times):

For all v in subtree(u): tin[u]tin[v] and tout[v]tout[u].

Equivalently, v is a descendant of u if and only if [tin[v],tout[v]][tin[u],tout[u]].

Subtree range: subtree(u) corresponds to flat-array positions [tin[u],tout[u]].

LCA via RMQ: lca(u,v) is the node with minimum depth in the full Euler tour between positions first[u] and first[v].

These two properties are what make every downstream algorithm work. If the invariant holds, subtree queries reduce to range queries and LCA reduces to RMQ -- both solved in O(logn) or O(1) with standard data structures.

When to Reach for This

Trigger phrases in problem statements:

  • "sum / max / count of values in a subtree" -- Euler tour + BIT/segment tree.
  • "lowest common ancestor" or "LCA" -- binary lifting or Euler tour + sparse table.
  • "distance between two nodes on a tree" -- LCA + depth formula.
  • "flatten a tree into an array" -- Euler tour, directly.
  • "path query on a tree" -- LCA to split the path, then two vertical chains.

Counter-examples -- when Euler tour alone is not enough:

  • Heavy path queries (update/query arbitrary paths): You need Heavy-Light Decomposition (HLD), which chains the tree so each path hits O(logn) chains. Euler tour alone only gives subtree ranges, not arbitrary path ranges.
  • Centroid-based distance queries: Centroid decomposition, not Euler tour.
  • Dynamic trees (link/cut): Link-Cut Trees or Euler Tour Trees (a different, more advanced structure).
  DECISION TREE: Which tree query technique?

  Is it a subtree query only?
    |
    YES --> Euler tour + BIT/segtree
    |
    NO --> Is it a path query?
            |
            YES --> Do you need updates?
            |         |
            |         YES --> HLD (O(log²n))
            |         |
            |         NO --> LCA + depth formula
            |
            NO --> Is it about distances?
                    |
                    YES --> Centroid decomposition
                    |
                    NO --> Link-Cut Tree (dynamic)

What the Code Won't Teach You

The Euler tour is a lens, not just a technique. Once you internalize "subtrees = contiguous ranges," you'll see this pattern everywhere: HLD chains, centroid decomposition subtrees, DSU on tree merges. The code gives you tin/tout arrays; the real skill is recognizing when a tree problem is secretly a range problem in disguise.

LCA is infrastructure, not the solution. In contest problems, LCA is almost never the answer itself -- it's a subroutine for computing distances, splitting paths, or anchoring virtual trees. When you read "path from u to v," your reflex should be: "find LCA, split into two vertical chains, handle each with prefix sums or climbing." The code teaches mechanics; the contest teaches when to deploy.

          problem: "sum on path u->v"
                    |
           +--------+--------+
           |                 |
    find LCA(u,v)     split into two chains
           |                 |
    u -> lca(u,v)      lca(u,v) -> v
           |                 |
    prefix_sum[u]     prefix_sum[v]
    - prefix_sum[lca] - prefix_sum[lca]
           |                 |
           +--------+--------+
                    |
              answer = sum of both

Off-by-one in ancestor checks will haunt you. The is_ancestor(u, v) check using tin[u] <= tin[v] && tout[v] <= tout[u] is elegant but fragile -- one wrong inequality direction and every downstream algorithm silently breaks. You won't notice until a tricky test case. Build the habit of testing on a 3-node chain: root -> middle -> leaf.


Implementation

Version 1: Binary Lifting LCA (Minimal Contest Template)

cpp
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200'005;
const int LOG = 18; // log2(2e5) < 18

vector<int> adj[MAXN];
int up[MAXN][LOG], depth[MAXN], tin[MAXN], tout[MAXN];
int timer_val = 0;

void dfs(int v, int p) {
    tin[v] = timer_val++;
    up[v][0] = p;
    for (int k = 1; k < LOG; k++)
        up[v][k] = up[v][k - 1] == -1 ? -1 : up[up[v][k - 1]][k - 1];
    for (int u : adj[v]) if (u != p) {
        depth[u] = depth[v] + 1;
        dfs(u, v);
    }
    tout[v] = timer_val++;
}

bool is_ancestor(int u, int v) {
    return tin[u] <= tin[v] && tout[v] <= tout[u];
}

int lca(int u, int v) {
    if (is_ancestor(u, v)) return u;
    if (is_ancestor(v, u)) return v;
    for (int k = LOG - 1; k >= 0; k--)
        if (up[u][k] != -1 && !is_ancestor(up[u][k], v))
            u = up[u][k];
    return up[u][0];
}

int dist(int u, int v) {
    return depth[u] + depth[v] - 2 * depth[lca(u, v)];
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, q;
    cin >> n >> q;
    for (int i = 0; i < n - 1; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    memset(up, -1, sizeof(up));
    depth[1] = 0;
    dfs(1, -1);

    while (q--) {
        int u, v;
        cin >> u >> v;
        cout << lca(u, v) << "\n";
    }
}

Version 2: Binary Lifting LCA (Explained)

cpp
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200'005;
const int LOG = 18; // ceil(log2(MAXN)), covers up to 2^17 = 131072 ancestors

vector<int> adj[MAXN];
int up[MAXN][LOG];  // up[v][k] = 2^k-th ancestor of v, or -1
int depth[MAXN];    // depth from root
int tin[MAXN];      // DFS entry time (for is_ancestor check)
int tout[MAXN];     // DFS exit time
int timer_val = 0;

void dfs(int v, int p) {
    tin[v] = timer_val++;
    up[v][0] = p; // direct parent

    // Build lifting table: jump 2^k = two jumps of 2^(k-1)
    for (int k = 1; k < LOG; k++) {
        if (up[v][k - 1] == -1)
            up[v][k] = -1; // already past the root
        else
            up[v][k] = up[up[v][k - 1]][k - 1];
    }

    for (int u : adj[v]) {
        if (u != p) {
            depth[u] = depth[v] + 1;
            dfs(u, v);
        }
    }
    tout[v] = timer_val++;
}

// u is ancestor of v iff u's DFS interval contains v's
bool is_ancestor(int u, int v) {
    return tin[u] <= tin[v] && tout[v] <= tout[u];
}

int lca(int u, int v) {
    // Edge case: one is already ancestor of the other
    if (is_ancestor(u, v)) return u;
    if (is_ancestor(v, u)) return v;

    // Lift u as high as possible while staying below LCA
    for (int k = LOG - 1; k >= 0; k--) {
        if (up[u][k] != -1 && !is_ancestor(up[u][k], v))
            u = up[u][k];
    }
    // Now u's parent is the LCA
    return up[u][0];
}

// Tree distance = depth[u] + depth[v] - 2*depth[lca(u,v)]
int dist(int u, int v) {
    return depth[u] + depth[v] - 2 * depth[lca(u, v)];
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, q;
    cin >> n >> q;
    for (int i = 0; i < n - 1; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    memset(up, -1, sizeof(up));
    depth[1] = 0;
    dfs(1, -1); // root at node 1

    while (q--) {
        int u, v;
        cin >> u >> v;
        cout << lca(u, v) << "\n";
    }
}

Version 3: Euler Tour + Sparse Table for O(1) LCA

cpp
#include <bits/stdc++.h>
using namespace std;

const int MAXN = 200'005;
const int MAXM = 2 * MAXN; // Euler tour length <= 2n - 1
const int LOG = 20;

vector<int> adj[MAXN];
int euler[MAXM], dep[MAXM]; // euler tour: node and depth at each position
int first_occ[MAXN];        // first occurrence of node in euler tour
int depth[MAXN];
int idx = 0;

void dfs(int v, int p, int d) {
    depth[v] = d;
    first_occ[v] = idx;
    euler[idx] = v;
    dep[idx] = d;
    idx++;
    for (int u : adj[v]) {
        if (u != p) {
            dfs(u, v, d + 1);
            euler[idx] = v; // record parent when returning
            dep[idx] = d;
            idx++;
        }
    }
}

// Sparse table over dep[] array, storing position of minimum
int sparse[LOG][MAXM];

void build_sparse(int n) {
    for (int i = 0; i < n; i++) sparse[0][i] = i;
    for (int k = 1; (1 << k) <= n; k++)
        for (int i = 0; i + (1 << k) <= n; i++) {
            int l = sparse[k - 1][i];
            int r = sparse[k - 1][i + (1 << (k - 1))];
            sparse[k][i] = (dep[l] <= dep[r]) ? l : r;
        }
}

int rmq(int l, int r) { // returns position of min depth in [l, r]
    int k = __lg(r - l + 1);
    int pl = sparse[k][l];
    int pr = sparse[k][r - (1 << k) + 1];
    return (dep[pl] <= dep[pr]) ? pl : pr;
}

int lca(int u, int v) {
    int l = first_occ[u], r = first_occ[v];
    if (l > r) swap(l, r);
    return euler[rmq(l, r)];
}

int dist(int u, int v) {
    return depth[u] + depth[v] - 2 * depth[lca(u, v)];
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, q;
    cin >> n >> q;
    for (int i = 0; i < n - 1; i++) {
        int u, v;
        cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    dfs(1, -1, 0);
    build_sparse(idx);

    while (q--) {
        int u, v;
        cin >> u >> v;
        cout << lca(u, v) << "\n";
    }
}

ASCII Trace: Euler Tour Construction on a 5-Node Tree

Tree:               Adjacency list (rooted at 0):
       0              0: [1, 2]
      / \             1: [3, 4]
     1   2            2: []
    / \               3: []
   3   4              4: []

DFS walk (record node on ENTER and on EXIT of each child):

Step   Action        euler[]              tin[]       tout[]
───────────────────────────────────────────────────────────────
 1     enter 0       [0]                  tin[0]=0     -
 2     enter 1       [0,1]                tin[1]=1     -
 3     enter 3       [0,1,3]              tin[3]=2     -
 4     leave 3       [0,1,3,1]            -            tout[3]=3
 5     enter 4       [0,1,3,1,4]          tin[4]=4     -
 6     leave 4       [0,1,3,1,4,1]        -            tout[4]=5
 7     leave 1       [0,1,3,1,4,1,0]      -            tout[1]=6
 8     enter 2       [0,1,3,1,4,1,0,2]    tin[2]=7     -
 9     leave 2       [0,1,3,1,4,1,0,2,0]  -            tout[2]=8
10     leave 0       [0,1,3,1,4,1,0,2,0]  -            tout[0]=8

Final arrays:
  euler[] = [0, 1, 3, 1, 4, 1, 0, 2, 0]   (length = 2n - 1 = 9)
  tin[]   = [0, 1, 2, -, 4, -, -, 7, -]    -> [0, 1, 2, 4, 7] for nodes 0-4
  tout[]  = [8, 6, 8, 3, 5]

LCA(3,4) = node at min-depth in euler[tin[3]..tin[4]] = euler[2..4] -> min depth is node 1
LCA(3,2) = node at min-depth in euler[tin[3]..tin[2]] = euler[2..7] -> min depth is node 0

Operations Reference

OperationBinary LiftingEuler Tour + RMQ
PreprocessingO(nlogn)O(nlogn)
LCA queryO(logn)O(1)
k-th ancestorO(logn)N/A
is_ancestor checkO(1)O(1)
Tree distanceO(logn)O(1)
SpaceO(nlogn)O(nlogn)
  COMPARISON: Binary lifting vs Euler tour + RMQ

  Binary Lifting:              Euler Tour + Sparse Table:
  +------------------------+   +---------------------------+
  | up[v][k] table         |   | euler[], dep[], first[]   |
  | O(n log n) build       |   | O(n log n) build          |
  | O(log n) per LCA       |   | O(1) per LCA              |
  | + gives k-th ancestor  |   | + gives subtree ranges    |
  | - no subtree queries   |   | - no k-th ancestor        |
  +------------------------+   +---------------------------+

When to pick which:

  • Binary lifting when you also need k-th ancestor or need to walk up the tree.
  • Euler tour + RMQ when you need O(1) LCA and/or subtree range queries.
  • Both approaches handle n2×105 easily within 2 seconds.


🔍 Why This Works -- DFS Timestamps Encode Ancestry

When DFS visits a node u, it records tin[u] (entry) and tout[u] (exit). The key insight: u is an ancestor of v if and only if tin[u] <= tin[v] and tout[v] <= tout[u].

Why? DFS explores u's entire subtree before leaving u. So every descendant v gets its entry time after u enters and its exit time before u exits. The interval [tin[u],tout[u]] completely contains [tin[v],tout[v]].

This also means the subtree of u corresponds to a contiguous range in the Euler tour array -- enabling range queries (sums, mins, etc.) over subtrees using segment trees or BITs.

Problem Patterns

Pattern 1: Tree Distance Queries

Compute distance between two nodes: dist(u,v)=depth[u]+depth[v]2depth[lca(u,v)]. Appears in almost every LCA problem.

cpp
int dist(int u, int v) {
    return depth[u] + depth[v] - 2 * depth[lca(u, v)];
}

Problems: CSES 1135 -- Distance Queries, CF 519E

Pattern 2: Path Aggregation (Sum / Max on Path)

Split path uv into ulca and vlca. Use Euler tour + BIT for prefix sums, or binary lifting to walk up collecting values.

cpp
// With binary lifting, walk up collecting edge weights:
long long path_sum(int u, int anc) {
    long long s = 0;
    for (int k = LOG - 1; k >= 0; k--)
        if (depth[u] - (1 << k) >= depth[anc]) {
            s += w[u][k]; // w[u][k] = sum of 2^k edges above u
            u = up[u][k];
        }
    return s;
}

Problems: CF 191C, CSES 1138 -- Path Queries

Pattern 3: Subtree Queries via Euler Tour + BIT/Segtree

Flatten tree with entry/exit times. Subtree of v maps to interval [tin[v],tout[v]). Point update on node point update on flat array. Subtree sum range sum.

cpp
// Update node v's value, query subtree sum of v
void update(int v, int delta) { bit_update(tin[v], delta); }
long long subtree_sum(int v) {
    return bit_query(tout[v]) - bit_query(tin[v] - 1);
}

Problems: CSES 1137 -- Subtree Queries, CF 620E

Pattern 4: Path Update via Euler Tour Difference Trick

To add x to all nodes on path uv: add x at tin[u] and tin[v], subtract x at tin[lca] and tin[parent(lca)]. Then subtree sum at any node gives its accumulated value.

cpp
void path_add(int u, int v, int x) {
    int l = lca(u, v);
    bit_update(tin[u], x);
    bit_update(tin[v], x);
    bit_update(tin[l], -x);
    if (up[l][0] != -1) bit_update(tin[up[l][0]], -x);
}
// Value at node w = subtree_sum(w) using BIT on Euler-tour array

Problems: CSES 1138 -- Path Queries, CF 191C

Pattern 5: LCA for Merging Paths / Checking Connectivity

Given a set of nodes, their "virtual tree" (auxiliary tree) is built by sorting by Euler tour order and taking consecutive LCAs. Used in problems asking about a subset of nodes on a tree.

Problems: CF 613D, CF 1702G2

Pattern 6: Diameter and Farthest Node

The farthest node from any node u is an endpoint of the diameter. Find diameter via two BFS/DFS, or use LCA to compute distances between candidate endpoints.

cpp
// After finding diameter endpoints d1, d2:
// farthest from u = max(dist(u, d1), dist(u, d2))

Problems: CSES 1131 -- Tree Diameter, CF 1294F

Pattern 7: LCA on Weighted Trees (Edge Weights)

Store edge weight in the child node. Distance query is the same formula. For max-edge on path, store max in the lifting table alongside the ancestor.

cpp
int max_edge[MAXN][LOG]; // max edge weight in 2^k steps above v
// build: max_edge[v][k] = max(max_edge[v][k-1], max_edge[up[v][k-1]][k-1])

Problems: CF 609E, CSES 1136 -- Tree Distances II


Contest Cheat Sheet

+---------------------------------------------------------------+
|              EULER TOUR & LCA CHEAT SHEET                     |
+---------------------------------------------------------------+
| WHEN TO USE:                                                  |
|   o "Distance between nodes on tree"                          |
|   o "Sum/max along path u->v"                                 |
|   o "Update/query all nodes in subtree"                       |
+---------------------------------------------------------------+
| APPROACH:                                                     |
|   o k-th ancestor needed?  --> binary lifting O(log n)        |
|   o O(1) LCA needed?      --> euler tour + sparse table       |
|   o Subtree queries?      --> euler tour + BIT/segtree        |
|   o Path update?          --> euler tour + difference BIT     |
+---------------------------------------------------------------+
| KEY FORMULAS:                                                 |
|   dist(u,v) = dep[u] + dep[v] - 2*dep[lca(u,v)]             |
|   subtree(v) = flat[tin[v] .. tout[v]]                        |
|   up[v][k] = up[up[v][k-1]][k-1]                             |
|   LOG = 18 (n<=2e5), LOG = 20 (n<=1e6)                       |
+---------------------------------------------------------------+
| COMPLEXITY: Preprocess O(n log n), Space O(n log n)           |
|             LCA query O(log n) lifting / O(1) RMQ            |
+---------------------------------------------------------------+

Gotchas & Debugging

  1. Root choice matters. If the problem doesn't specify a root, root at node 1. Some problems give a forest -- add a virtual root connecting all components.

  2. 1-indexed vs 0-indexed. The code above uses 1-indexed nodes. If you mix indexing, up[0][k] reads garbage. Initialize memset(up, -1, sizeof(up)) and always check for -1.

  3. Stack overflow on deep trees. n=2×105 with a chain tree = recursion depth 2×105. Default stack is 1-8 MB. Either increase stack size (ulimit -s unlimited on Linux, or pragma on Windows) or convert DFS to iterative.

    cpp
    // Iterative DFS skeleton:
    stack<pair<int,int>> stk;
    stk.push({root, -1});
    while (!stk.empty()) {
        auto [v, p] = stk.top(); stk.pop();
        // process v ...
        for (int u : adj[v]) if (u != p) stk.push({u, v});
    }
  4. Euler tour length. For LCA via RMQ, the tour has 2n1 entries, not 2n. Size your arrays accordingly: MAXM = 2 * MAXN is safe.

  5. LOG value. LOG = 18 covers n2×105. For n106, use LOG = 20. Too small = wrong answers on deep trees. Too large = wasted memory (rarely an issue).

  6. Forgetting is_ancestor check in lca(). If u is already an ancestor of v, the lifting loop won't find the LCA -- it will overshoot. Always handle this case first.

  7. Sparse table off-by-one. rmq(l, r) must be inclusive on both ends. Double-check that r - (1 << k) + 1 doesn't underflow.

  8. Multi-test reset. If the problem has t test cases, clear adjacency lists and reset timer_val = 0, idx = 0 between tests. Forgetting this is the #1 source of WA on multi-test tree problems.

  9. Directed vs undirected input. Some problems give parent arrays instead of edge lists. Build the adjacency list accordingly -- don't add both directions if you already have a rooted tree.

Mental Traps

Trap 1: "I'll just walk up the tree -- it's only one query."

Every LCA query starts as "just one query." Then the problem adds 105 more. The naive O(n) walk is the gateway drug to TLE. Even if the problem says "answer one query," check whether that query sits inside a loop you haven't noticed yet.

  The "just walk up" trap:

  Expected:     Actual (inside a loop you missed):

  LCA(u,v)      for each edge (u,v) in MST:
    |               LCA(u, v)        <-- O(n) x O(n) = O(n²)
    O(n)              |
    |               total: TLE
    done

Trap 2: "Euler tour = DFS preorder, same thing."

They share the entry timestamps, but preorder gives you ONE number per node. Euler tour gives you TWO (tin and tout), and the full LCA tour gives you a sequence of length 2n1. Confusing them means you'll build a sparse table on the wrong array or query subtree ranges without tout.

  Preorder:     tin only      [1, 2, 5, 6, 3, 4, 7]
                              (7 entries for 7 nodes)

  Euler tour:   tin + tout    each node appears TWICE
                              (14 timestamps for 7 nodes)

  Full LCA tour: every visit  [1,2,5,2,6,2,1,3,1,4,7,4,1]
                              (13 = 2n-1 entries)

  +-----------+---------------------+-------------------+
  | Variant   | Use case            | Length             |
  +-----------+---------------------+-------------------+
  | Preorder  | "which nodes come   | n                  |
  |           |  before/after?"     |                    |
  +-----------+---------------------+-------------------+
  | tin/tout  | subtree ranges      | 2 values per node  |
  |           | ancestor checks     |                    |
  +-----------+---------------------+-------------------+
  | Full tour | RMQ-based O(1) LCA  | 2n - 1             |
  +-----------+---------------------+-------------------+

Common Mistakes

  1. Inconsistent tin/tout conventions. This lesson uses two conventions in different sections by design (see the intuition section above): the visual walkthrough uses the subtree-right-bound convention, and the implementation uses the exit-timestamp convention. Both are valid in isolation, but mixing them in the same solution silently breaks subtree ranges. Pick one and stick with it: either tin[v] = timer++ at entry and tout[v] = timer++ at exit (subtree is half-open (tin[v], tout[v]) of timer values), or only tin increments and tout[v] = max(tin in subtree) (subtree is [tin[v], tout[v]] directly).

Debug Drills

Drill 1 -- Binary lifting: k-th ancestor out of bounds

cpp
int kth_ancestor(int v, int k) {
    for (int i = 0; i < LOG; i++)
        if (k & (1 << i))
            v = up[v][i];  // BUG: v might become -1
    return v;
}
What's wrong?

If k exceeds the depth of v, up[v][i] eventually becomes -1 (or 0 for root, depending on convention), and subsequent lookups access invalid state. Add a check:

cpp
if (v == -1) return -1;  // node doesn't exist

Also ensure up[root][0] = root (or -1) is set correctly during preprocessing.

Drill 2 -- Euler tour: using tout as inclusive but BIT as exclusive

cpp
// Subtree sum query for node v
int sum = bit.query(tout[v]) - bit.query(tin[v] - 1);
What's wrong?

If tout[v] is the last timestamp in v's subtree (inclusive), this is correct. But if your DFS sets tout[v] = timer after incrementing past the subtree (exclusive), then you need bit.query(tout[v] - 1). Mismatching inclusive/exclusive conventions between the Euler tour and the data structure is a classic bug.

Drill 3 -- Sparse table LCA: comparing wrong values

cpp
// RMQ returns index of minimum in euler_tour_depth
int lca(int u, int v) {
    int l = first[u], r = first[v];
    if (l > r) swap(l, r);
    return rmq_query(l, r);  // returns depth, not the node!
}
What's wrong?

The RMQ should return the node with minimum depth, not the depth value itself. Store (depth, node) pairs in the sparse table, and the minimum is taken by comparing depths but returning the node:

cpp
auto [d, node] = rmq_query(l, r);  // returns (min_depth, corresponding_node)
return node;

Self-Test

Before moving on, verify you can answer these without looking at the notes:

  • [ ] Write a DFS that computes tin[v] and tout[v] for every node -- then use it to check if u is an ancestor of v in O(1).
  • [ ] Build the binary lifting table and answer LCA queries in O(logn) -- including the depth-equalization step and the final up[u][0] return.
  • [ ] Given a tree with node values, flatten it with an Euler tour and answer "sum of values in subtree(v)" using a BIT on the flat array.
  • [ ] State the difference between the tin/tout Euler tour (length 2n) and the full Euler tour for RMQ-based LCA (length 2n1) -- and when to use each.
  • [ ] Compute dist(u, v) = depth[u] + depth[v] - 2 * depth[lca(u,v)] from memory, and explain why it works.

Practice Problems

CF RatingWhat You Should Be Able to Do
1200Compute LCA by walking up from both nodes (brute force O(n)).
1500Binary lifting for O(logn) LCA; Euler tour for subtree sum queries with BIT.
1800Sparse table for O(1) LCA; path queries by reducing to f(root,u) + f(root,v) - 2*f(root,lca).
2100Virtual tree (auxiliary tree) construction from a subset of key nodes; combine Euler tour with HLD or centroid decomp.
#ProblemSourceDifficultyKey Idea
1Distance QueriesCSESEasyDirect LCA + distance formula
2Company Queries ICSESEasyk-th ancestor via binary lifting
3Company Queries IICSESEasyDirect LCA query
4Subtree QueriesCSESMediumEuler tour + BIT for subtree sums
5Path QueriesCSESMediumEuler tour + BIT difference trick
6CF 519E -- A and B and Lecture RoomsCFMediumLCA + distance + midpoint on path
7CF 609E -- Minimum spanning tree for each edgeCFMediumMST + LCA with max-edge on path
8CF 191C -- Fools and RoadsCFMedium-HardPath increment via LCA + difference array
9CF 613D -- Kingdom and its CitiesCFHardVirtual tree (auxiliary tree) construction
10CF 1702G2 -- Passable Paths (hard)CFHardSort by Euler tour, consecutive LCAs

Further Reading

Built for the climb from Codeforces 1100 to 2100.