Appearance
Virtual Tree (Auxiliary Tree)
Quick Revisit
- USE WHEN: Multiple queries each involving a small subset of k important nodes on a large tree (Steiner tree, subtree DP on marked nodes)
- INVARIANT: Virtual tree has ≤ 2k−1 nodes (important nodes + pairwise LCAs) and preserves ancestor-descendant structure of the original tree
- TIME: O(k log k) construction per query (sorting + LCA) | SPACE: O(k) per query
- CLASSIC BUG: Not adding LCAs of consecutive nodes in the sorted-by-tin order — the virtual tree loses structural edges and disconnects
- PRACTICE: 05-ladder-graphs
Given a tree on
Difficulty: [Advanced]Prerequisites: Euler Tour & LCA, DFS & Tree DFSCF Rating Range: 2100-2500+
The Problem It Solves
You have a tree with
Solving on the full tree costs
The virtual tree lets you solve each query in
Construction Algorithm
Input: The original tree (with LCA preprocessing) and a list of
Output: A small tree of at most
Steps:
- Sort important nodes by Euler tour entry time (
). - Add pairwise LCAs: for each consecutive pair
in sorted order, compute and add it to the set. - Deduplicate and re-sort by
. - Build the virtual tree using a stack: process nodes in
order, maintaining a stack representing the current root-to-leaf path. For each new node, pop stack entries that are not ancestors, adding edges as you go.
Original tree: Important nodes: {3, 7, 11}
1
/ \
2 3 LCA(3,7) = 1, LCA(7,11) = 5
/ \ \
4 5 6 Virtual tree nodes: {1, 3, 5, 7, 11}
/ \
7 8 Virtual tree:
/ 1
9 / \
/ \ 3 5
10 11 / \
7 11
(edges carry original distances)The virtual tree has 5 nodes instead of 11. Edge weights equal the path length in the original tree.
Why consecutive-tin LCAs are enough
The most surprising step of the algorithm is step 2: we only add the LCAs of consecutive pairs in the tin-sorted list, not the LCAs of all
Claim. If you sort the marked nodes by Euler-tour entry time
, the set of LCAs of all pairs equals the set of LCAs of consecutive pairs.
The intuition: in tin order, the marked nodes appear in the order DFS first visits them. Two marked nodes
This is why the construction is
The tin order is critical. Sort by any other order (DFS post-order, label, depth) and the consecutive-pair LCA argument breaks. The DFS-order adjacency of marked nodes is what exposes all the branch LCAs.
Stack-Based Construction
The stack maintains the path from the virtual root to the current working node:
Processing nodes in tin order: [1, 3, 5, 7, 11]
Process 1: stack = [1]
Process 3: 3 is child of 1 -> add edge (1,3), stack = [1, 3]
Process 5: LCA(3,5)=1. Pop 3 (edge 1-3 done). 5 is child of 1.
stack = [1, 5]
Process 7: 7 is child of 5 -> add edge (5,7), stack = [1, 5, 7]
Process 11: LCA(7,11)=5. Pop 7 (edge 5-7 done). 11 is child of 5.
stack = [1, 5, 11]
Done: pop remaining -> edges (5,11), (1,5)Implementation
cpp
#include <bits/stdc++.h>
using namespace std;
// Assume LCA is precomputed: lca(u,v), tin[u] (Euler tour entry time)
// adj_vt: adjacency list of the virtual tree (cleared each query)
vector<int> adj_vt[MAXN];
int stk[MAXN], top_stk;
bool isAnc(int u, int v) { // is u an ancestor of v?
return tin[u] <= tin[v] && tout[v] <= tout[u];
}
void addEdge(int u, int v) {
adj_vt[u].push_back(v);
adj_vt[v].push_back(u);
}
void buildVirtualTree(vector<int>& nodes) {
sort(nodes.begin(), nodes.end(), [](int a, int b) {
return tin[a] < tin[b];
});
// add LCAs of consecutive pairs
int sz = nodes.size();
for (int i = 0; i + 1 < sz; i++) {
nodes.push_back(lca(nodes[i], nodes[i + 1]));
}
sort(nodes.begin(), nodes.end(), [](int a, int b) {
return tin[a] < tin[b];
});
nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
// Stack-based construction. The stack holds the current
// root-to-leaf path of the virtual tree under construction.
top_stk = 0;
for (int v : nodes) {
if (top_stk == 0) { stk[top_stk++] = v; continue; }
// Pop stack entries that are NOT ancestors of v -- those
// subtrees are finalised; record their parent edge.
while (top_stk > 1 && !isAnc(stk[top_stk - 1], v)) {
addEdge(stk[top_stk - 2], stk[top_stk - 1]);
top_stk--;
}
int l = lca(v, stk[top_stk - 1]);
if (stk[top_stk - 1] != l) {
// The LCA of v and the current stack top sits strictly
// between them. It must already be in `nodes` (we added
// it during the LCA-of-consecutive-pairs step), so we
// attach the unfinished top to l and replace the top.
addEdge(l, stk[top_stk - 1]);
stk[top_stk - 1] = l;
}
stk[top_stk++] = v;
}
while (top_stk > 1) {
addEdge(stk[top_stk - 2], stk[top_stk - 1]);
top_stk--;
}
}Clean up after each query: clear adj_vt[v] for every node in the virtual tree (don't memset the whole array).
Complexity
| Operation | Time |
|---|---|
| Build virtual tree | |
| Solve on virtual tree | depends on algorithm, but tree has |
| LCA query (binary lifting) | |
| Total per query |
When to Reach for Virtual Tree
- "
special nodes" in a tree query where . - Steiner tree on a tree (minimum subtree connecting marked nodes).
- DP on tree with only a few relevant leaves/nodes per query.
- Any problem where you'd run full tree DP but most nodes are irrelevant.
Pitfalls
- Forgetting to add the root. If the tree is rooted and your algorithm needs the root, include it in the important set.
- Not cleaning up adjacency lists. Only clear nodes that appeared in the virtual tree -- clearing the full array is
per query. - Off-by-one in LCA pairs. You need LCAs of consecutive pairs in sorted tin order, not all pairs.
- Edge weights. If edges carry weights, the virtual tree edge weight between two nodes equals the distance in the original tree (precompute with depth arrays).
Practice Problems
| # | Problem | Difficulty | Notes |
|---|---|---|---|
| 1 | CF 613D -- Kingdom and its Cities | 2500 | Classic virtual tree + greedy |
| 2 | CF 1111E -- Tree | 2500 | Virtual tree + DP |
| 3 | CF 587F -- Duff is Mad | 2800 | Advanced virtual tree application |
| 4 | Luogu P2495 -- 消耗战 | -- | Steiner tree on virtual tree |
Further Reading
- cp-algorithms: Virtual Tree
- Euler Tour & LCA -- prerequisite for LCA preprocessing