Appearance
DFS and Tree DFS
AL-13 | Depth-first search on general graphs and rooted trees -- the backbone of nearly every graph problem on Codeforces.
Difficulty: [Beginner]Prerequisites: Graph Representation, Stack, Queue, and DequeCF Rating: 1100-1500
Contest Frequency: *** Common -- appears in most Div 2 contests
Quick Revisit
- USE WHEN: reachability, cycle detection, connected components, tree traversal, back-edge classification
- INVARIANT: each node visited exactly once; recursion stack reveals ancestry
- TIME: O(V + E) | SPACE: O(V) stack depth
- CLASSIC BUG: stack overflow on large graphs (use iterative DFS or increase stack size)
- PRACTICE: ../12-practice-ladders/05-ladder-graphs.md
Contents
- Intuition
- C++ STL Reference
- Implementation (Contest-Ready)
- Operations Reference
- Problem Patterns & Tricks
- Contest Cheat Sheet
- Gotchas & Debugging
- Common Mistakes
- Self-Test
- Practice Problems
- Iterative DFS and BFS
- Code Reading Exercise
- Further Reading
- Recap
Intuition
This file teaches DFS in two passes. The first treats DFS as a way to visit every reachable node — the bare traversal. The second specialises to DFS on a rooted tree, where the recursion stack becomes a path from the root and timestamps become the subtree structure. Edge classification (tree/back/forward/cross) is a consequence of the second view; it appears later, after you have a use for it.
Part 1 — Plain DFS: visit every reachable node
Suppose you have 6 nodes and some edges, and you need to visit every node reachable from node 1.
Graph:
1 -- 2 2 -- 4
1 -- 3 2 -- 5
3 -- 6
1
/ \
2 3
/ \ \
4 5 6Naive attempt: pick a random neighbor, go there, pick another random neighbor... You might visit node 2, then jump back to 1, visit 2 again, wander to 3, revisit 1 a third time. With
The key insight:
Go as deep as possible before backtracking -- this naturally explores every reachable node exactly once using a stack (or recursion).
Analogy: you are in a maze. At every fork, always take the first unexplored corridor and walk until you hit a dead end. Then retrace your steps to the last fork that still has an unexplored corridor and repeat. You never re-enter a corridor you have already explored.
The "break" in this strategy is when you reach a dead end (a node all of whose neighbors are visited). At that moment the stack pops and you backtrack to the most recent node that still has unexplored neighbors. The stack guarantees you return to exactly the right place.
This first view of DFS — visit every reachable node, exactly once, backed by a stack — is sufficient for connected components, flood fill, reachability tests, and undirected cycle detection. Everything in Part 1 below works at this level.
Visual Walkthrough
Same 6-node graph, DFS from node 1, visiting smaller-numbered neighbors first.
adj[1] = {2, 3} adj[2] = {1, 4, 5} adj[3] = {1, 6}
adj[4] = {2} adj[5] = {2} adj[6] = {3}
Step Action Stack (top->right) Visited tin tout
---- ----------------- ------------------ --------------- --- ----
1 visit 1 [1] {1} 0 -
2 visit 2 (from 1) [1, 2] {1,2} 1 -
3 visit 4 (from 2) [1, 2, 4] {1,2,4} 2 -
4 backtrack from 4 [1, 2] {1,2,4} - 3
5 visit 5 (from 2) [1, 2, 5] {1,2,4,5} 4 -
6 backtrack from 5 [1, 2] {1,2,4,5} - 5
7 backtrack from 2 [1] {1,2,4,5} - 6
8 visit 3 (from 1) [1, 3] {1,2,3,4,5} 7 -
9 visit 6 (from 3) [1, 3, 6] {1,2,3,4,5,6} 8 -
10 backtrack from 6 [1, 3] {1,2,3,4,5,6} - 9
11 backtrack from 3 [1] {1,2,3,4,5,6} - 10
12 backtrack from 1 [] {1,2,3,4,5,6} - 11
DFS order: 1 -> 2 -> 4 -> 5 -> 3 -> 6
Total node visits: 6 (each node entered exactly once)
Total edge inspections: 2 * 5 = 10 (each undirected edge checked from both ends)
Overall work: O(n + m) = O(6 + 5) = O(11)Discovery times (tin) and finish times (tout) are shown above. They define nested intervals: node
The Invariant
+------------------------------------------------------------------+
| INVARIANT: Every node is visited exactly once. The recursion |
| stack contains the current path from the source to the node |
| being processed. |
+------------------------------------------------------------------+Informal induction proof that DFS visits every reachable node exactly once:
Base case (
Inductive step. Assume that after
- If
is already visited, we skip it -- no double visit. - If
is unvisited, we mark it visited before recursing (or pushing). So is entered exactly once.
Because the graph is finite, the process terminates. Any reachable node
BFS vs DFS -- when to pick which:
| Criterion | DFS | BFS |
|---|---|---|
| Memory | ||
| Shortest path (unweighted) | No | Yes -- BFS gives shortest paths |
| Connectivity / reachability | Yes | Yes |
| Cycle detection | Yes -- back edge to gray node | Possible but less natural |
| Topological sort | Yes -- reverse of finish order | Yes -- Kahn's algorithm |
| Tree properties (subtree size, diameter, tin/tout) | Natural fit | Awkward |
| Subtree queries / Euler tour | DFS only | Not applicable |
Rule of thumb. If you need shortest distance in an unweighted graph or level-order information, use BFS. For almost everything else -- connectivity, cycle detection, topological sort, tree structure -- DFS is simpler and often more powerful. DFS uses
Part 2 — DFS on rooted trees: ancestry, time, and structure
The plain-DFS view above is enough for traversal. But on a rooted tree (or a DFS tree of any graph), DFS does more than visit nodes — the order in which it enters and exits them is itself the data structure. Three insights separate "I can code DFS" from "I understand DFS":
1. The recursion stack IS the current source-to-node path.
When DFS is processing node v, every node on the call stack forms the unique tree path from the root to v. This is why a back edge (to a GRAY node on the stack) immediately gives you a cycle -- the stack path from that ancestor down to v, plus the back edge, IS the cycle.
DFS tree (--- = tree edge, ... = back edge):
1 ---> 2 ---> 3 Stack at node 3: [1, 2, 3]
:
:..> 1 back edge: 3 sees GRAY node 1
cycle = 1 -> 2 -> 3 -> 1
(stack path 1->2->3 + back edge 3->1)2. tin/tout encode the entire subtree structure in O(1) queries.
The code increments a timer on entry and exit. The insight the code won't teach you: these times form nested intervals. [tin[u], tout[u]] fully contains [tin[v], tout[v]] if and only if v is in the subtree of u. This turns "is v a descendant of u?" into a simple range check -- no tree traversal needed.
DFS tree with tin/tout (same 6-node graph):
1 [0,11] [1 [2 [4]4 [5]5 ]2 [3 [6]6 ]3 ]1
/ \ 0 1 2 3 4 5 6 7 8 9 10 11
2 3
/ \ \ Is 6 in subtree(1)?
4 5 6 tin[1]=0 <= tin[6]=8 AND
[2,3] [4,5] [8,9] tout[6]=9 <= tout[1]=11 => YES
[1,6] [7,10]
Is 5 in subtree(3)?
tin[3]=7 <= tin[5]=4? NO => NOT in subtree3. Once you have a DFS tree, every edge has a name.
Now that ancestry is well-defined, the edges DFS sees on a directed graph fall into four categories:
- Tree edge: to a WHITE (undiscovered) node — explored next.
- Back edge: to a GRAY (on-stack) ancestor — closes a cycle.
- Forward edge: to a BLACK descendant already in the subtree.
- Cross edge: to a BLACK node in a different subtree.
In undirected graphs only tree and back edges exist. This classification is what turns DFS into directed-cycle detection, topological sort, SCC decomposition, and bridge finding — each of those algorithms is just "DFS plus interpret the edges."
If this is your first read, you do not need to memorise the four categories. The one to keep is back edge ⇒ cycle. The others appear in advanced graph algorithms once they have a job to do.
When to Reach for This
DFS is the go-to for: cycle detection, topological ordering (via post-order), connected components, articulation points/bridges, tree diameter, and subtree queries.
Trigger phrases in the problem statement:
- "connected components" -> DFS (or BFS)
- "cycle detection" -> DFS with colors (WHITE/GRAY/BLACK)
- "tree traversal", "subtree queries" -> tree DFS with tin/tout
- "topological order" -> DFS finish-time reverse
- "bipartite check" -> DFS 2-coloring
- "reachability" -> DFS or BFS
- "bridges", "articulation points" -> DFS with
low[]array (Tarjan) - "ancestor check", "back edge" -> DFS timestamps
- "tree diameter" -> two-pass DFS (farthest node twice)
- "subtree sum/update" on a tree -> DFS Euler tour + range query
Counter-examples (do NOT use DFS here):
- "shortest path in an unweighted graph" -> use BFS
- "shortest path with weights" -> use Dijkstra / Bellman-Ford
- "minimum spanning tree" -> use Kruskal / Prim
DFS State Snapshots
Using the same 6-node graph from the walkthrough, here are snapshots of DFS state at key moments. Nodes are colored WHITE (undiscovered), GRAY (on the recursion stack), or BLACK (finished -- all descendants processed).
Snapshot 1: Mid-traversal (step 5 -- visiting node 5 from node 2)
1[G] Recursion stack
/ \ (= current path from root):
2[G] 3[W] +---+
/ \ \ | 5 | <-- top (currently processing)
4[B] 5[G] 6[W] | 2 |
| 1 |
W = WHITE (undiscovered) +---+
G = GRAY (on stack)
B = BLACK (finished)
Nodes 1, 2, 5 are GRAY -- they form the path 1 -> 2 -> 5.
Node 4 is BLACK -- fully processed, already backtracked.
Nodes 3, 6 are WHITE -- not yet reached.Snapshot 2: Back edge discovered -- edge to a GRAY node means cycle
Suppose an extra directed edge (5 -> 1) exists.
When DFS at node 5 examines neighbor 1:
node 1 is GRAY (on stack) => back edge => CYCLE detected!
1[G] <.............
/ \ . back edge (5 -> 1)
2[G] 3[W] . cycle: 1 -> 2 -> 5 -> 1
/ \ \ .
4[B] 5[G] ..' 6[W]
Rule: edge to GRAY node = back edge = cycle exists
edge to BLACK node = forward/cross edge = no cycle
edge to WHITE node = tree edge (explore it)Snapshot 3: Nested tin/tout intervals as brackets
tin/tout values from the walkthrough:
Node: 1 2 4 5 3 6
tin: 0 1 2 4 7 8
tout: 11 6 3 5 10 9
Visualized as nested brackets on a number line:
0 1 2 3 4 5 6 7 8 9 10 11
| | | | | | | | | | | |
[1 [2 [4 ]4 [5 ]5 ]2 [3 [6 ]6 ]3 ]1
Reading the nesting:
- [4]=[2,3] inside [2]=[1,6] inside [1]=[0,11]
=> 4 in subtree(2), 2 in subtree(1) YES
- [6]=[8,9] inside [3]=[7,10]
=> 6 in subtree(3) YES
- [3]=[7,10] inside [2]=[1,6]? tin[3]=7 > tout[2]=6
=> 3 NOT in subtree(2) NO
O(1) subtree test:
v in subtree(u) <=> tin[u] <= tin[v] AND tout[v] <= tout[u]C++ STL Reference
| Function / Class | Header | What it does | Time Complexity |
|---|---|---|---|
vector<vector<int>> adj(n) | <vector> | Adjacency list representation | |
vector<bool> vis(n, false) | <vector> | Visited array | |
stack<int> | <stack> | Explicit stack for iterative DFS | |
function<void(int,int)> | <functional> | Wrapper for recursive lambda | |
ranges::fill(vis, false) | <algorithm> | Reset visited array between test cases |
Implementation (Contest-Ready)
Version 1: Graph DFS -- Recursive + Iterative
cpp
#include <bits/stdc++.h>
using namespace std;
int main(){
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n, m;
cin >> n >> m;
vector<vector<int>> adj(n + 1);
for(int i = 0; i < m; i++){
int u, v; cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
// --- Recursive DFS ---
vector<bool> vis(n + 1, false);
int comp = 0;
auto dfs = [&](auto& self, int u) -> void {
vis[u] = true;
for(int v : adj[u])
if(!vis[v]) self(self, v);
};
for(int i = 1; i <= n; i++){
if(!vis[i]){
dfs(dfs, i);
comp++;
}
}
cout << "Connected components (recursive): " << comp << "\n";
// --- Iterative DFS (safe for large n) ---
fill(vis.begin(), vis.end(), false);
comp = 0;
for(int i = 1; i <= n; i++){
if(vis[i]) continue;
comp++;
stack<int> st;
st.push(i);
vis[i] = true;
while(!st.empty()){
int u = st.top(); st.pop();
for(int v : adj[u]){
if(!vis[v]){
vis[v] = true;
st.push(v);
}
}
}
}
cout << "Connected components (iterative): " << comp << "\n";
}Version 2: Tree DFS with tin/tout, Depth, Subtree Size, Parent
cpp
#include <bits/stdc++.h>
using namespace std;
int main(){
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n;
cin >> n;
vector<vector<int>> adj(n + 1);
for(int i = 0; i < n - 1; i++){
int u, v; cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
vector<int> tin(n + 1), tout(n + 1), depth(n + 1, 0);
vector<int> par(n + 1, 0), sz(n + 1, 1);
int timer = 0;
// Recursive tree DFS from root = 1
auto dfs = [&](auto& self, int u, int p) -> void {
par[u] = p;
tin[u] = timer++;
for(int v : adj[u]){
if(v == p) continue;
depth[v] = depth[u] + 1;
self(self, v, u);
sz[u] += sz[v];
}
tout[u] = timer++;
};
dfs(dfs, 1, 0);
// Check if u is ancestor of v in O(1)
auto is_ancestor = [&](int u, int v) -> bool {
return tin[u] <= tin[v] && tout[v] <= tout[u];
};
// Answer queries
int q;
cin >> q;
while(q--){
int u, v; cin >> u >> v;
if(is_ancestor(u, v))
cout << u << " is ancestor of " << v << "\n";
else
cout << u << " is NOT ancestor of " << v << "\n";
}
}Explained Version: Tree DFS with Inline Comments
cpp
#include <bits/stdc++.h>
using namespace std;
int main(){
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n;
cin >> n;
vector<vector<int>> adj(n + 1);
for(int i = 0; i < n - 1; i++){
int u, v; cin >> u >> v;
adj[u].push_back(v);
adj[v].push_back(u);
}
// tin[u]: time when we first enter node u
// tout[u]: time when we finish processing u and all its subtree
// Together, [tin[u], tout[u]] defines the "interval" of u's subtree.
vector<int> tin(n + 1), tout(n + 1), depth(n + 1, 0);
vector<int> par(n + 1, 0), sz(n + 1, 1);
int timer = 0;
// We use auto& self pattern for recursive lambdas.
// p = parent, prevents going back up the edge we came from.
auto dfs = [&](auto& self, int u, int p) -> void {
par[u] = p;
tin[u] = timer++; // record entry time
for(int v : adj[u]){
if(v == p) continue; // skip parent edge (tree has no other cycles)
depth[v] = depth[u] + 1;
self(self, v, u); // recurse into child
sz[u] += sz[v]; // after child returns, accumulate subtree size
}
tout[u] = timer++; // record exit time
};
dfs(dfs, 1, 0); // root at 1, parent of root is 0 (sentinel)
// Ancestor check: u is ancestor of v iff v's interval is inside u's interval.
// This works because DFS timestamps form nested intervals on a tree.
auto is_ancestor = [&](int u, int v) -> bool {
return tin[u] <= tin[v] && tout[v] <= tout[u];
};
int q;
cin >> q;
while(q--){
int u, v; cin >> u >> v;
cout << (is_ancestor(u, v) ? "YES" : "NO") << "\n";
}
}ASCII Trace: DFS on a 7-Node Graph
Adjacency list:
0: [1, 2]
1: [0, 3, 4]
2: [0, 5]
3: [1]
4: [1, 6]
5: [2]
6: [4]
Step Action Stack (top->) Visited disc/fin
---------------------------------------------------------------------
1 push 0 [0] {} -
2 pop 0, visit [] {0} 0: 1/-
3 push 2,1 [2,1] {0} -
4 pop 1, visit [2] {0,1} 1: 2/-
5 push 4,3 [2,4,3] {0,1} -
6 pop 3, visit [2,4] {0,1,3} 3: 3/4
7 pop 4, visit [2] {0,1,3,4} 4: 5/-
8 push 6 [2,6] {0,1,3,4} -
9 pop 6, visit [2] {0,1,3,4,6} 6: 6/7
10 (4 finished) [2] {0,1,3,4,6} 4: 5/8
11 (1 finished) [2] {0,1,3,4,6} 1: 2/9
12 pop 2, visit [] {0,1,2,3,4,6} 2: 10/-
13 push 5 [5] {0,1,2,3,4,6} -
14 pop 5, visit [] {0,1,2,3,4,5,6} 5: 11/12
15 (2 finished) [] {0,1,2,3,4,5,6} 2: 10/13
16 (0 finished) [] {0,1,2,3,4,5,6} 0: 1/14
DFS tree edges: 0->1, 1->3, 1->4, 4->6, 0->2, 2->5Operations Reference
DFS itself is always
| Operation | Time | Space |
|---|---|---|
| DFS traversal (graph) | ||
| DFS traversal (tree) | ||
| Compute tin/tout | ||
| Compute subtree sizes | ||
| Compute depths | ||
| Ancestor check (with tin/tout) | ||
| Cycle detection (directed) | ||
| Cycle detection (undirected) | ||
| Connected components |
Recursive vs Iterative -- When to Pick Which
Both versions appear in Version 1 above. Key points:
- Iterative DFS processes neighbors in reverse order compared to recursive (the stack reverses adjacency-list iteration).
- For tree DFS with entry/exit times, iterative requires tracking whether a node is being entered or exited -- more bookkeeping than recursive.
- For most contest problems with
, recursive DFS is fine. Increase stack size with ulimit -s unlimitedor a thread with larger stack. - For
on strict stack limits, use iterative DFS (see Gotcha 1).
Problem Patterns & Tricks
Pattern 1: Connected Components
Count or label connected components in an undirected graph. Run DFS from every unvisited node -- each new DFS call is a new component.
cpp
int comp = 0;
for(int i = 1; i <= n; i++){
if(!vis[i]){
dfs(dfs, i);
comp++;
}
}CF examples: CF 977E, CF 1176E
Pattern 2: Cycle Detection (Directed Graph)
Color-based DFS: white/gray/black. An edge to a gray node is a back edge (cycle). To reconstruct the cycle, record parents and trace back from the gray node.
cpp
vector<int> color(n + 1, 0); // 0=white, 1=gray, 2=black
bool has_cycle = false;
auto dfs = [&](auto& self, int u) -> void {
color[u] = 1;
for(int v : adj[u]){
if(color[v] == 1) has_cycle = true; // back edge
if(color[v] == 0) self(self, v);
}
color[u] = 2;
};CF examples: CF 977E, CF 1385E
Pattern 3: Bipartite Check (2-Coloring)
DFS while coloring nodes with alternating colors. If an edge connects two same-colored nodes, the graph is not bipartite.
cpp
vector<int> col(n + 1, -1);
bool bipartite = true;
auto dfs = [&](auto& self, int u, int c) -> void {
col[u] = c;
for(int v : adj[u]){
if(col[v] == -1) self(self, v, c ^ 1);
else if(col[v] == c) bipartite = false;
}
};CF examples: CF 687A, CF 1144F
Pattern 4: Subtree Queries via tin/tout (Euler Tour)
Flatten the tree into an array using tin. Node
cpp
// After tree DFS with tin and sz computed:
// Subtree of u maps to range [tin[u], tin[u] + sz[u] - 1]
// To add val to node u: update position tin[u]
// To query subtree sum of u: range query [tin[u], tin[u]+sz[u]-1]Pattern 5: Tree Diameter
The diameter (longest path) of a tree can be found with two DFS calls: (1) DFS from any node, find the farthest node
cpp
auto farthest = [&](int start) -> pair<int,int> {
vector<int> dist(n + 1, -1);
dist[start] = 0;
auto go = [&](auto& self, int u, int p) -> void {
for(int v : adj[u]) if(v != p){
dist[v] = dist[u] + 1;
self(self, v, u);
}
};
go(go, start, 0);
int far = max_element(dist.begin() + 1, dist.begin() + n + 1) - dist.begin();
return {far, dist[far]};
};
auto [u, d1] = farthest(1);
auto [v, diameter] = farthest(u);CF examples: CF 1324F, CSES Tree Diameter
Pattern 6: Finding Bridges and Articulation Points
Extend DFS with tin and low arrays. low[u] = minimum tin reachable from the subtree of
CF examples: CF 1000E, CF 118E
See: 02-bridges-and-articulation-points.md for full implementation.
Pattern 7: DFS on Implicit Graphs (Grid DFS)
Treat a 2D grid as a graph. Each cell has up to 4 neighbors. DFS to flood-fill components, count islands, etc.
cpp
int dx[] = {0,0,1,-1}, dy[] = {1,-1,0,0};
auto dfs = [&](auto& self, int x, int y) -> void {
vis[x][y] = true;
for(int d = 0; d < 4; d++){
int nx = x + dx[d], ny = y + dy[d];
if(nx >= 0 && nx < n && ny >= 0 && ny < m && !vis[nx][ny] && grid[nx][ny] == '.')
self(self, nx, ny);
}
};CF examples: CF 1365D, CF 1594D
Contest Cheat Sheet
+--------------------------------------------------------------+
| DFS / TREE DFS CHEAT SHEET |
+--------------------------------------------------------------+
| WHEN TO USE: |
| - Reachability, connected components, flood fill |
| - Cycle detection (directed: colors, undirected: parent) |
| - Bipartite check (2-color DFS) |
| - Tree: subtree sizes, depths, tin/tout, diameter |
+--------------------------------------------------------------+
| GRAPH DFS: |
| vis[u] = true; |
| for(int v : adj[u]) if(!vis[v]) dfs(v); |
+--------------------------------------------------------------+
| TREE DFS (root r, parent p): |
| tin[u] = timer++; sz[u] = 1; |
| for(int v : adj[u]) if(v != p) |
| depth[v] = depth[u]+1, dfs(v,u), sz[u] += sz[v]; |
| tout[u] = timer++; |
+--------------------------------------------------------------+
| ANCESTOR CHECK: tin[u] <= tin[v] && tout[v] <= tout[u] |
+--------------------------------------------------------------+
| DIAMETER: dfs from any node -> farthest u; dfs from u -> v |
+--------------------------------------------------------------+
| CYCLE (directed): edge to GRAY node = back edge = cycle |
+--------------------------------------------------------------+
| COMPLEXITY: O(n + m) time, O(n + m) space |
+--------------------------------------------------------------+
| DANGER: |
| - Stack overflow for n >= 1e5: use iterative or ulimit -s |
| - Mark visited BEFORE pushing (iterative), not after pop |
| - Reset vis[] between test cases |
+--------------------------------------------------------------+Gotchas & Debugging
Gotcha 1: Stack Overflow on Deep Recursion
Default stack size is typically 1-8 MB. A recursive DFS on a chain of
cpp
// Option A: Increase stack size (Linux, compile-time)
// g++ -O2 -Wl,-z,stacksize=67108864 sol.cpp (64 MB stack)
// Option B: Set stack via ulimit before running
// ulimit -s unlimited && ./sol
// Option C: Use iterative DFS (always safe)
stack<int> st;
st.push(root);
vis[root] = true;
while(!st.empty()){
int u = st.top(); st.pop();
for(int v : adj[u]) if(!vis[v]){
vis[v] = true;
st.push(v);
}
}On Codeforces, the stack limit is usually 256 MB, but not always. Check the problem constraints. If
Gotcha 2: Iterative DFS -- Mark Visited Before Push, Not After Pop
cpp
// WRONG -- pushes same node multiple times
while(!st.empty()){
int u = st.top(); st.pop();
if(vis[u]) continue; // wasteful, can TLE
vis[u] = true;
for(int v : adj[u]) if(!vis[v]) st.push(v);
}
// RIGHT -- mark before push
st.push(start);
vis[start] = true;
while(!st.empty()){
int u = st.top(); st.pop();
for(int v : adj[u]) if(!vis[v]){
vis[v] = true;
st.push(v);
}
}Gotcha 3: Forgetting to Reset Between Test Cases
With multiple test cases, forgetting to clear vis, tin, tout, etc. gives wrong answers. Use fill or reconstruct:
cpp
int t; cin >> t;
while(t--){
int n; cin >> n;
vector<vector<int>> adj(n + 1);
// ... read input ...
vector<bool> vis(n + 1, false); // fresh each test case
// ... dfs ...
}Gotcha 4: 1-indexed vs 0-indexed
Adjacency list of size n with nodes 1..n needs adj(n+1), not adj(n). Off-by-one here causes out-of-bounds or missing nodes.
Gotcha 5: Undirected Cycle Detection -- Parent Check
In undirected graphs, every edge appears twice. You must skip the edge back to the parent, not just check visited:
cpp
// WRONG for multigraphs -- two edges between u and v look like a cycle
auto dfs = [&](auto& self, int u, int p) -> void {
vis[u] = true;
for(int v : adj[u]){
if(v == p) continue; // skip parent
if(vis[v]){ /* cycle found */ }
else self(self, v, u);
}
};For multigraphs (multiple edges between same pair), track the edge index instead of the parent node.
Gotcha 6: Tree DFS -- Root Choice
For unrooted trees, the root choice doesn't affect subtree sizes or depths relative to the root, but it changes which nodes are ancestors. Some problems require rerooting -- see 08-dp-on-trees.md.
Mental Traps
Trap 1: "DFS finds the shortest path"
DFS finds a path, not the shortest. The path it finds depends entirely on adjacency list order. Use BFS for shortest paths in unweighted graphs.
Graph: 1 ------- 4 adj[1] = {2, 4}
| |
2 --- 3 --+ DFS from 1 (visits 2 first):
1 -> 2 -> 3 -> 4 (length 3)
DFS path to 4: 1 -> 2 -> 3 -> 4 (length 3)
Shortest path: 1 -> 4 (length 1)
^^^^^^^^^^^^
BFS would find this!Trap 2: "The visited array and the recursion stack are the same thing"
They are NOT. visited[v] = true means v has been discovered (WHITE -> GRAY or BLACK). The recursion stack holds only the currently active nodes (GRAY). This distinction is critical for cycle detection in directed graphs.
After backtracking from node 4 (step 4 in walkthrough):
1[G] visited[] = {1, 2, 4} (3 nodes)
/ \ stack = [1, 2] (2 nodes)
2[G] 3[W]
/ \ Node 4: visited = YES, on stack = NO (BLACK)
4[B] 5[W] Node 2: visited = YES, on stack = YES (GRAY)
An edge to node 4 (BLACK) is NOT a cycle -- it's a forward/cross edge.
An edge to node 2 (GRAY) IS a cycle -- it's a back edge.
Using only visited[] for directed cycle detection is WRONG.
You need the color/on-stack distinction (GRAY vs BLACK).Common Mistakes
| Mistake | Consequence | Fix |
|---|---|---|
| Recursive DFS on chain of | Stack overflow | Use iterative DFS or ulimit -s unlimited |
| Not marking visited before recursing/pushing | Infinite loop or TLE | Set vis[v] = true before the recursive call or push |
Using visited[] instead of colors for directed cycle detection | False positives -- cross/forward edges look like cycles | Use WHITE/GRAY/BLACK; only edge to GRAY = back edge = cycle |
| Not handling disconnected components | Missing nodes entirely | Wrap DFS in for(i=1..n) if(!vis[i]) dfs(i) outer loop |
See Gotchas & Debugging for detailed examples of each.
Self-Test
- [ ] I can write tree DFS with parent tracking --
dfs(u, par),if (v == par) continue, post-order subtree size - [ ] I can compute in-time and out-time correctly with a single timer incrementing on both entry and exit
- [ ] I can state the subtree query: "is v in subtree of u?" using
tin[u] <= tin[v] && tout[v] <= tout[u] - [ ] I can explain the stack overflow risk for large n and name one mitigation (iterative DFS or ulimit)
- [ ] I can classify edges in DFS: back edge to gray node indicates a cycle
Practice Problems
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | Building Roads | CSES | Easy (1200) | Count and connect components via DFS |
| 2 | Round Trip | CSES | Easy (1400) | Cycle detection in undirected graph, reconstruct cycle |
| 3 | Graph Coloring | CSES | Easy (1400) | Bipartite check via 2-color DFS |
| 4 | Labyrinth | CSES | Easy (1400) | Grid DFS/BFS, path reconstruction |
| 5 | Tree Diameter | CSES | Medium (1500) | Two-pass DFS for diameter |
| 6 | Subordinates | CSES | Medium (1400) | Subtree sizes via tree DFS |
| 7 | Tree Distances I | CSES | Medium (1700) | Farthest node for each vertex (two DFS from diameter endpoints) |
| 8 | CF 1324F | CF | Medium (1700) | Maximum white-black balance on tree path |
| 9 | CF 1144F | CF | Medium (1800) | Bipartite check + greedy coloring |
| 10 | CF 343D | CF | Hard (2000) | Subtree updates via Euler tour + BIT |
10. Iterative DFS and BFS Using Explicit Data Structures
Why Iterative?
Recursive DFS uses the call stack, which is typically limited to ~10^4-10^5 frames (1-8 MB default stack on most systems). A chain graph with stack<int> uses heap memory instead, avoiding this limit entirely. For large graphs (
BFS is naturally iterative (queue-based) and doesn't have this problem.
Side-by-Side Comparison
DFS with Stack: BFS with Queue:
+---------------------------+ +---------------------------+
| push(start) | | enqueue(start) |
| mark start visited | | mark start visited |
| while stack not empty: | | while queue not empty: |
| u = stack.top(); pop() | | u = queue.front(); pop()|
| process u | | process u |
| for each neighbor v: | | for each neighbor v: |
| if not visited[v]: | | if not visited[v]: |
| mark visited[v] | | mark visited[v] |
| push(v) | | enqueue(v) |
+---------------------------+ +---------------------------+
Key difference: stack (LIFO) vs queue (FIFO).
DFS dives deep first; BFS explores level by level.
Both visit every node exactly once: O(n + m).Trace on Example Graph
1 --- 2
| |
3 --- 4 --- 5
adj[1] = {2, 3} adj[2] = {1, 4} adj[3] = {1, 4}
adj[4] = {2, 3, 5} adj[5] = {4}DFS Trace (starting from node 1, stack = LIFO):
Step Action Stack (top->right) Visited
---- ------------------ ----------------- ---------
1 push(1), mark 1 [1] {1}
2 pop 1, push 2,3 [3, 2] {1,2,3}
(neighbors of 1: 2->mark+push, 3->mark+push)
3 pop 2, push 4 [3, 4] {1,2,3,4}
(neighbors of 2: 1->visited, 4->mark+push)
4 pop 4, push 5 [3, 5] {1,2,3,4,5}
(neighbors of 4: 2->visited, 3->visited, 5->mark+push)
5 pop 5 [3] {1,2,3,4,5}
(neighbors of 5: 4->visited)
6 pop 3 [] {1,2,3,4,5}
(neighbors of 3: 1->visited, 4->visited)
DFS visit order: 1 -> 2 -> 4 -> 5 -> 3BFS Trace (starting from node 1, queue = FIFO):
Step Action Queue (front->right) Visited
---- ------------------ ------------------ ---------
1 enqueue(1), mark 1 [1] {1}
2 dequeue 1, enq 2,3 [2, 3] {1,2,3}
(neighbors of 1: 2->mark+enqueue, 3->mark+enqueue)
3 dequeue 2, enq 4 [3, 4] {1,2,3,4}
(neighbors of 2: 1->visited, 4->mark+enqueue)
4 dequeue 3 [4] {1,2,3,4}
(neighbors of 3: 1->visited, 4->visited)
5 dequeue 4, enq 5 [5] {1,2,3,4,5}
(neighbors of 4: 2->visited, 3->visited, 5->mark+enqueue)
6 dequeue 5 [] {1,2,3,4,5}
(neighbors of 5: 4->visited)
BFS visit order: 1 -> 2 -> 3 -> 4 -> 5
BFS levels: L0:{1} L1:{2,3} L2:{4} L3:{5}Key observations from the trace:
- DFS dives deep: 1->2->4->5 before backtracking to 3.
- BFS explores level by level: all distance-1 nodes (2,3) before distance-2 nodes (4).
- Both visit every node exactly once and inspect each edge twice (undirected).
- BFS naturally gives shortest-path distances in unweighted graphs; DFS does not.
The Aha Moment
DFS is really about time. Every node gets a discovery time (when you first visit it) and a finish time (when you backtrack from it). These timestamps encode the entire structure of the search: if
Rating Progression
| CF Rating | What You Should Be Comfortable With |
|---|---|
| 1200 | Basic DFS, connected components, tree traversal, cycle detection |
| 1500 | DFS timestamps, subtree sizes, parent tracking, back/forward/cross edge classification |
| 1800 | Euler tour flattening for range queries, bridges/articulation points, SCC (Tarjan/Kosaraju) |
| 2100 | DFS tree properties for hard graph problems, centroid decomposition (DFS-based), virtual tree construction |
Before You Code Checklist
- [ ] Visited/color array initialized? Unvisited nodes must be clearly distinguishable.
- [ ] Recursion depth: For
nodes, recursive DFS will stack overflow. Use iterative DFS or increase stack size. - [ ] Parent tracking: In tree DFS, are you passing
parentto avoid revisiting the parent as a "back edge"? - [ ] Timestamps consistent: Are
tinandtoutbeing incremented with the same global timer? - [ ] Edge classification: For directed graphs, did you distinguish tree/back/forward/cross edges using entry and exit times?
What Would You Do If...?
...the tree has
nodes and recursive DFS stack overflows? Use iterative DFS with an explicit stack, or set ulimit -s unlimited(Linux) / increase stack size in your compiler settings. In contests, iterative DFS is safer....you need to answer "is
an ancestor of ?" in ? Use DFS timestamps: is an ancestor of if and only if and . ...you need to detect ALL cycles in a directed graph, not just whether one exists? DFS with back-edge detection tells you a cycle exists, but extracting all cycles is #P-hard in general. Instead, find SCCs (Tarjan) -- every SCC with more than one node contains cycles.
The Mistake That Teaches
The bug: You implement cycle detection in a directed graph using DFS. But your code reports false cycles:
cpp
vector<bool> visited(n + 1, false);
bool has_cycle = false;
void dfs(int u) {
visited[u] = true;
for (int v : adj[u]) {
if (visited[v]) { has_cycle = true; return; }
dfs(v);
}
}The lesson: In a directed graph, reaching a visited node doesn't mean there's a cycle -- it might be a cross edge or forward edge. You need three colors: white (unvisited), gray (in current DFS path), black (fully processed). A cycle exists only when you reach a gray node (back edge). The code above works for undirected graphs (with parent check) but is wrong for directed ones.
Debug This
Bug 1: Tree DFS computes wrong subtree sizes:
cpp
int sz[MAXN];
void dfs(int u, int parent) {
sz[u] = 1;
for (int v : adj[u]) {
if (v == parent) continue;
dfs(v, u);
}
// where's the subtree sum?
}Answer
After the recursive call dfs(v, u), you never add the child's subtree size to the parent. Fix: add sz[u] += sz[v]; after the recursive call.
Bug 2: DFS timestamps are wrong -- many nodes have the same tin:
cpp
int timer = 0;
void dfs(int u, int p) {
tin[u] = timer; // BUG: not incrementing
for (int v : adj[u]) {
if (v != p) dfs(v, u);
}
tout[u] = timer; // BUG: not incrementing
}Answer
The timer is never incremented. Fix: use tin[u] = timer++; and tout[u] = timer++;. Without incrementing, all nodes get the same timestamp and ancestor queries become meaningless.
Bug 3: Iterative DFS visits nodes in wrong order compared to recursive:
cpp
stack<int> st;
st.push(root);
while (!st.empty()) {
int u = st.top(); st.pop();
if (vis[u]) continue;
vis[u] = true;
for (int v : adj[u]) {
if (!vis[v]) st.push(v);
}
}Answer
This visits neighbors in reverse order compared to recursive DFS because the stack reverses the adjacency list iteration. The last neighbor pushed is the first popped. If order matters (e.g., for consistent timestamps), iterate adj[u] in reverse: for (int i = adj[u].size()-1; i >= 0; i--). For most problems, the order doesn't affect correctness, but it can affect output matching.
Historical Origin
Depth-first search was first formally analyzed by Robert Tarjan in 1972, who showed how DFS timestamps and back edges could solve problems like finding strongly connected components, biconnected components, and bridges -- all in linear time. His work earned him the Turing Award in 1986.
Mnemonic Anchor
"Go deep, stamp time, backtrack with knowledge." DFS is a time machine: entry time, exit time, and everything in between is your subtree.
Further Reading
- cp-algorithms: Depth First Search -- comprehensive DFS tutorial with all variations.
- cp-algorithms: Finding Bridges -- bridge/articulation point algorithms.
- cp-algorithms: Checking Bipartiteness -- 2-coloring proof and implementation.
- CF Blog: DFS ordering and Euler tour -- flattening trees for range queries.
- CSES Problem Set -- Graph Algorithms -- excellent practice set with editorial hints.
- See:
01-euler-tour-and-lca.mdfor LCA and full Euler tour techniques. - See:
02-bridges-and-articulation-points.mdfor bridge/AP implementation. - See:
08-dp-on-trees.mdfor DP patterns built on tree DFS.
Code Reading Exercise
What does this code compute? Read it carefully before checking the answer.
cpp
int ans;
int dfs(int u, int p, vector<vector<int>>& adj) {
int mx1 = 0, mx2 = 0;
for (int v : adj[u]) {
if (v == p) continue;
int d = dfs(v, u, adj) + 1;
if (d >= mx1) { mx2 = mx1; mx1 = d; }
else if (d > mx2) { mx2 = d; }
}
ans = max(ans, mx1 + mx2);
return mx1;
}Answer
This computes the diameter of a tree -- the length of the longest path between any two nodes. At each node u, it tracks the two longest downward paths (mx1 and mx2). The longest path passing through u has length mx1 + mx2, which updates the global answer. The function returns the single longest downward path for the parent's computation. Called once from the root with ans = 0.
What does this code compute? Assume adj represents a tree with n nodes.
cpp
int result;
void dfs(int u, int p, int n, vector<vector<int>>& adj, vector<int>& sz) {
sz[u] = 1;
int heaviest = 0;
for (int v : adj[u]) {
if (v == p) continue;
dfs(v, u, n, adj, sz);
sz[u] += sz[v];
heaviest = max(heaviest, sz[v]);
}
heaviest = max(heaviest, n - sz[u]);
if (heaviest <= n / 2) result = u;
}Answer
This finds the centroid of a tree -- the node whose removal minimizes the size of the largest remaining connected component. For each node u, it computes the heaviest component: the largest child subtree or the "upward" part (n - sz[u]). If the heaviest part is <= n/2, that node is a centroid. Every tree has at least one centroid, and this single DFS finds it in O(n).
Related Techniques
- BFS -- the complementary traversal; use BFS for shortest paths, DFS for structure discovery
- Topological Sort -- DFS-based topological ordering for directed acyclic graphs
- Bipartite Check -- 2-coloring via DFS, a direct application of graph traversal
- Bridges and Articulation Points -- Tarjan's bridge-finding algorithm builds directly on DFS discovery/low values
- LCA / Binary Lifting -- builds on DFS tin/tout and parent arrays
- Practice Ladder: Graphs -- curated problem set
Recap
- DFS explores as deep as possible before backtracking. Time:
. - The recursion stack IS the current root-to-node path -- back edges to nodes on this path reveal cycles.
tin/touttimestamps encode subtree structure:in subtree( ) iff and -- O(1) per query. - For trees: DFS naturally computes depth, subtree size, parent, and Euler tour (flattening for range queries).
- Edge classification (tree/back/forward/cross) drives cycle detection, topological sort, SCC, and bridge-finding.
- Watch for stack overflow on large
(use iterative DFS), and always handle disconnected components with an outer loop.