DFS and Tree DFS

AL-13 | Depth-first search on general graphs and rooted trees -- the backbone of nearly every graph problem on Codeforces.

Difficulty: [Beginner]Prerequisites: Graph Representation, Stack, Queue, and DequeCF Rating: 1100-1500

Contest Frequency: *** Common -- appears in most Div 2 contests

Quick Revisit
USE WHEN: reachability, cycle detection, connected components, tree traversal, back-edge classification
INVARIANT: each node visited exactly once; recursion stack reveals ancestry
TIME: O(V + E) | SPACE: O(V) stack depth
CLASSIC BUG: stack overflow on large graphs (use iterative DFS or increase stack size)
PRACTICE: ../12-practice-ladders/05-ladder-graphs.md

Intuition
C++ STL Reference
Implementation (Contest-Ready)
Operations Reference
Problem Patterns & Tricks
Contest Cheat Sheet
Gotchas & Debugging
Common Mistakes
Self-Test
Practice Problems
Iterative DFS and BFS
Code Reading Exercise
Further Reading
Recap

Intuition

This file teaches DFS in two passes. The first treats DFS as a way to visit every reachable node — the bare traversal. The second specialises to DFS on a rooted tree, where the recursion stack becomes a path from the root and timestamps become the subtree structure. Edge classification (tree/back/forward/cross) is a consequence of the second view; it appears later, after you have a use for it.

Part 1 — Plain DFS: visit every reachable node

Suppose you have 6 nodes and some edges, and you need to visit every node reachable from node 1.

  Graph:
  1 -- 2    2 -- 4
  1 -- 3    2 -- 5
  3 -- 6

        1
       / \
      2   3
     / \   \
    4   5   6

Naive attempt: pick a random neighbor, go there, pick another random neighbor... You might visit node 2, then jump back to 1, visit 2 again, wander to 3, revisit 1 a third time. With $n = 6$ and $m = 5$ edges you could easily perform 15+ node visits before covering all 6 nodes -- triple the necessary work. Every revisit is a wasted operation. The core problem: without a strategy, you revisit nodes and miss others.

The key insight:

Go as deep as possible before backtracking -- this naturally explores every reachable node exactly once using a stack (or recursion).

Analogy: you are in a maze. At every fork, always take the first unexplored corridor and walk until you hit a dead end. Then retrace your steps to the last fork that still has an unexplored corridor and repeat. You never re-enter a corridor you have already explored.

The "break" in this strategy is when you reach a dead end (a node all of whose neighbors are visited). At that moment the stack pops and you backtrack to the most recent node that still has unexplored neighbors. The stack guarantees you return to exactly the right place.

This first view of DFS — visit every reachable node, exactly once, backed by a stack — is sufficient for connected components, flood fill, reachability tests, and undirected cycle detection. Everything in Part 1 below works at this level.

Visual Walkthrough

Same 6-node graph, DFS from node 1, visiting smaller-numbered neighbors first.

  adj[1] = {2, 3}   adj[2] = {1, 4, 5}   adj[3] = {1, 6}
  adj[4] = {2}       adj[5] = {2}          adj[6] = {3}

  Step  Action             Stack (top->right)  Visited          tin  tout
  ----  -----------------  ------------------  ---------------  ---  ----
   1    visit 1            [1]                 {1}               0    -
   2    visit 2 (from 1)   [1, 2]              {1,2}             1    -
   3    visit 4 (from 2)   [1, 2, 4]           {1,2,4}           2    -
   4    backtrack from 4   [1, 2]              {1,2,4}           -    3
   5    visit 5 (from 2)   [1, 2, 5]           {1,2,4,5}         4    -
   6    backtrack from 5   [1, 2]              {1,2,4,5}         -    5
   7    backtrack from 2   [1]                 {1,2,4,5}         -    6
   8    visit 3 (from 1)   [1, 3]              {1,2,3,4,5}       7    -
   9    visit 6 (from 3)   [1, 3, 6]           {1,2,3,4,5,6}     8    -
  10    backtrack from 6   [1, 3]              {1,2,3,4,5,6}     -    9
  11    backtrack from 3   [1]                 {1,2,3,4,5,6}     -   10
  12    backtrack from 1   []                  {1,2,3,4,5,6}     -   11

  DFS order:  1 -> 2 -> 4 -> 5 -> 3 -> 6
  Total node visits: 6  (each node entered exactly once)
  Total edge inspections: 2 * 5 = 10  (each undirected edge checked from both ends)
  Overall work: O(n + m) = O(6 + 5) = O(11)

Discovery times (tin) and finish times (tout) are shown above. They define nested intervals: node $u$ is an ancestor of node $v$ in the DFS tree iff $t i n [u] \leq t i n [v]$ and $t o u t [v] \leq t o u t [u]$ . We come back to how these timestamps power subtree queries and cycle detection in Part 2 below.

The Invariant

+------------------------------------------------------------------+
| INVARIANT: Every node is visited exactly once. The recursion      |
| stack contains the current path from the source to the node       |
| being processed.                                                  |
+------------------------------------------------------------------+

Informal induction proof that DFS visits every reachable node exactly once:

Base case ( $k = 0$ ). The source node $s$ is marked visited and placed on the stack. It is visited exactly once.

Inductive step. Assume that after $k$ iterations every node that has been marked visited was visited exactly once, and every node on the stack is on the current path from $s$ . Consider the next iteration: we pop (or are at) node $u$ . For each neighbor $v$ of $u$ :

If $v$ is already visited, we skip it -- no double visit.
If $v$ is unvisited, we mark it visited before recursing (or pushing). So $v$ is entered exactly once.

Because the graph is finite, the process terminates. Any reachable node $w$ must lie on some path $s = v_{0}, v_{1}, \dots, v_{k} = w$ . By induction, $v_{0}$ is visited; when $v_{i}$ is visited, $v_{i + 1}$ is either already visited or will be visited when we inspect the edge $(v_{i}, v_{i + 1})$ . Hence $w$ is eventually visited. $◻$

BFS vs DFS -- when to pick which:

Criterion	DFS	BFS
Memory	$O (depth)$ stack frames	$O (width)$ queue entries
Shortest path (unweighted)	No	Yes -- BFS gives shortest paths
Connectivity / reachability	Yes	Yes
Cycle detection	Yes -- back edge to gray node	Possible but less natural
Topological sort	Yes -- reverse of finish order	Yes -- Kahn's algorithm
Tree properties (subtree size, diameter, tin/tout)	Natural fit	Awkward
Subtree queries / Euler tour	DFS only	Not applicable

Rule of thumb. If you need shortest distance in an unweighted graph or level-order information, use BFS. For almost everything else -- connectivity, cycle detection, topological sort, tree structure -- DFS is simpler and often more powerful. DFS uses $O (depth)$ memory which can be $O (n)$ worst case (chain graph); BFS uses $O (width)$ which can also be $O (n)$ worst case (star graph).

Part 2 — DFS on rooted trees: ancestry, time, and structure

The plain-DFS view above is enough for traversal. But on a rooted tree (or a DFS tree of any graph), DFS does more than visit nodes — the order in which it enters and exits them is itself the data structure. Three insights separate "I can code DFS" from "I understand DFS":

1. The recursion stack IS the current source-to-node path.

When DFS is processing node v, every node on the call stack forms the unique tree path from the root to v. This is why a back edge (to a GRAY node on the stack) immediately gives you a cycle -- the stack path from that ancestor down to v, plus the back edge, IS the cycle.

  DFS tree (--- = tree edge,  ... = back edge):

      1 ---> 2 ---> 3          Stack at node 3: [1, 2, 3]
                     :
                     :..> 1    back edge: 3 sees GRAY node 1
                               cycle = 1 -> 2 -> 3 -> 1
                               (stack path 1->2->3 + back edge 3->1)

2. tin/tout encode the entire subtree structure in O(1) queries.

The code increments a timer on entry and exit. The insight the code won't teach you: these times form nested intervals. [tin[u], tout[u]] fully contains [tin[v], tout[v]] if and only if v is in the subtree of u. This turns "is v a descendant of u?" into a simple range check -- no tree traversal needed.

  DFS tree with tin/tout (same 6-node graph):

        1 [0,11]                [1 [2 [4]4 [5]5 ]2 [3 [6]6 ]3 ]1
       / \                       0  1  2  3  4  5  6  7  8  9 10 11
      2   3
     / \    \                   Is 6 in subtree(1)?
    4   5    6                    tin[1]=0 <= tin[6]=8  AND
  [2,3] [4,5] [8,9]              tout[6]=9 <= tout[1]=11  => YES
       [1,6]  [7,10]
                                Is 5 in subtree(3)?
                                  tin[3]=7 <= tin[5]=4?  NO => NOT in subtree

3. Once you have a DFS tree, every edge has a name.

Now that ancestry is well-defined, the edges DFS sees on a directed graph fall into four categories:

Tree edge: to a WHITE (undiscovered) node — explored next.
Back edge: to a GRAY (on-stack) ancestor — closes a cycle.
Forward edge: to a BLACK descendant already in the subtree.
Cross edge: to a BLACK node in a different subtree.

In undirected graphs only tree and back edges exist. This classification is what turns DFS into directed-cycle detection, topological sort, SCC decomposition, and bridge finding — each of those algorithms is just "DFS plus interpret the edges."

If this is your first read, you do not need to memorise the four categories. The one to keep is back edge ⇒ cycle. The others appear in advanced graph algorithms once they have a job to do.

When to Reach for This

DFS is the go-to for: cycle detection, topological ordering (via post-order), connected components, articulation points/bridges, tree diameter, and subtree queries.

Trigger phrases in the problem statement:

"connected components" -> DFS (or BFS)
"cycle detection" -> DFS with colors (WHITE/GRAY/BLACK)
"tree traversal", "subtree queries" -> tree DFS with tin/tout
"topological order" -> DFS finish-time reverse
"bipartite check" -> DFS 2-coloring
"reachability" -> DFS or BFS
"bridges", "articulation points" -> DFS with low[] array (Tarjan)
"ancestor check", "back edge" -> DFS timestamps
"tree diameter" -> two-pass DFS (farthest node twice)
"subtree sum/update" on a tree -> DFS Euler tour + range query

Counter-examples (do NOT use DFS here):

"shortest path in an unweighted graph" -> use BFS
"shortest path with weights" -> use Dijkstra / Bellman-Ford
"minimum spanning tree" -> use Kruskal / Prim

DFS State Snapshots

Using the same 6-node graph from the walkthrough, here are snapshots of DFS state at key moments. Nodes are colored WHITE (undiscovered), GRAY (on the recursion stack), or BLACK (finished -- all descendants processed).

Snapshot 1: Mid-traversal (step 5 -- visiting node 5 from node 2)

        1[G]                     Recursion stack
       / \                       (= current path from root):
     2[G]  3[W]                  +---+
    / \      \                   | 5 | <-- top (currently processing)
  4[B] 5[G]  6[W]               | 2 |
                                 | 1 |
  W = WHITE (undiscovered)       +---+
  G = GRAY  (on stack)
  B = BLACK (finished)

  Nodes 1, 2, 5 are GRAY -- they form the path 1 -> 2 -> 5.
  Node 4 is BLACK -- fully processed, already backtracked.
  Nodes 3, 6 are WHITE -- not yet reached.

Snapshot 2: Back edge discovered -- edge to a GRAY node means cycle

  Suppose an extra directed edge (5 -> 1) exists.
  When DFS at node 5 examines neighbor 1:
    node 1 is GRAY (on stack) => back edge => CYCLE detected!

        1[G] <.............
       / \                 .  back edge (5 -> 1)
     2[G]  3[W]            .  cycle: 1 -> 2 -> 5 -> 1
    / \      \             .
  4[B] 5[G] ..'   6[W]

  Rule: edge to GRAY node  = back edge  = cycle exists
        edge to BLACK node = forward/cross edge = no cycle
        edge to WHITE node = tree edge (explore it)

Snapshot 3: Nested tin/tout intervals as brackets

  tin/tout values from the walkthrough:

    Node:  1      2     4     5      3     6
    tin:   0      1     2     4      7     8
    tout: 11      6     3     5     10     9

  Visualized as nested brackets on a number line:

  0   1   2   3   4   5   6   7   8   9  10  11
  |   |   |   |   |   |   |   |   |   |   |   |
  [1  [2  [4  ]4  [5  ]5  ]2  [3  [6  ]6  ]3  ]1

  Reading the nesting:
  - [4]=[2,3]  inside [2]=[1,6]  inside [1]=[0,11]
       => 4 in subtree(2), 2 in subtree(1)            YES
  - [6]=[8,9]  inside [3]=[7,10]
       => 6 in subtree(3)                              YES
  - [3]=[7,10] inside [2]=[1,6]?  tin[3]=7 > tout[2]=6
       => 3 NOT in subtree(2)                          NO

  O(1) subtree test:
    v in subtree(u)  <=>  tin[u] <= tin[v] AND tout[v] <= tout[u]

C++ STL Reference

Function / Class	Header	What it does	Time Complexity
`vector<vector<int>> adj(n)`	`<vector>`	Adjacency list representation	$O (n + m)$ space
`vector<bool> vis(n, false)`	`<vector>`	Visited array	$O (n)$ space
`stack<int>`	`<stack>`	Explicit stack for iterative DFS	$O (1)$ push/pop
`function<void(int,int)>`	`<functional>`	Wrapper for recursive lambda	$O (1)$ call overhead
`ranges::fill(vis, false)`	`<algorithm>`	Reset visited array between test cases	$O (n)$

Implementation (Contest-Ready)

Version 1: Graph DFS -- Recursive + Iterative

cpp

#include <bits/stdc++.h>
using namespace std;

int main(){
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;
    vector<vector<int>> adj(n + 1);
    for(int i = 0; i < m; i++){
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    // --- Recursive DFS ---
    vector<bool> vis(n + 1, false);
    int comp = 0;
    auto dfs = [&](auto& self, int u) -> void {
        vis[u] = true;
        for(int v : adj[u])
            if(!vis[v]) self(self, v);
    };
    for(int i = 1; i <= n; i++){
        if(!vis[i]){
            dfs(dfs, i);
            comp++;
        }
    }
    cout << "Connected components (recursive): " << comp << "\n";

    // --- Iterative DFS (safe for large n) ---
    fill(vis.begin(), vis.end(), false);
    comp = 0;
    for(int i = 1; i <= n; i++){
        if(vis[i]) continue;
        comp++;
        stack<int> st;
        st.push(i);
        vis[i] = true;
        while(!st.empty()){
            int u = st.top(); st.pop();
            for(int v : adj[u]){
                if(!vis[v]){
                    vis[v] = true;
                    st.push(v);
                }
            }
        }
    }
    cout << "Connected components (iterative): " << comp << "\n";
}

Version 2: Tree DFS with tin/tout, Depth, Subtree Size, Parent

cpp

#include <bits/stdc++.h>
using namespace std;

int main(){
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n;
    cin >> n;
    vector<vector<int>> adj(n + 1);
    for(int i = 0; i < n - 1; i++){
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    vector<int> tin(n + 1), tout(n + 1), depth(n + 1, 0);
    vector<int> par(n + 1, 0), sz(n + 1, 1);
    int timer = 0;

    // Recursive tree DFS from root = 1
    auto dfs = [&](auto& self, int u, int p) -> void {
        par[u] = p;
        tin[u] = timer++;
        for(int v : adj[u]){
            if(v == p) continue;
            depth[v] = depth[u] + 1;
            self(self, v, u);
            sz[u] += sz[v];
        }
        tout[u] = timer++;
    };
    dfs(dfs, 1, 0);

    // Check if u is ancestor of v in O(1)
    auto is_ancestor = [&](int u, int v) -> bool {
        return tin[u] <= tin[v] && tout[v] <= tout[u];
    };

    // Answer queries
    int q;
    cin >> q;
    while(q--){
        int u, v; cin >> u >> v;
        if(is_ancestor(u, v))
            cout << u << " is ancestor of " << v << "\n";
        else
            cout << u << " is NOT ancestor of " << v << "\n";
    }
}

Explained Version: Tree DFS with Inline Comments

cpp

#include <bits/stdc++.h>
using namespace std;

int main(){
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n;
    cin >> n;
    vector<vector<int>> adj(n + 1);
    for(int i = 0; i < n - 1; i++){
        int u, v; cin >> u >> v;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    // tin[u]: time when we first enter node u
    // tout[u]: time when we finish processing u and all its subtree
    // Together, [tin[u], tout[u]] defines the "interval" of u's subtree.
    vector<int> tin(n + 1), tout(n + 1), depth(n + 1, 0);
    vector<int> par(n + 1, 0), sz(n + 1, 1);
    int timer = 0;

    // We use auto& self pattern for recursive lambdas.
    // p = parent, prevents going back up the edge we came from.
    auto dfs = [&](auto& self, int u, int p) -> void {
        par[u] = p;
        tin[u] = timer++;   // record entry time

        for(int v : adj[u]){
            if(v == p) continue;   // skip parent edge (tree has no other cycles)
            depth[v] = depth[u] + 1;
            self(self, v, u);      // recurse into child
            sz[u] += sz[v];        // after child returns, accumulate subtree size
        }

        tout[u] = timer++;  // record exit time
    };
    dfs(dfs, 1, 0);  // root at 1, parent of root is 0 (sentinel)

    // Ancestor check: u is ancestor of v iff v's interval is inside u's interval.
    // This works because DFS timestamps form nested intervals on a tree.
    auto is_ancestor = [&](int u, int v) -> bool {
        return tin[u] <= tin[v] && tout[v] <= tout[u];
    };

    int q;
    cin >> q;
    while(q--){
        int u, v; cin >> u >> v;
        cout << (is_ancestor(u, v) ? "YES" : "NO") << "\n";
    }
}

ASCII Trace: DFS on a 7-Node Graph

Adjacency list:
  0: [1, 2]
  1: [0, 3, 4]
  2: [0, 5]
  3: [1]
  4: [1, 6]
  5: [2]
  6: [4]

Step  Action              Stack (top->)    Visited           disc/fin
---------------------------------------------------------------------
 1    push 0              [0]             {}                 -
 2    pop 0, visit        []              {0}                0: 1/-
 3    push 2,1            [2,1]           {0}                -
 4    pop 1, visit        [2]             {0,1}              1: 2/-
 5    push 4,3            [2,4,3]         {0,1}              -
 6    pop 3, visit        [2,4]           {0,1,3}            3: 3/4
 7    pop 4, visit        [2]             {0,1,3,4}          4: 5/-
 8    push 6              [2,6]           {0,1,3,4}          -
 9    pop 6, visit        [2]             {0,1,3,4,6}        6: 6/7
10    (4 finished)        [2]             {0,1,3,4,6}        4: 5/8
11    (1 finished)        [2]             {0,1,3,4,6}        1: 2/9
12    pop 2, visit        []              {0,1,2,3,4,6}      2: 10/-
13    push 5              [5]             {0,1,2,3,4,6}      -
14    pop 5, visit        []              {0,1,2,3,4,5,6}    5: 11/12
15    (2 finished)        []              {0,1,2,3,4,5,6}    2: 10/13
16    (0 finished)        []              {0,1,2,3,4,5,6}    0: 1/14

DFS tree edges: 0->1, 1->3, 1->4, 4->6, 0->2, 2->5

Operations Reference

DFS itself is always $O (n + m)$ , but the useful quantities you compute during the traversal -- timestamps, subtree sizes, depths -- each come at their own cost.

Operation	Time	Space
DFS traversal (graph)	$O (n + m)$	$O (n + m)$ adj list + $O (n)$ visited
DFS traversal (tree)	$O (n)$	$O (n)$ adj list + arrays
Compute tin/tout	$O (n)$	$O (n)$ for arrays
Compute subtree sizes	$O (n)$	$O (n)$
Compute depths	$O (n)$	$O (n)$
Ancestor check (with tin/tout)	$O (1)$ per query	$O (n)$ precomputation
Cycle detection (directed)	$O (n + m)$	$O (n)$ color array
Cycle detection (undirected)	$O (n + m)$	$O (n)$ visited + parent
Connected components	$O (n + m)$	$O (n)$ visited

Recursive vs Iterative -- When to Pick Which

Both versions appear in Version 1 above. Key points:

Iterative DFS processes neighbors in reverse order compared to recursive (the stack reverses adjacency-list iteration).
For tree DFS with entry/exit times, iterative requires tracking whether a node is being entered or exited -- more bookkeeping than recursive.
For most contest problems with $n \leq 2 \times 10^{5}$ , recursive DFS is fine. Increase stack size with ulimit -s unlimited or a thread with larger stack.
For $n > 10^{5}$ on strict stack limits, use iterative DFS (see Gotcha 1).

Problem Patterns & Tricks

Pattern 1: Connected Components

Count or label connected components in an undirected graph. Run DFS from every unvisited node -- each new DFS call is a new component.

cpp

int comp = 0;
for(int i = 1; i <= n; i++){
    if(!vis[i]){
        dfs(dfs, i);
        comp++;
    }
}

CF examples: CF 977E, CF 1176E

Pattern 2: Cycle Detection (Directed Graph)

Color-based DFS: white/gray/black. An edge to a gray node is a back edge (cycle). To reconstruct the cycle, record parents and trace back from the gray node.

cpp

vector<int> color(n + 1, 0); // 0=white, 1=gray, 2=black
bool has_cycle = false;
auto dfs = [&](auto& self, int u) -> void {
    color[u] = 1;
    for(int v : adj[u]){
        if(color[v] == 1) has_cycle = true; // back edge
        if(color[v] == 0) self(self, v);
    }
    color[u] = 2;
};

CF examples: CF 977E, CF 1385E

Pattern 3: Bipartite Check (2-Coloring)

DFS while coloring nodes with alternating colors. If an edge connects two same-colored nodes, the graph is not bipartite.

cpp

vector<int> col(n + 1, -1);
bool bipartite = true;
auto dfs = [&](auto& self, int u, int c) -> void {
    col[u] = c;
    for(int v : adj[u]){
        if(col[v] == -1) self(self, v, c ^ 1);
        else if(col[v] == c) bipartite = false;
    }
};

CF examples: CF 687A, CF 1144F

Pattern 4: Subtree Queries via tin/tout (Euler Tour)

Flatten the tree into an array using tin. Node $u$ 's subtree occupies indices $[t i n [u], t i n [u] + s z [u] - 1]$ in the flattened array. Now subtree sum/min/max becomes a range query -- apply BIT or segment tree.

cpp

// After tree DFS with tin and sz computed:
// Subtree of u maps to range [tin[u], tin[u] + sz[u] - 1]
// To add val to node u: update position tin[u]
// To query subtree sum of u: range query [tin[u], tin[u]+sz[u]-1]

CF examples: CF 620E, CF 343D

Pattern 5: Tree Diameter

The diameter (longest path) of a tree can be found with two DFS calls: (1) DFS from any node, find the farthest node $u$ . (2) DFS from $u$ , find the farthest node $v$ . Distance $u \to v$ is the diameter.

cpp

auto farthest = [&](int start) -> pair<int,int> {
    vector<int> dist(n + 1, -1);
    dist[start] = 0;
    auto go = [&](auto& self, int u, int p) -> void {
        for(int v : adj[u]) if(v != p){
            dist[v] = dist[u] + 1;
            self(self, v, u);
        }
    };
    go(go, start, 0);
    int far = max_element(dist.begin() + 1, dist.begin() + n + 1) - dist.begin();
    return {far, dist[far]};
};
auto [u, d1] = farthest(1);
auto [v, diameter] = farthest(u);

CF examples: CF 1324F, CSES Tree Diameter

Pattern 6: Finding Bridges and Articulation Points

Extend DFS with tin and low arrays. low[u] = minimum tin reachable from the subtree of $u$ via one back edge. Edge $(u, v)$ is a bridge if $l o w [v] > t i n [u]$ . Node $u$ is an articulation point if $l o w [v] \geq t i n [u]$ for some child $v$ (with special handling for root).

CF examples: CF 1000E, CF 118E

See: 02-bridges-and-articulation-points.md for full implementation.

Pattern 7: DFS on Implicit Graphs (Grid DFS)

Treat a 2D grid as a graph. Each cell has up to 4 neighbors. DFS to flood-fill components, count islands, etc.

cpp

int dx[] = {0,0,1,-1}, dy[] = {1,-1,0,0};
auto dfs = [&](auto& self, int x, int y) -> void {
    vis[x][y] = true;
    for(int d = 0; d < 4; d++){
        int nx = x + dx[d], ny = y + dy[d];
        if(nx >= 0 && nx < n && ny >= 0 && ny < m && !vis[nx][ny] && grid[nx][ny] == '.')
            self(self, nx, ny);
    }
};

CF examples: CF 1365D, CF 1594D

Contest Cheat Sheet

+--------------------------------------------------------------+
|                 DFS / TREE DFS CHEAT SHEET                   |
+--------------------------------------------------------------+
| WHEN TO USE:                                                 |
|   - Reachability, connected components, flood fill           |
|   - Cycle detection (directed: colors, undirected: parent)   |
|   - Bipartite check (2-color DFS)                            |
|   - Tree: subtree sizes, depths, tin/tout, diameter          |
+--------------------------------------------------------------+
| GRAPH DFS:                                                   |
|   vis[u] = true;                                             |
|   for(int v : adj[u]) if(!vis[v]) dfs(v);                   |
+--------------------------------------------------------------+
| TREE DFS (root r, parent p):                                 |
|   tin[u] = timer++; sz[u] = 1;                               |
|   for(int v : adj[u]) if(v != p)                             |
|       depth[v] = depth[u]+1, dfs(v,u), sz[u] += sz[v];      |
|   tout[u] = timer++;                                         |
+--------------------------------------------------------------+
| ANCESTOR CHECK:  tin[u] <= tin[v] && tout[v] <= tout[u]      |
+--------------------------------------------------------------+
| DIAMETER: dfs from any node -> farthest u; dfs from u -> v   |
+--------------------------------------------------------------+
| CYCLE (directed): edge to GRAY node = back edge = cycle      |
+--------------------------------------------------------------+
| COMPLEXITY:  O(n + m) time,  O(n + m) space                  |
+--------------------------------------------------------------+
| DANGER:                                                      |
|   - Stack overflow for n >= 1e5: use iterative or ulimit -s  |
|   - Mark visited BEFORE pushing (iterative), not after pop   |
|   - Reset vis[] between test cases                           |
+--------------------------------------------------------------+

Gotchas & Debugging

Gotcha 1: Stack Overflow on Deep Recursion

Default stack size is typically 1-8 MB. A recursive DFS on a chain of $n = 10^{6}$ nodes overflows. Solutions:

cpp

// Option A: Increase stack size (Linux, compile-time)
// g++ -O2 -Wl,-z,stacksize=67108864 sol.cpp  (64 MB stack)

// Option B: Set stack via ulimit before running
// ulimit -s unlimited && ./sol

// Option C: Use iterative DFS (always safe)
stack<int> st;
st.push(root);
vis[root] = true;
while(!st.empty()){
    int u = st.top(); st.pop();
    for(int v : adj[u]) if(!vis[v]){
        vis[v] = true;
        st.push(v);
    }
}

On Codeforces, the stack limit is usually 256 MB, but not always. Check the problem constraints. If $n \leq 2 \times 10^{5}$ , recursive DFS is usually fine.

Gotcha 2: Iterative DFS -- Mark Visited Before Push, Not After Pop

cpp

// WRONG -- pushes same node multiple times
while(!st.empty()){
    int u = st.top(); st.pop();
    if(vis[u]) continue;  // wasteful, can TLE
    vis[u] = true;
    for(int v : adj[u]) if(!vis[v]) st.push(v);
}

// RIGHT -- mark before push
st.push(start);
vis[start] = true;
while(!st.empty()){
    int u = st.top(); st.pop();
    for(int v : adj[u]) if(!vis[v]){
        vis[v] = true;
        st.push(v);
    }
}

Gotcha 3: Forgetting to Reset Between Test Cases

With multiple test cases, forgetting to clear vis, tin, tout, etc. gives wrong answers. Use fill or reconstruct:

cpp

int t; cin >> t;
while(t--){
    int n; cin >> n;
    vector<vector<int>> adj(n + 1);
    // ... read input ...
    vector<bool> vis(n + 1, false);  // fresh each test case
    // ... dfs ...
}

Gotcha 4: 1-indexed vs 0-indexed

Adjacency list of size n with nodes 1..n needs adj(n+1), not adj(n). Off-by-one here causes out-of-bounds or missing nodes.

Gotcha 5: Undirected Cycle Detection -- Parent Check

In undirected graphs, every edge appears twice. You must skip the edge back to the parent, not just check visited:

cpp

// WRONG for multigraphs -- two edges between u and v look like a cycle
auto dfs = [&](auto& self, int u, int p) -> void {
    vis[u] = true;
    for(int v : adj[u]){
        if(v == p) continue;  // skip parent
        if(vis[v]){ /* cycle found */ }
        else self(self, v, u);
    }
};

For multigraphs (multiple edges between same pair), track the edge index instead of the parent node.

Gotcha 6: Tree DFS -- Root Choice

For unrooted trees, the root choice doesn't affect subtree sizes or depths relative to the root, but it changes which nodes are ancestors. Some problems require rerooting -- see 08-dp-on-trees.md.

Mental Traps

Trap 1: "DFS finds the shortest path"

DFS finds a path, not the shortest. The path it finds depends entirely on adjacency list order. Use BFS for shortest paths in unweighted graphs.

  Graph:  1 ------- 4          adj[1] = {2, 4}
          |         |
          2 --- 3 --+          DFS from 1 (visits 2 first):
                                 1 -> 2 -> 3 -> 4   (length 3)
  DFS path to 4:  1 -> 2 -> 3 -> 4  (length 3)
  Shortest path:  1 -> 4             (length 1)
                  ^^^^^^^^^^^^
                  BFS would find this!

Trap 2: "The visited array and the recursion stack are the same thing"

They are NOT. visited[v] = true means v has been discovered (WHITE -> GRAY or BLACK). The recursion stack holds only the currently active nodes (GRAY). This distinction is critical for cycle detection in directed graphs.

  After backtracking from node 4 (step 4 in walkthrough):

        1[G]            visited[] = {1, 2, 4}     (3 nodes)
       / \              stack     = [1, 2]         (2 nodes)
     2[G]  3[W]
    / \                 Node 4: visited = YES, on stack = NO  (BLACK)
  4[B] 5[W]            Node 2: visited = YES, on stack = YES (GRAY)

  An edge to node 4 (BLACK) is NOT a cycle -- it's a forward/cross edge.
  An edge to node 2 (GRAY)  IS a cycle -- it's a back edge.

  Using only visited[] for directed cycle detection is WRONG.
  You need the color/on-stack distinction (GRAY vs BLACK).

Common Mistakes

Mistake	Consequence	Fix
Recursive DFS on chain of $10^{5}$ + nodes	Stack overflow	Use iterative DFS or `ulimit -s unlimited`
Not marking visited before recursing/pushing	Infinite loop or TLE	Set `vis[v] = true` before the recursive call or push
Using `visited[]` instead of colors for directed cycle detection	False positives -- cross/forward edges look like cycles	Use WHITE/GRAY/BLACK; only edge to GRAY = back edge = cycle
Not handling disconnected components	Missing nodes entirely	Wrap DFS in `for(i=1..n) if(!vis[i]) dfs(i)` outer loop

See Gotchas & Debugging for detailed examples of each.

Self-Test

[ ] I can write tree DFS with parent tracking -- dfs(u, par), if (v == par) continue, post-order subtree size
[ ] I can compute in-time and out-time correctly with a single timer incrementing on both entry and exit
[ ] I can state the subtree query: "is v in subtree of u?" using tin[u] <= tin[v] && tout[v] <= tout[u]
[ ] I can explain the stack overflow risk for large n and name one mitigation (iterative DFS or ulimit)
[ ] I can classify edges in DFS: back edge to gray node indicates a cycle

Practice Problems

#	Problem	Source	Difficulty	Key Idea
1	Building Roads	CSES	Easy (1200)	Count and connect components via DFS
2	Round Trip	CSES	Easy (1400)	Cycle detection in undirected graph, reconstruct cycle
3	Graph Coloring	CSES	Easy (1400)	Bipartite check via 2-color DFS
4	Labyrinth	CSES	Easy (1400)	Grid DFS/BFS, path reconstruction
5	Tree Diameter	CSES	Medium (1500)	Two-pass DFS for diameter
6	Subordinates	CSES	Medium (1400)	Subtree sizes via tree DFS
7	Tree Distances I	CSES	Medium (1700)	Farthest node for each vertex (two DFS from diameter endpoints)
8	CF 1324F	CF	Medium (1700)	Maximum white-black balance on tree path
9	CF 1144F	CF	Medium (1800)	Bipartite check + greedy coloring
10	CF 343D	CF	Hard (2000)	Subtree updates via Euler tour + BIT

10. Iterative DFS and BFS Using Explicit Data Structures

Why Iterative?

Recursive DFS uses the call stack, which is typically limited to ~10^4-10^5 frames (1-8 MB default stack on most systems). A chain graph with $n = 10^{6}$ nodes causes a stack overflow. Iterative DFS with an explicit stack<int> uses heap memory instead, avoiding this limit entirely. For large graphs ( $n \geq 10^{5}$ ), always prefer iterative DFS unless you've verified the stack size.

BFS is naturally iterative (queue-based) and doesn't have this problem.

Side-by-Side Comparison

DFS with Stack:                BFS with Queue:
+---------------------------+  +---------------------------+
| push(start)               |  | enqueue(start)            |
| mark start visited        |  | mark start visited        |
| while stack not empty:    |  | while queue not empty:    |
|   u = stack.top(); pop()  |  |   u = queue.front(); pop()|
|   process u               |  |   process u               |
|   for each neighbor v:    |  |   for each neighbor v:    |
|     if not visited[v]:    |  |     if not visited[v]:    |
|       mark visited[v]     |  |       mark visited[v]     |
|       push(v)             |  |       enqueue(v)          |
+---------------------------+  +---------------------------+

Key difference: stack (LIFO) vs queue (FIFO).
DFS dives deep first; BFS explores level by level.
Both visit every node exactly once: O(n + m).

Trace on Example Graph

    1 --- 2
    |     |
    3 --- 4 --- 5

adj[1] = {2, 3}    adj[2] = {1, 4}    adj[3] = {1, 4}
adj[4] = {2, 3, 5} adj[5] = {4}

DFS Trace (starting from node 1, stack = LIFO):

Step  Action              Stack (top->right)  Visited
----  ------------------  -----------------  ---------
 1    push(1), mark 1     [1]                {1}
 2    pop 1, push 2,3     [3, 2]             {1,2,3}
      (neighbors of 1: 2->mark+push, 3->mark+push)
 3    pop 2, push 4       [3, 4]             {1,2,3,4}
      (neighbors of 2: 1->visited, 4->mark+push)
 4    pop 4, push 5       [3, 5]             {1,2,3,4,5}
      (neighbors of 4: 2->visited, 3->visited, 5->mark+push)
 5    pop 5               [3]                {1,2,3,4,5}
      (neighbors of 5: 4->visited)
 6    pop 3               []                 {1,2,3,4,5}
      (neighbors of 3: 1->visited, 4->visited)

DFS visit order: 1 -> 2 -> 4 -> 5 -> 3

BFS Trace (starting from node 1, queue = FIFO):

Step  Action              Queue (front->right) Visited
----  ------------------  ------------------  ---------
 1    enqueue(1), mark 1  [1]                 {1}
 2    dequeue 1, enq 2,3  [2, 3]              {1,2,3}
      (neighbors of 1: 2->mark+enqueue, 3->mark+enqueue)
 3    dequeue 2, enq 4    [3, 4]              {1,2,3,4}
      (neighbors of 2: 1->visited, 4->mark+enqueue)
 4    dequeue 3           [4]                 {1,2,3,4}
      (neighbors of 3: 1->visited, 4->visited)
 5    dequeue 4, enq 5    [5]                 {1,2,3,4,5}
      (neighbors of 4: 2->visited, 3->visited, 5->mark+enqueue)
 6    dequeue 5           []                  {1,2,3,4,5}
      (neighbors of 5: 4->visited)

BFS visit order: 1 -> 2 -> 3 -> 4 -> 5
BFS levels:      L0:{1}  L1:{2,3}  L2:{4}  L3:{5}

Key observations from the trace:

DFS dives deep: 1->2->4->5 before backtracking to 3.
BFS explores level by level: all distance-1 nodes (2,3) before distance-2 nodes (4).
Both visit every node exactly once and inspect each edge twice (undirected).
BFS naturally gives shortest-path distances in unweighted graphs; DFS does not.

The Aha Moment

DFS is really about time. Every node gets a discovery time (when you first visit it) and a finish time (when you backtrack from it). These timestamps encode the entire structure of the search: if $[i n_{u}, o u t_{u}] \subset [i n_{v}, o u t_{v}]$ , then $u$ is a descendant of $v$ in the DFS tree. Once you see DFS as a time-stamping machine, everything -- back edges, subtree queries, Euler tours -- clicks into place.

Rating Progression

CF Rating	What You Should Be Comfortable With
1200	Basic DFS, connected components, tree traversal, cycle detection
1500	DFS timestamps, subtree sizes, parent tracking, back/forward/cross edge classification
1800	Euler tour flattening for range queries, bridges/articulation points, SCC (Tarjan/Kosaraju)
2100	DFS tree properties for hard graph problems, centroid decomposition (DFS-based), virtual tree construction

Before You Code Checklist

[ ] Visited/color array initialized? Unvisited nodes must be clearly distinguishable.
[ ] Recursion depth: For $n = 10^{6}$ nodes, recursive DFS will stack overflow. Use iterative DFS or increase stack size.
[ ] Parent tracking: In tree DFS, are you passing parent to avoid revisiting the parent as a "back edge"?
[ ] Timestamps consistent: Are tin and tout being incremented with the same global timer?
[ ] Edge classification: For directed graphs, did you distinguish tree/back/forward/cross edges using entry and exit times?

What Would You Do If...?

...the tree has $10^{6}$ nodes and recursive DFS stack overflows? Use iterative DFS with an explicit stack, or set ulimit -s unlimited (Linux) / increase stack size in your compiler settings. In contests, iterative DFS is safer.
...you need to answer "is $u$ an ancestor of $v$ ?" in $O (1)$ ? Use DFS timestamps: $u$ is an ancestor of $v$ if and only if $t i n [u] \leq t i n [v]$ and $t o u t [u] \geq t o u t [v]$ .
...you need to detect ALL cycles in a directed graph, not just whether one exists? DFS with back-edge detection tells you a cycle exists, but extracting all cycles is #P-hard in general. Instead, find SCCs (Tarjan) -- every SCC with more than one node contains cycles.

The Mistake That Teaches

The bug: You implement cycle detection in a directed graph using DFS. But your code reports false cycles:

cpp

vector<bool> visited(n + 1, false);
bool has_cycle = false;

void dfs(int u) {
    visited[u] = true;
    for (int v : adj[u]) {
        if (visited[v]) { has_cycle = true; return; }
        dfs(v);
    }
}

The lesson: In a directed graph, reaching a visited node doesn't mean there's a cycle -- it might be a cross edge or forward edge. You need three colors: white (unvisited), gray (in current DFS path), black (fully processed). A cycle exists only when you reach a gray node (back edge). The code above works for undirected graphs (with parent check) but is wrong for directed ones.

Debug This

Bug 1: Tree DFS computes wrong subtree sizes:

cpp

int sz[MAXN];
void dfs(int u, int parent) {
    sz[u] = 1;
    for (int v : adj[u]) {
        if (v == parent) continue;
        dfs(v, u);
    }
    // where's the subtree sum?
}

Answer

After the recursive call dfs(v, u), you never add the child's subtree size to the parent. Fix: add sz[u] += sz[v]; after the recursive call.

Bug 2: DFS timestamps are wrong -- many nodes have the same tin:

cpp

int timer = 0;
void dfs(int u, int p) {
    tin[u] = timer;  // BUG: not incrementing
    for (int v : adj[u]) {
        if (v != p) dfs(v, u);
    }
    tout[u] = timer;  // BUG: not incrementing
}

Answer

The timer is never incremented. Fix: use tin[u] = timer++; and tout[u] = timer++;. Without incrementing, all nodes get the same timestamp and ancestor queries become meaningless.

Bug 3: Iterative DFS visits nodes in wrong order compared to recursive:

cpp

stack<int> st;
st.push(root);
while (!st.empty()) {
    int u = st.top(); st.pop();
    if (vis[u]) continue;
    vis[u] = true;
    for (int v : adj[u]) {
        if (!vis[v]) st.push(v);
    }
}

Answer

This visits neighbors in reverse order compared to recursive DFS because the stack reverses the adjacency list iteration. The last neighbor pushed is the first popped. If order matters (e.g., for consistent timestamps), iterate adj[u] in reverse: for (int i = adj[u].size()-1; i >= 0; i--). For most problems, the order doesn't affect correctness, but it can affect output matching.

Historical Origin

Depth-first search was first formally analyzed by Robert Tarjan in 1972, who showed how DFS timestamps and back edges could solve problems like finding strongly connected components, biconnected components, and bridges -- all in linear time. His work earned him the Turing Award in 1986.

Mnemonic Anchor

"Go deep, stamp time, backtrack with knowledge." DFS is a time machine: entry time, exit time, and everything in between is your subtree.

Code Reading Exercise

What does this code compute? Read it carefully before checking the answer.

cpp

int ans;

int dfs(int u, int p, vector<vector<int>>& adj) {
    int mx1 = 0, mx2 = 0;
    for (int v : adj[u]) {
        if (v == p) continue;
        int d = dfs(v, u, adj) + 1;
        if (d >= mx1) { mx2 = mx1; mx1 = d; }
        else if (d > mx2) { mx2 = d; }
    }
    ans = max(ans, mx1 + mx2);
    return mx1;
}

Answer
This computes the diameter of a tree -- the length of the longest path between any two nodes. At each node u, it tracks the two longest downward paths (mx1 and mx2). The longest path passing through u has length mx1 + mx2, which updates the global answer. The function returns the single longest downward path for the parent's computation. Called once from the root with ans = 0.

What does this code compute? Assume adj represents a tree with n nodes.

cpp

int result;

void dfs(int u, int p, int n, vector<vector<int>>& adj, vector<int>& sz) {
    sz[u] = 1;
    int heaviest = 0;
    for (int v : adj[u]) {
        if (v == p) continue;
        dfs(v, u, n, adj, sz);
        sz[u] += sz[v];
        heaviest = max(heaviest, sz[v]);
    }
    heaviest = max(heaviest, n - sz[u]);
    if (heaviest <= n / 2) result = u;
}

Answer
This finds the centroid of a tree -- the node whose removal minimizes the size of the largest remaining connected component. For each node u, it computes the heaviest component: the largest child subtree or the "upward" part (n - sz[u]). If the heaviest part is <= n/2, that node is a centroid. Every tree has at least one centroid, and this single DFS finds it in O(n).

BFS -- the complementary traversal; use BFS for shortest paths, DFS for structure discovery
Topological Sort -- DFS-based topological ordering for directed acyclic graphs
Bipartite Check -- 2-coloring via DFS, a direct application of graph traversal
Bridges and Articulation Points -- Tarjan's bridge-finding algorithm builds directly on DFS discovery/low values
LCA / Binary Lifting -- builds on DFS tin/tout and parent arrays
Practice Ladder: Graphs -- curated problem set

Recap

DFS explores as deep as possible before backtracking. Time: $O (V + E)$ .
The recursion stack IS the current root-to-node path -- back edges to nodes on this path reveal cycles.
tin/tout timestamps encode subtree structure: $v$ in subtree( $u$ ) iff $t i n [u] \leq t i n [v]$ and $t o u t [v] \leq t o u t [u]$ -- O(1) per query.
For trees: DFS naturally computes depth, subtree size, parent, and Euler tour (flattening for range queries).
Edge classification (tree/back/forward/cross) drives cycle detection, topological sort, SCC, and bridge-finding.
Watch for stack overflow on large $n$ (use iterative DFS), and always handle disconnected components with an outer loop.

DFS and Tree DFS ​

Contents ​

Intuition ​

Part 1 — Plain DFS: visit every reachable node ​

Visual Walkthrough ​

The Invariant ​

Part 2 — DFS on rooted trees: ancestry, time, and structure ​

When to Reach for This ​

DFS State Snapshots ​

C++ STL Reference ​

Implementation (Contest-Ready) ​

Version 1: Graph DFS -- Recursive + Iterative ​

Version 2: Tree DFS with tin/tout, Depth, Subtree Size, Parent ​

Explained Version: Tree DFS with Inline Comments ​

ASCII Trace: DFS on a 7-Node Graph ​

Operations Reference ​

Recursive vs Iterative -- When to Pick Which ​

Problem Patterns & Tricks ​

Pattern 1: Connected Components ​

Pattern 2: Cycle Detection (Directed Graph) ​

Pattern 3: Bipartite Check (2-Coloring) ​

Pattern 4: Subtree Queries via tin/tout (Euler Tour) ​

Pattern 5: Tree Diameter ​

Pattern 6: Finding Bridges and Articulation Points ​

Pattern 7: DFS on Implicit Graphs (Grid DFS) ​

Contest Cheat Sheet ​

Gotchas & Debugging ​

Gotcha 1: Stack Overflow on Deep Recursion ​

Gotcha 2: Iterative DFS -- Mark Visited Before Push, Not After Pop ​

Gotcha 3: Forgetting to Reset Between Test Cases ​

Gotcha 4: 1-indexed vs 0-indexed ​

Gotcha 5: Undirected Cycle Detection -- Parent Check ​

Gotcha 6: Tree DFS -- Root Choice ​

Mental Traps ​

Common Mistakes ​

Self-Test ​

Practice Problems ​

10. Iterative DFS and BFS Using Explicit Data Structures ​

Why Iterative? ​

Side-by-Side Comparison ​

Trace on Example Graph ​

The Aha Moment ​

Rating Progression ​

Before You Code Checklist ​

What Would You Do If...? ​

The Mistake That Teaches ​

Debug This ​

Historical Origin ​

Mnemonic Anchor ​

Further Reading ​

Code Reading Exercise ​

Related Techniques ​

Recap ​

DFS and Tree DFS

Contents

Intuition

Part 1 — Plain DFS: visit every reachable node

Visual Walkthrough

The Invariant

Part 2 — DFS on rooted trees: ancestry, time, and structure

When to Reach for This

DFS State Snapshots

C++ STL Reference

Implementation (Contest-Ready)

Version 1: Graph DFS -- Recursive + Iterative

Version 2: Tree DFS with tin/tout, Depth, Subtree Size, Parent

Explained Version: Tree DFS with Inline Comments

ASCII Trace: DFS on a 7-Node Graph

Operations Reference

Recursive vs Iterative -- When to Pick Which

Problem Patterns & Tricks

Pattern 1: Connected Components

Pattern 2: Cycle Detection (Directed Graph)

Pattern 3: Bipartite Check (2-Coloring)

Pattern 4: Subtree Queries via tin/tout (Euler Tour)

Pattern 5: Tree Diameter

Pattern 6: Finding Bridges and Articulation Points

Pattern 7: DFS on Implicit Graphs (Grid DFS)

Contest Cheat Sheet

Gotchas & Debugging

Gotcha 1: Stack Overflow on Deep Recursion

Gotcha 2: Iterative DFS -- Mark Visited Before Push, Not After Pop

Gotcha 3: Forgetting to Reset Between Test Cases

Gotcha 4: 1-indexed vs 0-indexed

Gotcha 5: Undirected Cycle Detection -- Parent Check

Gotcha 6: Tree DFS -- Root Choice

Mental Traps

Common Mistakes

Self-Test

Practice Problems

10. Iterative DFS and BFS Using Explicit Data Structures

Why Iterative?

Side-by-Side Comparison

Trace on Example Graph

The Aha Moment

Rating Progression

Before You Code Checklist

What Would You Do If...?

The Mistake That Teaches

Debug This

Historical Origin

Mnemonic Anchor

Further Reading

Code Reading Exercise

Related Techniques

Recap