Skip to content

Breadth-First Search (BFS)

Quick Revisit

  • USE WHEN: unweighted shortest path, level-order traversal, minimum-step reachability
  • INVARIANT: every node dequeued has its final (shortest) distance; queue is monotone in distance
  • TIME: O(V + E) | SPACE: O(V)
  • CLASSIC BUG: not marking visited before pushing (causes duplicates and TLE)
  • PRACTICE: ../12-practice-ladders/05-ladder-graphs.md

AL-12 | Level-by-level graph exploration -- the canonical algorithm for shortest paths in unweighted (and 0-1 weighted) graphs.

Difficulty: [Beginner]Prerequisites: Graph Representation, Stack, Queue, and DequeCF Rating Range: 1100-1500

Contest Frequency: *** Common -- appears in most Div 2 contests


Contents

Intuition

2.1 The Pain

Consider this unweighted graph with 6 nodes -- find the shortest path from A to F:

    A --- B --- D
    |     |     |
    C --- E --- F

Adjacency: A: [B, C] B: [A, D, E] C: [A, E] D: [B, F] E: [B, C, F] F: [D, E]

Naive approach: enumerate every possible path from A to F.

  • ABDF (length 3)
  • ABEF (length 3)
  • ACEF (length 3)
  • ACEBDF (length 5)
  • ABECA (cycles make it worse)

Even in this tiny graph there are 4+ simple paths. In a graph with n nodes and m edges the number of simple paths can be exponential -- up to O(n!) in a complete graph. Checking all of them is hopeless.

We need a strategy that visits each node at most once and still guarantees the shortest path. That strategy is BFS.

2.2 The Key Insight

Explore nodes layer by layer -- all nodes at distance k before any node at distance k+1 -- and the first time you reach a node IS the shortest path.

Analogy -- ripples in a pond. Drop a stone at A. A circular wave expands outward. Ring 1 hits every node one hop away (B, C). Ring 2 hits every node two hops away (D, E). Ring 3 hits F. The wave cannot skip ahead; it fills distance k completely before any water reaches distance k+1.

A FIFO queue enforces exactly this order: nodes discovered at distance k sit in front of nodes discovered at distance k+1, so every layer drains before the next one begins.

Where the insight breaks. If edges have unequal positive weights, a two-hop path can be shorter than a one-hop path. The layer-by-layer guarantee collapses because "one edge" no longer equals "one unit of distance." For weighted graphs, use Dijkstra (non-negative weights) or Bellman-Ford (negative weights). The special case of 0/1 weights is rescued by the deque trick (0-1 BFS) covered below — the trick works precisely because zero-cost edges must go to the front of the deque so they stay in the current layer.

2.3 Visual Walkthrough

Same graph, BFS from A. Track the queue contents, the distance array, and which node is marked visited at each step.

  Graph:
      A --- B --- D
      |     |     |
      C --- E --- F

  dist = { A:inf, B:inf, C:inf, D:inf, E:inf, F:inf }
  visited = {}

  -- Initialise --
  Step 0  Push A (dist A=0, mark A visited)
          Queue: [A]
          dist:    { A:0, B:inf, C:inf, D:inf, E:inf, F:inf }
          visited: { A }

  -- Layer 0 (distance 0) --
  Step 1  Pop A.  Neighbours: B, C.
          B unvisited -> dist B = 0+1 = 1, mark B, push B.
          C unvisited -> dist C = 0+1 = 1, mark C, push C.
          Queue: [B, C]
          dist:    { A:0, B:1, C:1, D:inf, E:inf, F:inf }
          visited: { A, B, C }

  -- Layer 1 (distance 1) --
  Step 2  Pop B.  Neighbours: A, D, E.
          A visited -> skip.
          D unvisited -> dist D = 1+1 = 2, mark D, push D.
          E unvisited -> dist E = 1+1 = 2, mark E, push E.
          Queue: [C, D, E]
          dist:    { A:0, B:1, C:1, D:2, E:2, F:inf }
          visited: { A, B, C, D, E }

  Step 3  Pop C.  Neighbours: A, E.
          A visited -> skip.
          E visited -> skip.
          Queue: [D, E]
          dist:    { A:0, B:1, C:1, D:2, E:2, F:inf }    (unchanged)

  -- Layer 2 (distance 2) --
  Step 4  Pop D.  Neighbours: B, F.
          B visited -> skip.
          F unvisited -> dist F = 2+1 = 3, mark F, push F.
          Queue: [E, F]
          dist:    { A:0, B:1, C:1, D:2, E:2, F:3 }
          visited: { A, B, C, D, E, F }

  Step 5  Pop E.  Neighbours: B, C, F.
          B visited -> skip.
          C visited -> skip.
          F visited -> skip.
          Queue: [F]

  -- Layer 3 (distance 3) --
  Step 6  Pop F.  Neighbours: D, E.  All visited -> skip.
          Queue: []   (empty -- done)

  Final dist: { A:0, B:1, C:1, D:2, E:2, F:3 }

Operation count. Each of the 6 nodes is pushed and popped exactly once (6 pushes + 6 pops = 12 queue operations). Each edge is examined twice (once from each endpoint) for 2×7=14 adjacency checks. Total work: O(V+E).

2.4 The Invariant

+---------------------------------------------------------------+
|  INVARIANT: When a node u is dequeued, dist[u] is final and   |
|  equals the length of a shortest path from the source to u.   |
+---------------------------------------------------------------+

Proof sketch (layer-by-layer argument).

  1. Ordering property. The queue is always a sequence of nodes whose distances form at most two consecutive values d and d+1, with all d-nodes in front of all (d+1)-nodes. This holds at initialisation (queue = [source] with distance 0) and is preserved by the BFS rule: popping a node at distance d can only enqueue new nodes at distance d+1, which go to the back.

  2. Non-decreasing dequeue order. From (1), nodes leave the queue in non-decreasing order of their assigned distance: all nodes at distance 0, then all at distance 1, then 2, etc.

  3. First-visit optimality. Suppose for contradiction that when BFS first reaches node v via some path P of length , there exists a shorter path P of length <. Every intermediate node w on P satisfies dist(s,w)1<1. Because all edge weights are 1, every such w belongs to a layer strictly before layer , so w was already dequeued and its edge to the next node on P was already explored. This means v would have been discovered at distance , contradicting the assumption that BFS first assigned distance .

  4. Therefore: the distance recorded when a node is first visited (and enqueued) is the true shortest-path distance. Since we mark nodes visited on push, no later path can overwrite it.


BFS vs DFS -- trade-offs.

AspectBFSDFS
Traversal orderLayer by layer (all dist k before dist k+1)As deep as possible before backtracking
Finds shortest path?Yes (unweighted graphs)No -- may find a longer path first
Space complexityO(W) where W = max frontier width (can be O(V))O(D) where D = max depth (can be O(V))
Data structureQueue (FIFO)Stack / recursion (LIFO)
Best forShortest paths, level-order queries, minimum stepsConnectivity, cycle detection, topological sort, back-edge classification

Rule of thumb. If the problem says "minimum number of moves/steps" in an unweighted setting, reach for BFS. If it says "find any path,""detect a cycle," or "topological order," reach for DFS.

DFS can accidentally look like it works on small cases but produce wrong answers on larger inputs because it does not explore layers in order. Never substitute DFS for BFS when shortest paths are required.

What the Code Won't Teach You

1. The FIFO queue IS the layer enforcer.

The queue is not incidental to BFS -- it is the mechanism that creates the layer-by-layer guarantee. When processing a node at distance d, newly discovered neighbours go in at distance d+1. Because FIFO means first-in-first-out, all d-nodes drain before any (d+1)-node:

  Queue during layer transition (d=1 -> d=2):

  +-----+-----+-----+-----+-----+
  |  C  |  D  |  E  |     |     |
  +-----+-----+-----+-----+-----+
  ^            ^                 ^
  front        |                 back
  (d=1)        (d=2 starts here)

  C will be popped and processed BEFORE D or E.
  Any new nodes C discovers go behind E (still d=2).
  The boundary only moves forward -- never backward.

2. 0-1 BFS = same idea, deque instead of queue.

With edge weights of 0 or 1:

  • Cost-0 edge: the neighbour is at the same distance -- push to front (same layer).
  • Cost-1 edge: the neighbour is at distance d+1 -- push to back (next layer).

The deque maintains the same two-layer invariant as a regular BFS queue. There is no new algorithmic idea -- just a different container that handles two "speeds" of edges.

3. The dist[v] = -1 idiom.

Instead of maintaining a separate visited[] array, initialise all distances to 1. Then dist[v] == -1 simultaneously means "unvisited" and "unknown distance." When you set dist[v] = dist[u] + 1, the node is both marked visited and assigned its correct distance in one operation. Memorise this pattern -- it saves a line of code and eliminates a common source of bugs (forgetting to update one of two arrays).

2.5 When to Reach for This

Trigger phrases in problem statements:

  • "Find the minimum number of moves / steps / operations"
  • "Shortest path in an unweighted graph / grid"
  • "Nearest X from every cell"
  • "Can you reach ... in at most k steps?"
  • "BFS/shortest path" tag on Codeforces

Counter-examples (BFS is NOT the right tool):

  • Weighted edges with values other than 0/1 -> Dijkstra or Bellman-Ford.
  • "Find any path" / "detect a cycle" -> DFS is simpler and uses less memory.
  • "Longest path in a DAG" -> topological sort + DP, not BFS.
  • "Minimum spanning tree" -> Kruskal / Prim, completely different problem.
  • "All-pairs shortest paths" on dense graphs -> Floyd-Warshall is more practical.

The Nodes May Be Implicit

BFS is often taught as "shortest path in an unweighted graph," but the deeper claim is minimum number of state transitions. The graph rarely arrives as an adjacency list — most contest BFS problems hide the graph inside something else. If you can describe (a) what counts as a state and (b) what counts as a single move, you have a BFS.

Mini example — string transformation:

Start with the word "cat". In one move you may change a single letter to any other letter. What is the minimum number of moves to reach "dog"?

There is no adjacency list. The states are strings; the neighbours of a string are all strings differing in exactly one letter. The transitions all cost 1, so BFS over the implicit graph finds the answer:

cpp
queue<string> q;
unordered_map<string,int> dist;
dist["cat"] = 0; q.push("cat");
while (!q.empty()) {
    string s = q.front(); q.pop();
    if (s == "dog") { cout << dist[s]; return 0; }
    for (int i = 0; i < (int)s.size(); i++) {
        char old = s[i];
        for (char c = 'a'; c <= 'z'; c++) {
            if (c == old) continue;
            s[i] = c;
            if (!dist.count(s)) { dist[s] = dist[old_s_dist] + 1; q.push(s); }
        }
        s[i] = old;
    }
}

The same shape works for grid moves (state = (r, c)), bitmask states (state = subset of items collected), integer operations (state = current number, neighbours = number after one allowed operation), and puzzle configurations. Whenever a problem asks for the minimum number of unit moves between configurations, BFS is the default tool — even if the word "graph" never appears.

2.6 BFS Wavefront Snapshots

The diagrams below freeze BFS mid-execution so you can see the layer invariant in the queue.

Diagram 1 -- Standard BFS wavefront (source = A)

  Graph:   A --- B --- D            Queue snapshots
           |     |     |
           C --- E --- F
                                    After init (layer 0 in queue):
  Wavefront rings:                    front -> [ A ] <- back
                                               d=0
   d=0  *A*
   d=1  *B* *C*                     After expanding A (layer 1 in queue):
   d=2  *D* *E*                       front -> [ B  C ] <- back
   d=3  *F*                                    d=1 d=1

                                    After expanding B (layers 1-2 in queue):
                                      front -> [ C | D  E ] <- back
                                               d=1 | d=2 d=2
                                                   ^
                                             layer boundary

                                    After expanding C (layer 2 in queue):
                                      front -> [ D  E ] <- back
                                               d=2 d=2

                                    After expanding D (layers 2-3):
                                      front -> [ E | F ] <- back
                                               d=2 | d=3

Key: the queue always holds at most two consecutive distances. Everything at distance d is ahead of everything at distance d+1.

Diagram 2 -- 0-1 BFS deque state

  Graph with 0/1 edge weights:
      A --0-- B --1-- D
      |       |       |
     (1)     (0)     (1)
      |       |       |
      C --1-- E --0-- F

  Start: Deque = [ A ]   dist[A] = 0

  Pop A (dist 0):
    A-B cost 0 -> push_front B (dist 0)
    A-C cost 1 -> push_back  C (dist 1)
    Deque: front -> [ B | C ] <- back
                    d=0 | d=1

  Pop B (dist 0):
    B-E cost 0 -> push_front E (dist 0)
    B-D cost 1 -> push_back  D (dist 1)
    Deque: front -> [ E | C  D ] <- back
                    d=0 | d=1 d=1

  Pop E (dist 0):
    E-F cost 0 -> push_front F (dist 0)
    Deque: front -> [ F | C  D ] <- back
                    d=0 | d=1 d=1

  Cost-0 edges stay in the SAME layer (push front).
  Cost-1 edges go to the NEXT layer (push back).
  The deque maintains the layer invariant -- same idea,
  different container.

Diagram 3 -- Multi-source BFS (sources = A and F)

  Graph:   A --- B --- D
           |     |     |
           C --- E --- F

  Init: push A (dist 0), push F (dist 0)
    Queue: front -> [ A  F ] <- back
                    d=0 d=0

  Pop A -> push B(1), C(1):
    Queue: front -> [ F  B  C ] <- back
                    d=0 d=1 d=1

  Pop F -> D already unvisited, push D(1); E unvisited, push E(1):
    Queue: front -> [ B  C  D  E ] <- back
                    d=1 d=1 d=1 d=1

  Result: dist from nearest source
    A=0  B=1  C=1  D=1  E=1  F=0

  Think: a virtual super-source S* with 0-cost edges to
  every real source.  Remove S* and you have multi-source BFS.

C++ STL Reference

Function / ClassHeaderWhat it doesTime Complexity
queue<T><queue>FIFO queue for standard BFSpush/pop/front: O(1)
deque<T><deque>Double-ended queue for 0-1 BFSpush_front/push_back/pop_front: O(1)
queue::push(x)<queue>Enqueue element at backO(1)
queue::pop()<queue>Dequeue element from frontO(1)
queue::front()<queue>Access front elementO(1)
queue::empty()<queue>Check if queue is emptyO(1)
deque::push_front(x)<deque>Insert at front (weight-0 edge in 0-1 BFS)O(1) amortized
deque::push_back(x)<deque>Insert at back (weight-1 edge in 0-1 BFS)O(1) amortized
deque::pop_front()<deque>Remove from frontO(1) amortized
deque::front()<deque>Access front elementO(1)
vector<vector<int>><vector>Adjacency list representationAccess: O(1)

Implementation (Contest-Ready)

+------------------------------------------------------------+
| CRITICAL: Mark vertices as visited when you PUSH them onto |
| the queue, NOT when you pop them. Marking on pop allows    |
| the same vertex to be pushed multiple times, causing O(V^2)|
| time and O(V^2) memory -- TLE and MLE on large graphs.     |
+------------------------------------------------------------+
cpp
// WRONG -- marks on pop, vertex gets pushed multiple times
while (!q.empty()) {
    int v = q.front(); q.pop();
    visited[v] = true;  // TOO LATE -- v was already in queue multiple times
    for (int u : adj[v])
        if (!visited[u]) q.push(u);
}

// CORRECT -- marks on push, each vertex enters queue exactly once
while (!q.empty()) {
    int v = q.front(); q.pop();
    for (int u : adj[v]) {
        if (!visited[u]) {
            visited[u] = true;  // Mark NOW, before push
            q.push(u);
        }
    }
}

Version 1a: Standard BFS -- Minimal Contest Template

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    int n, m, s;
    scanf("%d%d%d", &n, &m, &s);
    vector<vector<int>> adj(n);
    for (int i = 0; i < m; i++) {
        int u, v;
        scanf("%d%d", &u, &v);
        adj[u].push_back(v);
        adj[v].push_back(u);
    }

    vector<int> dist(n, -1);
    queue<int> q;
    dist[s] = 0;
    q.push(s);
    while (!q.empty()) {
        int u = q.front(); q.pop();
        for (int v : adj[u]) {
            if (dist[v] == -1) {
                dist[v] = dist[u] + 1;
                q.push(v);
            }
        }
    }

    for (int i = 0; i < n; i++)
        printf("%d%c", dist[i], " \n"[i == n - 1]);
}

Version 1b: Standard BFS -- Explained

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    int n, m, s;
    scanf("%d%d%d", &n, &m, &s);

    // Build adjacency list.  adj[u] = list of neighbors of u.
    vector<vector<int>> adj(n);
    for (int i = 0; i < m; i++) {
        int u, v;
        scanf("%d%d", &u, &v);
        adj[u].push_back(v);
        adj[v].push_back(u); // remove for directed graph
    }

    // dist[v] = shortest distance from s to v.  -1 = not yet visited.
    // Using -1 as "unvisited" doubles as the visited array -- no separate bool needed.
    vector<int> dist(n, -1);
    queue<int> q;

    // Seed the BFS with the source node.
    dist[s] = 0;
    q.push(s);

    while (!q.empty()) {
        int u = q.front(); q.pop();

        for (int v : adj[u]) {
            // Only process v if we haven't visited it yet.
            // The first time we reach v is guaranteed to be the shortest path
            // because BFS explores layer by layer.
            if (dist[v] == -1) {
                dist[v] = dist[u] + 1;
                q.push(v);
            }
        }
    }

    // Print shortest distances.  Unreachable nodes remain -1.
    for (int i = 0; i < n; i++)
        printf("%d%c", dist[i], " \n"[i == n - 1]);
}

Version 2a: 0-1 BFS -- Minimal Contest Template

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    int n, m, s;
    scanf("%d%d%d", &n, &m, &s);
    // adj[u] = {(v, w)} where w is 0 or 1
    vector<vector<pair<int,int>>> adj(n);
    for (int i = 0; i < m; i++) {
        int u, v, w;
        scanf("%d%d%d", &u, &v, &w);
        adj[u].push_back({v, w});
        adj[v].push_back({u, w});
    }

    vector<int> dist(n, INT_MAX);
    deque<int> dq;
    dist[s] = 0;
    dq.push_back(s);
    while (!dq.empty()) {
        int u = dq.front(); dq.pop_front();
        for (auto [v, w] : adj[u]) {
            if (dist[u] + w < dist[v]) {
                dist[v] = dist[u] + w;
                if (w == 0) dq.push_front(v);
                else        dq.push_back(v);
            }
        }
    }

    for (int i = 0; i < n; i++)
        printf("%d%c", dist[i], " \n"[i == n - 1]);
}

Version 2b: 0-1 BFS -- Explained

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    int n, m, s;
    scanf("%d%d%d", &n, &m, &s);

    // Adjacency list with edge weights.  Each edge weight is 0 or 1.
    vector<vector<pair<int,int>>> adj(n);
    for (int i = 0; i < m; i++) {
        int u, v, w;
        scanf("%d%d%d", &u, &v, &w);
        adj[u].push_back({v, w});
        adj[v].push_back({u, w});
    }

    // dist[v] = shortest distance from s.  Start with infinity.
    // Unlike standard BFS, a node can be relaxed multiple times,
    // so we use INT_MAX instead of -1 as sentinel.
    vector<int> dist(n, INT_MAX);
    deque<int> dq;
    dist[s] = 0;
    dq.push_back(s);

    while (!dq.empty()) {
        int u = dq.front(); dq.pop_front();

        for (auto [v, w] : adj[u]) {
            // Relax edge (u, v) with weight w.
            if (dist[u] + w < dist[v]) {
                dist[v] = dist[u] + w;
                // Weight-0 edge: v is same "distance layer" as u.
                //   Push to FRONT so it's processed before weight-1 neighbors.
                // Weight-1 edge: v is next layer.
                //   Push to BACK like normal BFS.
                if (w == 0) dq.push_front(v);
                else        dq.push_back(v);
            }
        }
    }

    for (int i = 0; i < n; i++)
        printf("%d%c", dist[i], " \n"[i == n - 1]);
}

Bonus: Multi-Source BFS

cpp
// Push ALL sources into the queue before the loop.
// dist[src] = 0 for every source.  Everything else is standard BFS.
vector<int> dist(n, -1);
queue<int> q;
for (int src : sources) {
    dist[src] = 0;
    q.push(src);
}
while (!q.empty()) {
    int u = q.front(); q.pop();
    for (int v : adj[u]) {
        if (dist[v] == -1) {
            dist[v] = dist[u] + 1;
            q.push(v);
        }
    }
}
// dist[v] = shortest distance from v to the nearest source.

Bonus: Grid BFS Snippet

cpp
int R, C;
vector<string> grid(R);
int dr[] = {-1, 1, 0, 0};
int dc[] = {0, 0, -1, 1};

vector<vector<int>> dist(R, vector<int>(C, -1));
queue<pair<int,int>> q;
dist[sr][sc] = 0;
q.push({sr, sc});
while (!q.empty()) {
    auto [r, c] = q.front(); q.pop();
    for (int d = 0; d < 4; d++) {
        int nr = r + dr[d], nc = c + dc[d];
        if (nr >= 0 && nr < R && nc >= 0 && nc < C
            && grid[nr][nc] != '#' && dist[nr][nc] == -1) {
            dist[nr][nc] = dist[r][c] + 1;
            q.push({nr, nc});
        }
    }
}

Operations Reference

OperationTimeSpace
Standard BFS (adjacency list)O(V+E)O(V) queue + O(V+E) graph
0-1 BFS (adjacency list)O(V+E)O(V) deque + O(V+E) graph
Multi-source BFSO(V+E)O(V) queue + O(V+E) graph
Grid BFS (R×C)O(RC)O(RC)
BFS path reconstruction (via parent array)O(V) extraO(V)
Bidirectional BFS (worst case)O(V+E), typically O(bd/2)O(bd/2)

Where V = vertices, E = edges, b = branching factor, d = shortest path length.


Problem Patterns & Tricks

Pattern 1: Shortest Path in Unweighted Graph

The bread-and-butter BFS application. Given an unweighted graph, find the shortest distance from source to all other nodes.

Trick: Use dist[v] = -1 as both "unvisited" flag and distance storage. No separate visited[] needed.

Examples: CF 20C -- Dijkstra?, CF 1063B -- Labyrinth

Pattern 2: Grid BFS / Flood Fill

Model a 2D grid as an implicit graph. Each cell is a node; edges connect to 4 (or 8) neighbors. Find shortest path, count connected components, or flood-fill regions.

Trick: Use int dr[]={-1,1,0,0}, dc[]={0,0,-1,1} for 4-directional moves. Always check bounds before accessing grid[nr][nc].

cpp
for (int d = 0; d < 4; d++) {
    int nr = r + dr[d], nc = c + dc[d];
    if (nr >= 0 && nr < R && nc >= 0 && nc < C && dist[nr][nc] == -1)
        // ...
}

Examples: CF 1037D -- Valid BFS?, CF 1582E -- Pchelyonok and Segments

Pattern 3: Multi-Source BFS

Instead of running BFS from each source separately (O(k(V+E))), push all k sources into the queue at distance 0 before the main loop. Single pass: O(V+E).

Use cases: Find shortest distance from every cell to the nearest fire/monster/exit.

Examples: CF 1293D -- Aroma's Search, CF 1294F -- Three Paths on a Tree

Pattern 4: 0-1 BFS

When edge weights are 0 or 1, use a deque instead of a priority queue. Weight-0 edges push to front, weight-1 edges push to back. Same complexity as BFS: O(V+E), much faster than Dijkstra.

Common scenario: Grid with some free moves and some costing 1. E.g., "breaking walls" costs 1, moving through open space costs 0.

Examples: CF 1063B -- Labyrinth, CF 877D -- Olya and Energy Drinks

Pattern 5: BFS on State Space / Implicit Graph

The "graph" isn't given explicitly. States are nodes; transitions are edges. Encode state as an integer or tuple, BFS over it.

Examples:

  • Rubik's cube / puzzle: state = board configuration.
  • Bitmask BFS: state = (node, visited_mask) for visiting all nodes, O(2nn).

Examples: CF 1272E -- Nearest Opposite Parity, CF 1572A -- Book

Pattern 6: Bidirectional BFS

Run BFS from both source and target simultaneously. Stop when the two frontiers meet. Reduces search space from O(bd) to O(bd/2) where b is branching factor.

When to use: Large implicit state spaces (e.g., puzzle problems) where bd is too big but bd/2 fits.

Trick: Alternate expanding the smaller frontier. When a node is in both visited sets, you've found a shortest path.

Examples: CF 1702F -- Equate Multisets

Pattern 7: BFS to Find Connected Components

Run BFS (or DFS) from every unvisited node to partition the graph into connected components. Useful as a preprocessing step.

cpp
int comp = 0;
vector<int> comp_id(n, -1);
for (int i = 0; i < n; i++) {
    if (comp_id[i] != -1) continue;
    queue<int> q;
    q.push(i);
    comp_id[i] = comp;
    while (!q.empty()) {
        int u = q.front(); q.pop();
        for (int v : adj[u]) {
            if (comp_id[v] == -1) {
                comp_id[v] = comp;
                q.push(v);
            }
        }
    }
    comp++;
}

Examples: CF 977E -- Cyclic Components, CF 1176E -- Cover it!


Contest Cheat Sheet

+--------------------------------------------------------------+
|                        BFS CHEAT SHEET                       |
+--------------------------------------------------------------+
| WHEN TO USE:                                                 |
|   - Shortest path in unweighted graph                        |
|   - Shortest path with 0/1 weights (use deque)              |
|   - Nearest source in multi-source problems                  |
|   - Flood fill / connected components on grids               |
|   - State-space search (puzzle, bitmask)                     |
+--------------------------------------------------------------+
| TEMPLATE:                                                    |
|   vector<int> dist(n, -1);                                   |
|   queue<int> q;                                              |
|   dist[s] = 0; q.push(s);                                   |
|   while (!q.empty()) {                                       |
|       int u = q.front(); q.pop();                            |
|       for (int v : adj[u])                                   |
|           if (dist[v]==-1) { dist[v]=dist[u]+1; q.push(v);} |
|   }                                                          |
+--------------------------------------------------------------+
| 0-1 BFS: deque, push_front for w=0, push_back for w=1       |
| MULTI-SOURCE: push all sources at dist=0 before loop         |
+--------------------------------------------------------------+
| TIME:  O(V + E)        SPACE: O(V + E)                      |
| GRID:  O(R * C)        4-dir: dr={-1,1,0,0} dc={0,0,-1,1}  |
+--------------------------------------------------------------+
| MARK VISITED ON PUSH, NOT ON POP!                            |
+--------------------------------------------------------------+

Common Mistakes & Debugging

Gotcha 1: Marking Visited on Pop Instead of Push

+------------------------------------------------------------+
| CRITICAL: Mark vertices as visited when you PUSH them onto |
| the queue, NOT when you pop them. Marking on pop allows    |
| the same vertex to be pushed multiple times, causing O(V^2)|
| time and O(V^2) memory -- TLE and MLE on large graphs.     |
+------------------------------------------------------------+

Wrong:

cpp
while (!q.empty()) {
    int u = q.front(); q.pop();
    visited[u] = true;          // TOO LATE!
    for (int v : adj[u])
        if (!visited[v]) q.push(v);
}

Multiple copies of the same node get pushed into the queue before any of them is popped. This causes O(E) queue size instead of O(V), leading to TLE or MLE.

Right: Set dist[v] (or visited[v]) immediately when you push, not when you pop.

cpp
// CORRECT -- marks on push, each vertex enters queue exactly once
while (!q.empty()) {
    int v = q.front(); q.pop();
    for (int u : adj[v]) {
        if (!visited[u]) {
            visited[u] = true;  // Mark NOW, before push
            q.push(u);
        }
    }
}

Gotcha 2: Integer Overflow in Grid BFS

For large grids (R,C105), the total number of cells R×C can overflow int if you compute it carelessly. Use long long for cell counts or when encoding (r, c) as a single integer r * C + c.

Gotcha 3: 0-1 BFS -- Not Checking for Improvement

In 0-1 BFS, a node can be pushed multiple times. Always check if (dist[u] + w < dist[v]) before relaxing. Using a simple visited flag (like standard BFS) does not work.

Gotcha 4: Forgetting to Handle Disconnected Components

If the graph isn't connected, some nodes remain at dist = -1. Check whether the problem expects -1, INF, or "impossible" for unreachable nodes.

Gotcha 5: Grid BFS Bounds Check

Always validate nr >= 0 && nr < R && nc >= 0 && nc < C before accessing grid[nr][nc]. Out-of-bounds access causes undefined behavior (often silent corruption, not a crash).

Gotcha 6: Using BFS When Weights Are Not 0/1

BFS gives shortest path only for unweighted graphs. 0-1 BFS extends this to 0/1 weights. For arbitrary non-negative weights, you need Dijkstra. For negative weights, you need Bellman-Ford.

Debugging Tips

  • Print the dist array after BFS. If some values look wrong, check whether the graph was built correctly (missing edges, wrong direction).
  • Check queue size during BFS. If it grows much larger than V, you're likely marking visited on pop instead of push.
  • For grid BFS: print the dist grid as a 2D matrix to visually verify the wavefront.

Mental Traps

Trap 1: "DFS also finds paths -- I can use either."

DFS finds a path, not the shortest path. On small examples it might accidentally match BFS, but it is not guaranteed.

  Graph:   A ------- D        Shortest A->D: length 1 (edge A-D)
           |         |
           B --- C --+

  BFS from A:                 DFS from A (one possible order):
    Layer 0: A                  A -> B -> C -> D
    Layer 1: B, D               Path found: A-B-C-D (length 3)
    -> dist[D] = 1  CORRECT     -> WRONG! Missed the direct edge.

Rule: If the problem requires shortest path in an unweighted graph, DFS is never a valid substitute for BFS. If the problem requires any path or connectivity, DFS is fine (and often simpler).

Trap 2: "Multi-source BFS is just running BFS from each source."

Running BFS k times from k sources costs O(k(V+E)). Multi-source BFS pushes all k sources into the queue at distance 0 and runs a single BFS -- total cost O(V+E) regardless of k.

  3 sources: S1, S2, S3        Naive approach (3 separate BFS runs):
                                  Total work = 3 * O(V+E)
  O --- O --- O --- O
  |                 |           Multi-source BFS:
  S1    S2    O    S3             Queue init: [S1, S2, S3] all at dist 0
  |     |     |     |             Total work = 1 * O(V+E)
  O --- O --- O --- O
                                  Think: virtual super-node S* with
  Naive: 3x full traversal         0-cost edges to S1, S2, S3.
  Multi: 1x full traversal         Remove S* => multi-source BFS.

Self-Test

  • [ ] I can write BFS from memory with dist[v] = -1 init, queue, and marking on push
  • [ ] I can explain the marking-on-push vs marking-on-pop bug and why wrong version pushes same node O(degree) times
  • [ ] I can write multi-source BFS -- what initialization changes vs single-source
  • [ ] I can write 0-1 BFS with deque, push_front/push_back logic, and why visited flag doesn't work
  • [ ] I can state when BFS is strictly better than Dijkstra and when Dijkstra is needed

Quick-Fire Questions

Q1: Why does BFS guarantee shortest paths in unweighted graphs?

AnswerBFS processes nodes in order of distance from the source. It explores all nodes at distance k before any node at distance k+1 (FIFO guarantee). So the first time a node is reached, the path used has the minimum number of edges -- no later path can be shorter.

Q2: What's the bug if you mark nodes as visited when popping from the queue instead of when pushing?

AnswerA node can be pushed into the queue multiple times (once per neighbor that discovers it) before it's first popped. This wastes memory (queue grows to O(E) instead of O(V)) and time. Mark on push ensures each node enters the queue exactly once.

Q3: In multi-source BFS, how does the initialization differ from single-source BFS?

AnswerInstead of pushing one source with dist=0, push ALL source nodes into the queue with dist=0. The BFS then expands from every source simultaneously in a single O(V+E) pass. The first time any cell is reached, it's from the nearest source.

Q4: When should you use 0-1 BFS instead of Dijkstra?

AnswerWhen all edge weights are 0 or 1. 0-1 BFS runs in O(V+E) using a deque (push weight-0 to front, weight-1 to back), while Dijkstra takes O((V+E) log V) with a heap. Same correctness, better constant and asymptotic complexity for the 0/1 case.

Q5: A graph has 10^6 nodes and you use recursive DFS. What's likely to happen, and what's the fix?

AnswerStack overflow. The default call stack is typically 1-8 MB, and 10^6 recursive calls exceed it. Fix: use iterative DFS with an explicit stack (std::stack or vector), which uses heap memory and can handle arbitrarily deep graphs.

Practice Problems

#ProblemSourceDifficultyKey Idea
1LabyrinthCSESEasyBasic grid BFS, path reconstruction
2MonstersCSESEasy-MedMulti-source BFS (monsters) + single-source BFS (player)
3Valid BFS?CF 1037D1600Verify if a BFS order is valid by sorting adjacency lists
4Nearest Opposite ParityCF 1272E1500Reverse graph + multi-source BFS
5LabyrinthCF 1063B19000-1 BFS on grid with constrained left/right moves
6Olya and Energy DrinksCF 877D1900BFS with jump range (deque optimization)
7Three Paths on a TreeCF 1294F1900Two BFS calls to find diameter, then multi-source
8Cover it!CF 1176E1700BFS to 2-color tree, pick the smaller color class
9Cyclic ComponentsCF 977E1300BFS/DFS connected components + degree check
10Shortest Path with ObstacleCF 1547D1300Simple BFS / Manhattan distance with blocked cell

Level-Up Chain

Four levels of BFS mastery -- each introduces a harder twist on the basic algorithm.

LevelPatternProblem / DescriptionKey Idea
1Shortest path in a gridLabyrinth (CSES)Standard BFS on a grid with dx/dy arrays. First visit = shortest path.
2Multi-source BFSMonsters (CSES)Enqueue all sources at time 0 -- the BFS wavefront expands from every source simultaneously.
30-1 BFSLabyrinth (CF 1063B, 1900)Edge weights are 0 or 1. Use a deque: push weight-0 edges to front, weight-1 edges to back. Runs in O(V+E).
4BFS on implicit state graph"Minimum operations to transform X into Y" patternStates are not grid cells but abstract configurations (e.g., (position, keys_bitmask)). BFS explores the implicit state graph. See CF 877D.

Rating Progression

CF RatingWhat You Should Be Comfortable With
1200Grid BFS, shortest path in unweighted graph, connected components
1500Multi-source BFS, 0-1 BFS, BFS on implicit graphs (state-space)
1800BFS with bitmask states, BFS on complement graph in O(n+m), bidirectional BFS
2100BFS on layered/virtual graphs, BFS + binary search combos, BFS to establish distance arrays for DP

Before-You-Code Checklist

  • [ ] Visited array initialized? Every node must start unvisited. Did you use memset or fill correctly?
  • [ ] Source(s) enqueued and marked visited BEFORE the loop? Forgetting to mark the source as visited causes it to be re-enqueued.
  • [ ] Undirected edges added both ways? adj[u].push_back(v) AND adj[v].push_back(u).
  • [ ] Distance array type: Will distances exceed INT_MAX? (Usually no for BFS, but check.)
  • [ ] Grid bounds checked? For grid BFS: 0 <= nx < rows && 0 <= ny < cols before accessing the cell.

Debug Exercises

Exercise 1: Multi-source BFS gives wrong results:

cpp
queue<int> q;
vector<int> dist(n + 1, INF);
for (int s : sources) {
    q.push(s);
    // forgot dist[s] = 0;
}
Answer

Source nodes are enqueued but their distances are left as INF. When processing them, dist[u] + 1 = INF + 1 overflows or gives a huge number. Fix: set dist[s] = 0 for every source before starting BFS.

Exercise 2: 0-1 BFS on a grid gives TLE:

cpp
deque<pair<int,int>> dq;
dq.push_back({0, 0});
dist[0][0] = 0;
while (!dq.empty()) {
    auto [r, c] = dq.front(); dq.pop_front();
    for (auto [dr, dc, w] : moves) {
        int nr = r + dr, nc = c + dc;
        if (dist[nr][nc] > dist[r][c] + w) {
            dist[nr][nc] = dist[r][c] + w;
            if (w == 0) dq.push_front({nr, nc});
            else dq.push_back({nr, nc});
        }
    }
}
Answer

No bounds check on nr, nc before accessing dist[nr][nc]. This causes undefined behavior (out-of-bounds access). Also, unlike Dijkstra, 0-1 BFS can re-process nodes, so without careful handling this can be slow. Fix: add bounds checking and optionally a visited flag to avoid redundant processing.

Historical Note

BFS was first described by Konrad Zuse in his (rejected) 1945 PhD thesis on graph exploration. It was independently reinvented by Edward F. Moore in 1959 for finding shortest paths in mazes, and later formalized by C.Y. Lee in 1961 for wire routing in circuit boards.

Mnemonic

"First in, first out -- first found, shortest route." The queue's FIFO property is what guarantees shortest paths. No priority needed when all edges cost the same.


Designing Test Cases

BFS is "simple" -- until it isn't. Wrong visited checks, off-by-one on grid bounds, or forgetting to handle disconnected components will cost you. Test these.

Minimal cases:

  • Single node: n = 1, source = 0. Distance to itself is 0. Queue should start and immediately finish.
  • Start equals end: source = destination. Should return distance 0 without processing any edges.

Edge cases:

  • Disconnected components: Source can reach some nodes but not others. Unreachable nodes must have distance -1 or INF, not 0. This is the most common BFS bug -- forgetting to initialize distances.
  • Line graph: 0--1--2--...--(n-1). Tests that BFS correctly handles maximum depth. Distance from 0 to n-1 should be n-1.
  • Complete graph: Every node connects to every other node. All distances from source are 1. Catches issues with duplicate processing (pushing the same node multiple times without marking visited).
  • Cycle: 0--1--2--0. BFS should not loop forever. Verifies your visited array works.
  • Grid BFS -- boundary cells: Start or end at corner (0,0) or (n-1,m-1). Out-of-bounds access on neighbors is a classic crash.
  • Grid BFS -- no path: Fully walled off. BFS must terminate and report unreachable.

Stress test (run locally before submitting):

cpp
// Compare BFS shortest path against brute force (try all paths via DFS).
mt19937 rng(42);
for (int iter = 0; iter < 10000; iter++) {
    int n = rng() % 10 + 2;
    // random undirected graph: each edge exists with ~40% probability
    vector<vector<int>> adj(n);
    for (int i = 0; i < n; i++)
        for (int j = i + 1; j < n; j++)
            if (rng() % 5 < 2) {
                adj[i].push_back(j);
                adj[j].push_back(i);
            }
    int src = 0, dst = n - 1;
    int bfs_ans = bfs_dist(adj, src, dst);
    int bf_ans = brute_shortest(adj, src, dst, n);  // DFS/enumerate all paths
    assert(bfs_ans == bf_ans);
}

Small graphs (n <= 10) make brute-force feasible. Run 10k+ iterations to cover many random topologies including disconnected cases.


Further Reading

See Also


How to Practice This

Speed Drill -- BFS on Adjacency List

Set a timer and implement BFS from a source node (compute shortest distances in an unweighted graph using an adjacency list) from scratch. No references.

AttemptTargetNotes
1st< 6 minFocus on correctness -- visited array, queue processing, distance tracking.
2nd< 4 minEliminate pauses -- the queue loop and neighbor iteration should be automatic.
3rd+< 3 minCompetition-ready. BFS is a fundamental primitive -- it must be instant.

Variation Drills

Once basic BFS is automatic, drill these essential variants:

VariationKey changePractice problem
0-1 BFSUse deque; push weight-0 edges to front, weight-1 to backCF 1063B - Labyrinth
Multi-source BFSInitialize queue with all sources at distance 0CSES Monsters
Grid BFS2D array, 4-directional neighbors, bounds checkingCSES Labyrinth

For each: implement, test on the linked problem, then time yourself.

Practice Strategy

  1. Week 1: Drill adjacency-list BFS daily until you hit 3 min consistently.
  2. Week 2: Alternate between grid BFS and multi-source BFS -- one per day.
  3. Week 3: Practice 0-1 BFS and BFS with state (e.g., BFS on (node, key-mask) for problems with keys and doors).
  4. Ongoing: For any shortest-path problem, first check: are weights uniform? If yes, BFS. If 0/1, deque BFS. Otherwise, Dijkstra.

Recap

  • BFS explores nodes layer by layer using a FIFO queue -- all distance-k nodes before any distance-(k+1) node.
  • The first time BFS reaches a node is the shortest path (unweighted graphs). No relaxation needed.
  • Mark visited on push, not on pop -- the single most common BFS bug.
  • Use dist[v] = -1 as both "unvisited" flag and distance storage; no separate visited[] array needed.
  • 0-1 BFS: swap the queue for a deque; weight-0 edges push front, weight-1 edges push back. Still O(V+E).
  • Multi-source BFS: push all sources at distance 0 before the loop -- single O(V+E) pass.
  • For weighted graphs with arbitrary non-negative weights, reach for Dijkstra instead.
  • Time: O(V+E) | Space: O(V) queue + O(V+E) graph.

Built for the climb from Codeforces 1100 to 2100.