Skip to content

Strongly Connected Components and 2-SAT

Quick Revisit

  • USE WHEN: Collapsing cycles in a directed graph (condensation) or solving boolean satisfiability with ≤2 literals per clause
  • INVARIANT: x and ¬x in the same SCC ⟹ 2-SAT is unsatisfiable; otherwise assign x = TRUE iff comp[x] > comp[¬x] (Kosaraju numbering is reverse-topological)
  • TIME: O(V + E) | SPACE: O(V + E)
  • CLASSIC BUG: Reversing the comp[] comparison direction in 2-SAT assignment — flips every variable's truth value
  • PRACTICE: 05-ladder-graphs

If I Had 5 Minutes:

  1. SCC = maximal set of vertices where every vertex can reach every other via directed paths.
  2. Kosaraju in 3 lines: DFS on G for finish order -> reverse all edges -> DFS on G^R in reverse finish order; each tree = one SCC.
  3. Condensation trick: collapse each SCC to a single super-node -> you get a DAG -> now do DP, topo-sort, or reachability.
  4. 2-SAT connection: model boolean variables as nodes (x and ¬x), clauses as implication edges, run SCC; unsatisfiable iff x and ¬x share an SCC.
  5. Biggest gotcha: Kosaraju's component numbering is reverse-topological -- comp[x] > comp[¬x] means TRUE. Get the comparison direction wrong and every assignment is flipped.

An SCC is a maximal set of vertices where every vertex is reachable from every other. 2-SAT uses SCCs on an implication graph to solve boolean satisfiability in linear time. AL-23

Difficulty: [Advanced] | CF Rating: 1800-2100 Prerequisites: DFS and Tree DFS, Topological Sort


Intuition

You have a directed graph on 6 nodes:

  0 --> 1 --> 2 --> 4
  ^     |     ^     |
  |     v     |     v
  +---- 3 ----+     5

Edges: 0->1, 1->3, 3->0, 3->2, 2->1, 2->4, 4->5.

You need to find every group of nodes that are mutually reachable -- if u can reach v and v can reach u through directed paths, they belong to the same group.

The brute-force approach: run BFS (or DFS) from every node, recording which nodes each one can reach. Then for each pair (u,v), check if u reaches v and v reaches u. That costs O(n(n+m)) time for the n BFS calls alone, and O(n2) space for the reachability matrix. On a contest graph with n=2×105 and m=5×105, that is roughly 1.4×1011 operations -- far too slow.

We need a way to discover all mutually-reachable groups in a single O(n+m) pass.

Run DFS twice -- once to record finish order, once on the reversed graph in reverse finish order -- each DFS tree in the second pass is exactly one SCC.

This is Kosaraju's algorithm, and the core idea rests on one observation: if you sort vertices by decreasing finish time from a DFS on the original graph G, and then explore the reversed graph GR in that order, each fresh DFS in GR stays trapped inside a single strongly connected component.

Think of it like water flowing through a system of one-way pipes. In the first pass you figure out which tanks drain last (latest finish time = "most upstream"). In the second pass you reverse every pipe and pour water from those upstream tanks. Because the pipes are reversed, water cannot escape the SCC it started in -- it only reaches nodes that could originally reach the starting tank.

Two DFS passes, both O(n+m). Total: O(n+m) time and space. No reachability matrix, no BFS from every node.

Visual Walkthrough

We trace Kosaraju on the 6-node graph above.

Step 1 -- Build the adjacency list and its reverse.

  G (forward):                G^R (reversed):
  0: [1]                      0: [3]
  1: [3]                      1: [0, 2]
  2: [1, 4]                   2: [3]
  3: [0, 2]                   3: [1]
  4: [5]                      4: [2]
  5: []                       5: [4]

Step 2 -- First DFS on G. Record finish order.

Start at node 0 (unvisited).

  call dfs(0)
    call dfs(1)
      call dfs(3)
        call dfs(0)  -- already visited, skip
        call dfs(2)
          call dfs(1) -- already visited, skip
          call dfs(4)
            call dfs(5)
            finish 5   order: [5]
          finish 4     order: [5, 4]
        finish 2       order: [5, 4, 2]
      finish 3         order: [5, 4, 2, 3]
    finish 1           order: [5, 4, 2, 3, 1]
  finish 0             order: [5, 4, 2, 3, 1, 0]

Step 3 -- Reverse the finish order to get processing order.

Processing order (latest finish first): [0, 1, 3, 2, 4, 5].

Step 4 -- Second DFS on GR in processing order.

  Pop 0: dfs on G^R from 0
    0 -> 3 -> 1 -> (0 visited, skip)
                -> 2 -> (3 visited, skip)
    Reached: {0, 3, 1, 2}  -->  SCC #0 = {0, 1, 2, 3}
  Pop 4: (1, 3, 2 already assigned, skip to 4)
    4 -> 2 (visited, skip)
    Reached: {4}            -->  SCC #1 = {4}
  Pop 5:
    5 -> 4 (visited, skip)
    Reached: {5}            -->  SCC #2 = {5}

Step 5 -- Result.

  SCC #0: {0, 1, 2, 3}   (all four nodes are mutually reachable)
  SCC #1: {4}             (singleton -- no cycle involving 4)
  SCC #2: {5}             (singleton -- 5 is a dead end)

  Condensation DAG:  SCC#0 --> SCC#1 --> SCC#2
  THE CONDENSATION DAG -- SCCs as super-nodes:

  Original graph:                 Condensation:
  +------------------+           
  | 0 -> 1 -> 2       |           +-------+    +-----+    +-----+
  | ^   ↓   ↑       |    ==>    | SCC#0 | -> | SCC#1| -> | SCC#2|
  | +-- 3 --+   4->5 |           |{0,1,  |   |  {4} |   |  {5} |
  +------------------+           | 2,3}  |   +-----+   +-----+
                                 +-------+
  Edges 2->4 and 4->5 become       (a DAG -- no cycles!)
  edges between SCC super-nodes

Verify: in the original graph, 0130 is a cycle, and 321 closes another cycle pulling 2 in. Nodes 4 and 5 have no return path, so they are singletons. Matches the algorithm output.

The Invariant

+------------------------------------------------------------------+
| In the second DFS (on G^R), each tree rooted at the unvisited    |
| node with the latest finish time from the first DFS forms        |
| exactly one strongly connected component.                        |
+------------------------------------------------------------------+

Why this holds: Let C be the SCC containing the node v with the latest finish time among all unvisited nodes. In the first DFS, v finished after every node reachable from v, so every node in C finished before v. In GR, edges inside C are reversed but still form a strongly connected subgraph, so DFS from v in GR reaches every node in C. It cannot escape C: any edge leaving C in GR leads to an SCC whose representative finished later than v in the first DFS and is therefore already assigned. So the tree rooted at v is exactly C.

When to Reach for This

Trigger phrases in problem statements:

  • "strongly connected components" or "mutually reachable" -- direct SCC.
  • "2-SAT", "boolean satisfiability with 2-literal clauses" -- build implication graph, run SCC.
  • "implication graph", "if A then B" constraints -- 2-SAT formulation.
  • "condensation DAG", "collapse cycles" -- SCC then DAG DP.
  • "at most one of" / "exactly one of" two choices per element -- classic 2-SAT encoding.
  • "minimum edges to make graph strongly connected" -- condensation, count sources and sinks.

Counter-examples (when this is NOT the right tool):

  • Undirected connectivity -- use Union-Find or simple BFS/DFS connected components, not SCC.
  • Biconnected components (undirected, edge/vertex) -- use Tarjan's bridge/articulation-point algorithm, not SCC.
  • Clauses with 3 or more literals -- that is 3-SAT, which is NP-complete. SCC-based 2-SAT does not apply.
  • "Shortest path" or "minimum cost" in a directed graph -- wrong category entirely; use Dijkstra/Bellman-Ford.
  • Constraints of the form "at most k of these n things" for k>1 -- not directly 2-SAT (might need network flow or ILP).

What the Code Won't Teach You

Kosaraju's is memorizable; the why is not. The code says "DFS on G, reverse, DFS on G^R in reverse finish order." But why does reverse finish order + reversed edges isolate exactly one SCC per tree? The key: the last node to finish in pass 1 belongs to a "source" SCC in the condensation DAG. Reversing all edges turns this source into a sink -- DFS from it can't escape. This structural argument is what makes the algorithm correct, not the code itself.

  WHY KOSARAJU'S WORKS -- the source-sink flip:

  Condensation DAG:     SCC_A -> SCC_B -> SCC_C

  Pass 1 (on G):
    SCC_A nodes finish LAST (they can reach everything)

  Pass 2 (on G^R, starting from SCC_A):
    G^R reverses all edges:  SCC_A <- SCC_B <- SCC_C
    SCC_A is now a SINK -- DFS from it stays inside SCC_A
    
    +--------+     +--------+     +--------+
    | SCC_A  | <-── | SCC_B  | <-── | SCC_C  |
    | (start |     | (next) |     | (last) |
    |  here) |     |        |     |        |
    +--------+     +--------+     +--------+
    DFS trapped    DFS trapped    DFS trapped
    inside A       inside B       inside C

2-SAT modeling is where contests are won or lost. The SCC solver is a black box -- you call build() and check the result. The real skill is translating English constraints into implications: "if sensor A is on, sensor B must be off" becomes (A -> ¬B) and (B -> ¬A), which is the clause (¬A ∨ ¬B). Getting the negation wrong or forgetting one implication direction is the #1 source of WA in 2-SAT problems.

  2-SAT MODELING CHEAT SHEET:

  English constraint        | Clause        | Implications
  --------------------------+---------------+---------------------
  "A must be true"          | (A ∨ A)       | ¬A -> A
  "A must be false"         | (¬A ∨ ¬A)    | A -> ¬A
  "Not both A and B"        | (¬A ∨ ¬B)    | A -> ¬B, B -> ¬A
  "At least one of A, B"    | (A ∨ B)       | ¬A -> B, ¬B -> A
  "Exactly one of A, B"     | (A∨B)∧(¬A∨¬B) | 4 implications
  "If A then B"             | (¬A ∨ B)      | A -> B, ¬B -> ¬A

The condensation DAG unlocks DP on directed graphs. After finding SCCs, collapsing each into a single node gives a DAG. Any problem that says "maximize/minimize something over all reachable nodes" or "count paths" on a directed graph with cycles becomes a DAG DP problem after condensation. The SCC algorithm is the preprocessing step; the DP is the actual solution.


From SCC to 2-SAT: when the graph is not given

So far the input has been a directed graph and the question has been about its SCCs. 2-SAT is the trick of going the other direction: you are given a boolean formula and asked whether some assignment makes it true, and you turn it into a directed graph -- the implication graph -- whose SCCs answer the question.

Set up the conversion. Each variable xi becomes two nodes: one for the literal xi (the variable being TRUE), one for ¬xi (the variable being FALSE). A 2-SAT clause (ab) -- where a and b are literals -- is logically equivalent to the pair of implications ¬ab and ¬ba. Add those two edges to the graph. Repeat for every clause.

The miracle is the following theorem:

2-SAT theorem. The formula is satisfiable iff for every variable xi, the literal xi and the literal ¬xi lie in different SCCs.

If xi and ¬xi are in the same SCC, then by following implications you can derive xi¬xi and ¬xixi -- a contradiction, so no assignment works. Otherwise, an assignment exists, and we can read it off the SCC labels.

This is why SCCs come first in the lesson and 2-SAT second: 2-SAT is not a separate algorithm, it is a modelling technique that delegates the work to SCC.


C++ STL Reference

FeatureUsageNotes
vector<vector<int>> adj(n)Adjacency listForward graph
vector<vector<int>> radj(n)Reverse adjacency listFor Kosaraju pass 2
vector<int> orderFinish order stackpush_back in post-order
vector<int> comp(n, -1)SCC component ID per vertexFilled in pass 2
vector<bool> vis(n)Visited flagsReset between passes
ranges::reverseReverse finish orderC++20 ranges

Implementation

Version 1: Kosaraju SCC (minimal)

cpp
#include <bits/stdc++.h>
using namespace std;

struct SCC {
    int n, num_scc = 0;
    vector<vector<int>> adj, radj;
    vector<int> order, comp;
    vector<bool> vis;

    SCC(int n) : n(n), adj(n), radj(n), comp(n, -1), vis(n) {}

    void add_edge(int u, int v) {
        adj[u].push_back(v);
        radj[v].push_back(u);
    }

    void dfs1(int v) {
        vis[v] = true;
        for (int u : adj[v])
            if (!vis[u]) dfs1(u);
        order.push_back(v);
    }

    void dfs2(int v, int c) {
        comp[v] = c;
        for (int u : radj[v])
            if (comp[u] == -1) dfs2(u, c);
    }

    void build() {
        for (int i = 0; i < n; i++)
            if (!vis[i]) dfs1(i);
        ranges::reverse(order);
        for (int v : order)
            if (comp[v] == -1) dfs2(v, num_scc++);
    }
};

int main() {
    int n, m;
    scanf("%d%d", &n, &m);
    SCC scc(n);
    for (int i = 0; i < m; i++) {
        int u, v;
        scanf("%d%d", &u, &v);
        scc.add_edge(u - 1, v - 1);
    }
    scc.build();
    printf("%d\n", scc.num_scc);
    for (int i = 0; i < n; i++)
        printf("%d%c", scc.comp[i] + 1, " \n"[i == n - 1]);
}
  KOSARAJU'S TWO-PASS STRUCTURE:

  Pass 1 (DFS on G):                  Pass 2 (DFS on G^R):
  +---------------------------+        +---------------------------+
  | Visit all nodes via DFS   |        | Process in reverse finish |
  | Record finish order:      |        | order on reversed graph:  |
  |   [5, 4, 2, 3, 1, 0]     |  ==>   |   Start at 0: reaches    |
  | (last finished = 0)       |        |   {0,3,1,2} -> SCC #0     |
  +---------------------------+        |   Start at 4: {4} -> SCC#1 |
                                       |   Start at 5: {5} -> SCC#2 |
  Reverse: [0, 1, 3, 2, 4, 5]         +---------------------------+

Version 2: 2-SAT solver (complete)

Variable xi maps to node 2i, ¬xi maps to node 2i+1.

cpp
#include <bits/stdc++.h>
using namespace std;

struct TwoSAT {
    int n;
    SCC scc; // uses SCC struct from above
    vector<bool> assignment;

    TwoSAT(int n) : n(n), scc(2 * n), assignment(n) {}

    // node index for literal: var i, positive => 2*i, negative => 2*i+1
    static int pos(int i) { return 2 * i; }
    static int neg(int i) { return 2 * i + 1; }
    static int negate(int lit) { return lit ^ 1; }

    // add clause (a OR b) where a,b are literal indices from pos()/neg()
    void add_clause(int a, int b) {
        scc.add_edge(negate(a), b);
        scc.add_edge(negate(b), a);
    }

    // force literal a to be true
    void force_true(int a) {
        add_clause(a, a);
    }

    // at most one of a, b is true
    void at_most_one(int a, int b) {
        add_clause(negate(a), negate(b));
    }

    // exactly one of a, b is true (XOR)
    void exactly_one(int a, int b) {
        add_clause(a, b);
        add_clause(negate(a), negate(b));
    }

    bool solve() {
        scc.build();
        for (int i = 0; i < n; i++) {
            if (scc.comp[pos(i)] == scc.comp[neg(i)])
                return false; // x and ~x in same SCC
            // With THIS file's Kosaraju (dfs2 calls num_scc++ as it
            // discovers components, starting from the latest-finishing
            // node of pass 1 -- the source of the condensation DAG):
            //   comp = 0  -> source SCC (no incoming edges from others)
            //   comp = K-1 -> sink   SCC (no outgoing edges to others)
            //
            // A satisfying 2-SAT assignment picks the literal whose SCC
            // is closer to a sink (later in topological order). With the
            // numbering above, "later in topo order" = larger comp value.
            // So x_i is TRUE iff comp[pos(i)] > comp[neg(i)].
            //
            // If you swap the SCC implementation (e.g., to Tarjan, which
            // numbers SCCs in topological order), invert this comparison.
            assignment[i] = scc.comp[pos(i)] > scc.comp[neg(i)];
        }
        return true;
    }
};

int main() {
    int n, m;
    scanf("%d%d", &n, &m);
    TwoSAT sat(n);
    for (int i = 0; i < m; i++) {
        // input: each clause is two literals
        // literal j means x_{|j|}, positive if j>0, negative if j<0
        int a, b;
        scanf("%d%d", &a, &b);
        int la = a > 0 ? TwoSAT::pos(a - 1) : TwoSAT::neg(-a - 1);
        int lb = b > 0 ? TwoSAT::pos(b - 1) : TwoSAT::neg(-b - 1);
        sat.add_clause(la, lb);
    }
    if (sat.solve()) {
        puts("YES");
        for (int i = 0; i < n; i++)
            printf("%d%c", (int)sat.assignment[i], " \n"[i == n - 1]);
    } else {
        puts("NO");
    }
}

Explained: key details

cpp
// Why negate(lit) = lit ^ 1?
// pos(i) = 2*i (even), neg(i) = 2*i+1 (odd)
// XOR with 1 flips the last bit: even <-> odd, i.e., x <-> ~x

// Why comp[pos(i)] > comp[neg(i)] means x_i = true?
// In *this* Kosaraju template, dfs2 increments num_scc as components
// are discovered in pass 2, starting from the latest-finishing node
// of pass 1. That first-discovered component (comp = 0) is a SOURCE
// of the condensation DAG; the last-discovered (comp = num_scc - 1)
// is a SINK. So comp values increase along topological order.
//
// The 2-SAT correctness rule says: pick the literal whose SCC comes
// later in topological order. "Later in topo order" = "closer to a
// sink" = "larger comp here". Hence comp[pos] > comp[neg] -> TRUE.
// (Tarjan's SCC labels in the opposite direction; if you swap the
// SCC backend, invert the comparison.)

// Clause (a OR b) => edges (~a -> b) and (~b -> a)
// Reading: "if a is false then b must be true" and vice versa.

Operations Reference

OperationTimeSpaceNotes
Build adjacency listO(n+m)O(n+m)Forward + reverse
Kosaraju SCCO(n+m)O(n+m)Two DFS passes
Tarjan SCCO(n+m)O(n+m)Single pass, slightly faster in practice
Condensation DAGO(n+m)O(nscc+mscc)Iterate edges, map to SCC IDs
2-SAT build graphO(m)O(n+m)2n nodes, 2m edges
2-SAT solveO(n+m)O(n+m)SCC + one linear scan
2-SAT single clauseO(1)O(1)Two edge insertions

Where n = vertices (or 2n for 2-SAT), m = edges (or 2m for 2-SAT).


Problem Patterns

Pattern 1: Find and count SCCs

Direct application. Count SCCs, find the largest, or check if the whole graph is strongly connected.

cpp
scc.build();
// number of SCCs
printf("%d\n", scc.num_scc);
// check if strongly connected
printf("%s\n", scc.num_scc == 1 ? "YES" : "NO");

Problems: CF 427C - Checkposts, CF 999E - Reachability from the Capital

Pattern 2: Condensation + DAG DP

Condense SCCs, then run DP on the DAG (longest path, min cost to reach all nodes, etc.).

cpp
scc.build();
vector<vector<int>> dag(scc.num_scc);
vector<int> scc_weight(scc.num_scc);
for (int v = 0; v < n; v++) {
    scc_weight[scc.comp[v]] += weight[v];
    for (int u : adj[v])
        if (scc.comp[v] != scc.comp[u])
            dag[scc.comp[v]].push_back(scc.comp[u]);
}
// now DP on dag in topological order (SCC IDs are reverse topo)
  CONDENSATION + DAG DP PIPELINE:

  Step 1: Find SCCs       Step 2: Build DAG       Step 3: DP on DAG
  +------------------+    +-----------------+     +-----------------+
  | Kosaraju/Tarjan  |    | For each edge   |     | Topo order on   |
  | comp[v] = SCC ID | ->  | if comp[u]!=    | ->   | condensation    |
  | for every node   |    |    comp[v]:     |     | dp[scc] = max   |
  +------------------+    |  add DAG edge   |     | reachable value |
                          +-----------------+     +-----------------+

Problems: CF 894E - Ralph and Mushrooms, CSES - Coin Collector

Pattern 3: Classic 2-SAT (direct boolean clauses)

Given explicit boolean constraints, build implication graph and solve.

Problems: CF 228E - The Road to Berland is Paved with Good Intentions, CF 776D - The Door Problem

Pattern 4: 2-SAT with "choose one of two" constraints

Each element has two choices. Conflicts between choices translate to 2-SAT clauses. Common in coloring, placement, and scheduling.

cpp
// element i has options A_i (pos(i)) and B_i (neg(i))
// if choosing A_i conflicts with A_j: at_most_one(pos(i), pos(j))
// means "not both A_i and A_j": (~A_i OR ~A_j)
sat.add_clause(TwoSAT::neg(i), TwoSAT::neg(j));

Problems: CF 1903F - Baby's First 2-SAT, CF 587D - Duff in Mafia

Pattern 5: 2-SAT + binary search on answer

Binary search on a parameter k. For each candidate k, build a 2-SAT instance and check satisfiability.

Problems: CF 1971H - +-Equalization, CF 1475F - Unusual Matrix

Pattern 6: Reachability and minimum edges to make strongly connected

Find the number of edges to add so the condensation DAG becomes strongly connected: max(sources,sinks) where sources/sinks are in the condensation.

cpp
scc.build();
if (scc.num_scc == 1) { printf("0\n"); return; }
vector<int> in_deg(scc.num_scc), out_deg(scc.num_scc);
for (int v = 0; v < n; v++)
    for (int u : adj[v])
        if (scc.comp[v] != scc.comp[u])
            out_deg[scc.comp[v]]++, in_deg[scc.comp[u]]++;
int sources = 0, sinks = 0;
for (int i = 0; i < scc.num_scc; i++) {
    sources += (in_deg[i] == 0);
    sinks += (out_deg[i] == 0);
}
printf("%d\n", max(sources, sinks));

Problems: CF 999E - Reachability from the Capital, CSES - Flight Routes Check

Pattern 7: 2-SAT with graph/geometry constraints

Place n objects where each has two possible positions. Geometric conflicts (overlap, distance) between positions create 2-SAT clauses.

Problems: CF 1215F - Radio Stations


Contest Cheat Sheet

+--------------------------------------------------------------+
|                  SCC / 2-SAT CHEAT SHEET                     |
+--------------------------------------------------------------+
| KOSARAJU SCC:                                                |
|   1. DFS on G, push to stack in post-order                   |
|   2. Reverse edges -> G^R                                    |
|   3. Pop stack, DFS on G^R -> each tree = 1 SCC              |
|   Time: O(n+m)   Space: O(n+m)                              |
+--------------------------------------------------------------+
| 2-SAT SETUP:                                                 |
|   - n vars -> 2n nodes: x_i = 2i, ~x_i = 2i+1              |
|   - clause (a OR b) -> edges (~a->b) and (~b->a)            |
|   - negate literal: lit ^ 1                                  |
+--------------------------------------------------------------+
| 2-SAT SOLVE:                                                 |
|   - Run SCC on implication graph                             |
|   - UNSAT iff comp[x] == comp[~x] for any x                 |
|   - Assignment: x=true iff comp[2i] > comp[2i+1]            |
+--------------------------------------------------------------+
| COMMON CLAUSES:                                              |
|   force true(a):       add_clause(a, a)                      |
|   at_most_one(a,b):    add_clause(~a, ~b)                    |
|   exactly_one(a,b):    add_clause(a,b) + add_clause(~a,~b)   |
|   implication(a->b):   add_clause(~a, b)                     |
+--------------------------------------------------------------+
| CONDENSATION: edges between diff SCCs -> DAG                 |
|   sources + sinks in DAG -> useful for reachability           |
+--------------------------------------------------------------+

Gotchas & Debugging

1. Recursion depth. Kosaraju and Tarjan both use DFS. For n105, recursive DFS may stack overflow. Use iterative DFS or increase stack size (ulimit -s unlimited on Linux, or #pragma comment(linker, "/STACK:1000000000") on Windows).

2. Forgetting the reverse graph. Kosaraju needs both adj and radj. If you add (u, v) to adj but forget (v, u) to radj, pass 2 produces garbage.

3. 2-SAT literal indexing off-by-one. Variable i maps to nodes 2i and 2i+1. If your input is 1-indexed, subtract 1 before calling pos(i-1) / neg(i-1).

4. Wrong comparison direction for assignment. With Kosaraju, comp[pos(i)] > comp[neg(i)] means true. With Tarjan (which labels in reverse discovery order), the comparison may flip. Know which convention your template uses.

5. Rebuilding 2-SAT in binary search. When binary searching on a parameter, you must rebuild the entire implication graph for each candidate. Clear the adjacency lists and comp array. A common bug is accumulating edges from prior iterations.

6. Missing implied edges. The clause (ab) gives two edges. Forgetting one makes the implication graph incomplete and can produce wrong satisfying assignments.

7. Self-loops in condensation. When building the condensation DAG, skip edges where comp[u] == comp[v]. Otherwise your "DAG" has self-loops and topological sort breaks.

8. Multi-test cleanup. In problems with multiple test cases, reset vis, order, comp, and num_scc. Using a struct with a constructor helps but you still need to clear order.

Mental Traps

Trap 1: "Two nodes in the same SCC means there's an edge between them."

No. Same SCC means there exist directed paths uv and vu. These paths can be arbitrarily long with many intermediate nodes. A cycle of length 100 puts all 100 nodes in one SCC -- but most pairs have no direct edge. Confusing "same SCC" with "adjacent" leads to wrong reasoning about condensation properties.

  SAME SCC != DIRECT EDGE:

  0 -> 1 -> 2 -> 3 -> 4 -> 0    (cycle of length 5)
  
  All 5 nodes are in the SAME SCC.
  But node 0 and node 3 have NO direct edge.
  
  Path 0->3:  0 -> 1 -> 2 -> 3  (length 3)
  Path 3->0:  3 -> 4 -> 0       (length 2)
  
  "Same SCC" = mutual reachability via PATHS
  NOT via direct edges.

Trap 2: "I'll just check if the graph has cycles to find SCCs."

Finding cycles is not the same as finding SCCs. A graph can have overlapping cycles that share nodes -- the SCC is the maximal set of mutually reachable nodes. DFS cycle detection finds a cycle, not the maximal component. You need Kosaraju's or Tarjan's to correctly identify all SCCs.

  CYCLE != SCC:

  Graph: 0->1, 1->2, 2->0, 2->3, 3->4, 4->2

  Cycles found: {0,1,2} and {2,3,4}
  
  But the SCC is {0,1,2,3,4} -- ALL five nodes!
  Because 0->1->2->3->4->2->0 connects everything.

  +---- one big SCC ----+
  | 0 -> 1 -> 2 -> 3 -> 4  |
  | ^         ↑       |  |
  | +-------- +  <-----+  |
  +----------------------+

  Cycle detection finds pieces.
  SCC algorithms find the whole.

Common Mistakes

  1. Indexing mismatch between variables and implication-graph nodes. A 2-SAT solution claims "unsatisfiable" on a clearly satisfiable formula when you number variables starting from 1 and use 2i for true, 2i+1 for false -- but variable 0's true node is node 0, and arrays get initialized assuming 1-indexed variables. Node 0 is never visited, causing an SCC to be missed. Fix: consistently use 0-indexed variables (x0,x1,,xn1) with nodes 2i and 2i+1, or 1-indexed with nodes 2i and 2i+1 starting from i=1.

Debug Drills

Drill 1 -- Kosaraju: forgetting to reverse the graph

cpp
// Second DFS: iterate in reverse finish order on the ORIGINAL graph
while (!order.empty()) {
    int v = order.top(); order.pop();
    if (!visited[v])
        dfs2(v, comp_id++);  // BUG: dfs2 should run on G^R, not G
}
What's wrong?

The second pass of Kosaraju must run on the reversed graph GR (all edges flipped). Running on G again just re-discovers the same DFS tree and doesn't correctly separate SCCs.

Drill 2 -- 2-SAT: implication edges reversed

cpp
// Clause: (x_i OR x_j)
// Should add: NOT x_i -> x_j AND NOT x_j -> x_i
add_edge(2*i, 2*j);      // BUG: this is x_i -> x_j, not NOT x_i -> x_j
add_edge(2*j, 2*i);
What's wrong?

The implication ¬xixj means "if xi is false, then xj must be true." The false-node of xi is 2*i + 1, and the true-node of xj is 2*j. Fix:

cpp
add_edge(2*i + 1, 2*j);  // NOT x_i -> x_j
add_edge(2*j + 1, 2*i);  // NOT x_j -> x_i

Drill 3 -- SCC assignment extraction: wrong comparison

cpp
// Extract 2-SAT assignment
for (int i = 0; i < n; i++)
    assignment[i] = (comp[2*i] < comp[2*i+1]);  // BUG
What's wrong?

With Kosaraju (or Tarjan where component IDs are in reverse topological order), xi is true when comp[2i]>comp[2i+1] (the true-literal's SCC is later in topological order). The < should be >:

cpp
assignment[i] = (comp[2*i] > comp[2*i+1]);

The exact direction depends on your SCC numbering convention -- check which direction is "later in topological order."

Self-Test

Before moving on, verify you can answer these without looking at the notes:

  • [ ] Write Kosaraju's algorithm from memory -- two DFS passes, first on G recording finish order, second on GR in reverse finish order, each tree in pass 2 = one SCC.
  • [ ] Encode the 2-SAT clause (AB) as implication edges -- ¬AB and ¬BA -- and explain why both directions are needed.
  • [ ] State the 2-SAT satisfiability condition: unsatisfiable iff any variable x and ¬x are in the same SCC.
  • [ ] Extract a satisfying assignment from SCCs: variable x is TRUE iff comp[2*i] > comp[2*i+1] (Kosaraju convention) -- and explain why this works.
  • [ ] Build the condensation DAG from SCC results and compute the number of edges needed to make a directed graph strongly connected: max(sources,sinks) in the condensation.

Practice Problems

CF RatingWhat You Should Be Able to Do
1200Know what an SCC is; detect cycles in a directed graph.
1500Implement Kosaraju's or Tarjan's algorithm; build the condensation DAG.
1800Model 2-SAT problems by constructing the implication graph; check satisfiability and extract assignments.
2100Solve advanced 2-SAT variants (with range constraints or binary search on parameters); combine SCC with DP on the condensation DAG.
#ProblemSourceDifficultyNotes
1CheckpostsCF 427C1700Count SCCs, sum minimums
2Reachability from the CapitalCF 999E2000Condensation + sources
3Ralph and MushroomsCF 894E2100SCC + DAG DP
4The Door ProblemCF 776D2000Classic 2-SAT
5Coin CollectorCSES1900SCC condensation + DP
6Flight Routes CheckCSES1600Check strong connectivity
7Radio StationsCF 1215F22002-SAT with intervals
8Duff in MafiaCF 587D25002-SAT + binary search
9Baby's First 2-SATCF 1903F23002-SAT with range constraints
10+-EqualizationCF 1971H20002-SAT from matrix

Further Reading


Solve With Me -- 2-SAT: Selecting a Valid Configuration

Problem (composite from several contests): n boolean variables. Given m constraints of the form "at least one of these two literals must be true" (i.e., clauses xixj where each literal can be negated). Find a satisfying assignment or report impossible.

OK this is textbook 2-SAT. But let me actually think through the implication graph construction because I mess this up every single time.

Each clause ab means: if ¬a then b, and if ¬b then a. So I need two directed edges per clause. Each variable xi gets two graph nodes: one for xi (true) and one for x¯i (false). I'll use nodes 2i and 2i+1 for variable i.

Now for the fun part: which node is "true" and which is "false"? I always get this backwards. Let me fix a convention: node 2i means "xi is true," node 2i+1 means "xi is false." Then the clause xixj becomes edges 2i+12j (if xi is false, xj must be true) and 2j+12i (if xj is false, xi must be true). For negated literals, swap: x¯i corresponds to node 2i... wait no. x¯i true means xi is false, which is node 2i+1. Ugh.

Let me just define lit(i, neg): if neg is false, return 2i; if true, return 2i+1. And negate(v) is v1. Then for clause (ab): add edge negate(a) -> b and negate(b) -> a. Much cleaner.

Build the implication graph, run Kosaraju (or Tarjan) to find SCCs. For each variable i: if 2i and 2i+1 are in the same SCC, that means xix¯ixi, a contradiction -> output impossible. Otherwise, xi is true iff comp[2i]>comp[2i+1] (using Kosaraju's reverse-topological SCC numbering).

The part I always forget: the direction of the comparison. Kosaraju numbers SCCs in reverse topological order -- the first SCC found in the second pass has the highest number but comes last topologically. So xi is true when its "true" node is in a later SCC (higher number in Kosaraju). I think. Let me just test on a tiny example... x0x1: edges 12 and 30. If these are the only constraints, both variables can be true. Does my comparison give that? Usually yes, because isolated nodes get their own SCCs and the ordering is arbitrary. Good enough -- the key invariant is "pick the literal whose SCC comes later in topological order."

What to recognize next time: 2-SAT setup is mechanical -- the only creative part is modeling real-world constraints as 2-literal clauses. Once you have clauses, it's negate(a) -> b and negate(b) -> a for each clause, then SCC. If you're getting WA, check the SCC comparison direction.

Built for the climb from Codeforces 1100 to 2100.