MST -- Kruskal's Algorithm

Quick Revisit
USE WHEN: minimum spanning tree, especially when edges are given as a list (edge-centric)
INVARIANT: greedily add cheapest edge that does not form a cycle (cut property)
TIME: O(E log E) (dominated by sort) | SPACE: O(V) for DSU
CLASSIC BUG: forgetting to sort edges, or incorrect union-find (no path compression / union by rank)
PRACTICE: ../12-practice-ladders/05-ladder-graphs.md

Find the minimum spanning tree of a weighted undirected graph by sorting edges and greedily adding the cheapest one that doesn't create a cycle. The go-to MST algorithm in contests thanks to its simplicity and natural pairing with DSU.

ID: AL-18 Difficulty: [Intermediate]Prerequisites: Graph Representation, Union-Find (DSU)CF Rating: 1400-1700

Contest Frequency: Regular -- appears in graph optimization problems

Intuition
C++ STL Reference
Implementation (Contest-Ready)
Operations Reference
Kruskal vs Prim
When to Reach for This
Problem Patterns & Tricks
Contest Cheat Sheet
Common Mistakes
Self-Test
Practice Problems
Recap
Further Reading

Intuition

You have 5 nodes and a bunch of weighted edges:

         4         7
   (A)--------(B)--------(C)
    |  \        |        / |
    |   \2      |5      /  |
   8|    \      |      /6  |9
    |     \     |     /    |
    |      \    |    /     |
   (D)-------(E)--------(F)
         1         3

Goal: pick a subset of edges that connects every node with the smallest possible total weight. That subset must form a tree (connected, no cycles) with exactly $V - 1 = 4$ edges.

Naive approach: enumerate all spanning trees, compute each total weight, keep the minimum. A complete graph on $V$ nodes has $V^{V - 2}$ spanning trees (Cayley's formula). For $V = 5$ that is $5^{3} = 125$ trees. For $V = 20$ it is $20^{18} \approx 2.6 \times 10^{23}$ . Completely hopeless beyond toy inputs.

We need a strategy that avoids brute-force enumeration.

Sort edges by weight. Add the cheapest edge that does not create a cycle. Repeat until you have $V - 1$ edges.

Analogy: imagine you are a government planner building roads between villages. You have a list of possible roads with their construction costs. You sort the list from cheapest to most expensive. You go down the list: if a road connects two villages (or village groups) that cannot yet reach each other, you build it. If they are already connected by previously built roads, you skip it --- building it would waste money on a redundant loop. By the time you have built $V - 1$ roads every village is reachable and the total cost is minimized.

The cycle check is exactly what a Disjoint Set Union (DSU) does in near-constant time: find(u) != find(v) means $u$ and $v$ are in different components, so the edge is safe.

Visual Walkthrough

Edges sorted by weight:

  Edge    Weight
  D-E       1
  A-E       2
  E-F       3
  A-B       4
  B-E       5
  C-F       6
  B-C       7
  A-D       8
  C-F       9   (parallel edge, heavier)

Step-by-step trace (DSU state shown as component sets):

Step 1: Edge D-E (w=1).  find(D)!=find(E) -> ADD.
  Forest: D--E
  Components: {A} {B} {C} {D,E} {F}          MST edges: 1

Step 2: Edge A-E (w=2).  find(A)!=find(E) -> ADD.
  Forest: A--E--D
  Components: {A,D,E} {B} {C} {F}            MST edges: 2

Step 3: Edge E-F (w=3).  find(E)!=find(F) -> ADD.
  Forest: A--E--D
              |
              F
  Components: {A,D,E,F} {B} {C}              MST edges: 3

Step 4: Edge A-B (w=4).  find(A)!=find(B) -> ADD.
  Forest: B--A--E--D
                |
                F
  Components: {A,B,D,E,F} {C}                MST edges: 4

Step 5: Edge B-E (w=5).  find(B)==find(E) -> SKIP (cycle).

Step 6: Edge C-F (w=6).  find(C)!=find(F) -> ADD.
  Forest: B--A--E--D
                |
                F--C
  Components: {A,B,C,D,E,F}                  MST edges: 5 = V-1. DONE.

Total MST weight: $1 + 2 + 3 + 4 + 6 = 16$ .

Resulting MST:

             4
  (A)--------(B)
    \
     \2
      \
  (D)--(E)--(F)
     1     3  \
               \6
               (C)

The Invariant

+------------------------------------------------------------------+
| INVARIANT                                                        |
|                                                                  |
| 1. The selected edge set always forms a FOREST (acyclic).        |
|    Every edge added connects two distinct components, so no      |
|    cycle is ever introduced.                                     |
|                                                                  |
| 2. CUT PROPERTY: for any cut (S, V\S) of the graph, the         |
|    minimum-weight edge crossing the cut belongs to some MST.     |
|    Because we process edges in sorted order, each accepted edge  |
|    is the cheapest edge crossing the cut between the two         |
|    components it connects --- so it is safe to include.          |
+------------------------------------------------------------------+

Plain-language exchange argument. Imagine some MST $T$ that does not include the cheapest edge $e = (u, v)$ crossing a particular cut between two components. $T$ is a spanning tree, so it already connects $u$ to $v$ through some path. Walk that path and locate the moment it leaves one side of the cut to reach the other — call that crossing edge $e^{'}$ . By the cut hypothesis, $w (e) \leq w (e^{'})$ .

Now swap them: drop $e^{'}$ from $T$ , add $e$ . The result is still connected (you removed and added one edge crossing the same cut) and still acyclic (you cannot have created a cycle: the cycle that briefly existed when both $e$ and $e^{'}$ were present is broken by removing $e^{'}$ ). The new tree has total weight $\leq$ the old one. So some MST contains $e$ , which is what "safe to add" means.

Kruskal exploits this by always considering the cheapest remaining edge: the two components it connects form a cut, and $e$ is by construction the lightest edge crossing it.

Why Kruskal exploits this. Processing edges lightest-first, each accepted edge is the lightest crossing the cut between its two components. The cut property guarantees it belongs to the MST.

Complexity. Sorting takes $O (E \log E)$ . Each edge triggers one find and at most one unite on the DSU, each $O (α (V))$ . Total: $O (E \log E)$ , which is $O (E \log V)$ since $E \leq V^{2}$ so $\log E \leq 2 \log V$ .

Disconnected Input → Spanning Forest

Kruskal does not assume the input graph is connected. If the graph has multiple components, the algorithm still terminates correctly: it greedily adds the cheapest edge between any two distinct components and stops when no more edges remain. The result is a minimum spanning forest — a minimum spanning tree in each component.

Concretely, after the loop:

If cnt == V - 1 (or mst_edges.size() == V - 1), the graph was connected and you have an MST.
If cnt < V - 1, the graph was disconnected; you have a spanning forest with cnt edges spread across V - cnt components.

Many problems hinge on this distinction. "Connect every node" with a disconnected graph is impossible, and you must report so. "Minimum cost to keep what is connected, connected" is the spanning-forest sum. Always check cnt == V - 1 before declaring victory.

What the Code Won't Teach You

Kruskal's is DSU + sorting -- nothing more.

The entire algorithm is: sort edges, iterate, use DSU to reject cycles. There is no graph traversal, no relaxation, no priority queue. If you understand DSU's find and unite, you already understand the cycle-detection half. The other half is std::sort. The elegance is that these two simple pieces together solve a deep optimization problem.

KRUSKAL STEP-BY-STEP: EDGE SELECTION + DSU STATE
=================================================
Use [X]--[Y] for MST edges, (X)--(Y) for remaining.

Graph: 4 nodes, 5 edges
       3           5
  (A)------(B)------(C)
    \      /          |
     4   2            1
      \ /             |
      (D)-------------+

Sorted edges: C-D(1)  B-D(2)  A-B(3)  A-D(4)  B-C(5)

Step 1: C-D (w=1)  find(C)!=find(D) --> ADD
  DSU: {A} {B} {C,D}         MST edges: 1
  Tree so far:
        (A)    (B)    [C]--1--[D]

Step 2: B-D (w=2)  find(B)!=find(D) --> ADD
  DSU: {A} {B,C,D}           MST edges: 2
  Tree so far:
        (A)    [B]--2--[D]--1--[C]

Step 3: A-B (w=3)  find(A)!=find(B) --> ADD
  DSU: {A,B,C,D}             MST edges: 3 = n-1 --> DONE!
  Final MST:
        [A]--3--[B]--2--[D]--1--[C]

Step 4: A-D (w=4)  SKIP  (find(A)==find(D), would create cycle)
Step 5: B-C (w=5)  SKIP  (find(B)==find(C), would create cycle)

MST total weight: 1 + 2 + 3 = 6

Non-obvious MST problems require graph construction.

The hard part is never the algorithm -- it's building the graph. When a problem smells like MST but doesn't hand you an explicit edge list, ask: "What are the nodes? What are the edges? Can I reduce the edge count?"

NON-OBVIOUS EDGE REDUCTION: SORTED ARRAY -> ADJACENT-ONLY EDGES
================================================================

Problem: "Connect n cities. Cost(i,j) = |a[i] - a[j]|. Find MST."

Naive: all n*(n-1)/2 edges.  n=10^5 --> ~5*10^9 edges. Too slow.

Key insight: sort by a[i]. Only adjacent pairs can be MST edges.

  Unsorted:  a = [8, 2, 5, 1, 9]

  All-pairs edges (10 edges):
  +---+   +---+   +---+   +---+   +---+
  | 8 |   | 2 |   | 5 |   | 1 |   | 9 |
  +---+   +---+   +---+   +---+   +---+
    *-------*       |       |       |
    *---------------*       |       |
    *-----------------------*       |
    *-------------------------------*
            *-------*       |       |
            *---------------*       |
            *-----------------------*
                    *-------*       |
                    *---------------*
                            *-------*

  Sorted:    a = [1, 2, 5, 8, 9]

  Adjacent-only edges (4 = n-1 edges):
  +---+   +---+   +---+   +---+   +---+
  | 1 |---| 2 |---| 5 |---| 8 |---| 9 |
  +---+ 1 +---+ 3 +---+ 3 +---+ 1 +---+

  Why? For non-adjacent i < j < k in sorted order:
    |a[i] - a[k]| = |a[i] - a[j]| + |a[j] - a[k]|
  Edge (i,k) is NEVER cheaper than BOTH (i,j) and (j,k).
  So (i,k) is never needed in the MST.

  Result: O(n log n) sort + Kruskal on n-1 edges. Done.

C++ STL Reference

Function / Class	Header	What it does	Time
`sort(first, last)`	`<algorithm>`	Sorts edges by weight (default `<` on tuples/pairs)	$O (E \log E)$
`sort(first, last, comp)`	`<algorithm>`	Custom comparator (e.g., descending for max spanning tree)	$O (E \log E)$
`tuple<int,int,int>`	`<tuple>`	Store `{weight, u, v}` --- sorts by weight first	$O (1)$ compare
`array<int,3>`	`<array>`	Alternative to tuple; same lexicographic sort	$O (1)$ compare
`vector<tuple<int,int,int>>`	`<vector>`	Edge list container	$O (1)$ amortized push
`iota(first, last, val)`	`<numeric>`	Initialize DSU parent: `iota(par.begin(), par.end(), 0)`	$O (n)$
`accumulate(first, last, init)`	`<numeric>`	Sum MST edge weights (if stored)	$O (n)$

Key point: Store edges as {weight, u, v} (weight first), so default sort orders by weight. No custom comparator needed.

Implementation (Contest-Ready)

Version 1 --- Minimal contest template

cpp

#include <bits/stdc++.h>
using namespace std;

struct DSU {
    vector<int> p, rk;
    DSU(int n) : p(n), rk(n, 0) { iota(p.begin(), p.end(), 0); }
    int find(int x) { return p[x] == x ? x : p[x] = find(p[x]); }
    bool unite(int a, int b) {
        a = find(a); b = find(b);
        if (a == b) return false;
        if (rk[a] < rk[b]) swap(a, b);
        p[b] = a;
        if (rk[a] == rk[b]) rk[a]++;
        return true;
    }
};

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;
    vector<array<long long, 3>> edges(m); // {w, u, v}
    for (int i = 0; i < m; i++) {
        int u, v; long long w;
        cin >> u >> v >> w;
        edges[i] = {w, u - 1, v - 1};
    }
    sort(edges.begin(), edges.end());

    DSU dsu(n);
    long long mst = 0;
    int cnt = 0;
    for (auto [w, u, v] : edges) {
        if (dsu.unite(u, v)) {
            mst += w;
            if (++cnt == n - 1) break;
        }
    }

    if (cnt == n - 1)
        cout << mst << "\n";
    else
        cout << "IMPOSSIBLE\n";
}

Version 2 --- Explained version with MST edge tracking

cpp

#include <bits/stdc++.h>
using namespace std;

struct DSU {
    vector<int> parent, rank_;
    DSU(int n) : parent(n), rank_(n, 0) {
        iota(parent.begin(), parent.end(), 0);
    }
    // Path compression: flatten the tree on every find.
    int find(int x) {
        if (parent[x] != x)
            parent[x] = find(parent[x]);
        return parent[x];
    }
    // Union by rank: attach shorter tree under taller.
    // Returns true if a merge happened (u, v were in different sets).
    bool unite(int a, int b) {
        a = find(a);
        b = find(b);
        if (a == b) return false; // already connected -> skip (cycle)
        if (rank_[a] < rank_[b]) swap(a, b);
        parent[b] = a;
        if (rank_[a] == rank_[b]) rank_[a]++;
        return true;
    }
};

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;

    // Store edges as {weight, u, v}. Weight first so sort() orders by weight.
    // Use long long for weight to avoid overflow when summing.
    struct Edge { long long w; int u, v; };
    vector<Edge> edges(m);
    for (int i = 0; i < m; i++) {
        cin >> edges[i].u >> edges[i].v >> edges[i].w;
        edges[i].u--; edges[i].v--; // convert to 0-indexed
    }

    // Sort by weight ascending. This is the core of Kruskal:
    // process cheapest edges first.
    sort(edges.begin(), edges.end(), [](const Edge& a, const Edge& b) {
        return a.w < b.w;
    });

    DSU dsu(n);
    long long mst_weight = 0;
    vector<pair<int,int>> mst_edges; // store which edges are in the MST

    for (auto& [w, u, v] : edges) {
        // If u and v are in different components, this edge is safe to add.
        // It's the cheapest edge crossing this particular cut (sorted order).
        if (dsu.unite(u, v)) {
            mst_weight += w;
            mst_edges.push_back({u, v});
            // MST has exactly n-1 edges; stop early once we have them all.
            if ((int)mst_edges.size() == n - 1) break;
        }
    }

    if ((int)mst_edges.size() < n - 1) {
        // Graph is disconnected. No spanning tree exists.
        cout << "IMPOSSIBLE\n";
    } else {
        cout << mst_weight << "\n";
        // Optional: print the MST edges
        for (auto [u, v] : mst_edges)
            cout << u + 1 << " " << v + 1 << "\n";
    }
}

Operations Reference

Operation	Time	Space
Full Kruskal (sort + DSU)	$O (E \log E)$	$O (V + E)$
Sort edges	$O (E \log E)$	$O (E)$
Single DSU find	$O (α (V))$	-
Single DSU unite	$O (α (V))$	-
All DSU operations (over $E$ edges)	$O (E \cdot α (V))$	$O (V)$
Check connectivity after MST	$O (1)$	-

Notes:

Bottleneck is sorting. The DSU part is effectively linear.
For $V, E \leq 2 \times 10^{5}$ , runs in under 100ms. For $E \leq 10^{6}$ , still under 500ms.
Space is $O (V)$ for DSU + $O (E)$ for edge list. If the problem gives edges as input, you're already storing them.
If edges arrive pre-sorted (rare), the algorithm is $O (E \cdot α (V)) \approx O (E)$ .

Kruskal vs Prim

	Kruskal	Prim
Best for	Sparse graphs, edge list input	Dense graphs, adjacency list/matrix
Complexity	$O (E \log E)$	$O ((V + E) \log V)$ with binary heap
Data structure	DSU	Priority queue
Implementation	Simpler	More code
Edge list input	Natural (sort directly)	Must build adjacency list first

Rule of thumb: if the input is an edge list (which it almost always is on CF), use Kruskal. Use Prim's only when you have a dense graph ( $E \approx V^{2}$ ) and want $O (V^{2})$ without a heap.

Kruskal also has the advantage that the DSU built during the algorithm is reusable for follow-up connectivity queries.

When to Reach for This

Trigger phrases:

"minimum spanning tree"
"connect all nodes with minimum cost"
"minimum cost network / wiring / road"
"maximum spanning tree" (negate weights or sort descending)
"offline connectivity queries sorted by weight"

Kruskal is preferred when:

Edges are given as a list (the most common CF input format)
The graph is sparse ( $E ≪ V^{2}$ )
You need to process edges in sorted weight order for other reasons (e.g., offline queries)
You need the DSU afterward for connectivity checks

Prim is preferred when:

The graph is dense ( $E \approx V^{2}$ ) and you want $O (V^{2})$ with an adjacency matrix
The input is already an adjacency list and building an edge list would be wasteful

CF frequency. MST appears as a CF tag on ~5% of Div 2 C/D problems. Common scenarios: minimum cost to connect all nodes, maximum spanning tree (negate weights or sort descending), MST with constrained edges, second-best MST, MST queries.

Counter-examples --- when Kruskal does NOT apply directly:

Directed graphs. MST is defined on undirected graphs. For directed graphs you need a minimum spanning arborescence (Edmonds/Chu-Liu).
Steiner tree. "Connect a subset of nodes with minimum cost" is NP-hard in general, not MST.
Shortest paths. MST minimizes total edge weight, not the distance between any particular pair of nodes. Don't confuse MST with Dijkstra.

Problem Patterns & Tricks

Pattern 1 --- Classic MST

Read an edge list, find minimum total weight to connect all vertices. Direct application of Kruskal.

Recognize it: "Connect all cities with minimum cable cost," "minimum cost spanning tree."

CSES -- Road Reparation
CF 472D -- Design Tutorial: Inverse the Problem

Pattern 2 --- Maximum spanning tree

Sort edges in descending order (or negate weights). Everything else stays the same. Used in problems where you want to maximize the minimum edge on a path, or maximize total weight.

cpp

sort(edges.begin(), edges.end(), [](auto& a, auto& b) {
    return a[0] > b[0]; // descending by weight
});

Recognize it: "Maximize the minimum bandwidth," "maximum bottleneck path."

CF 609E -- Minimum Spanning Tree For Each Edge
CF 1133F2 -- Spanning Tree with One Fixed Degree

Pattern 3 --- MST with constrained edges (must-include / must-exclude)

Some edges are forced into or out of the MST. For must-include: add those edges first (unite their endpoints), then run Kruskal on the remaining edges. For must-exclude: skip them during Kruskal.

A common variant: find MST that includes a specific edge $e$ . Add $e$ first, then run Kruskal on the rest.

CF 1095F -- Make It Connected (virtual node + forced edges)
CF 1245D -- Shichikuji and Power Grid

Pattern 4 --- Second-best MST

Find the MST weight, then for each non-MST edge $(u, v, w)$ , find the heaviest edge on the MST path from $u$ to $v$ . Swapping gives a spanning tree of weight $MST - max_on_path + w$ . The minimum such value over all non-MST edges is the second-best MST.

Finding the max edge on a path requires LCA with edge-weight tracking, or brute force on small inputs.

CSES -- New Roads Queries (related: connectivity over time)
CF 160D -- Edges in MST

Pattern 5 --- Kruskal + offline connectivity queries

Process edges in weight order while answering queries like "when do $u$ and $v$ first become connected?" or "what is the maximum edge weight on the MST path between $u$ and $v$ ?"

Sort edges and queries together. As you unite components, check if the current edge resolves any pending query.

cpp

// For "minimum bottleneck path" queries:
// Sort edges by weight. Process edges via Kruskal.
// Query (u,v): answer is the weight of the edge that first connects u and v.

CF 1213G -- Path Queries
CF 1851G -- Vlad and the Mountains

Pattern 6 --- MST on complete/implicit graphs

When the graph is complete ( $E = O (V^{2})$ ) or defined implicitly (e.g., edge weight = $| x_{i} - x_{j} | + | y_{i} - y_{j} |$ ), you can't enumerate all edges. Reduce edges via sorting by coordinates, nearest-neighbor tricks, or problem-specific observations to get $O (V \log V)$ candidate edges, then run Kruskal on those.

Manhattan MST: Sort points by $x + y$ , $x - y$ , etc. Only $O (V)$ candidate edges needed.

CF 1550E -- Boring Segments
CF 1095F -- Make It Connected

Pattern 7 --- Kruskal for building Kruskal's reconstruction tree

Build a "Kruskal tree" where each DSU merge creates a new internal node with weight equal to the merge edge. The resulting tree supports queries like "minimum bottleneck path" via LCA. Advanced but powerful.

CF 1706E -- Qpwoeirut and Vertices
CF 1648D -- Serious Business

Contest Cheat Sheet

+--------------------------------------------------------------+
|  KRUSKAL'S MST -- QUICK REFERENCE                            |
+--------------------------------------------------------------+
|  WHEN: MST on edge list, sparse graphs, offline connectivity |
|  TIME: O(E log E)          SPACE: O(V + E)                   |
+--------------------------------------------------------------+
|  TEMPLATE:                                                   |
|    sort edges by weight                                      |
|    DSU dsu(n);                                               |
|    long long mst = 0; int cnt = 0;                           |
|    for (auto [w, u, v] : edges)                              |
|      if (dsu.unite(u, v)) {                                  |
|        mst += w;                                             |
|        if (++cnt == n - 1) break;                            |
|      }                                                       |
+--------------------------------------------------------------+
|  EDGES: store as {weight, u, v} so sort is by weight         |
|  MAX SPANNING TREE: sort descending (or negate weights)      |
+--------------------------------------------------------------+
|  WATCH OUT:                                                  |
|  - long long for total weight (sum can overflow int)         |
|  - Check cnt == n-1 (graph might be disconnected)            |
|  - 0-indexed vs 1-indexed nodes in DSU                       |
|  - DSU must use path compression + union by rank             |
|  - Don't forget: MST has exactly V-1 edges                   |
+--------------------------------------------------------------+

Common Mistakes

Mistake	Symptom	Fix
Forgetting to sort edges	Valid spanning tree but not minimum weight	Always sort edges by weight before the main loop
Incorrect DSU (no path compression / union by rank)	TLE on large inputs or stack overflow	Use both path compression and union by rank
Not checking connectivity	Report MST weight when graph is disconnected	Verify `cnt == n - 1` after the loop
Integer overflow on total weight	Wrong answer on large inputs	Use `long long` for the sum (individual weights * V-1 can exceed 2^31)
Off-by-one: MST has V-1 edges	Wrong edge count check	MST has exactly V-1 edges, not V

Detailed Gotchas

1. Integer overflow on MST weight. Individual weights fit in int ( $\leq 10^{9}$ ), but summing $V - 1$ edges can reach $\sim 2 \times 10^{14}$ . Always use long long for the total weight. The edge weights themselves can stay int if they fit.

2. Disconnected graphs. If $cnt < V - 1$ after processing all edges, the graph has no spanning tree. Some problems ask for a spanning forest (sum of MSTs of each component); others want "IMPOSSIBLE". Read the problem statement.

3. 1-indexed nodes + 0-indexed DSU. Codeforces usually gives 1-indexed nodes. If your DSU is 0-indexed, subtract 1 from node IDs when building the edge list. If 1-indexed, allocate DSU(n + 1). Mixing causes silent wrong answers.

4. Parallel edges. Kruskal handles parallel edges naturally --- they're just extra entries in the sorted edge list. Don't deduplicate.

5. Self-loops. A self-loop $(u, u, w)$ has find(u) == find(u), so unite returns false. It's automatically skipped. No special handling needed.

6. Forgetting the early break. The if (++cnt == n - 1) break; is an optimization, not correctness. Without it, you still get the right MST weight --- you just iterate through extra edges pointlessly. On tight TL with $E = 10^{6}$ , the break matters.

7. Maximum spanning tree direction. For max spanning tree, either sort descending or negate all weights and find min spanning tree. If you negate, remember to negate the final answer back.

8. Edge weight ties. When multiple edges have the same weight, the MST may not be unique, but Kruskal still finds a valid MST. If the problem asks "is this edge in every MST?" --- that's a harder question (see Pattern 4).

Debugging checklist:

Print sorted edge list. Are weights in the right order?
Print each accepted edge: cerr << "add " << u << "-" << v << " w=" << w << "\n";
Verify cnt == n - 1 at the end.
Test on a small graph (4-5 nodes) by hand.
Check for 0-index vs 1-index mismatches in the DSU.

Mental Traps

Trap: "The MST path between two nodes is the shortest path."

MST minimizes total tree weight, not any specific pairwise distance. A direct edge may be excluded from the MST because its endpoints were already connected through cheaper edges -- yet that direct edge IS the shortest path.

MST PATH vs SHORTEST PATH -- SAME GRAPH
========================================

Graph:
         1           1           1
    [A]-------[B]-------[C]-------[D]
     |                              |
     +-------------2----------------+

Sorted edges: A-B(1) B-C(1) C-D(1) A-D(2)

Kruskal adds: A-B(1), B-C(1), C-D(1) --> 3 = n-1, DONE.
Edge A-D(2) skipped: find(A)==find(D).

MST (total weight = 3):
    [A]---1---[B]---1---[C]---1---[D]

MST path A -> D:  A -> B -> C -> D     cost = 3
Shortest   A -> D:  A -----------------> D  cost = 2
                                         ^--- DIFFERENT!

The MST is optimal for TOTAL weight (3 < any other
spanning tree). But the PATH through the MST between
specific nodes can be longer than the direct route.

Trap: "Forgetting to sort produces a subtly wrong answer."

Without sorting, Kruskal processes edges in input order. DSU still rejects cycles, so the result is a valid spanning tree -- but NOT minimum. The bug is completely silent: no crash, no assertion failure, just the wrong weight.

UNSORTED EDGES --> WRONG TREE
=============================

Graph:
         3           1
    [A]-------[B]-------[C]
     |                   |
     +---------2---------+

Edges in input order: (A,B,3) (B,C,1) (A,C,2)

WITHOUT sorting (process in input order):
  (A,B,3): find(A)!=find(B) -> ADD    DSU: {A,B} {C}
  (B,C,1): find(B)!=find(C) -> ADD    DSU: {A,B,C}
  (A,C,2): find(A)==find(C) -> SKIP
  Result: weight = 3 + 1 = 4          <-- WRONG

  Tree built:    [A]--3--[B]--1--[C]

WITH sorting: (B,C,1) (A,C,2) (A,B,3)
  (B,C,1): find(B)!=find(C) -> ADD    DSU: {B,C} {A}
  (A,C,2): find(A)!=find(C) -> ADD    DSU: {A,B,C}
  (A,B,3): find(A)==find(B) -> SKIP
  Result: weight = 1 + 2 = 3          <-- CORRECT

  Tree built:    [A]--2--[C]--1--[B]

Both are valid spanning trees. Only the sorted run is MINIMUM.

Trap: "The MST is unique."

Only true when all edge weights are distinct. With ties, multiple valid MSTs can exist. Problems that ask "is edge X in THE MST?" are ill-posed unless they mean "in every MST" or "in some MST."

MULTIPLE MSTs WITH EQUAL WEIGHTS
=================================

         2           2
    [A]-------[B]-------[C]
     |                   |
     +---------2---------+

All three edges have weight 2. Any two edges form a valid
MST with total weight 4:

  MST #1:                MST #2:                MST #3:
  [A]--2--[B]--2--[C]    [A]--2--[B]            [B]--2--[C]
                          |                              |
                          +--2--[C]              [A]--2--+

If a problem asks "is edge A-C in the MST?" the answer
depends on WHICH MST. Ask: "in ALL MSTs? in SOME MST?"

Self-Test

Without looking at the sections above, verify you can:

Write Kruskal's from memory -- edge sort, DSU init, cycle check via find, unite, weight accumulation, and the cnt == n-1 termination.
Explain the cut property -- why is the minimum-weight edge crossing any cut always safe to include in the MST?
State when Kruskal's beats Prim's -- and vice versa. What is the complexity crossover point?
Identify a non-obvious edge reduction -- given a problem where the naive graph has $O (n^{2})$ edges, describe how sorting reduces it to $O (n)$ edges before running Kruskal.
Predict the output of unsorted Kruskal -- given a small graph, trace what happens if you skip the sort. Explain why the result is a spanning tree but not minimum.
Distinguish MST from shortest path -- draw a 4-node graph where the MST path between two nodes is strictly longer than the shortest path.
State the MST edge count -- exactly $V - 1$ edges on $V$ nodes, and why this must be so (tree = connected acyclic graph).

Practice Problems

#	Problem	Source	Difficulty	Key Idea
1	Road Reparation	CSES	Easy	Textbook Kruskal, detect disconnected graph
2	Building Roads	CSES	Easy	Find connected components, add edges to connect them
3	Inverse the Problem	CF 472D	Medium	Build MST, check if distance matrix matches MST distances
4	Path Queries	CF 1213G	Medium	Offline: sort edges + queries, Kruskal with component sizes
5	Minimum Spanning Tree For Each Edge	CF 609E	Medium	MST + max edge on path via LCA
6	Make It Connected	CF 1095F	Medium	Virtual node with special edges + Kruskal
7	Shichikuji and Power Grid	CF 1245D	Medium	Model as MST with virtual source node
8	Vlad and the Mountains	CF 1851G	Medium-Hard	Offline queries, Kruskal-style edge processing
9	Edges in MST	CF 160D	Hard	Classify each edge: in all MSTs / some / none
10	Qpwoeirut and Vertices	CF 1706E	Hard	Kruskal reconstruction tree + range queries

Rating Progression

CF Rating	What You Should Be Comfortable With
1200	Template Kruskal, build MST, compute MST weight
1500	MST uniqueness check, handle equal-weight edges, MST on modified edge sets
1800	Kruskal reconstruction tree, second-best MST, MST with constraints (forced edges)
2100	Online MST maintenance, Kruskal + segment tree tricks, Matroid intersection basics

Debug This

Bug 1: Kruskal's adds fewer than $n - 1$ edges but the graph is connected:

cpp

vector<tuple<int,int,int>> edges;  // {w, u, v}
for (int i = 0; i < m; i++) {
    int u, v, w; cin >> u >> v >> w;
    edges.push_back({w, u, v});
}
sort(edges.begin(), edges.end());
DSU dsu(n);  // initializes for nodes 0..n-1
int count = 0;
for (auto [w, u, v] : edges) {
    if (dsu.find(u) != dsu.find(v)) {
        dsu.unite(u, v);
        count++;
    }
}
// count < n-1 ... but graph is connected!

Answer

The nodes are likely 1-indexed in the input, but the DSU is initialized for 0..n-1. Node n is out of range, so edges involving node n don't correctly unite. Fix: initialize DSU for n+1 nodes (or convert input to 0-indexed).

Bug 2: MST weight overflows:

cpp

int mst_weight = 0;  // BUG: should be long long
for (auto [w, u, v] : edges) {
    if (dsu.find(u) != dsu.find(v)) {
        dsu.unite(u, v);
        mst_weight += w;
    }
}

Answer

If edge weights are up to $10^{9}$ and there are up to $2 \times 10^{5}$ edges in the MST, the total weight can reach $2 \times 10^{14}$ , which overflows a 32-bit int. Fix: use long long mst_weight = 0.

Bug 3: DSU's find function causes stack overflow:

cpp

int find(int x) {
    if (parent[x] != x) parent[x] = find(parent[x]);
    return parent[x];
}
// Stack overflow on n = 500000 with chain graph

Answer

Without union-by-rank/size, the DSU tree can degenerate into a chain of depth $n$ . Recursive path compression on a chain of depth $5 \times 10^{5}$ overflows the stack. Fix: use union-by-rank to keep trees shallow, or implement iterative path compression.

Historical Note

Kruskal's algorithm was published by Joseph Kruskal in 1956. Interestingly, the algorithm was already known to Otakar Boruvka in 1926 for connecting Moravian electrical networks -- making MST one of the oldest problems in algorithm design. The efficient DSU-based implementation came later with Tarjan's union-find analysis.

Recap

Kruskal = sort + DSU. Sort edges by weight, greedily add each edge that connects two different components.
Cut property guarantees correctness: the cheapest edge crossing any cut belongs to the MST.
O(E log E) time, dominated by sorting. DSU operations are nearly O(1) amortized.
Kruskal vs Prim: Kruskal is simpler and natural for edge-list input (most CF problems). Prim is better for dense graphs.
MST has exactly V-1 edges. If you add fewer, the graph is disconnected.
Mnemonic: "Sort, scan, skip cycles -- that's Kruskal."

MST -- Kruskal's Algorithm

Contents

Intuition

Visual Walkthrough

The Invariant

Disconnected Input → Spanning Forest

What the Code Won't Teach You

C++ STL Reference

Implementation (Contest-Ready)

Version 1 --- Minimal contest template

Version 2 --- Explained version with MST edge tracking

Operations Reference

Kruskal vs Prim

When to Reach for This

Problem Patterns & Tricks

Pattern 1 --- Classic MST

Pattern 2 --- Maximum spanning tree

Pattern 3 --- MST with constrained edges (must-include / must-exclude)

Pattern 4 --- Second-best MST

Pattern 5 --- Kruskal + offline connectivity queries

Pattern 6 --- MST on complete/implicit graphs

Pattern 7 --- Kruskal for building Kruskal's reconstruction tree

Contest Cheat Sheet

Common Mistakes

Detailed Gotchas

Mental Traps

Self-Test

Practice Problems

Rating Progression

Debug This

Historical Note

Recap

Further Reading

Cross-links

MST -- Kruskal's Algorithm ​

Contents ​

Intuition ​

Visual Walkthrough ​

The Invariant ​

Disconnected Input → Spanning Forest ​

What the Code Won't Teach You ​

C++ STL Reference ​

Implementation (Contest-Ready) ​

Version 1 --- Minimal contest template ​

Version 2 --- Explained version with MST edge tracking ​

Operations Reference ​

Kruskal vs Prim ​

When to Reach for This ​

Problem Patterns & Tricks ​

Pattern 1 --- Classic MST ​

Pattern 2 --- Maximum spanning tree ​

Pattern 3 --- MST with constrained edges (must-include / must-exclude) ​

Pattern 4 --- Second-best MST ​

Pattern 5 --- Kruskal + offline connectivity queries ​

Pattern 6 --- MST on complete/implicit graphs ​

Pattern 7 --- Kruskal for building Kruskal's reconstruction tree ​

Contest Cheat Sheet ​

Common Mistakes ​

Detailed Gotchas ​

Mental Traps ​

Self-Test ​

Practice Problems ​

Rating Progression ​

Debug This ​

Historical Note ​

Recap ​

Further Reading ​

Cross-links ​

MST -- Kruskal's Algorithm

Contents

Intuition

Visual Walkthrough

The Invariant

Disconnected Input → Spanning Forest

What the Code Won't Teach You

C++ STL Reference

Implementation (Contest-Ready)

Version 1 --- Minimal contest template

Version 2 --- Explained version with MST edge tracking

Operations Reference

Kruskal vs Prim

When to Reach for This

Problem Patterns & Tricks

Pattern 1 --- Classic MST

Pattern 2 --- Maximum spanning tree

Pattern 3 --- MST with constrained edges (must-include / must-exclude)

Pattern 4 --- Second-best MST

Pattern 5 --- Kruskal + offline connectivity queries

Pattern 6 --- MST on complete/implicit graphs

Pattern 7 --- Kruskal for building Kruskal's reconstruction tree

Contest Cheat Sheet

Common Mistakes

Detailed Gotchas

Mental Traps

Self-Test

Practice Problems

Rating Progression

Debug This

Historical Note

Recap

Further Reading

Cross-links