Dijkstra's Algorithm

Quick Revisit
USE WHEN: single-source shortest path with non-negative edge weights
INVARIANT: once a node is popped from the priority queue, its distance is final
TIME: O((V + E) log V) with binary heap | SPACE: O(V + E)
CLASSIC BUG: using Dijkstra with negative edge weights (greedy assumption breaks)
PRACTICE: ../12-practice-ladders/05-ladder-graphs.md

Greedy single-source shortest path for graphs with non-negative edge weights. The bread-and-butter graph algorithm on Codeforces from Div 2 C onward.

ID: AL-15 Difficulty: [Intermediate]Prerequisites: Graph Representation, Priority Queue and HeapsCF Rating: 1500-1800

Contest Frequency: Regular -- appears every few Div 2 contests

Intuition
C++ STL Reference
Implementation (Contest-Ready)
Operations Reference
Problem Patterns & Tricks
Contest Cheat Sheet
Common Mistakes & Debugging
Self-Test
Practice Problems
Further Reading
Recap

Intuition

You have a weighted graph with 5 nodes and need the shortest path from $A$ to $E$ :

        4       1
  (A)------>(B)------>(E)
   |                   ^
   |2                  |3
   v                   |
  (C)------>(D)--------+
        5

Brute force: enumerate every possible path from $A$ to $E$ . Paths can revisit different orderings of intermediate nodes, so this blows up exponentially -- $O (V!)$ in the worst case. For $V = 20$ that is already over $10^{18}$ paths.

Naive greedy (pick the lightest outgoing edge): start at $A$ , always move along the cheapest edge leaving the current node.

Greedy walk: A --4--> B --1--> E    total = 5

Looks fine here -- but try a slightly different graph:

        1      10
  (A)------>(B)------>(E)
   |                   ^
   |2                  |1
   v                   |
  (C)------>(D)--------+
        1

Greedy walk: A --1--> B --10--> E   total = 11
Actual best: A --2--> C --1--> D --1--> E   total = 4

The naive greedy fails because it commits to one edge without considering the full cost of reaching the destination through that edge. We need a smarter greedy strategy.

Always finalize the unvisited node with the smallest tentative distance -- it can't get any shorter because all remaining paths go through nodes with equal or larger distance.

Analogy: imagine dropping a stone into a pond at the source node. Ripples expand outward in circles of increasing radius. The first time a ripple reaches a node, that is the shortest distance -- later ripples (larger radius) can never beat it. Dijkstra simulates this expanding wavefront, processing nodes in order of their distance from the source.

Where this breaks: if an edge has negative weight, a "later ripple" can actually arrive sooner than an earlier one. The expanding-circle model collapses, and Dijkstra's greedy choice is no longer safe. Use Bellman-Ford instead.

Checkpoint: Dijkstra always finalizes the unvisited node with the smallest tentative distance. Why does this greedy choice fail when edges can have negative weights?
Think first, then click
When all edges are non-negative, any path through unvisited nodes (which have distance >= the current node) can only get longer. So the current minimum can't be beaten. But with a negative edge, a path through a higher-distance node could decrease along the way and arrive with a shorter total -- the "already finalized" distance turns out to be wrong.

Visual Walkthrough

Graph (same as the Pain section, first example):

        4       1
  (A)------>(B)------>(E)
   |                   ^
   |2                  |3
   v                   |
  (C)------>(D)--------+
        5

We track a distance array $d []$ and a min-heap priority queue (PQ). Finalized nodes are marked with *.

Initial state:
  d[A]=0   d[B]=inf   d[C]=inf   d[D]=inf   d[E]=inf
  PQ: {(0, A)}

Step 1: Extract (0, A). Relax A->B (w=4), A->C (w=2).
  d[A]=0*  d[B]=4     d[C]=2     d[D]=inf   d[E]=inf
  PQ: {(2, C), (4, B)}

Step 2: Extract (2, C). Relax C->D (w=5).
  d[A]=0*  d[B]=4     d[C]=2*    d[D]=7     d[E]=inf
  PQ: {(4, B), (7, D)}

Step 3: Extract (4, B). Relax B->E (w=1).
  d[A]=0*  d[B]=4*    d[C]=2*    d[D]=7     d[E]=5
  PQ: {(5, E), (7, D)}

Step 4: Extract (5, E). No outgoing edges to relax.
  d[A]=0*  d[B]=4*    d[C]=2*    d[D]=7     d[E]=5*
  PQ: {(7, D)}

Step 5: Extract (7, D). Relax D->E (w=3): d[D]+3=10 > d[E]=5. No update.
  d[A]=0*  d[B]=4*    d[C]=2*    d[D]=7*    d[E]=5*
  PQ: {}  --> DONE

Final shortest distances from $A$ :

  +------+---+---+---+---+---+
  | Node | A | B | C | D | E |
  +------+---+---+---+---+---+
  | d[]  | 0 | 4 | 2 | 7 | 5 |
  +------+---+---+---+---+---+

The Invariant

+------------------------------------------------------------------+
| GREEDY INVARIANT: When node u is extracted from the priority     |
| queue, d[u] equals the true shortest-path distance from s to u. |
+------------------------------------------------------------------+

Proof by contradiction. Suppose node $u$ is extracted from the PQ but $d [u]$ is NOT the true shortest-path distance. Then there exists a shorter path $P$ from $s$ to $u$ with length $ℓ (P) < d [u]$ .

Path $P$ starts at $s$ (already finalized) and ends at $u$ (just extracted). So $P$ must, at some point, leave the set of finalized nodes. Let $v$ be the first unfinalized node along $P$ , and let $x$ be the finalized node just before $v$ on $P$ .

Since $x$ was already finalized, $d [x]$ is the true shortest distance to $x$ (by induction -- $x$ was extracted earlier).
When $x$ was finalized, the edge $(x, v)$ was relaxed, so $d [v] \leq d [x] + w (x, v)$ .
The sub-path of $P$ from $s$ to $v$ has length at least $d [x] + w (x, v)$ , so $d [v] \leq$ (length of $P$ up to $v$ ).
All edges are non-negative, so the remaining portion of $P$ from $v$ to $u$ has non-negative length. Therefore $ℓ (P) \geq$ (length of $P$ up to $v$ ) $\geq d [v]$ .
But $u$ was extracted before $v$ , meaning $d [u] \leq d [v]$ (the PQ extracts the minimum).
Combining: $ℓ (P) \geq d [v] \geq d [u]$ .

This contradicts $ℓ (P) < d [u]$ . Therefore $d [u]$ must be the true shortest-path distance when $u$ is extracted. $◼$

When to Reach for This

Trigger phrases in problem statements:

"shortest path" / "minimum cost path" with non-negative weights
"single source, all destinations" (or single source-target)
"minimum cost to reach a state" (state-space Dijkstra)
"grid shortest path with varying cell costs"

Counter-examples -- when Dijkstra is NOT the right tool:

Situation	Use instead	Why
Negative edge weights	Bellman-Ford / SPFA	Greedy invariant breaks
All weights = 1	BFS	$O (V + E)$ simpler and faster
All weights 0 or 1	0-1 BFS (deque)	$O (V + E)$ , no log factor
All-pairs shortest paths, $V \leq 400$	Floyd-Warshall	$O (V^{3})$ , simpler code
Negative cycles detection	Bellman-Ford	Dijkstra cannot detect them
DAG shortest paths	DP on topological order	$O (V + E)$ , no heap needed (06-dp-on-dag.md)

Dijkstra State Snapshots

Snapshot 1 -- Expanding wavefront at Step 2:

After extracting (2, C), nodes A and C are finalized. B and D are tentative. The PQ holds the frontier of the expanding "ripple."

         4       1
   (A*)----->(B)------>(E)          * = finalized
    |0         4         ?
    |2                  |3
    v                   |
   (C*)----->(D)--------+
    2    5    7

  d[]:  A=0*   B=4    C=2*   D=7    E=inf

  PQ contents (min-heap):
  +-------+-------+
  | dist  | node  |
  +-------+-------+
  |   4   |   B   |  <-- next to extract
  |   7   |   D   |
  +-------+-------+

  Wavefront on the number line:
  --[A*]------[C*]------[B]----------------[D]-------->
    0          2         4                  7

Snapshot 2 -- Negative edge breaks the greedy invariant:

Add a negative edge D->B (weight -6) to the same graph. Dijkstra finalizes B=4 at Step 3, but the true shortest is A->C->D->B = 2+5+(-6) = 1. The cheaper path arrives too late.

         4       1
   (A)------>(B)------>(E)
    |          ^         ^
    |2     -6 |         |3
    v         |         |
   (C)------>(D)--------+
         5

  Step 1: Extract A(0).  Relax A->B(4), A->C(2).
          PQ: [(2,C), (4,B)]

  Step 2: Extract C(2).  Relax C->D(7).
          PQ: [(4,B), (7,D)]

  Step 3: Extract B(4).  <- FINALIZED B = 4  X WRONG
          Relax B->E(5).
          PQ: [(5,E), (7,D)]

  Step 4: Extract E(5).  PQ: [(7,D)]

  Step 5: Extract D(7).  Relax D->B: 7+(-6)=1 < 4
          But B is already finalized!
          +--------------------------------------------+
          | TRUE shortest to B: A->C->D->B = 2+5-6 = 1   |
          | Dijkstra reported:  A->B     = 4       X     |
          +--------------------------------------------+
          The "later ripple" (dist 1) beat the "earlier"
          one (dist 4), violating the greedy invariant.

Snapshot 3 -- Stale PQ entry and the staleness check:

Add edge A->D (weight 9) to the original graph. D gets pushed twice: first as {9,D}, then as {7,D} when C->D improves it.

         4       1
   (A)------>(B)------>(E)
    |  \                ^
    |2  \9             |3
    v    v             |
   (C)------>(D)-------+
         5

  After Step 1 (extract A):
    Relax A->B(4), A->C(2), A->D(9).
    PQ: [(2,C), (4,B), (9,D)]

  After Step 2 (extract C):
    Relax C->D: d[D] improved 9->7.  Push {7,D}.
    PQ: [(4,B), (7,D), (9,D)]
                        ^^^^^
                        STALE -- old distance, still in PQ

  After Step 4 (extract D at dist 7):
    D finalized with d[D]=7.
    PQ: [(9,D)]   <- stale entry remains

  Step 5: Extract (9, D).
    Check: du=9 > d[D]=7  ->  SKIP  (staleness check)
    +--------------------------------------------------+
    |  if (du > d[u]) continue;  // skips stale entry  |
    +--------------------------------------------------+
    Without this check, we'd re-process D and waste time.
    On dense graphs, this turns O((V+E)log V) into TLE.

What the Code Won't Teach You

1. The staleness check is fundamental, not a micro-optimization.

std::priority_queue has no decrease-key operation. When we find a shorter path to node v, we cannot update its existing PQ entry -- we push a new entry {new_dist, v} and leave the old one behind. This means the PQ accumulates stale (outdated) entries. The single line

cpp

if (du > d[u]) continue;   // stale entry -- skip

is what makes lazy deletion work. Without it, every stale entry triggers a full round of useless relaxations. This is not an optimization you "can add later" -- it is part of the algorithm.

  PQ internals during execution:
  +-------+------+-------------------+
  | dist  | node | status            |
  +-------+------+-------------------+
  |   3   |  v   | CURRENT (d[v]=3)  |  <- process this
  |   7   |  v   | STALE   (d[v]=3)  |  <- du=7 > d[v]=3 -> skip
  |   9   |  w   | CURRENT (d[w]=9)  |  <- process this
  +-------+------+-------------------+

2. Dijkstra = BFS + priority queue.

BFS assumes every edge costs 1, so a FIFO queue processes nodes in order of hop count. Dijkstra generalises: edges have varying costs, so we swap the FIFO queue for a min-heap that always extracts the closest unfinalized node. Both algorithms share the same skeleton -- "process the closest node next, relax its neighbours."

  BFS:       queue   (FIFO)     all edges cost 1     O(V + E)
  Dijkstra:  min-heap           edges cost >= 0       O((V+E) log V)
  -------------------------------------------------------------
  Same core loop:
    while (not empty):
      u = extract closest unfinalized node
      for each neighbour v of u:
        if d[u] + w(u,v) < d[v]:
          d[v] = d[u] + w(u,v)
          push v

3. The greedy invariant is WHY negative edges break Dijkstra.

When node u is extracted from the PQ, d[u] is final because every remaining path to u must go through nodes still in the PQ, all of which have distance >= d[u]. No remaining path can beat d[u] -- so the greedy choice is safe. A negative-weight edge breaks this: a path through a higher-distance node can decrease on the way back to u, arriving with total cost < d[u] after u was already finalized.

How This Evolved

Shortest-path algorithms evolved along a clear axis: handling increasingly general edge weights. BFS came first -- it finds shortest paths in unweighted graphs by exploring nodes layer by layer, one hop at a time. Every edge has implicit weight 1, and the FIFO queue guarantees nodes are finalized in order of distance. BFS is $O (V + E)$ and remains the best choice when all edges are equal.

Dijkstra's algorithm generalizes BFS to non-negative weights by replacing the FIFO queue with a priority queue. Instead of processing nodes by hop count, it processes them by tentative distance -- smallest first. The greedy invariant ("the node with the smallest tentative distance is already optimal") holds precisely because no edge can decrease a path's total cost. With a binary heap this runs in $O ((V + E) \log V)$ ; with a Fibonacci heap, $O (V \log V + E)$ .

But what if edges can be negative? Dijkstra's greedy choice fails -- a longer path through a negative edge might beat a shorter one. Bellman-Ford handles this by relaxing every edge $V - 1$ times, running in $O (V E)$ . It also detects negative-weight cycles (if any distance decreases on a $V$ -th pass, a negative cycle exists). For the all-pairs problem on graphs with negative edges but no negative cycles, Johnson's algorithm combines Bellman-Ford and Dijkstra: run Bellman-Ford once from a virtual source to compute a potential function that reweights all edges to be non-negative, then run Dijkstra from every vertex on the reweighted graph. This gives $O (V^{2} \log V + V E)$ -- far better than running Bellman-Ford $V$ times.

C++ STL Reference

Function / Class	Header	What it does	Time
`priority_queue<T>`	`<queue>`	Max-heap by default	push/pop $O (\log n)$
`priority_queue<T, vector<T>, greater<T>>`	`<queue>`	Min-heap (what we need)	push/pop $O (\log n)$
`pair<long long, int>`	`<utility>`	`{dist, node}` -- pairs compare lexicographically, so min-heap on `first` (distance)	$O (1)$ compare
`vector<vector<pair<int,int>>>`	`<vector>`	Adjacency list: `adj[u] = {{v, w}, ...}`	$O (1)$ amortized push
`numeric_limits<long long>::max()`	`<limits>`	Use as $\infty$ . Or just `1e18`.	-

Key point: Store pairs as {distance, node} in the PQ, not {node, distance}. The min-heap compares by the first element, which must be the distance.

Checkpoint: C++'s priority_queue has no decrease_key operation. How does Dijkstra handle finding a shorter path to a node that's already in the priority queue?
Think first, then click
It uses "lazy deletion": push a *new* entry `{shorter_dist, v}` into the PQ alongside the old stale entry. When a stale entry is later extracted, check `if (du > d[u]) continue;` -- the distance in the PQ entry is worse than the already-recorded shortest distance, so skip it. This adds at most `O(E)` entries to the PQ but maintains correctness.

Implementation (Contest-Ready)

Read this before the code: lazy heap deletion

Every contest Dijkstra implementation in this book uses lazy deletion, and the pattern is so important it deserves its own beat before any code appears.

std::priority_queue has no "decrease-key" operation. When you discover a shorter path to node v, you cannot update its existing entry — you push a new entry {shorter_dist, v} and leave the old one behind. The PQ now contains both. Each time you pop, you must distinguish:

Current entry: du == d[u]. Process it: relax outgoing edges.
Stale entry: du > d[u]. Skip it: a better entry already finalised u (or will).

The two states differ by one line:

cpp

auto [du, u] = pq.top(); pq.pop();
if (du > d[u]) continue;       // stale -- already superseded
// otherwise du == d[u], this is the canonical entry for u

This is not an optimisation you "add later." Without the staleness check, every superseded entry triggers another full round of relaxations, and the algorithm can degenerate badly on dense graphs with many edge improvements. Read the contest implementations below with this in mind: every Dijkstra in CP code allows duplicate PQ entries on purpose, and the staleness check is what keeps it linear in PQ-extractions.

Priority queue pair ordering matters. Use {dist, node}, NOT {node, dist}. priority_queue compares pairs lexicographically -- if the node ID comes first, you're sorting by node ID, not by distance. This is a silent bug that produces wrong shortest paths.

cpp

// WRONG -- sorts by node ID
priority_queue<pair<int,int>, vector<pair<int,int>>, greater<>> pq;
pq.push({node, dist});  // BAD: compares node first

// CORRECT -- sorts by distance
pq.push({dist, node});  // GOOD: smallest distance popped first

Stale entries in the priority queue. When you pop {d, v}, check if (d > dist[v]) continue;. Without this, you process outdated entries and may relax edges from non-optimal distances. The algorithm is still correct without the check (just slower), but it can cause TLE on graphs with many edges.

Version 1 -- Minimal contest template

cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;
    vector<vector<pair<int, long long>>> adj(n + 1);
    for (int i = 0; i < m; i++) {
        int u, v;
        long long w;
        cin >> u >> v >> w;
        adj[u].push_back({v, w});
        adj[v].push_back({u, w}); // remove for directed
    }

    int s = 1;
    const long long INF = 1e18;
    vector<long long> d(n + 1, INF);
    d[s] = 0;
    // min-heap of {dist, node}
    priority_queue<pair<long long, int>,
                   vector<pair<long long, int>>,
                   greater<>> pq;
    pq.push({0, s});

    while (!pq.empty()) {
        auto [du, u] = pq.top();
        pq.pop();
        if (du > d[u]) continue; // stale entry
        for (auto [v, w] : adj[u]) {
            if (d[u] + w < d[v]) {
                d[v] = d[u] + w;
                pq.push({d[v], v});
            }
        }
    }

    for (int i = 1; i <= n; i++)
        cout << (d[i] == INF ? -1 : d[i]) << " \n"[i == n];
}

Version 2 -- Explained version with path reconstruction

cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;

    // Adjacency list: adj[u] stores {neighbour, weight} pairs.
    // 1-indexed so we allocate n+1 slots.
    vector<vector<pair<int, long long>>> adj(n + 1);
    for (int i = 0; i < m; i++) {
        int u, v;
        long long w;
        cin >> u >> v >> w;
        adj[u].push_back({v, w});
        adj[v].push_back({u, w});
    }

    int source = 1, target = n;
    const long long INF = 1e18;

    // d[v] = shortest known distance from source to v.
    vector<long long> d(n + 1, INF);
    // par[v] = predecessor of v on the shortest path (for reconstruction).
    vector<int> par(n + 1, -1);

    d[source] = 0;

    // Min-heap ordered by distance. We store {distance, node}.
    // greater<> makes it a min-heap (smallest distance on top).
    priority_queue<pair<long long, int>,
                   vector<pair<long long, int>>,
                   greater<>> pq;
    pq.push({0, source});

    while (!pq.empty()) {
        auto [du, u] = pq.top();
        pq.pop();

        // If we already found a better path to u, this entry is stale.
        // This is the "lazy deletion" approach -- cheaper than decrease-key.
        if (du > d[u]) continue;

        for (auto [v, w] : adj[u]) {
            // Relaxation: can we improve d[v] by going through u?
            if (d[u] + w < d[v]) {
                d[v] = d[u] + w;
                par[v] = u;
                pq.push({d[v], v});
            }
        }
    }

    if (d[target] == INF) {
        cout << -1 << "\n";
        return 0;
    }

    cout << d[target] << "\n";

    // Reconstruct path by walking parent pointers backwards.
    vector<int> path;
    for (int v = target; v != -1; v = par[v])
        path.push_back(v);
    reverse(path.begin(), path.end());

    for (int i = 0; i < (int)path.size(); i++)
        cout << path[i] << " \n"[i + 1 == (int)path.size()];
}

Operations Reference

Why a priority queue? Dijkstra's core loop extracts the unvisited node with smallest tentative distance. Without a heap, finding that node requires scanning all $V$ entries -- $O (V)$ per extraction, $O (V^{2})$ total. A min-heap brings extraction down to $O (\log V)$ , giving $O ((V + E) \log V)$ overall. For sparse graphs ( $E \approx V$ , e.g. $V = 10^{6}$ ) this is the difference between $10^{12}$ and $2 \times 10^{7}$ operations -- easily the difference between TLE and AC. For dense graphs ( $E \approx V^{2}$ ), the $O (V^{2})$ no-heap version is competitive because the $\log V$ factor from the heap can hurt; but in CP, most graphs are sparse and the priority_queue version is the default.

Operation	Time	Space
Full Dijkstra (binary heap)	$O ((V + E) \log V)$	$O (V + E)$
Full Dijkstra (Fibonacci heap)	$O (V \log V + E)$	$O (V + E)$
Full Dijkstra (dense, no heap)	$O (V^{2})$	$O (V + E)$
Single extraction from PQ	$O (\log V)$	-
Single relaxation	$O (\log V)$	-
Path reconstruction	$O (V)$	$O (V)$

Notes:

The binary heap version is best for sparse graphs ( $E \approx V$ ). For dense graphs ( $E \approx V^{2}$ ), the $O (V^{2})$ version without a heap can be faster due to lower constant factors.
Fibonacci heap is theoretically better but never used in CP -- the constant factor is huge.
In practice, the STL priority_queue version handles $V, E \leq 10^{6}$ comfortably within 1-2 seconds.
Space is dominated by the adjacency list ( $O (V + E)$ ) and the distance array ( $O (V)$ ).

Problem Patterns & Tricks

Pattern 1 -- Classic SSSP

Straightforward single-source shortest path. Read graph, run Dijkstra, output distances.

Recognize it: "Find the shortest path from $s$ to $t$ " with non-negative weights.

CF 20C -- Dijkstra?
CSES -- Shortest Routes I

Pattern 2 -- State-space Dijkstra (modified states)

The "vertex" in Dijkstra is not just a node but a tuple (node, extra_state). This handles problems where the cost depends on additional information.

Examples: shortest path with at most $k$ toll roads, shortest path where you can halve one edge, shortest path with fuel constraints.

cpp

// State: {node, num_edges_halved}
// dist[v][k] = shortest path to v having halved k edges
vector<vector<long long>> d(n + 1, vector<long long>(2, INF));
d[s][0] = 0;
// push {dist, node, halved_count}
pq.push({0, s, 0});

CF 1473E -- Minimum Path (use one free +/- edge)
CF 1816D -- Grid with abilities (state = position + power used)

Pattern 3 -- Multi-source Dijkstra

Push multiple sources into the PQ with distance 0. This finds the shortest distance from any source to every vertex -- equivalent to adding a virtual super-source with zero-weight edges to all real sources.

cpp

for (int s : sources) {
    d[s] = 0;
    pq.push({0, s});
}

CF 1528A -- Parsa's Humongous Tree (multi-source on tree)
CF 1272E -- Nearest opposite-color node

Pattern 4 -- Dijkstra on a grid

The graph is implicit: cells are nodes, adjacent cells share edges. Weight = cost to enter the next cell. Build the adjacency on-the-fly instead of storing an explicit graph.

cpp

int dx[] = {0, 0, 1, -1};
int dy[] = {1, -1, 0, 0};
// In the main loop:
for (int dir = 0; dir < 4; dir++) {
    int nx = x + dx[dir], ny = y + dy[dir];
    if (nx < 0 || nx >= R || ny < 0 || ny >= C) continue;
    long long nw = d[x][y] + cost[nx][ny];
    if (nw < d[nx][ny]) {
        d[nx][ny] = nw;
        pq.push({nw, nx * C + ny}); // encode (row,col) as single int
    }
}

CF 1063B -- Labyrinth (0-1 BFS variant, but Dijkstra also works)
CF 1579E2 -- Grid paths with costs

Pattern 5 -- Shortest path with reconstruction

Run Dijkstra while maintaining a par[] array. Reconstruct by walking backwards from target to source. See Version 2 implementation above.

CF 20C -- Dijkstra?
CSES -- Shortest Routes I (with path output)

Pattern 6 -- K-shortest paths (Yen's algorithm wrapper)

Find the $k$ -th shortest path. Run Dijkstra $k$ times, each time forbidding one edge from the previously found paths. Heavy but works for small $k$ .

Simpler variant: allow revisits. Use a counter -- the $k$ -th time a node is popped from the PQ, that is its $k$ -th shortest distance.

cpp

vector<int> cnt(n + 1, 0);
while (!pq.empty()) {
    auto [du, u] = pq.top(); pq.pop();
    cnt[u]++;
    if (cnt[u] == k && u == target) { cout << du; return 0; }
    if (cnt[u] > k) continue;
    for (auto [v, w] : adj[u])
        pq.push({du + w, v});
}

CF 1196F -- K-th Path
CSES -- Flight Routes

Pattern 7 -- Meet-in-the-middle / bidirectional Dijkstra

Run Dijkstra from both source and target. Answer = $min_{(u, v, w) \in E} (d_{s} [u] + w + d_{t} [v])$ . Useful for single source-target queries on large graphs or when combined with other techniques.

CF 449B -- Jzzhu and Cities (reverse Dijkstra for pruning)

Contest Cheat Sheet

+--------------------------------------------------------------+
|  DIJKSTRA'S ALGORITHM -- QUICK REFERENCE                     |
+--------------------------------------------------------------+
|  WHEN: Non-negative weights, single-source shortest path     |
|  TIME: O((V+E) log V)    SPACE: O(V+E)                      |
+--------------------------------------------------------------+
|  TEMPLATE:                                                   |
|    vector<long long> d(n+1, 1e18);                           |
|    priority_queue<pair<long long,int>,                        |
|      vector<pair<long long,int>>, greater<>> pq;             |
|    d[s]=0; pq.push({0,s});                                   |
|    while(!pq.empty()){                                       |
|      auto [du,u]=pq.top(); pq.pop();                        |
|      if(du>d[u]) continue;                                   |
|      for(auto [v,w]:adj[u])                                  |
|        if(d[u]+w<d[v]) d[v]=d[u]+w, pq.push({d[v],v});      |
|    }                                                         |
+--------------------------------------------------------------+
|  WATCH OUT:                                                  |
|  - {dist,node} NOT {node,dist} in PQ                        |
|  - Use long long if sum of weights > 2e9                     |
|  - if(du>d[u]) continue; is NOT optional (TLE without it)   |
|  - Negative weights? Use Bellman-Ford/SPFA instead           |
|  - Unweighted? Use plain BFS (faster and simpler)            |
+--------------------------------------------------------------+

Common Mistakes & Debugging

The most frequent Dijkstra bugs, roughly ordered by how often they appear in contest submissions:

#	Mistake	Symptom
1	Using Dijkstra with negative edge weights	WA -- greedy invariant breaks silently
2	Not using visited/finalized check (`if (du > d[u]) continue`)	TLE -- processes same node many times
3	Using adjacency matrix instead of adjacency list	TLE -- $O (V^{2})$ per relaxation step
4	Forgetting to initialize distances to infinity	WA -- nodes appear reachable at dist 0

Below is the full list with explanations.

1. Negative edge weights break Dijkstra. The greedy invariant ("extracted distance is final") fails. Dijkstra may produce wrong answers silently -- no runtime error, just wrong output. If the problem has negative weights, use Bellman-Ford or SPFA.

2. {dist, node} vs {node, dist} in the PQ.priority_queue with greater<> compares pairs lexicographically. If you store {node, dist}, the heap orders by node ID, not distance. Your code compiles fine but gives wrong answers. Always {dist, node}.

3. The if (du > d[u]) continue; check. Without this, you process stale entries -- vertices whose distance was already improved. On dense graphs this turns $O ((V + E) \log V)$ into $O (E \log E)$ or worse, causing TLE. This single line is the difference between AC and TLE on tight limits.

4. Integer overflow. If weights are up to $10^{9}$ and the path has up to $10^{5}$ edges, the total distance can reach $10^{14}$ . Use long long for distances. Set INF = 1e18 (not INT_MAX -- adding to INT_MAX overflows).

5. 1-indexed vs 0-indexed. If vertices are 1-indexed (Codeforces standard), allocate n + 1 entries. Off-by-one here causes out-of-bounds writes that corrupt data silently.

6. Forgetting greater<> for min-heap. Default priority_queue is a max-heap. You extract the largest distance first, which is the exact opposite of what you want. Results: wrong answers or infinite loops.

7. Parallel edges and self-loops. The algorithm handles these correctly without special cases. Parallel edges: the relaxation naturally picks the best. Self-loops: $d [u] + w \geq d [u]$ so they never improve anything. Don't add special-case code.

8. Disconnected components. If target is unreachable, $d [target]$ stays $\infty$ . Check for this before outputting. Print $- 1$ or whatever the problem specifies.

9. Multi-edge insertion. When the problem says "there may be multiple edges between the same pair of vertices," do NOT deduplicate. Just add all edges to the adjacency list. Dijkstra handles it.

Debugging checklist:

Print the distance array after Dijkstra. Are disconnected nodes still $\infty$ ?
Print each extraction: cerr << "pop " << u << " d=" << du << "\n";
Check that the adjacency list has the right number of edges (count them).
For wrong answer: trace the relaxation on paper with a 4-5 node example.

Mental Traps

Trap 1: "I can handle negative weights by adding a large constant to all edges."

This is wrong. Adding a constant $K$ to every edge changes the relative cost of paths with different edge counts. A path with fewer hops gets penalised less than a path with more hops, destroying the original shortest-path ordering.

  Original graph:                 After adding K=10 to every edge:

       1        -3                     11        7
  (A)---->(B)------>(C)           (A)---->(B)------>(C)
   |                 ^             |                 ^
   |  4              |             | 14              |
   +-------->(D)-----+            +-------->(D)-----+
          2                              12

  Path via B,C: 1 + (-3) = -2         11 + 7  = 18   (2 edges, +2K)
  Path via D,C: 4 +   2  =  6         14 + 12 = 26   (2 edges, +2K)
  Direct A->D->C: 4 + 2    =  6         14 + 12 = 26

  Now add a direct edge A->C with weight 5 (original):
  Path A->C:     5                      15             (1 edge,  +1K)
  Path A->B->C:   1 + (-3) = -2         11 + 7  = 18   (2 edges, +2K)

  Original winner:  A->B->C  (cost -2)
  Shifted winner:   A->C    (cost 15)   <- DIFFERENT ANSWER
                                          +K per edge changes ranking

Trap 2: "State-space Dijkstra is just Dijkstra with more nodes -- I only need to check d[u]."

In state-space Dijkstra (e.g., d[node][extra_state]), the staleness check must compare all state dimensions, not just the node. A PQ entry {dist, node, state} is stale if dist > d[node][state], not if dist > d[node] (which ignores the extra dimension).

  State-space example: d[node][used_coupon]
  PQ entry: {dist=5, node=B, coupon=0}

  WRONG staleness check:             CORRECT staleness check:
  if (du > d[B]) continue;           if (du > d[B][0]) continue;
        ^                                       ^^^^
        ignores coupon dimension                checks full state

  Why it matters:
  +------+------------+------------+
  | node | d[v][0]    | d[v][1]    |
  +------+------------+------------+
  |  B   |   5        |   3        |
  +------+------------+------------+

  PQ: {5, B, coupon=0}  <- current for d[B][0]=5, should process
  Checking d[B] alone might see d[B][1]=3 < 5 and wrongly skip!

Self-Test

Can you do these without looking anything up?

[ ] Write Dijkstra from memory -- min-heap declaration, {dist, node} pairs, staleness check, relaxation
[ ] Explain why {dist, node} must be in this order with a concrete counter-example
[ ] State why Dijkstra fails on negative edges using the greedy invariant argument
[ ] Modify Dijkstra for state-space problems -- describe how d[node][extra] and PQ entries change
[ ] Identify the integer overflow and max-heap bugs and describe their symptoms

Quick-fire questions:

Q1: What is the time complexity of Dijkstra with a binary heap?
Answer
O((V+E) log V). Each of the V nodes is extracted from the heap once (O(log V) each), and each of the E edges may trigger a push to the heap (O(log V) each). Total: O((V+E) log V).

Q2: Why does Dijkstra fail on graphs with negative edge weights? Give a concrete 3-node example.
Answer
Dijkstra's greedy invariant assumes that once a node is popped, its distance is final. Negative edges violate this. Example: A->B weight 2, A->C weight 3, C->B weight -2. Dijkstra pops B with dist=2, but the path A->C->B has cost 1. Since B is already finalized, the shorter path is missed.

Q3: In the priority queue implementation, why do we store pairs as {dist, node} and not {node, dist}?
Answer
C++ `priority_queue` (min-heap via `greater<>`) compares by the first element. Storing `{dist, node}` ensures the node with smallest distance is popped first. If we stored `{node, dist}`, nodes would be ordered by node ID, not by distance.

Q4: What is the "staleness check" in Dijkstra, and what happens if you omit it?
Answer
When popping `{d, u}`, we check `if (d > dist[u]) continue;`. This skips stale entries -- earlier pushes with longer distances. Without it, we'd re-relax neighbors from outdated distances, potentially causing TLE from exponential re-processing.

Q5: You need shortest paths where each edge has weight 0 or 1. Dijkstra works but what is faster and why?
Answer
0-1 BFS using a deque: push weight-0 neighbors to the front, weight-1 neighbors to the back. Runs in O(V+E) vs Dijkstra's O((V+E) log V), because the deque maintains sorted-by-distance order without heap operations.

Practice Problems

#	Problem	Source	Difficulty	Key Idea
1	Shortest Routes I	CSES	Easy	Textbook Dijkstra, just implement it
2	Dijkstra?	CF 20C	Easy	SSSP with path reconstruction
3	Flight Discount	CSES	Easy-Med	State-space Dijkstra: `d[v][0/1]` (used discount or not)
4	Minimum Path	CF 1473E	Medium	Modified Dijkstra with extra state for free +/- edge
5	Shortest Routes II	CSES	Medium	All-pairs (Floyd-Warshall), know when NOT to use Dijkstra
6	Flight Routes	CSES	Medium	K-shortest paths via repeated extraction
7	Jzzhu and Cities	CF 449B	Medium-Hard	Dijkstra + edge pruning, count shortest-path edges
8	Civilization	CF 455C	Medium-Hard	DSU + tree diameter + BFS/Dijkstra on trees
9	K-th Path	CF 1196F	Hard	Prune graph to relevant edges, then k-shortest paths
10	Road Optimization	CF 1625C	Hard	DP on shortest path with at most $k$ removals

Pattern Fingerprint

Constraint / Signal	-> Use
"Shortest path" + non-negative weights ->	Dijkstra
"Shortest path" + weights in {0, 1} ->	0-1 BFS (faster)
"Shortest path" + negative weights ->	Bellman-Ford (not Dijkstra!)
"Shortest path from one source to all nodes" ->	Dijkstra (SSSP)
"Shortest path with state (fuel, tickets, etc.)" ->	Dijkstra on expanded state graph
$n \leq 10^{5}$ , $m \leq 2 \times 10^{5}$ ->	Dijkstra with binary heap: $O ((V + E) \log V)$

Rating Progression

CF Rating	What You Should Be Comfortable With
1200	Template Dijkstra, reading weighted graphs, shortest path from $s$ to $t$
1500	Multi-source Dijkstra, Dijkstra on grids, reconstructing the actual shortest path
1800	Dijkstra with extended states (layers, bitmasks), $k$ -shortest paths, Dijkstra + binary search
2100	Dijkstra on virtual/segment-tree graphs, potential functions (Johnson's), Dijkstra with online edge modifications

What Would You Do If...?

...the graph has negative edge weights? You CANNOT use Dijkstra. Use Bellman-Ford ( $O (V E)$ ) or, if no negative cycles, use Johnson's algorithm (reweight with potentials, then run Dijkstra from each source).
...you need the shortest path using at most $k$ "free" edges (weight set to 0)? Build a layered graph: $k + 1$ copies of the original graph. A "free" edge moves you from layer $i$ to layer $i + 1$ . Run Dijkstra on this expanded graph with $O (k n)$ nodes.
...there are $10^{5}$ sources and you need shortest paths from all of them? If to a single target, run Dijkstra from the target on the reversed graph. If all-pairs, consider Floyd-Warshall (if $n \leq 500$ ) or run Dijkstra $n$ times ( $O (n (n + m) \log n)$ ).

Debug This

Bug 1: Dijkstra gives wrong answers on some test cases:

cpp

priority_queue<pair<int,int>> pq;  // max-heap!
pq.push({0, src});

Answer

priority_queue<pair<int,int>> in C++ is a max-heap. It pops the node with the largest distance first, which violates Dijkstra's greedy property. Fix: use priority_queue<pair<int,int>, vector<pair<int,int>>, greater<pair<int,int>>> for a min-heap, or push negated distances with a max-heap.

Bug 2: Distance to some reachable nodes stays at INF:

cpp

vector<long long> dist(n + 1, 1e18);
dist[src] = 0;
pq.push({0, src});
while (!pq.empty()) {
    auto [d, u] = pq.top(); pq.pop();
    if (d > dist[u]) continue;
    for (auto& [v, w] : adj[u]) {
        if (dist[u] + w < dist[v]) {
            dist[v] = dist[u] + w;
            pq.push({dist[v], v});
        }
    }
}
// dist[target] is still 1e18...

Answer

The code logic is correct. If dist[target] is still 1e18, the target is genuinely unreachable from src. Check: (1) is the graph directed when it should be undirected? (2) Did you add edges in both directions for undirected graphs? (3) Are you using 1-indexed nodes with adj of size n+1?

Bug 3: Dijkstra with state gives TLE:

cpp

// Dijkstra on (node, fuel_remaining) state
// fuel can be 0..K where K = 10000, n = 100000
dist[src][K] = 0;
pq.push({0, src, K});
while (!pq.empty()) { ... }

Answer

The state space is $n \times K = 10^{5} \times 10^{4} = 10^{9}$ , which is way too large. The distance array alone would need ~8 GB. Fix: reduce the state space. Often, fuel/ticket states can be bounded by a small constant. Alternatively, binary search on the answer and use plain Dijkstra inside.

Historical Origin

Dijkstra's algorithm was conceived by Edsger W. Dijkstra in 1956 (published 1959) while sitting at a cafe in Amsterdam. He designed it in about 20 minutes to demonstrate the capabilities of the ARMAC computer. It remains one of the most widely used algorithms in computer science, from GPS navigation to internet routing.

Designing Test Cases

Dijkstra bugs usually show up on non-trivial graph structure -- not on the friendly sample inputs. Test these before submitting.

Minimal cases:

Single node: n = 1, source = 0. Distance to itself should be 0, no edges needed.
Two nodes: One edge 0 -> 1 with weight w. Distance is w. Also test the reverse query 1 -> 0 (should be unreachable in directed graph, w in undirected).

Edge cases:

Disconnected graph: Source can't reach some nodes. Verify those distances stay at INF, not 0 or garbage.
Self-loop: Edge u -> u with weight w. Should not affect dist[u] (already 0). Catches bugs where you relax the source node.
Greedy trap -- direct path isn't shortest: Classic triangle: 0->1 weight 10, 0->2 weight 3, 2->1 weight 2. Shortest to node 1 is 5 (via 2), not 10. If your code returns 10, you're not relaxing properly.
Multiple edges between same pair: Two edges u->v with different weights. Dijkstra should find the better one naturally.
Large weights near overflow: Weights up to 10^9 on a path of length 10^5. Total distance is 10^{14} -- make sure you're using long long.

Stress test (run locally before submitting):

cpp

// Compare Dijkstra against Floyd-Warshall on small random graphs.
mt19937 rng(42);
for (int iter = 0; iter < 10000; iter++) {
    int n = rng() % 8 + 2;  // 2..9 nodes
    int m = rng() % (n * n);
    // build random graph, weights in [1, 100]
    // run Dijkstra from node 0
    // run Floyd-Warshall on same graph
    // assert all distances match
}

Keep n <= 9 so Floyd-Warshall's O(n^3) is instant. This catches relaxation order bugs, visited-set mistakes, and priority queue misuse.

Attempt	Target	Notes
1st	< 8 min	Focus on correctness -- proper relaxation, PQ with `{dist, node}`, skip stale entries.
2nd	< 6 min	Eliminate pauses -- the PQ loop and relaxation should be automatic.
3rd+	< 4 min	Competition-ready. Dijkstra is the workhorse of weighted shortest paths.

Variation	Key change	Practice problem
With path reconstruction	Store `parent[v]` during relaxation; trace back from target to source	CSES Shortest Routes I + print path

Recap

Dijkstra's algorithm is greedy BFS with a price tag. Instead of exploring the nearest node by hop count (BFS), it explores the nearest node by total cost. The priority queue replaces BFS's plain queue, and the greedy choice -- "the closest unfinalized node will never get a shorter path" -- is valid because all edge weights are non-negative. Break that assumption (negative edges), and the greedy choice fails.

Key takeaways:

Core idea: always finalize the unvisited node with smallest tentative distance.
Invariant: once popped from the PQ, a node's distance is optimal.
Complexity: O((V + E) log V) with a binary heap.
Classic bugs: wrong PQ pair order, missing staleness check, negative weights, int overflow.
Alternatives: BFS (unweighted), 0-1 BFS (weights in {0,1}), DP on DAG, Bellman-Ford (negative weights).

"Greedy by distance: closest unvisited node can't get cheaper." That is the invariant. Non-negative weights make it true. Negative weights break it.

Dijkstra's Algorithm

Contents

Intuition

Visual Walkthrough

The Invariant

When to Reach for This

Dijkstra State Snapshots

What the Code Won't Teach You

How This Evolved

C++ STL Reference

Implementation (Contest-Ready)

Read this before the code: lazy heap deletion

Version 1 -- Minimal contest template

Version 2 -- Explained version with path reconstruction

Operations Reference

Problem Patterns & Tricks

Pattern 1 -- Classic SSSP

Pattern 2 -- State-space Dijkstra (modified states)

Pattern 3 -- Multi-source Dijkstra

Pattern 4 -- Dijkstra on a grid

Pattern 5 -- Shortest path with reconstruction

Pattern 6 -- K-shortest paths (Yen's algorithm wrapper)

Pattern 7 -- Meet-in-the-middle / bidirectional Dijkstra

Contest Cheat Sheet

Common Mistakes & Debugging

Mental Traps

Self-Test

Practice Problems

Pattern Fingerprint

Rating Progression

What Would You Do If...?

Debug This

Historical Origin

Designing Test Cases

Further Reading

How to Practice This

Reconstruct It

Solve With Me -- CF 20C "Dijkstra?"

Recap

Dijkstra's Algorithm ​

Contents ​

Intuition ​

Visual Walkthrough ​

The Invariant ​

When to Reach for This ​

Dijkstra State Snapshots ​

What the Code Won't Teach You ​

How This Evolved ​

C++ STL Reference ​

Implementation (Contest-Ready) ​

Read this before the code: lazy heap deletion ​

Version 1 -- Minimal contest template ​

Version 2 -- Explained version with path reconstruction ​

Operations Reference ​

Problem Patterns & Tricks ​

Pattern 1 -- Classic SSSP ​

Pattern 2 -- State-space Dijkstra (modified states) ​

Pattern 3 -- Multi-source Dijkstra ​

Pattern 4 -- Dijkstra on a grid ​

Pattern 5 -- Shortest path with reconstruction ​

Pattern 6 -- K-shortest paths (Yen's algorithm wrapper) ​

Pattern 7 -- Meet-in-the-middle / bidirectional Dijkstra ​

Contest Cheat Sheet ​

Common Mistakes & Debugging ​

Mental Traps ​

Self-Test ​

Practice Problems ​

Pattern Fingerprint ​

Rating Progression ​

What Would You Do If...? ​

Debug This ​

Historical Origin ​

Designing Test Cases ​

Further Reading ​

How to Practice This ​

Reconstruct It ​

Solve With Me -- CF 20C "Dijkstra?" ​

Related Techniques ​

Recap ​

Dijkstra's Algorithm

Contents

Intuition

Visual Walkthrough

The Invariant

When to Reach for This

Dijkstra State Snapshots

What the Code Won't Teach You

How This Evolved

C++ STL Reference

Implementation (Contest-Ready)

Read this before the code: lazy heap deletion

Version 1 -- Minimal contest template

Version 2 -- Explained version with path reconstruction

Operations Reference

Problem Patterns & Tricks

Pattern 1 -- Classic SSSP

Pattern 2 -- State-space Dijkstra (modified states)

Pattern 3 -- Multi-source Dijkstra

Pattern 4 -- Dijkstra on a grid

Pattern 5 -- Shortest path with reconstruction

Pattern 6 -- K-shortest paths (Yen's algorithm wrapper)

Pattern 7 -- Meet-in-the-middle / bidirectional Dijkstra

Contest Cheat Sheet

Common Mistakes & Debugging

Mental Traps

Self-Test

Practice Problems

Pattern Fingerprint

Rating Progression

What Would You Do If...?

Debug This

Historical Origin

Designing Test Cases

Further Reading

How to Practice This

Reconstruct It

Solve With Me -- CF 20C "Dijkstra?"

Related Techniques

Recap