Skip to content

Dijkstra's Algorithm

Quick Revisit

  • USE WHEN: single-source shortest path with non-negative edge weights
  • INVARIANT: once a node is popped from the priority queue, its distance is final
  • TIME: O((V + E) log V) with binary heap | SPACE: O(V + E)
  • CLASSIC BUG: using Dijkstra with negative edge weights (greedy assumption breaks)
  • PRACTICE: ../12-practice-ladders/05-ladder-graphs.md

Greedy single-source shortest path for graphs with non-negative edge weights. The bread-and-butter graph algorithm on Codeforces from Div 2 C onward.

ID: AL-15 Difficulty: [Intermediate]Prerequisites: Graph Representation, Priority Queue and HeapsCF Rating: 1500-1800

Contest Frequency: Regular -- appears every few Div 2 contests


Contents


Intuition

You have a weighted graph with 5 nodes and need the shortest path from A to E:

        4       1
  (A)------>(B)------>(E)
   |                   ^
   |2                  |3
   v                   |
  (C)------>(D)--------+
        5

Brute force: enumerate every possible path from A to E. Paths can revisit different orderings of intermediate nodes, so this blows up exponentially -- O(V!) in the worst case. For V=20 that is already over 1018 paths.

Naive greedy (pick the lightest outgoing edge): start at A, always move along the cheapest edge leaving the current node.

Greedy walk: A --4--> B --1--> E    total = 5

Looks fine here -- but try a slightly different graph:

        1      10
  (A)------>(B)------>(E)
   |                   ^
   |2                  |1
   v                   |
  (C)------>(D)--------+
        1
Greedy walk: A --1--> B --10--> E   total = 11
Actual best: A --2--> C --1--> D --1--> E   total = 4

The naive greedy fails because it commits to one edge without considering the full cost of reaching the destination through that edge. We need a smarter greedy strategy.

Always finalize the unvisited node with the smallest tentative distance -- it can't get any shorter because all remaining paths go through nodes with equal or larger distance.

Analogy: imagine dropping a stone into a pond at the source node. Ripples expand outward in circles of increasing radius. The first time a ripple reaches a node, that is the shortest distance -- later ripples (larger radius) can never beat it. Dijkstra simulates this expanding wavefront, processing nodes in order of their distance from the source.

Where this breaks: if an edge has negative weight, a "later ripple" can actually arrive sooner than an earlier one. The expanding-circle model collapses, and Dijkstra's greedy choice is no longer safe. Use Bellman-Ford instead.

Checkpoint: Dijkstra always finalizes the unvisited node with the smallest tentative distance. Why does this greedy choice fail when edges can have negative weights?

Think first, then click When all edges are non-negative, any path through unvisited nodes (which have distance >= the current node) can only get longer. So the current minimum can't be beaten. But with a negative edge, a path through a higher-distance node could decrease along the way and arrive with a shorter total -- the "already finalized" distance turns out to be wrong.

Visual Walkthrough

Graph (same as the Pain section, first example):

        4       1
  (A)------>(B)------>(E)
   |                   ^
   |2                  |3
   v                   |
  (C)------>(D)--------+
        5

We track a distance array d[] and a min-heap priority queue (PQ). Finalized nodes are marked with *.

Initial state:
  d[A]=0   d[B]=inf   d[C]=inf   d[D]=inf   d[E]=inf
  PQ: {(0, A)}

Step 1: Extract (0, A). Relax A->B (w=4), A->C (w=2).
  d[A]=0*  d[B]=4     d[C]=2     d[D]=inf   d[E]=inf
  PQ: {(2, C), (4, B)}

Step 2: Extract (2, C). Relax C->D (w=5).
  d[A]=0*  d[B]=4     d[C]=2*    d[D]=7     d[E]=inf
  PQ: {(4, B), (7, D)}

Step 3: Extract (4, B). Relax B->E (w=1).
  d[A]=0*  d[B]=4*    d[C]=2*    d[D]=7     d[E]=5
  PQ: {(5, E), (7, D)}

Step 4: Extract (5, E). No outgoing edges to relax.
  d[A]=0*  d[B]=4*    d[C]=2*    d[D]=7     d[E]=5*
  PQ: {(7, D)}

Step 5: Extract (7, D). Relax D->E (w=3): d[D]+3=10 > d[E]=5. No update.
  d[A]=0*  d[B]=4*    d[C]=2*    d[D]=7*    d[E]=5*
  PQ: {}  --> DONE

Final shortest distances from A:

  +------+---+---+---+---+---+
  | Node | A | B | C | D | E |
  +------+---+---+---+---+---+
  | d[]  | 0 | 4 | 2 | 7 | 5 |
  +------+---+---+---+---+---+

The Invariant

+------------------------------------------------------------------+
| GREEDY INVARIANT: When node u is extracted from the priority     |
| queue, d[u] equals the true shortest-path distance from s to u. |
+------------------------------------------------------------------+

Proof by contradiction. Suppose node u is extracted from the PQ but d[u] is NOT the true shortest-path distance. Then there exists a shorter path P from s to u with length (P)<d[u].

Path P starts at s (already finalized) and ends at u (just extracted). So P must, at some point, leave the set of finalized nodes. Let v be the first unfinalized node along P, and let x be the finalized node just before v on P.

  1. Since x was already finalized, d[x] is the true shortest distance to x (by induction -- x was extracted earlier).
  2. When x was finalized, the edge (x,v) was relaxed, so d[v]d[x]+w(x,v).
  3. The sub-path of P from s to v has length at least d[x]+w(x,v), so d[v] (length of P up to v).
  4. All edges are non-negative, so the remaining portion of P from v to u has non-negative length. Therefore (P) (length of P up to v) d[v].
  5. But u was extracted before v, meaning d[u]d[v] (the PQ extracts the minimum).
  6. Combining: (P)d[v]d[u].

This contradicts (P)<d[u]. Therefore d[u] must be the true shortest-path distance when u is extracted.

When to Reach for This

Trigger phrases in problem statements:

  • "shortest path" / "minimum cost path" with non-negative weights
  • "single source, all destinations" (or single source-target)
  • "minimum cost to reach a state" (state-space Dijkstra)
  • "grid shortest path with varying cell costs"

Counter-examples -- when Dijkstra is NOT the right tool:

SituationUse insteadWhy
Negative edge weightsBellman-Ford / SPFAGreedy invariant breaks
All weights = 1BFSO(V+E) simpler and faster
All weights 0 or 10-1 BFS (deque)O(V+E), no log factor
All-pairs shortest paths, V400Floyd-WarshallO(V3), simpler code
Negative cycles detectionBellman-FordDijkstra cannot detect them
DAG shortest pathsDP on topological orderO(V+E), no heap needed (06-dp-on-dag.md)

Dijkstra State Snapshots

Snapshot 1 -- Expanding wavefront at Step 2:

After extracting (2, C), nodes A and C are finalized. B and D are tentative. The PQ holds the frontier of the expanding "ripple."

         4       1
   (A*)----->(B)------>(E)          * = finalized
    |0         4         ?
    |2                  |3
    v                   |
   (C*)----->(D)--------+
    2    5    7

  d[]:  A=0*   B=4    C=2*   D=7    E=inf

  PQ contents (min-heap):
  +-------+-------+
  | dist  | node  |
  +-------+-------+
  |   4   |   B   |  <-- next to extract
  |   7   |   D   |
  +-------+-------+

  Wavefront on the number line:
  --[A*]------[C*]------[B]----------------[D]-------->
    0          2         4                  7

Snapshot 2 -- Negative edge breaks the greedy invariant:

Add a negative edge D->B (weight -6) to the same graph. Dijkstra finalizes B=4 at Step 3, but the true shortest is A->C->D->B = 2+5+(-6) = 1. The cheaper path arrives too late.

         4       1
   (A)------>(B)------>(E)
    |          ^         ^
    |2     -6 |         |3
    v         |         |
   (C)------>(D)--------+
         5

  Step 1: Extract A(0).  Relax A->B(4), A->C(2).
          PQ: [(2,C), (4,B)]

  Step 2: Extract C(2).  Relax C->D(7).
          PQ: [(4,B), (7,D)]

  Step 3: Extract B(4).  <- FINALIZED B = 4  X WRONG
          Relax B->E(5).
          PQ: [(5,E), (7,D)]

  Step 4: Extract E(5).  PQ: [(7,D)]

  Step 5: Extract D(7).  Relax D->B: 7+(-6)=1 < 4
          But B is already finalized!
          +--------------------------------------------+
          | TRUE shortest to B: A->C->D->B = 2+5-6 = 1   |
          | Dijkstra reported:  A->B     = 4       X     |
          +--------------------------------------------+
          The "later ripple" (dist 1) beat the "earlier"
          one (dist 4), violating the greedy invariant.

Snapshot 3 -- Stale PQ entry and the staleness check:

Add edge A->D (weight 9) to the original graph. D gets pushed twice: first as {9,D}, then as {7,D} when C->D improves it.

         4       1
   (A)------>(B)------>(E)
    |  \                ^
    |2  \9             |3
    v    v             |
   (C)------>(D)-------+
         5

  After Step 1 (extract A):
    Relax A->B(4), A->C(2), A->D(9).
    PQ: [(2,C), (4,B), (9,D)]

  After Step 2 (extract C):
    Relax C->D: d[D] improved 9->7.  Push {7,D}.
    PQ: [(4,B), (7,D), (9,D)]
                        ^^^^^
                        STALE -- old distance, still in PQ

  After Step 4 (extract D at dist 7):
    D finalized with d[D]=7.
    PQ: [(9,D)]   <- stale entry remains

  Step 5: Extract (9, D).
    Check: du=9 > d[D]=7  ->  SKIP  (staleness check)
    +--------------------------------------------------+
    |  if (du > d[u]) continue;  // skips stale entry  |
    +--------------------------------------------------+
    Without this check, we'd re-process D and waste time.
    On dense graphs, this turns O((V+E)log V) into TLE.

What the Code Won't Teach You

1. The staleness check is fundamental, not a micro-optimization.

std::priority_queue has no decrease-key operation. When we find a shorter path to node v, we cannot update its existing PQ entry -- we push a new entry {new_dist, v} and leave the old one behind. This means the PQ accumulates stale (outdated) entries. The single line

cpp
if (du > d[u]) continue;   // stale entry -- skip

is what makes lazy deletion work. Without it, every stale entry triggers a full round of useless relaxations. This is not an optimization you "can add later" -- it is part of the algorithm.

  PQ internals during execution:
  +-------+------+-------------------+
  | dist  | node | status            |
  +-------+------+-------------------+
  |   3   |  v   | CURRENT (d[v]=3)  |  <- process this
  |   7   |  v   | STALE   (d[v]=3)  |  <- du=7 > d[v]=3 -> skip
  |   9   |  w   | CURRENT (d[w]=9)  |  <- process this
  +-------+------+-------------------+

2. Dijkstra = BFS + priority queue.

BFS assumes every edge costs 1, so a FIFO queue processes nodes in order of hop count. Dijkstra generalises: edges have varying costs, so we swap the FIFO queue for a min-heap that always extracts the closest unfinalized node. Both algorithms share the same skeleton -- "process the closest node next, relax its neighbours."

  BFS:       queue   (FIFO)     all edges cost 1     O(V + E)
  Dijkstra:  min-heap           edges cost >= 0       O((V+E) log V)
  -------------------------------------------------------------
  Same core loop:
    while (not empty):
      u = extract closest unfinalized node
      for each neighbour v of u:
        if d[u] + w(u,v) < d[v]:
          d[v] = d[u] + w(u,v)
          push v

3. The greedy invariant is WHY negative edges break Dijkstra.

When node u is extracted from the PQ, d[u] is final because every remaining path to u must go through nodes still in the PQ, all of which have distance >= d[u]. No remaining path can beat d[u] -- so the greedy choice is safe. A negative-weight edge breaks this: a path through a higher-distance node can decrease on the way back to u, arriving with total cost < d[u] after u was already finalized.

How This Evolved

Shortest-path algorithms evolved along a clear axis: handling increasingly general edge weights. BFS came first -- it finds shortest paths in unweighted graphs by exploring nodes layer by layer, one hop at a time. Every edge has implicit weight 1, and the FIFO queue guarantees nodes are finalized in order of distance. BFS is O(V+E) and remains the best choice when all edges are equal.

Dijkstra's algorithm generalizes BFS to non-negative weights by replacing the FIFO queue with a priority queue. Instead of processing nodes by hop count, it processes them by tentative distance -- smallest first. The greedy invariant ("the node with the smallest tentative distance is already optimal") holds precisely because no edge can decrease a path's total cost. With a binary heap this runs in O((V+E)logV); with a Fibonacci heap, O(VlogV+E).

But what if edges can be negative? Dijkstra's greedy choice fails -- a longer path through a negative edge might beat a shorter one. Bellman-Ford handles this by relaxing every edge V1 times, running in O(VE). It also detects negative-weight cycles (if any distance decreases on a V-th pass, a negative cycle exists). For the all-pairs problem on graphs with negative edges but no negative cycles, Johnson's algorithm combines Bellman-Ford and Dijkstra: run Bellman-Ford once from a virtual source to compute a potential function that reweights all edges to be non-negative, then run Dijkstra from every vertex on the reweighted graph. This gives O(V2logV+VE) -- far better than running Bellman-Ford V times.


C++ STL Reference

Function / ClassHeaderWhat it doesTime
priority_queue<T><queue>Max-heap by defaultpush/pop O(logn)
priority_queue<T, vector<T>, greater<T>><queue>Min-heap (what we need)push/pop O(logn)
pair<long long, int><utility>{dist, node} -- pairs compare lexicographically, so min-heap on first (distance)O(1) compare
vector<vector<pair<int,int>>><vector>Adjacency list: adj[u] = {{v, w}, ...}O(1) amortized push
numeric_limits<long long>::max()<limits>Use as . Or just 1e18.-

Key point: Store pairs as {distance, node} in the PQ, not {node, distance}. The min-heap compares by the first element, which must be the distance.

Checkpoint: C++'s priority_queue has no decrease_key operation. How does Dijkstra handle finding a shorter path to a node that's already in the priority queue?

Think first, then click It uses "lazy deletion": push a *new* entry `{shorter_dist, v}` into the PQ alongside the old stale entry. When a stale entry is later extracted, check `if (du > d[u]) continue;` -- the distance in the PQ entry is worse than the already-recorded shortest distance, so skip it. This adds at most `O(E)` entries to the PQ but maintains correctness.

Implementation (Contest-Ready)

Read this before the code: lazy heap deletion

Every contest Dijkstra implementation in this book uses lazy deletion, and the pattern is so important it deserves its own beat before any code appears.

std::priority_queue has no "decrease-key" operation. When you discover a shorter path to node v, you cannot update its existing entry — you push a new entry {shorter_dist, v} and leave the old one behind. The PQ now contains both. Each time you pop, you must distinguish:

  • Current entry: du == d[u]. Process it: relax outgoing edges.
  • Stale entry: du > d[u]. Skip it: a better entry already finalised u (or will).

The two states differ by one line:

cpp
auto [du, u] = pq.top(); pq.pop();
if (du > d[u]) continue;       // stale -- already superseded
// otherwise du == d[u], this is the canonical entry for u

This is not an optimisation you "add later." Without the staleness check, every superseded entry triggers another full round of relaxations, and the algorithm can degenerate badly on dense graphs with many edge improvements. Read the contest implementations below with this in mind: every Dijkstra in CP code allows duplicate PQ entries on purpose, and the staleness check is what keeps it linear in PQ-extractions.

Priority queue pair ordering matters. Use {dist, node}, NOT {node, dist}. priority_queue compares pairs lexicographically -- if the node ID comes first, you're sorting by node ID, not by distance. This is a silent bug that produces wrong shortest paths.

cpp
// WRONG -- sorts by node ID
priority_queue<pair<int,int>, vector<pair<int,int>>, greater<>> pq;
pq.push({node, dist});  // BAD: compares node first

// CORRECT -- sorts by distance
pq.push({dist, node});  // GOOD: smallest distance popped first

Stale entries in the priority queue. When you pop {d, v}, check if (d > dist[v]) continue;. Without this, you process outdated entries and may relax edges from non-optimal distances. The algorithm is still correct without the check (just slower), but it can cause TLE on graphs with many edges.

Version 1 -- Minimal contest template

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;
    vector<vector<pair<int, long long>>> adj(n + 1);
    for (int i = 0; i < m; i++) {
        int u, v;
        long long w;
        cin >> u >> v >> w;
        adj[u].push_back({v, w});
        adj[v].push_back({u, w}); // remove for directed
    }

    int s = 1;
    const long long INF = 1e18;
    vector<long long> d(n + 1, INF);
    d[s] = 0;
    // min-heap of {dist, node}
    priority_queue<pair<long long, int>,
                   vector<pair<long long, int>>,
                   greater<>> pq;
    pq.push({0, s});

    while (!pq.empty()) {
        auto [du, u] = pq.top();
        pq.pop();
        if (du > d[u]) continue; // stale entry
        for (auto [v, w] : adj[u]) {
            if (d[u] + w < d[v]) {
                d[v] = d[u] + w;
                pq.push({d[v], v});
            }
        }
    }

    for (int i = 1; i <= n; i++)
        cout << (d[i] == INF ? -1 : d[i]) << " \n"[i == n];
}

Version 2 -- Explained version with path reconstruction

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n, m;
    cin >> n >> m;

    // Adjacency list: adj[u] stores {neighbour, weight} pairs.
    // 1-indexed so we allocate n+1 slots.
    vector<vector<pair<int, long long>>> adj(n + 1);
    for (int i = 0; i < m; i++) {
        int u, v;
        long long w;
        cin >> u >> v >> w;
        adj[u].push_back({v, w});
        adj[v].push_back({u, w});
    }

    int source = 1, target = n;
    const long long INF = 1e18;

    // d[v] = shortest known distance from source to v.
    vector<long long> d(n + 1, INF);
    // par[v] = predecessor of v on the shortest path (for reconstruction).
    vector<int> par(n + 1, -1);

    d[source] = 0;

    // Min-heap ordered by distance. We store {distance, node}.
    // greater<> makes it a min-heap (smallest distance on top).
    priority_queue<pair<long long, int>,
                   vector<pair<long long, int>>,
                   greater<>> pq;
    pq.push({0, source});

    while (!pq.empty()) {
        auto [du, u] = pq.top();
        pq.pop();

        // If we already found a better path to u, this entry is stale.
        // This is the "lazy deletion" approach -- cheaper than decrease-key.
        if (du > d[u]) continue;

        for (auto [v, w] : adj[u]) {
            // Relaxation: can we improve d[v] by going through u?
            if (d[u] + w < d[v]) {
                d[v] = d[u] + w;
                par[v] = u;
                pq.push({d[v], v});
            }
        }
    }

    if (d[target] == INF) {
        cout << -1 << "\n";
        return 0;
    }

    cout << d[target] << "\n";

    // Reconstruct path by walking parent pointers backwards.
    vector<int> path;
    for (int v = target; v != -1; v = par[v])
        path.push_back(v);
    reverse(path.begin(), path.end());

    for (int i = 0; i < (int)path.size(); i++)
        cout << path[i] << " \n"[i + 1 == (int)path.size()];
}

Operations Reference

Why a priority queue? Dijkstra's core loop extracts the unvisited node with smallest tentative distance. Without a heap, finding that node requires scanning all V entries -- O(V) per extraction, O(V2) total. A min-heap brings extraction down to O(logV), giving O((V+E)logV) overall. For sparse graphs (EV, e.g. V=106) this is the difference between 1012 and 2×107 operations -- easily the difference between TLE and AC. For dense graphs (EV2), the O(V2) no-heap version is competitive because the logV factor from the heap can hurt; but in CP, most graphs are sparse and the priority_queue version is the default.

OperationTimeSpace
Full Dijkstra (binary heap)O((V+E)logV)O(V+E)
Full Dijkstra (Fibonacci heap)O(VlogV+E)O(V+E)
Full Dijkstra (dense, no heap)O(V2)O(V+E)
Single extraction from PQO(logV)-
Single relaxationO(logV)-
Path reconstructionO(V)O(V)

Notes:

  • The binary heap version is best for sparse graphs (EV). For dense graphs (EV2), the O(V2) version without a heap can be faster due to lower constant factors.
  • Fibonacci heap is theoretically better but never used in CP -- the constant factor is huge.
  • In practice, the STL priority_queue version handles V,E106 comfortably within 1-2 seconds.
  • Space is dominated by the adjacency list (O(V+E)) and the distance array (O(V)).

Problem Patterns & Tricks

Pattern 1 -- Classic SSSP

Straightforward single-source shortest path. Read graph, run Dijkstra, output distances.

Recognize it: "Find the shortest path from s to t" with non-negative weights.

CF 20C -- Dijkstra?
CSES -- Shortest Routes I

Pattern 2 -- State-space Dijkstra (modified states)

The "vertex" in Dijkstra is not just a node but a tuple (node, extra_state). This handles problems where the cost depends on additional information.

Examples: shortest path with at most k toll roads, shortest path where you can halve one edge, shortest path with fuel constraints.

cpp
// State: {node, num_edges_halved}
// dist[v][k] = shortest path to v having halved k edges
vector<vector<long long>> d(n + 1, vector<long long>(2, INF));
d[s][0] = 0;
// push {dist, node, halved_count}
pq.push({0, s, 0});
CF 1473E -- Minimum Path (use one free +/- edge)
CF 1816D -- Grid with abilities (state = position + power used)

Pattern 3 -- Multi-source Dijkstra

Push multiple sources into the PQ with distance 0. This finds the shortest distance from any source to every vertex -- equivalent to adding a virtual super-source with zero-weight edges to all real sources.

cpp
for (int s : sources) {
    d[s] = 0;
    pq.push({0, s});
}
CF 1528A -- Parsa's Humongous Tree (multi-source on tree)
CF 1272E -- Nearest opposite-color node

Pattern 4 -- Dijkstra on a grid

The graph is implicit: cells are nodes, adjacent cells share edges. Weight = cost to enter the next cell. Build the adjacency on-the-fly instead of storing an explicit graph.

cpp
int dx[] = {0, 0, 1, -1};
int dy[] = {1, -1, 0, 0};
// In the main loop:
for (int dir = 0; dir < 4; dir++) {
    int nx = x + dx[dir], ny = y + dy[dir];
    if (nx < 0 || nx >= R || ny < 0 || ny >= C) continue;
    long long nw = d[x][y] + cost[nx][ny];
    if (nw < d[nx][ny]) {
        d[nx][ny] = nw;
        pq.push({nw, nx * C + ny}); // encode (row,col) as single int
    }
}
CF 1063B -- Labyrinth (0-1 BFS variant, but Dijkstra also works)
CF 1579E2 -- Grid paths with costs

Pattern 5 -- Shortest path with reconstruction

Run Dijkstra while maintaining a par[] array. Reconstruct by walking backwards from target to source. See Version 2 implementation above.

CF 20C -- Dijkstra?
CSES -- Shortest Routes I (with path output)

Pattern 6 -- K-shortest paths (Yen's algorithm wrapper)

Find the k-th shortest path. Run Dijkstra k times, each time forbidding one edge from the previously found paths. Heavy but works for small k.

Simpler variant: allow revisits. Use a counter -- the k-th time a node is popped from the PQ, that is its k-th shortest distance.

cpp
vector<int> cnt(n + 1, 0);
while (!pq.empty()) {
    auto [du, u] = pq.top(); pq.pop();
    cnt[u]++;
    if (cnt[u] == k && u == target) { cout << du; return 0; }
    if (cnt[u] > k) continue;
    for (auto [v, w] : adj[u])
        pq.push({du + w, v});
}
CF 1196F -- K-th Path
CSES -- Flight Routes

Pattern 7 -- Meet-in-the-middle / bidirectional Dijkstra

Run Dijkstra from both source and target. Answer = min(u,v,w)E(ds[u]+w+dt[v]). Useful for single source-target queries on large graphs or when combined with other techniques.

CF 449B -- Jzzhu and Cities (reverse Dijkstra for pruning)

Contest Cheat Sheet

+--------------------------------------------------------------+
|  DIJKSTRA'S ALGORITHM -- QUICK REFERENCE                     |
+--------------------------------------------------------------+
|  WHEN: Non-negative weights, single-source shortest path     |
|  TIME: O((V+E) log V)    SPACE: O(V+E)                      |
+--------------------------------------------------------------+
|  TEMPLATE:                                                   |
|    vector<long long> d(n+1, 1e18);                           |
|    priority_queue<pair<long long,int>,                        |
|      vector<pair<long long,int>>, greater<>> pq;             |
|    d[s]=0; pq.push({0,s});                                   |
|    while(!pq.empty()){                                       |
|      auto [du,u]=pq.top(); pq.pop();                        |
|      if(du>d[u]) continue;                                   |
|      for(auto [v,w]:adj[u])                                  |
|        if(d[u]+w<d[v]) d[v]=d[u]+w, pq.push({d[v],v});      |
|    }                                                         |
+--------------------------------------------------------------+
|  WATCH OUT:                                                  |
|  - {dist,node} NOT {node,dist} in PQ                        |
|  - Use long long if sum of weights > 2e9                     |
|  - if(du>d[u]) continue; is NOT optional (TLE without it)   |
|  - Negative weights? Use Bellman-Ford/SPFA instead           |
|  - Unweighted? Use plain BFS (faster and simpler)            |
+--------------------------------------------------------------+

Common Mistakes & Debugging

The most frequent Dijkstra bugs, roughly ordered by how often they appear in contest submissions:

#MistakeSymptom
1Using Dijkstra with negative edge weightsWA -- greedy invariant breaks silently
2Not using visited/finalized check (if (du > d[u]) continue)TLE -- processes same node many times
3Using adjacency matrix instead of adjacency listTLE -- O(V2) per relaxation step
4Forgetting to initialize distances to infinityWA -- nodes appear reachable at dist 0

Below is the full list with explanations.

1. Negative edge weights break Dijkstra. The greedy invariant ("extracted distance is final") fails. Dijkstra may produce wrong answers silently -- no runtime error, just wrong output. If the problem has negative weights, use Bellman-Ford or SPFA.

2. {dist, node} vs {node, dist} in the PQ.priority_queue with greater<> compares pairs lexicographically. If you store {node, dist}, the heap orders by node ID, not distance. Your code compiles fine but gives wrong answers. Always {dist, node}.

3. The if (du > d[u]) continue; check. Without this, you process stale entries -- vertices whose distance was already improved. On dense graphs this turns O((V+E)logV) into O(ElogE) or worse, causing TLE. This single line is the difference between AC and TLE on tight limits.

4. Integer overflow. If weights are up to 109 and the path has up to 105 edges, the total distance can reach 1014. Use long long for distances. Set INF = 1e18 (not INT_MAX -- adding to INT_MAX overflows).

5. 1-indexed vs 0-indexed. If vertices are 1-indexed (Codeforces standard), allocate n + 1 entries. Off-by-one here causes out-of-bounds writes that corrupt data silently.

6. Forgetting greater<> for min-heap. Default priority_queue is a max-heap. You extract the largest distance first, which is the exact opposite of what you want. Results: wrong answers or infinite loops.

7. Parallel edges and self-loops. The algorithm handles these correctly without special cases. Parallel edges: the relaxation naturally picks the best. Self-loops: d[u]+wd[u] so they never improve anything. Don't add special-case code.

8. Disconnected components. If target is unreachable, d[target] stays . Check for this before outputting. Print 1 or whatever the problem specifies.

9. Multi-edge insertion. When the problem says "there may be multiple edges between the same pair of vertices," do NOT deduplicate. Just add all edges to the adjacency list. Dijkstra handles it.

Debugging checklist:

  • Print the distance array after Dijkstra. Are disconnected nodes still ?
  • Print each extraction: cerr << "pop " << u << " d=" << du << "\n";
  • Check that the adjacency list has the right number of edges (count them).
  • For wrong answer: trace the relaxation on paper with a 4-5 node example.

Mental Traps

Trap 1: "I can handle negative weights by adding a large constant to all edges."

This is wrong. Adding a constant K to every edge changes the relative cost of paths with different edge counts. A path with fewer hops gets penalised less than a path with more hops, destroying the original shortest-path ordering.

  Original graph:                 After adding K=10 to every edge:

       1        -3                     11        7
  (A)---->(B)------>(C)           (A)---->(B)------>(C)
   |                 ^             |                 ^
   |  4              |             | 14              |
   +-------->(D)-----+            +-------->(D)-----+
          2                              12

  Path via B,C: 1 + (-3) = -2         11 + 7  = 18   (2 edges, +2K)
  Path via D,C: 4 +   2  =  6         14 + 12 = 26   (2 edges, +2K)
  Direct A->D->C: 4 + 2    =  6         14 + 12 = 26

  Now add a direct edge A->C with weight 5 (original):
  Path A->C:     5                      15             (1 edge,  +1K)
  Path A->B->C:   1 + (-3) = -2         11 + 7  = 18   (2 edges, +2K)

  Original winner:  A->B->C  (cost -2)
  Shifted winner:   A->C    (cost 15)   <- DIFFERENT ANSWER
                                          +K per edge changes ranking

Trap 2: "State-space Dijkstra is just Dijkstra with more nodes -- I only need to check d[u]."

In state-space Dijkstra (e.g., d[node][extra_state]), the staleness check must compare all state dimensions, not just the node. A PQ entry {dist, node, state} is stale if dist > d[node][state], not if dist > d[node] (which ignores the extra dimension).

  State-space example: d[node][used_coupon]
  PQ entry: {dist=5, node=B, coupon=0}

  WRONG staleness check:             CORRECT staleness check:
  if (du > d[B]) continue;           if (du > d[B][0]) continue;
        ^                                       ^^^^
        ignores coupon dimension                checks full state

  Why it matters:
  +------+------------+------------+
  | node | d[v][0]    | d[v][1]    |
  +------+------------+------------+
  |  B   |   5        |   3        |
  +------+------------+------------+

  PQ: {5, B, coupon=0}  <- current for d[B][0]=5, should process
  Checking d[B] alone might see d[B][1]=3 < 5 and wrongly skip!

Self-Test

Can you do these without looking anything up?

  • [ ] Write Dijkstra from memory -- min-heap declaration, {dist, node} pairs, staleness check, relaxation
  • [ ] Explain why {dist, node} must be in this order with a concrete counter-example
  • [ ] State why Dijkstra fails on negative edges using the greedy invariant argument
  • [ ] Modify Dijkstra for state-space problems -- describe how d[node][extra] and PQ entries change
  • [ ] Identify the integer overflow and max-heap bugs and describe their symptoms

Quick-fire questions:

Q1: What is the time complexity of Dijkstra with a binary heap?

AnswerO((V+E) log V). Each of the V nodes is extracted from the heap once (O(log V) each), and each of the E edges may trigger a push to the heap (O(log V) each). Total: O((V+E) log V).

Q2: Why does Dijkstra fail on graphs with negative edge weights? Give a concrete 3-node example.

AnswerDijkstra's greedy invariant assumes that once a node is popped, its distance is final. Negative edges violate this. Example: A->B weight 2, A->C weight 3, C->B weight -2. Dijkstra pops B with dist=2, but the path A->C->B has cost 1. Since B is already finalized, the shorter path is missed.

Q3: In the priority queue implementation, why do we store pairs as {dist, node} and not {node, dist}?

AnswerC++ `priority_queue` (min-heap via `greater<>`) compares by the first element. Storing `{dist, node}` ensures the node with smallest distance is popped first. If we stored `{node, dist}`, nodes would be ordered by node ID, not by distance.

Q4: What is the "staleness check" in Dijkstra, and what happens if you omit it?

AnswerWhen popping `{d, u}`, we check `if (d > dist[u]) continue;`. This skips stale entries -- earlier pushes with longer distances. Without it, we'd re-relax neighbors from outdated distances, potentially causing TLE from exponential re-processing.

Q5: You need shortest paths where each edge has weight 0 or 1. Dijkstra works but what is faster and why?

Answer0-1 BFS using a deque: push weight-0 neighbors to the front, weight-1 neighbors to the back. Runs in O(V+E) vs Dijkstra's O((V+E) log V), because the deque maintains sorted-by-distance order without heap operations.

Practice Problems

#ProblemSourceDifficultyKey Idea
1Shortest Routes ICSESEasyTextbook Dijkstra, just implement it
2Dijkstra?CF 20CEasySSSP with path reconstruction
3Flight DiscountCSESEasy-MedState-space Dijkstra: d[v][0/1] (used discount or not)
4Minimum PathCF 1473EMediumModified Dijkstra with extra state for free +/- edge
5Shortest Routes IICSESMediumAll-pairs (Floyd-Warshall), know when NOT to use Dijkstra
6Flight RoutesCSESMediumK-shortest paths via repeated extraction
7Jzzhu and CitiesCF 449BMedium-HardDijkstra + edge pruning, count shortest-path edges
8CivilizationCF 455CMedium-HardDSU + tree diameter + BFS/Dijkstra on trees
9K-th PathCF 1196FHardPrune graph to relevant edges, then k-shortest paths
10Road OptimizationCF 1625CHardDP on shortest path with at most k removals

Pattern Fingerprint

Constraint / Signal-> Use
"Shortest path" + non-negative weights ->Dijkstra
"Shortest path" + weights in {0, 1} ->0-1 BFS (faster)
"Shortest path" + negative weights ->Bellman-Ford (not Dijkstra!)
"Shortest path from one source to all nodes" ->Dijkstra (SSSP)
"Shortest path with state (fuel, tickets, etc.)" ->Dijkstra on expanded state graph
n105, m2×105 ->Dijkstra with binary heap: O((V+E)logV)

Rating Progression

CF RatingWhat You Should Be Comfortable With
1200Template Dijkstra, reading weighted graphs, shortest path from s to t
1500Multi-source Dijkstra, Dijkstra on grids, reconstructing the actual shortest path
1800Dijkstra with extended states (layers, bitmasks), k-shortest paths, Dijkstra + binary search
2100Dijkstra on virtual/segment-tree graphs, potential functions (Johnson's), Dijkstra with online edge modifications

What Would You Do If...?

  1. ...the graph has negative edge weights? You CANNOT use Dijkstra. Use Bellman-Ford (O(VE)) or, if no negative cycles, use Johnson's algorithm (reweight with potentials, then run Dijkstra from each source).

  2. ...you need the shortest path using at most k "free" edges (weight set to 0)? Build a layered graph: k+1 copies of the original graph. A "free" edge moves you from layer i to layer i+1. Run Dijkstra on this expanded graph with O(kn) nodes.

  3. ...there are 105 sources and you need shortest paths from all of them? If to a single target, run Dijkstra from the target on the reversed graph. If all-pairs, consider Floyd-Warshall (if n500) or run Dijkstra n times (O(n(n+m)logn)).

Debug This

Bug 1: Dijkstra gives wrong answers on some test cases:

cpp
priority_queue<pair<int,int>> pq;  // max-heap!
pq.push({0, src});
Answer

priority_queue<pair<int,int>> in C++ is a max-heap. It pops the node with the largest distance first, which violates Dijkstra's greedy property. Fix: use priority_queue<pair<int,int>, vector<pair<int,int>>, greater<pair<int,int>>> for a min-heap, or push negated distances with a max-heap.

Bug 2: Distance to some reachable nodes stays at INF:

cpp
vector<long long> dist(n + 1, 1e18);
dist[src] = 0;
pq.push({0, src});
while (!pq.empty()) {
    auto [d, u] = pq.top(); pq.pop();
    if (d > dist[u]) continue;
    for (auto& [v, w] : adj[u]) {
        if (dist[u] + w < dist[v]) {
            dist[v] = dist[u] + w;
            pq.push({dist[v], v});
        }
    }
}
// dist[target] is still 1e18...
Answer

The code logic is correct. If dist[target] is still 1e18, the target is genuinely unreachable from src. Check: (1) is the graph directed when it should be undirected? (2) Did you add edges in both directions for undirected graphs? (3) Are you using 1-indexed nodes with adj of size n+1?

Bug 3: Dijkstra with state gives TLE:

cpp
// Dijkstra on (node, fuel_remaining) state
// fuel can be 0..K where K = 10000, n = 100000
dist[src][K] = 0;
pq.push({0, src, K});
while (!pq.empty()) { ... }
Answer

The state space is n×K=105×104=109, which is way too large. The distance array alone would need ~8 GB. Fix: reduce the state space. Often, fuel/ticket states can be bounded by a small constant. Alternatively, binary search on the answer and use plain Dijkstra inside.

Historical Origin

Dijkstra's algorithm was conceived by Edsger W. Dijkstra in 1956 (published 1959) while sitting at a cafe in Amsterdam. He designed it in about 20 minutes to demonstrate the capabilities of the ARMAC computer. It remains one of the most widely used algorithms in computer science, from GPS navigation to internet routing.


Designing Test Cases

Dijkstra bugs usually show up on non-trivial graph structure -- not on the friendly sample inputs. Test these before submitting.

Minimal cases:

  • Single node: n = 1, source = 0. Distance to itself should be 0, no edges needed.
  • Two nodes: One edge 0 -> 1 with weight w. Distance is w. Also test the reverse query 1 -> 0 (should be unreachable in directed graph, w in undirected).

Edge cases:

  • Disconnected graph: Source can't reach some nodes. Verify those distances stay at INF, not 0 or garbage.
  • Self-loop: Edge u -> u with weight w. Should not affect dist[u] (already 0). Catches bugs where you relax the source node.
  • Greedy trap -- direct path isn't shortest: Classic triangle: 0->1 weight 10, 0->2 weight 3, 2->1 weight 2. Shortest to node 1 is 5 (via 2), not 10. If your code returns 10, you're not relaxing properly.
  • Multiple edges between same pair: Two edges u->v with different weights. Dijkstra should find the better one naturally.
  • Large weights near overflow: Weights up to 10^9 on a path of length 10^5. Total distance is 10^{14} -- make sure you're using long long.

Stress test (run locally before submitting):

cpp
// Compare Dijkstra against Floyd-Warshall on small random graphs.
mt19937 rng(42);
for (int iter = 0; iter < 10000; iter++) {
    int n = rng() % 8 + 2;  // 2..9 nodes
    int m = rng() % (n * n);
    // build random graph, weights in [1, 100]
    // run Dijkstra from node 0
    // run Floyd-Warshall on same graph
    // assert all distances match
}

Keep n <= 9 so Floyd-Warshall's O(n^3) is instant. This catches relaxation order bugs, visited-set mistakes, and priority queue misuse.


Further Reading

See also:

  • 01-graph-representation.md -- adjacency list construction
  • 09-priority-queue-and-heaps.md -- heap internals and STL usage
  • 08-bellman-ford.md -- when you need negative weights
  • bfs-and-dfs.md -- BFS for unweighted shortest paths

How to Practice This

Speed Drill -- Dijkstra with Priority Queue

Set a timer and implement Dijkstra's algorithm (adjacency list + min-heap priority queue, compute shortest distances from a source) from scratch. No references.

AttemptTargetNotes
1st< 8 minFocus on correctness -- proper relaxation, PQ with {dist, node}, skip stale entries.
2nd< 6 minEliminate pauses -- the PQ loop and relaxation should be automatic.
3rd+< 4 minCompetition-ready. Dijkstra is the workhorse of weighted shortest paths.

Variation Drills

Once basic Dijkstra is automatic, drill this essential extension:

VariationKey changePractice problem
With path reconstructionStore parent[v] during relaxation; trace back from target to sourceCSES Shortest Routes I + print path

Implement, verify correctness, then time yourself.

Practice Strategy

  1. Week 1: Drill basic Dijkstra daily until you hit 4 min consistently.
  2. Week 2: Add path reconstruction -- practice printing the actual shortest path, not just the distance.
  3. Week 3: Practice Dijkstra on state-space graphs (e.g., (node, fuel) or (node, bitmask)).
  4. Ongoing: For weighted shortest paths with non-negative weights, Dijkstra is the default. If weights can be negative, switch to Bellman-Ford.

Reconstruct It

Setup: You know BFS finds shortest paths in unweighted graphs in O(V+E). Now the edges have positive weights. BFS no longer works because one hop != one unit of distance.

How do you modify BFS to handle weighted edges?

Constraint hint: BFS processes nodes in order of number of edges from the source. You need to process them in order of total weight from the source.

Step 1 -- What breaks in BFS? BFS uses a FIFO queue, which guarantees nodes are visited in non-decreasing order of hops. But with weights, a node reached in fewer hops may have a longer total distance than one reached in more hops. You need a data structure that always gives you the closest unfinished node.

Step 2 -- Replace the queue. A min-heap (priority queue) ordered by current distance gives you the unprocessed node with smallest distance. Pop it, and relax its neighbors: if dist[u] + w(u,v) < dist[v], update dist[v] and push (dist[v], v).

Step 3 -- Why is a popped node final? When you pop node u, every node still in the PQ has distance dist[u] (because it's a min-heap and all weights are non-negative). So no future relaxation can improve u -- its distance is final. This greedy property is the heart of Dijkstra's correctness.

Full derivation

Initialize dist[source] = 0, all others INF. Push (0, source) into a min-priority queue.

Main loop: Pop (d, u) from the PQ. If d > dist[u], this is a stale entry -- skip it. Otherwise, for each neighbor v with edge weight w:

  • If dist[u] + w < dist[v], set dist[v] = dist[u] + w and push (dist[v], v).

Why it works: The PQ ensures we always process the globally closest unfinished node. Since all edge weights are non-negative, once a node is popped, no shorter path to it can exist -- every future path through other nodes would be at least as long. This is a greedy algorithm: "always extend the shortest known path."

Complexity: Each node is popped at most once (stale entries are skipped). Each edge causes at most one push. With a binary heap: O((V+E)logV). This is BFS generalized to weighted graphs -- replace the FIFO queue with a priority queue, and "hop count" with "total distance."


Solve With Me -- CF 20C "Dijkstra?"

The problem title is literally "Dijkstra?" -- OK, no guessing what algorithm to use. Given a weighted undirected graph, find the shortest path from vertex 1 to vertex n, and print the actual path (not just the distance).

Straightforward plan: Dijkstra from source 1, track predecessors, reconstruct path at the end. I've written this a hundred times. Let me code it up... priority queue of (distance, node), dist[] array initialized to infinity, prev[] array for path reconstruction.

Submitted. Wrong answer on test 3.

Huh. Let me re-read the constraints. n,m105, weights up to 106. So max distance could be 105×106=1011 which overflows int. Classic trap. Changed dist[] to long long. Still WA.

OK let me re-examine my path reconstruction. I'm doing prev[v] = u when I relax edge (u,v). Then walking backwards from n to 1. But wait -- I'm printing the path in reverse. I need to reverse it. Did I forget to reverse? ...Yes. Yes I did. That's embarrassing. Fix: push nodes onto a vector, then reverse.

Actually there's another subtlety: if vertex n is unreachable, I should print 1. My code was printing an empty path instead. Added the check.

Now AC. Total debugging time: 15 minutes for two trivial bugs. The algorithm itself was never the issue -- it's always the implementation details that get you.

One thing I noticed in other people's solutions: some use visited[] to skip already-processed nodes (the "lazy deletion" approach), others check if dist > d[node] at the top of the loop. Both work. I prefer the distance check because it's one fewer array.

What to recognize next time: On "find shortest path" problems, the path reconstruction is where bugs hide. Always: (1) use long long when max distance can exceed 2×109, (2) remember to reverse the backtracked path, (3) handle the unreachable case. The algorithm is the easy part.

  • DP on DAG -- faster than Dijkstra when the graph is a DAG, runs in O(V+E)
  • Bellman-Ford -- handles negative edge weights where Dijkstra cannot
  • BFS -- equivalent to Dijkstra on unweighted graphs, with simpler O(V+E) implementation
  • Priority Queue and Heaps -- the priority queue is the engine that makes Dijkstra efficient

Recap

Dijkstra's algorithm is greedy BFS with a price tag. Instead of exploring the nearest node by hop count (BFS), it explores the nearest node by total cost. The priority queue replaces BFS's plain queue, and the greedy choice -- "the closest unfinalized node will never get a shorter path" -- is valid because all edge weights are non-negative. Break that assumption (negative edges), and the greedy choice fails.

Key takeaways:

  • Core idea: always finalize the unvisited node with smallest tentative distance.
  • Invariant: once popped from the PQ, a node's distance is optimal.
  • Complexity: O((V + E) log V) with a binary heap.
  • Classic bugs: wrong PQ pair order, missing staleness check, negative weights, int overflow.
  • Alternatives: BFS (unweighted), 0-1 BFS (weights in {0,1}), DP on DAG, Bellman-Ford (negative weights).

"Greedy by distance: closest unvisited node can't get cheaper." That is the invariant. Non-negative weights make it true. Negative weights break it.

Built for the climb from Codeforces 1100 to 2100.