Skip to content

Persistent DSU (Union-Find with Rollback)

Quick Revisit: Rollback-friendly union-find. Drop path compression, keep union-by-rank, push every structural change onto a history stack. find costs O(logn) instead of near-O(1), but every union becomes undoable in O(1). The killer application is offline dynamic connectivity: segment-tree D&C over a query timeline, adding edges on the way down and rolling them back on the way up.

Difficulty: [Advanced]Prerequisites: Union-Find (DSU)CF Rating Range: 2000–2400+


Why You'd Want to Undo a Union

Consider offline dynamic connectivity: edges are added and removed. Regular DSU handles "add edge" but has no "remove edge." If you process queries with divide and conquer on time, each edge is active during a contiguous interval [l,r]. You add edges when entering a time range and undo them when leaving.

text
  Queries timeline:  [-----edge A-----]
                          [---edge B---]
                                [--edge C--]

  D&C on time:
  solve([0, 7]):
    solve([0, 3]):     add A, solve, UNDO A
    solve([4, 7]):     add A, add B
      solve([4, 5]):   solve, UNDO B
      solve([6, 7]):   add C, solve, UNDO C, UNDO B
                       UNDO A

Each undo reverses the most recent union—a stack.

Union-by-Rank Without Path Compression

Path compression flattens the tree, which is destructive. Without it, find() walks to the root in O(logn) (guaranteed by union-by-rank). The parent array stays a real tree, so undoing a union is just a matter of restoring the saved parent and rank.

text
  Before union(2, 5):           After union(2, 5):
                                (root of 5's tree becomes child of 2's root)
    1           4                     1
   / \         / \                   / \
  2   3       5   6                 2   3
                                   /
                                  4
                                 / \
                                5   6

  Saved on stack: (4, old_parent=4, old_rank=1)
  To undo: restore parent[4]=4, rank[4]=1

Implementation

cpp
#include <bits/stdc++.h>
using namespace std;

struct PersistentDSU {
    vector<int> par, rnk;
    struct Snap { int v, old_par, old_rnk; };
    vector<Snap> history;

    PersistentDSU(int n) : par(n), rnk(n, 0) {
        iota(par.begin(), par.end(), 0);
    }

    int find(int x) {
        while (par[x] != x) x = par[x];  // no path compression
        return x;
    }

    bool unite(int a, int b) {
        a = find(a); b = find(b);
        if (a == b) return false;
        if (rnk[a] < rnk[b]) swap(a, b);
        history.push_back({b, par[b], rnk[a]});
        par[b] = a;
        if (rnk[a] == rnk[b]) rnk[a]++;
        return true;
    }

    int save() { return history.size(); }

    void rollback(int checkpoint) {
        while ((int)history.size() > checkpoint) {
            auto [v, op, ork] = history.back();
            history.pop_back();
            par[v] = op;
            rnk[find(v)] = ork;  // restore rank of the root
        }
    }
};

save() returns a checkpoint. rollback(checkpoint) undoes all unions after that point.

Offline Dynamic Connectivity

The standard application. You have q queries: "add edge (u,v)", "remove edge (u,v)", "is u connected to v?" Process offline:

  1. For each edge, find its active interval [tadd,tremove).
  2. Build a segment tree over the timeline [0,q). Add each edge to all O(logq) nodes covering its interval.
  3. DFS over the segment tree. When entering a node, unite all its edges. When leaving, roll back. At leaves, answer connectivity queries.
cpp
void solve(int node, int lo, int hi, PersistentDSU& dsu) {
    int checkpoint = dsu.save();
    for (auto [u, v] : edges_at[node])
        dsu.unite(u, v);

    if (lo == hi) {
        // answer query at time lo
        if (query_at[lo].type == ASK) {
            int a = query_at[lo].u, b = query_at[lo].v;
            answers[lo] = (dsu.find(a) == dsu.find(b));
        }
    } else {
        int mid = (lo + hi) / 2;
        solve(2*node, lo, mid, dsu);
        solve(2*node+1, mid+1, hi, dsu);
    }

    dsu.rollback(checkpoint);
}

Total complexity: O((n+q)log2n)—each edge is united and rolled back O(logq) times, and each union or find costs O(logn).

Complexity Comparison

DSU variantFindUniteUndoSpace
Standard (rank + compression)O(α(n))O(α(n))NoO(n)
Persistent (rank only)O(logn)O(logn)O(1)O(n+ops)

The logn slowdown is the price of undo. In practice the constant is small—rollback DSU is fast enough that the overhead rarely shows up in contest profiles.

Pitfalls

  • Don't use path compression. Even partial compression (halving) breaks rollback. Stick to pure union-by-rank.
  • Rollback order matters. You must undo in reverse chronological order (LIFO). The stack handles this naturally, but don't try to undo arbitrary operations out of order.
  • Save the rank correctly. When undoing a union, you must restore the rank of the new root—the one whose rank may have increased. Save that rank before incrementing it.
  • History memory. Each union adds one entry. For O(qlogq) unions in the D&C approach, the stack can grow to O(qlogq) entries.

Practice Problems

#ProblemDifficultyNotes
1CF 1681F—Unique Occurrences2400D&C on tree edges + rollback DSU
2CF 813F—Bipartite Checking2400Online bipartiteness with rollback DSU
3CF 891C—Envy2300MST verification with rollback
4CSES—Dynamic ConnectivityTextbook offline dynamic connectivity

Further Reading

Built for the climb from Codeforces 1100 to 2100.