Functional Graphs

Quick Revisit
USE WHEN: Each node has exactly one successor and you need $k$ -th successor queries or cycle detection (permutation orbits, iterated maps)
INVARIANT: Binary lifting table: $\text{up}[v][j] = $ node reached from $v$ after $2^{j}$ steps; any $k$ -step query decomposes into $O (\log k)$ hops
TIME: O(n log k) build, O(log k) per query | SPACE: O(n log k)
CLASSIC BUG: Off-by-one at tail-cycle boundary in Floyd's algorithm, or forgetting that cycle entry ≠ first repeated node
PRACTICE: 05-ladder-graphs

AL-41 | A graph where every node has out-degree 1 -- each node points to exactly one successor. The structure decomposes into tails feeding into disjoint cycles (rho shapes). Key technique: binary lifting to answer " $k$ -th successor" queries in $O (\log k)$ .

Difficulty: [Intermediate]Prerequisites: DFS, Binary Lifting / LCACF Rating Range: 1500 -- 1900+

Intuition
C++ STL Reference
Implementation (Contest-Ready)
Operations Reference
Problem Patterns & Tricks
Contest Cheat Sheet
Gotchas & Debugging
Self-Test
Practice Problems
Mastery Corner
Further Reading

Intuition

The Pain

You have 8 nodes ( $0$ through $7$ ). Each points to exactly one successor:

Node $v$	0	1	2	3	4	5	6	7
$n x t [v]$	1	2	3	4	5	3	1	6

Question: Where does node $0$ end up after $k = 10^{9}$ steps?

Naive approach -- simulate one step at a time:

v = 0
for i in 1..k:
    v = nxt[v]     // 10^9 iterations!

For small $k$ , trace by hand:

0 \to 1 \to 2 \to 3 \to 4 \to 5 \to 3 \to 4 \to 5 \to 3 \to 4 \to 5 \to \dots

Notice: once we hit node 3, we cycle through $3 \to 4 \to 5 \to 3$ forever. But the naive code doesn't exploit this -- it repeats the same cycle a billion times.

For $k = 10^{9}$ : $10^{9}$ operations. Way too slow.

We're doing the same jumps over and over. There has to be a better way.

The Key Insight

"Every functional graph decomposes into tails leading into cycles -- and binary lifting (precomputing jumps of size $1, 2, 4, 8, \dots, 2^{29}$ ) answers 'where am I after $k$ steps?' in $O (\log k)$ ."

Think of express trains. Instead of stopping at every station (1 step at a time), you have express routes: a 1-stop, 2-stop, 4-stop, 8-stop express. To travel 11 stations, take the 8-express, then the 2-express, then 1 regular stop -- three hops instead of eleven.

Binary lifting precomputes these "express routes" from every station. Any integer $k$ decomposes into at most $⌈ \log_{2} k ⌉$ powers of two, so the query uses at most 30 hops for $k \leq 10^{9}$ .

Visual Walkthrough

The graph (rho shape):

     7 --> 6 --+
               v
     0 ------> 1 --> 2 --> 3 --> 4
                            ^     |
                            |     v
                            +<--- 5

     |---- tail ----|---- cycle ----|
       (0, 1, 2)      (3, 4, 5)
                      length = 3

Every node eventually reaches the cycle $3 \to 4 \to 5 \to 3$ . The "rho" name comes from the shape: a tail with a loop at the end, like the Greek letter $ρ$ .

Multi-component rho shapes: A functional graph on $n$ nodes can have several independent components, each with its own tail(s) and cycle. Here is an example with 14 nodes split across three components:

  Component A                Component B           Component C
  (tail=3, cycle=3)         (tail=0, cycle=2)     (tail=2, cycle=1)

  0 --> 1 --> 2 --> 3        7 <---+               11 --> 12 --> 13
                    |        |     |                           |
                    v        v     |                           v
                    4        8 ----+                          13
                    |                                   (self-loop)
                    v
                    5
                    |
                    +---> 3  (back to cycle entry)
                   ^
                   |
                   6  (another tail feeding in)

  tail nodes: {0,1,2,6}     tail nodes: {}         tail nodes: {11,12}
  cycle: 3->4->5->3         cycle: 7->8->7         cycle: 13->13
  cycle length: 3            cycle length: 2        cycle length: 1

  Labels:
    "tail"        = nodes with in-path leading TO the cycle
    "cycle entry" = first node where the tail meets the cycle
    "cycle length"= number of distinct nodes on the cycle

Key observations:

Component B is a pure cycle (no tail) -- this happens in permutation graphs.
Component C has a self-loop -- a cycle of length 1.
Component A has two tails (starting at 0 and at 6) feeding into the same cycle.
Total edges = 14 = $n$ (always, since every node has exactly one outgoing edge).

Step 1 -- Detect the cycle (Floyd's tortoise and hare, from node 0):

Two pointers start at node 0. Slow moves 1 step; fast moves 2 steps.

           slow  fast
  Init:      0     0
  Move 1:    1     2    (slow +1, fast +2)
  Move 2:    2     4
  Move 3:    3     3    <-- meet!

They meet at node 3. Now find the cycle entry -- reset slow to start, advance both by 1:

           slow  fast
  Reset:     0     3
  Move 1:    1     4    (both +1)
  Move 2:    2     5
  Move 3:    3     3    <-- cycle entry = node 3

Cycle length: walk from entry until return. $3 \to 4 \to 5 \to 3$ . Length $= 3$ .

Step 2 -- Build the binary lifting table:

$l i f t [v] [j]$ = node reached from $v$ after $2^{j}$ steps.

Base row: $l i f t [v] [0] = n x t [v]$ . Each next row: $l i f t [v] [j] = l i f t [l i f t [v] [j - 1]] [j - 1]$ .

$v$	0	1	2	3	4	5	6	7
$l i f t [v] [0]$ ( $2^{0} = 1$ step)	1	2	3	4	5	3	1	6
$l i f t [v] [1]$ ( $2^{1} = 2$ steps)	2	3	4	5	3	4	2	1
$l i f t [v] [2]$ ( $2^{2} = 4$ steps)	4	5	3	4	5	3	4	3
$l i f t [v] [3]$ ( $2^{3} = 8$ steps)	5	3	4	5	3	4	5	4

Example check: $l i f t [0] [2] = 4$ . Verify: $0 \to 1 \to 2 \to 3 \to 4$ . Four steps from 0 lands on 4. Correct.

Built as: $l i f t [0] [2] = l i f t [l i f t [0] [1]] [1] = l i f t [2] [1] = 4$ . Jump 2 steps from 0 (reach 2), then 2 more from 2 (reach 4).

Step 3 -- Answer the query: node 0 after $k = 11$ steps

$11 = 1011_{2} = 2^{3} + 2^{1} + 2^{0}$

  Start: v = 0

  Bit 0 set (2^0 = 1):  v = lift[0][0] = 1
  Bit 1 set (2^1 = 2):  v = lift[1][1] = 3
  Bit 2 not set          (skip)
  Bit 3 set (2^3 = 8):  v = lift[3][3] = 5

  Answer: node 5

Verify: $0 \to 1 \to 2 \to 3 \to 4 \to 5 \to 3 \to 4 \to 5 \to 3 \to 4 \to 5$ . After 11 steps: node 5. Correct.

Brute force: 11 steps (or $10^{9}$ for the original query). Binary lifting: 3 table lookups. For $k = 10^{9}$ : at most 30 lookups.

The Invariant

+------------------------------------------------------------------+
| INVARIANT: lift[v][j] = the node reached from v after 2^j steps. |
|                                                                   |
| Built bottom-up:                                                  |
|   lift[v][0] = nxt[v]                                            |
|   lift[v][j] = lift[ lift[v][j-1] ][j-1]                        |
|                                                                   |
| "To jump 2^j steps: jump 2^(j-1), then jump 2^(j-1) again."    |
+------------------------------------------------------------------+

Look at the table: $l i f t [7] [3] = 4$ . Node 7 after 8 steps reaches node 4. Computed as $l i f t [l i f t [7] [2]] [2] = l i f t [3] [2] = 4$ . First jump 4 steps from 7 (reach 3), then 4 more from 3 (reach 4).

The invariant holds because $2^{j} = 2^{j - 1} + 2^{j - 1}$ exactly. At Step 3 in the walkthrough above, the query decomposes $k = 11$ into $8 + 2 + 1$ and chains three table lookups -- each one relies on the invariant holding for the current node.

What the Code Won't Teach You

The rho shape is universal -- every functional graph component looks like ρ.

Once you internalize that every component is a tail feeding into a cycle, all functional graph algorithms become obvious. The tail is a simple path (no branching forward). The cycle is the attractor. From any node, follow edges and you must reach the cycle (finite graph, one outgoing edge). This is why cycle detection is $O (n)$ and not harder.

  Every component of a functional graph:

      tail nodes              cycle nodes
    a -> b -> c -> [d -> e -> f -> d]
    |              |                |
    one-way path   one-way loop     back to start
    toward cycle   (length L)

  Key consequences:
    1. Number of edges = n  (one per node, always)
    2. Number of cycles per component = exactly 1
    3. All nodes eventually reach the cycle
    4. k-th step query: O(log k) via binary lifting,
       or O(1) if already on cycle (mod L)

Binary lifting is "express trains" for pointer chasing.

The code builds a table, but the insight is about doubling. Instead of following one edge at a time ( $O (k)$ ), you precompute "where do I land after $2^{0}, 2^{1}, 2^{2}, \dots$ steps?" Since any integer $k$ is a sum of distinct powers of two, you chain at most $⌈ \log_{2} k ⌉$ table lookups. For $k = 10^{18}$ , that is 60 lookups instead of $10^{18}$ steps.

  Naive:  0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 3 -> 4 -> ...
          |    |    |    |    |    |    |    |
          step 1    2    3    4    5    6    7  ... (10^9 steps!)

  Binary lifting:
          0 --[2^0]--> 1 --[2^1]--> 3 --[2^3]--> 5
          |            |            |
          1 step       2 steps      8 steps
          Total: 1 + 2 + 8 = 11 steps in just 3 lookups

Floyd's tortoise-and-hare has three phases, not one.

Many people remember "slow pointer, fast pointer, they meet." That is only Phase 1. Phase 2 (find cycle entry) and Phase 3 (measure cycle length) are equally important and easy to forget.

Cycle entry ≠ first repeated node. This is the single most common mistake in Floyd's algorithm. Run on nxt = [1, 2, 3, 4, 5, 3, 1, 6] starting at node 0:

Phase	Action	tortoise	hare	Notes
1	start the chase	1	2	one and two steps from `start = 0`
1	step	2	4
1	step	3	3	meet — but this is inside the cycle, not the entry
2	reset tortoise to `start`, advance both by 1	tortoise=1, hare=4
2	step	2	5
2	step	3	3	meet again — this is the cycle entry
3	walk from entry until return	4, 5, 3		length = 3

Notice: Phase 1's meeting node (3 in this trace) happens to equal the cycle entry only because the tail is short. In general the meeting node is some arbitrary point inside the cycle, and Phase 2 is what finds the actual entry. The mathematical reason: if the tail has length $t$ and the meeting point is $x$ steps past the entry inside a cycle of length $L$ , then $t + x \equiv 0 (\mod L)$ , so walking $t$ more steps from either start (the new tortoise position) or the meeting point lands on the entry. Skipping Phase 2 returns the meeting node, not the entry — silently wrong on roughly half of inputs.

When to Reach for This

Trigger phrases:

"each node has exactly one outgoing edge" $\to$ functional graph
"follow the function $k$ times" / " $k$ -th successor" $\to$ binary lifting
" $k$ up to $10^{9}$ or $10^{18}$ " with successor queries $\to$ must be $O (\log k)$
"permutation cycles" $\to$ permutations are bijective functional graphs
"eventually periodic sequence" / "rho-shaped structure" $\to$ cycle detection

Counter-examples:

Looks like...	But actually...	Use instead
One outgoing edge, but need shortest distance between two nodes	Not $k$ -th successor -- it is a path problem	Decompose into tail + cycle coordinates
"Follow function $k$ times" but the function changes between queries	Dynamic successor, lifting table invalidated	Rebuild per update or use link-cut trees
"Permutation" but need to compose two different permutations	Composition, not single-successor jumping	Multiply permutation arrays directly

C++ STL Reference

Function / Class	Header	What it does	Time
`vector<array<int,LOG>>`	`<vector>`, `<array>`	2D lifting table; `array` for fixed second dim	$O (1)$ access
`__lg(k)`	built-in	$⌊ \log_{2} (k) ⌋$ ; highest set bit position	$O (1)$
`bit_width(unsigned)`	`<bit>`	C++20: number of bits to represent value	$O (1)$
`vector<int>`	`<vector>`	Successor array, color flags, cycle membership	$O (1)$ access
`tuple<int,int,int>`	`<tuple>`	Return (entry, tail_len, cycle_len) from Floyd's	--

Implementation (Contest-Ready)

Version A -- Minimal Contest Template

Binary lifting for $k$ -th successor + DFS-based cycle decomposition.

cpp

#include <bits/stdc++.h>
using namespace std;

const int LOG = 30; // 2^30 > 10^9; use 60 for k <= 10^18

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n;
    cin >> n;
    vector<int> nxt(n);
    for (int i = 0; i < n; i++) cin >> nxt[i];

    // --- Binary lifting ---
    vector<array<int, LOG>> lift(n);
    for (int v = 0; v < n; v++) lift[v][0] = nxt[v];
    for (int j = 1; j < LOG; j++)
        for (int v = 0; v < n; v++)
            lift[v][j] = lift[lift[v][j - 1]][j - 1];

    auto query = [&](int v, long long k) -> int {
        for (int j = 0; j < LOG; j++)
            if ((k >> j) & 1)
                v = lift[v][j];
        return v;
    };

    // --- Cycle decomposition (DFS coloring) ---
    vector<int> color(n, 0);
    vector<bool> on_cycle(n, false);
    for (int i = 0; i < n; i++) {
        if (color[i]) continue;
        vector<int> path;
        int v = i;
        while (!color[v]) {
            color[v] = 1;
            path.push_back(v);
            v = nxt[v];
        }
        if (color[v] == 1) {
            int u = v;
            do { on_cycle[u] = true; u = nxt[u]; } while (u != v);
        }
        for (int u : path) color[u] = 2;
    }

    int q;
    cin >> q;
    while (q--) {
        int v; long long k;
        cin >> v >> k;
        cout << query(v, k) << "\n";
    }
}

Version B -- Explained Version

Same logic with detailed comments, plus Floyd's cycle detection utility.

cpp

#include <bits/stdc++.h>
using namespace std;

// LOG = 30 handles k up to ~10^9.
// For k up to 10^18, set LOG = 60 (memory: n * 60 ints).
const int LOG = 30;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n;
    cin >> n;
    vector<int> nxt(n);
    for (int i = 0; i < n; i++) cin >> nxt[i];

    // ---- Binary Lifting Table ----
    // lift[v][j] = the node you reach after 2^j steps from v.
    vector<array<int, LOG>> lift(n);

    // Base case: one step = direct successor.
    for (int v = 0; v < n; v++)
        lift[v][0] = nxt[v];

    // Double the jump: to go 2^j steps, go 2^(j-1) then 2^(j-1) again.
    for (int j = 1; j < LOG; j++)
        for (int v = 0; v < n; v++)
            lift[v][j] = lift[lift[v][j - 1]][j - 1];

    // Decompose k into binary; chain set-bit jumps.
    auto query = [&](int v, long long k) -> int {
        for (int j = 0; j < LOG; j++)
            if ((k >> j) & 1)
                v = lift[v][j];
        return v;
    };

    // ---- Cycle Decomposition (DFS Coloring) ----
    // Each weakly-connected component of a functional graph has exactly
    // one cycle. Nodes either sit ON a cycle or on a "tail" leading in.
    //
    // Trace paths. If we revisit a gray (in-progress) node, everything
    // from that node back to itself forms the cycle.

    vector<int> color(n, 0); // 0=white, 1=gray, 2=black
    vector<bool> on_cycle(n, false);
    vector<int> cycle_id(n, -1);
    int num_cycles = 0;

    for (int i = 0; i < n; i++) {
        if (color[i]) continue;
        vector<int> path;
        int v = i;
        while (!color[v]) {
            color[v] = 1;
            path.push_back(v);
            v = nxt[v];
        }
        if (color[v] == 1) {
            // Found a new cycle: mark every node from v around the loop.
            int cid = num_cycles++;
            int u = v;
            do {
                on_cycle[u] = true;
                cycle_id[u] = cid;
                u = nxt[u];
            } while (u != v);
        }
        for (int u : path) color[u] = 2;
    }

    // Tail distance: how many steps from v to its cycle entry.
    vector<int> tail_dist(n, 0);
    for (int v = 0; v < n; v++) {
        if (on_cycle[v]) { cycle_id[v] = cycle_id[v]; continue; }
        int d = 0, u = v;
        while (!on_cycle[u]) { u = nxt[u]; d++; }
        tail_dist[v] = d;
        cycle_id[v] = cycle_id[u];
    }

    // ---- Floyd's Cycle Detection (single start, O(1) extra space) ----
    // Use when the graph is implicit (e.g., f(x) = x^2 + c mod p)
    // and you cannot store the full successor array.
    auto floyd = [&](int start) -> tuple<int, int, int> {
        // Phase 1: tortoise (slow, +1) and hare (fast, +2) meet inside cycle.
        int slow = nxt[start], fast = nxt[nxt[start]];
        while (slow != fast) {
            slow = nxt[slow];
            fast = nxt[nxt[fast]];
        }
        // Phase 2: find cycle entry.
        slow = start;
        int tail_len = 0;
        while (slow != fast) {
            slow = nxt[slow];
            fast = nxt[fast];
            tail_len++;
        }
        int entry = slow;
        // Phase 3: measure cycle length.
        int cyc_len = 1;
        int cur = nxt[entry];
        while (cur != entry) { cur = nxt[cur]; cyc_len++; }
        return {entry, tail_len, cyc_len};
    };

    // Example: auto [entry, tlen, clen] = floyd(0);

    int q;
    cin >> q;
    while (q--) {
        int v; long long k;
        cin >> v >> k;
        cout << query(v, k) << "\n";
    }
}

Version C -- Standalone Floyd's Cycle Detection

A self-contained function for Floyd's tortoise-and-hare algorithm. Use this when the graph is implicit (e.g., $f (x) = x^{2} + c mod p$ ) and you cannot afford to store the full successor array. All three phases are separated and commented.

The rho (ρ) shape detected by Floyd's algorithm:

  start
    |
    v
    a --> b --> c --> d       <-- "tail" (length = mu)
                     |
                     v
              +---> [e] --> f --> g --> h
              |      ^                  |
              |      |  cycle entry     |
              |      +<--- i <--- j <---+
              |
              +---- cycle (length = lambda)

  Phase 1: slow (+1) and fast (+2) MEET somewhere inside the cycle.
  Phase 2: find the cycle ENTRY node (the node where tail meets cycle).
  Phase 3: measure the cycle LENGTH by walking from entry back to entry.

Step-by-step trace on the example graph (nxt = {1,2,3,4,5,3,1,6}):

Starting from node 0:

  Phase 1 -- Find meeting point:
  +------+------+------+
  | Step | slow | fast |
  +------+------+------+
  |  0   |  1   |  2   |  slow=nxt[0]=1, fast=nxt[nxt[0]]=nxt[1]=2
  |  1   |  2   |  4   |  slow=nxt[1]=2, fast=nxt[nxt[2]]=nxt[3]=4
  |  2   |  3   |  3   |  slow=nxt[2]=3, fast=nxt[nxt[4]]=nxt[5]=3
  +------+------+------+
                          slow == fast == 3  --> MEET

  Phase 2 -- Find cycle entry:
  Reset slow to start (0), keep fast at meeting point (3).
  Both advance by +1:
  +------+------+------+
  | Step | slow | fast |
  +------+------+------+
  |  0   |  0   |  3   |
  |  1   |  1   |  4   |  slow=nxt[0]=1, fast=nxt[3]=4
  |  2   |  2   |  5   |  slow=nxt[1]=2, fast=nxt[4]=5
  |  3   |  3   |  3   |  slow=nxt[2]=3, fast=nxt[5]=3
  +------+------+------+
                          slow == fast == 3  --> ENTRY = 3
                          tail_length = 3 (steps taken)

  Phase 3 -- Measure cycle length:
  Walk from entry until return:
  +------+------+
  | Step | node |
  +------+------+
  |  0   |  4   |  start from nxt[entry]=nxt[3]=4
  |  1   |  5   |  nxt[4]=5
  |  2   |  3   |  nxt[5]=3  --> back at entry!
  +------+------+
                    cycle_length = 3

Complete compilable code:

cpp

#include <bits/stdc++.h>
using namespace std;

// Floyd's Cycle Detection -- standalone function.
// Works with any successor function f: domain -> domain.
// Returns {cycle_entry, tail_length, cycle_length}.
// Time: O(tail_length + cycle_length), Space: O(1).
template <typename T, typename F>
tuple<T, int, int> floyd_cycle(T start, F f) {
    // Phase 1: find meeting point inside the cycle.
    // slow moves +1, fast moves +2. They must meet inside the cycle.
    T slow = f(start), fast = f(f(start));
    while (slow != fast) {
        slow = f(slow);
        fast = f(f(fast));
    }

    // Phase 2: find cycle entry.
    // Reset slow to start. Advance both by +1 until they meet.
    // Mathematical proof: they meet exactly at the cycle entry.
    slow = start;
    int tail_len = 0;
    while (slow != fast) {
        slow = f(slow);
        fast = f(fast);
        tail_len++;
    }
    T entry = slow;

    // Phase 3: measure cycle length.
    // Walk from entry until we return to entry.
    int cyc_len = 1;
    T cur = f(entry);
    while (cur != entry) {
        cur = f(cur);
        cyc_len++;
    }

    return {entry, tail_len, cyc_len};
}

int main() {
    // Example: functional graph with 8 nodes.
    vector<int> nxt = {1, 2, 3, 4, 5, 3, 1, 6};

    auto [entry, tail, cycle] = floyd_cycle(0, [&](int v) {
        return nxt[v];
    });

    cout << "Cycle entry: " << entry << "\n";   // 3
    cout << "Tail length: " << tail << "\n";     // 3
    cout << "Cycle length: " << cycle << "\n";   // 3

    // Works with implicit functions too (no array needed):
    long long c = 1, p = 1000000007;
    auto [e2, t2, c2] = floyd_cycle(2LL, [&](long long x) {
        return ((__int128)x * x + c) % p;
    });
    cout << "Implicit function cycle length: " << c2 << "\n";
}

Note on Floyd's vs DFS coloring: Floyd's detects the cycle from a single start node in $O (1)$ space. If you need cycle info for all $n$ nodes, use the DFS coloring approach from Version A/B instead -- it runs in $O (n)$ total. See also the DFS traversal patterns file for background on the gray/black coloring technique.

Operations Reference

Operation	Time	Space
Build lifting table	$O (n \log k)$	$O (n \log k)$
$k$ -th successor query	$O (\log k)$	$O (1)$
DFS cycle decomposition (all nodes)	$O (n)$	$O (n)$
Tail distance (all nodes)	$O (n)$	$O (n)$
Floyd's cycle detection (one start)	$O (λ + μ)$	$O (1)$

$n$ = nodes, $k$ = steps, $λ$ = tail length, $μ$ = cycle length.

Problem Patterns & Tricks

Pattern 1: k-th Successor Queries

The bread-and-butter application. Given a functional graph and queries $(v, k)$ : "where is $v$ after $k$ steps?" Build the lifting table, answer each query in $O (\log k)$ .

cpp

// After building lift[][]:
auto ans = query(v, k); // 3 lines of code, see template above

Examples: CSES -- Planets Queries I, AtCoder ABC 167D -- Teleporter

Pattern 2: Permutation Cycle Decomposition

A permutation of ${0, \dots, n - 1}$ is a bijective functional graph -- every node has in-degree 1 AND out-degree 1, so the entire graph is a union of disjoint cycles (no tails). To apply a permutation $k$ times: for each cycle of length $L$ , each element shifts by $k mod L$ positions.

cpp

vector<bool> vis(n, false);
vector<int> result(n);
for (int i = 0; i < n; i++) {
    if (vis[i]) continue;
    vector<int> cyc;
    for (int v = i; !vis[v]; v = p[v]) {
        vis[v] = true;
        cyc.push_back(v);
    }
    int L = cyc.size();
    for (int j = 0; j < L; j++)
        result[cyc[j]] = a[cyc[(j + k) % L]];
}

Examples: CF 1327D -- Infinite Path, CF 1551D2 -- Domino (hard version)

Pattern 3: Floyd's for Implicit Functions

When the successor function is computed on-the-fly (e.g., $f (x) = x^{2} + c (\mod p)$ in Pollard's rho), storing the full graph is impossible. Floyd's tortoise-and-hare finds the cycle entry and length using $O (1)$ extra memory.

cpp

auto f = [](long long x) { return ((__int128)x * x + c) % p; };
long long slow = f(x0), fast = f(f(x0));
while (slow != fast) { slow = f(slow); fast = f(f(fast)); }
// ... find entry and length as in Version B above

Examples: Pollard's rho integer factorization, CF 1020B -- Badge (simpler variant)

Pattern 4: Path Aggregation with Binary Lifting

Store extra information alongside each jump: the minimum edge, maximum value, or XOR sum over the $2^{j}$ -step path. Same doubling trick.

Comparison to binary lifting on trees:

On trees (see LCA with Binary Lifting), lifting moves toward the root and terminates. Path aggregation between $u$ and $v$ splits into $u \to LCA$ and $v \to LCA$ . On functional graphs, there is no root -- cycle nodes loop forever. Path aggregation is simpler: you always walk forward from a single start node for $k$ steps. No LCA computation is needed; just decompose $k$ into powers of two and merge.

  Tree lifting:        u ---> ... ---> LCA <--- ... <--- v
                        (need to find LCA, merge two paths)

  Functional graph:    u ---> ... ---> (k steps forward)
                        (single direction, no LCA needed)

Full compilable example -- minimum value over $k$ steps:

cpp

#include <bits/stdc++.h>
using namespace std;

const int LOG = 30;

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    int n;
    cin >> n;
    vector<int> nxt(n);
    vector<long long> w(n);  // weight/value at each node
    for (int i = 0; i < n; i++) cin >> nxt[i];
    for (int i = 0; i < n; i++) cin >> w[i];

    // Each entry stores: destination after 2^j steps,
    // and the minimum w[] value along those 2^j nodes.
    struct Entry { int dest; long long mn; };
    vector<array<Entry, LOG>> lift(n);

    // Base: one step from v visits nxt[v], min value is w[v].
    for (int v = 0; v < n; v++)
        lift[v][0] = {nxt[v], w[v]};

    // Build: merge two halves of size 2^(j-1).
    for (int j = 1; j < LOG; j++)
        for (int v = 0; v < n; v++) {
            auto [mid, mn1] = lift[v][j - 1];
            auto [dst, mn2] = lift[mid][j - 1];
            lift[v][j] = {dst, min(mn1, mn2)};
        }

    // Query: minimum value over k steps starting from v.
    auto query = [&](int v, long long k) -> pair<int, long long> {
        long long ans = LLONG_MAX;
        for (int j = 0; j < LOG; j++) {
            if ((k >> j) & 1) {
                ans = min(ans, lift[v][j].mn);
                v = lift[v][j].dest;
            }
        }
        return {v, ans};
    };

    int q;
    cin >> q;
    while (q--) {
        int v; long long k;
        cin >> v >> k;
        auto [dest, mn] = query(v, k);
        cout << dest << " " << mn << "\n";
    }
}

Adaptations: Replace min with max, +, ^ (XOR), or gcd for other aggregations. The merge function must be associative. For non-commutative operations (e.g., matrix multiplication), the order of the two halves matters -- always merge left-half then right-half.

Examples: CSES -- Planets Queries II (distance variant), weighted functional graph problems on CF

Pattern 5: Cycle Structure for Every Node

Find which cycle each node reaches, the tail distance, and the cycle length. The DFS coloring approach runs in $O (n)$ -- see Version B. Useful for questions like "do nodes $u$ and $v$ converge?" (same cycle?) or "after how many steps does node $v$ first revisit a state?" (tail + cycle length).

Examples: CSES -- Planets Cycles, CF 1020B -- Badge

Pattern 6: Functional Graph on Modular Arithmetic

Many number-theory problems define $f (x) = g (x) mod m$ for some function $g$ , creating a functional graph on ${0, \dots, m - 1}$ . Since the domain is finite, trajectories are eventually periodic. Binary lifting or Floyd's lets you handle huge iteration counts.

Examples: CF 1146C -- Tree Diameter (iterated function variant), various ad-hoc modular iteration problems on CF

Contest Cheat Sheet

+----------------------------------------------------------------+
|            FUNCTIONAL GRAPHS -- CHEAT SHEET                    |
+----------------------------------------------------------------+
| WHEN TO USE:                                                   |
|   - Each node has exactly one outgoing edge (out-degree 1)     |
|   - "Where is node v after k steps?"  (k can be huge)         |
|   - Permutation cycle decomposition                            |
|   - Implicit function cycle detection (Pollard's rho, etc.)   |
+----------------------------------------------------------------+
| BINARY LIFTING:                                                |
|   lift[v][0] = nxt[v]                                         |
|   lift[v][j] = lift[lift[v][j-1]][j-1]                        |
|   query(v,k): for each set bit j in k, v = lift[v][j]        |
+----------------------------------------------------------------+
| CYCLE DETECTION (DFS):                                         |
|   Color nodes: 0=white, 1=gray, 2=black                      |
|   Trace path; if you hit a gray node => cycle found           |
+----------------------------------------------------------------+
| COMPLEXITY:                                                    |
|   Build:  O(n log k) time and space                           |
|   Query:  O(log k) per query                                  |
|   Cycles: O(n) full decomposition                             |
+----------------------------------------------------------------+
| WATCH OUT:                                                     |
|   - LOG = 30 for k <= 10^9,  LOG = 60 for k <= 10^18         |
|   - 0-indexed vs 1-indexed nodes                              |
|   - Memory: n * LOG ints  (200K * 60 * 4B = 48MB -- tight!)  |
|   - Floyd's needs start != nxt[start] to avoid trivial loop   |
+----------------------------------------------------------------+

Gotchas & Debugging

LOG too small. If $k$ can reach $10^{18}$ , you need LOG = 60, not 30. With LOG = 30 the high bits of $k$ are silently ignored, giving wrong answers on large $k$ only. Symptom: passes small tests, WA on large.
Memory with LOG = 60. The table is $n \times 60$ ints. For $n = 2 \times 10^{5}$ that is $\approx 48$ MB -- within typical 256 MB limits but tight. If $n$ is larger, consider answering queries offline or using cycle decomposition + modular arithmetic instead of lifting.
0-indexed vs 1-indexed. If nodes are 1-indexed, make sure the lifting table is sized to $n + 1$ and loops start at 1. Off-by-one here causes out-of-bounds reads that may not crash but return garbage.
Self-loops. A node with $n x t [v] = v$ is a valid functional graph (a cycle of length 1). Floyd's algorithm handles this, but make sure your DFS coloring doesn't special-case it incorrectly.
Multiple components. A functional graph on $n$ nodes can have several disjoint components, each with its own cycle. Don't assume a single cycle. The DFS coloring approach naturally handles all components.
Floyd's starting point. Floyd's algorithm detects the cycle reachable from a given start node. If you need cycle info for ALL nodes, use DFS coloring ( $O (n)$ ), not Floyd's from each node ( $O (n \cdot (λ + μ))$ worst case).
Confusing successor direction. In a functional graph, edges go $v \to n x t [v]$ . The "tree" part is the tails, and they point toward the cycle. If you need to go backward (find predecessors), build a reverse adjacency list -- each cycle node can have multiple predecessors.
Debugging tip: brute force. For small $n$ and $k$ ( $\leq 1000$ ), simulate naively and compare with binary lifting output. Stress test with random functional graphs (random nxt[] arrays).

Mental Traps

Trap 1: "I found where the pointers meet -- that's the cycle entry."

No. The meeting point is inside the cycle, but it is usually not the entry. Phase 2 is needed to find the actual entry:

  Phase 1: Tortoise (+1) and Hare (+2) meet INSIDE the cycle.

    tail (length t)          cycle (length L)
  S -> a -> b -> [c -> d -> e -> c]
                       ^
                       |
                  Meeting point (not necessarily c!)

  Phase 2: Reset tortoise to S. Advance both +1.
           They meet at the cycle ENTRY (node c).

  Why: Let M = meeting point. Tortoise traveled t + x steps
       (x = distance from entry to M inside cycle).
       Hare traveled 2(t + x) = t + x + kL for some k.
       So t + x = kL, meaning t = kL - x.
       Walking t more steps from M: (x + t) = x + kL - x = kL
       -> lands back at entry. Tortoise also walks t steps
       from S -> arrives at entry. They meet at entry.

  Phase 3: From entry, walk +1 until you return. Count = L.

Trap 2: "Binary lifting works the same as on trees."

On a tree, every node has a unique ancestor path to the root. In a functional graph, tail nodes have a unique successor path, but cycle nodes loop forever. If you apply tree-style binary lifting without considering cycles, you get correct answers for $k$ -th successor queries -- the table naturally handles the wrap-around. But if you need to compute distances between two nodes, you need cycle-aware logic:

  Tree: every node has finite depth.
        lift[v][j] always moves toward root.
        Eventually lift[v][j] = root for large j.

  Functional graph: cycle nodes have infinite depth.
        lift[v][j] wraps around the cycle.
        lift[v][j] is well-defined for all j.

  Distance between u and v in a functional graph:
  +------+------+-----------------------------------------+
  | u    | v    | How to compute distance                 |
  +------+------+-----------------------------------------+
  | tail | tail | Simple subtraction of depths            |
  | cycle| cycle| min of two arc distances (mod L)        |
  | tail | cycle| tail distance + cycle arc               |
  | diff | comp | Unreachable (distance = -1)             |
  +------+------+-----------------------------------------+

  Tree binary lifting doesn't handle the cycle cases.

Self-Test

Answer without looking at the code above.

[ ] Rho structure. Draw the rho shape and explain why every component of a functional graph has exactly one cycle.
[ ] Binary lifting build. Write the two nested loops that build the lifting table. What is lift[v][0]? What is the recurrence for lift[v][j]?
[ ] Query decomposition. Given $k = 13$ (binary: $1101$ ), trace which table entries you look up to answer "where is node $v$ after 13 steps?"
[ ] Floyd's three phases. From memory, describe what each phase of Floyd's tortoise-and-hare algorithm computes and what it returns.
[ ] LOG sizing. If $k$ can be up to $10^{18}$ , what should LOG be set to? What is the memory cost for $n = 2 \times 10^{5}$ ?

Practice Problems

#	Problem	Source	Difficulty	Key Idea
1	Planets Queries I	CSES	1500	Direct binary lifting; $k$ -th successor
2	Teleporter	AtCoder ABC 167D	1500	Single $k$ -th successor query, $k \leq 10^{18}$
3	Badge	CF 1020B	1600	Follow from each node until a revisit; find cycle entry
4	Planets Cycles	CSES	1700	Cycle length + tail distance for every node
5	Planets Queries II	CSES	1800	Distance between two nodes in a functional graph
6	Infinite Path	CF 1327D	1900	Permutation cycles; enumerate divisors of cycle length
7	Museums Tour	CF 1137C	2100	Functional-graph structure on SCC-condensed graph

12. Mastery Corner

💡 The Aha Moment: In a finite functional graph, every path MUST eventually revisit a node -- because there are only n nodes but infinite steps. That revisit creates the cycle. The "rho" shape isn't a special case; it's the ONLY possible shape.

🏠 Real-Life Analogy: Like following "next" links on a chain of web pages where each page links to exactly one other page -- you'll always end up going in circles eventually.

If I Had 5 Minutes

Functional graph = every node has exactly one outgoing edge (out-degree 1)
Structure: tails feeding into disjoint cycles (rho/ρ shape)
Binary lifting: lift[v][j] = lift[lift[v][j-1]][j-1], query in O(log k)
Floyd's tortoise-hare: O(1) space cycle detection (3 phases: meet, entry, length)
Permutation = bijective functional graph (all cycles, no tails)

Pattern Fingerprint

Constraint	Technique
"Each node has one successor, find k-th position"	Binary lifting, O(n log k) build
"Detect cycle in sequence f(x), f(f(x)), ..."	Floyd's tortoise and hare, O(1) space
"Apply permutation k times"	Cycle decomposition, k mod cycle_length
"All nodes: distance to cycle / cycle membership"	DFS coloring in O(n)

Rating Progression

CF Rating	What You Should Handle
1200	Simulate following edges, detect simple cycles
1500	Binary lifting for k-th successor queries
1800	Floyd's cycle detection, permutation cycle decomposition
2100	Path aggregation with lifting, distance between nodes on functional graphs

Before You Code Checklist

[ ] Is LOG large enough? (30 for k<=10^9, 60 for k<=10^18)
[ ] Is memory OK? (n x LOG x 4 bytes -- check against 256MB limit)
[ ] Are nodes 0-indexed or 1-indexed? (Lifting table size must match)
[ ] Does the problem need cycle info for ALL nodes (use DFS) or just one start (use Floyd's)?
[ ] For permutations: using k mod cycle_length for each cycle?

What Would You Do If...?

k is up to 10^18 but n is only 200,000? -- Use LOG=60. Memory = 200000 x 60 x 4B = 48MB, fits in 256MB. Each query is 60 lookups.
You can't store the successor array (implicit function like f(x) = x²+c mod p)? -- Use Floyd's tortoise-and-hare for O(1) space cycle detection. Can't use binary lifting without the array.
Two nodes start on different components -- can they ever meet? -- No. Each component has its own cycle. Nodes on different components never meet. Check cycle_id equality.

Historical Origin

Floyd's cycle detection algorithm (published 1967, attributed to Robert W. Floyd) uses the "tortoise and hare" metaphor -- a slow pointer and a fast pointer that must meet inside any cycle. The algorithm is central to Pollard's rho factoring method (1975), which finds prime factors of large numbers by detecting cycles in the sequence x -> x² + c mod n.

The Mistake That Teaches

Bug: "My binary lifting gives correct answers for small k but wrong answers when k > 2^30." Root cause: LOG was set to 30 but the problem allowed k up to 10^18. The query loop for (int j = 0; j < LOG; j++) ignores all bits above position 29. For k = 10^18, bits 30-59 are silently dropped. Fix: Set LOG = 60 and ensure the lifting table is built with 60 levels. Also check memory: n x 60 x 4 bytes.

Mnemonic Anchor

"Rho knows where you go" -- every functional graph forms the Greek letter ρ (rho): a tail leading into a cycle. Once you see ρ, binary lifting answers any "where after k steps?" question.

When NOT to Use This

Graph has nodes with out-degree > 1 -- not a functional graph; use general BFS/DFS or Dijkstra
Need shortest path between arbitrary pairs -- functional graph structure doesn't help; use Floyd-Warshall or BFS
The successor function changes between queries -- lifting table is invalidated; need dynamic data structures
k is small (<= n) -- just simulate; binary lifting's O(n log k) build is overkill

Debug This

Bug 1: What's wrong with this Floyd's implementation?

cpp

int slow = start, fast = start;
while (slow != fast) {  // BUG: they start equal!
    slow = nxt[slow];
    fast = nxt[nxt[fast]];
}

Answer: Both pointers start at start, so slow != fast is immediately false and the loop never executes. Fix: initialize slow = nxt[start], fast = nxt[nxt[start]] (one step ahead).

Bug 2: Why does this lifting table give wrong answers?

cpp

const int LOG = 30;
vector<array<int, LOG>> lift(n);
for (int v = 0; v < n; v++) lift[v][0] = nxt[v];
for (int j = 1; j < LOG; j++)
    for (int v = 0; v < n; v++)
        lift[v][j] = lift[v][j - 1];  // BUG

Answer: The recurrence is wrong. lift[v][j] = lift[v][j-1] just copies the previous level. Correct: lift[v][j] = lift[lift[v][j-1]][j-1] -- jump 2^(j-1) from v, then jump 2^(j-1) again.

Bug 3: This cycle decomposition misses some cycles. Why?

cpp

vector<int> color(n, 0);
for (int i = 0; i < n; i++) {
    if (color[i]) continue;
    int v = i;
    while (!color[v]) {
        color[v] = 2;  // BUG: marking as black immediately
        v = nxt[v];
    }
}

Answer: Nodes are marked BLACK (2) instead of GRAY (1) during traversal. When the path reaches a previously-visited node, we can't distinguish between "revisiting a node in our current path" (cycle found) vs "reaching a node from a previous traversal" (no new cycle). Fix: mark as GRAY (1) during traversal, then BLACK (2) after.

Bug 4: This query function gives wrong answers for k=0.

cpp

auto query = [&](int v, long long k) -> int {
    for (int j = LOG - 1; j >= 0; j--)
        if ((k >> j) & 1) v = lift[v][j];
    return v;
};

Answer: This is actually correct for k=0 (no bits set, returns v unchanged). Trick question! But watch out: iterating high-to-low vs low-to-high doesn't matter for functional graphs since the operations commute. Both orders give the same result.

Functional Graphs ​

Contents ​

Intuition ​

The Pain ​

The Key Insight ​

Visual Walkthrough ​

The Invariant ​

What the Code Won't Teach You ​

When to Reach for This ​

C++ STL Reference ​

Implementation (Contest-Ready) ​

Version A -- Minimal Contest Template ​

Version B -- Explained Version ​

Version C -- Standalone Floyd's Cycle Detection ​

Operations Reference ​

Problem Patterns & Tricks ​

Pattern 1: k-th Successor Queries ​

Pattern 2: Permutation Cycle Decomposition ​

Pattern 3: Floyd's for Implicit Functions ​

Pattern 4: Path Aggregation with Binary Lifting ​

Pattern 5: Cycle Structure for Every Node ​

Pattern 6: Functional Graph on Modular Arithmetic ​

Contest Cheat Sheet ​

Gotchas & Debugging ​

Mental Traps ​

Self-Test ​

Practice Problems ​

12. Mastery Corner ​

If I Had 5 Minutes ​

Pattern Fingerprint ​

Rating Progression ​

Before You Code Checklist ​

What Would You Do If...? ​

Historical Origin ​

The Mistake That Teaches ​

Mnemonic Anchor ​

When NOT to Use This ​

Debug This ​

Further Reading ​