Skip to content

Matrix Exponentiation

Quick Revisit

  • USE WHEN: linear recurrence with n up to 10^18; state transitions expressible as matrix multiply
  • INVARIANT: state vector times transition matrix raised to n equals the n-th term
  • TIME: O(k^3 log n) | SPACE: O(k^2)
  • CLASSIC BUG: off-by-one in exponent -- matrix power n vs n-1 depending on base case indexing
  • PRACTICE: ../12-practice-ladders/08-ladder-mixed.md

AL-12 | Turn any linear recurrence into a matrix power and compute it in O(k3logn) -- the standard trick for "compute the n-th term, where n1018" problems.

Difficulty: [Intermediate]Prerequisites: Math Fundamentals, DP IntroCF Rating Range: 1700-2000

Contest Frequency: [*] Specialized -- appears in linear recurrence problems


Contents


Intuition

"Compute F(1018)mod(109+7)."

The textbook DP is clean and correct -- but hopelessly slow:

cpp
// O(n) -- works for n <= 10^7, dies for n = 10^18
vector<long long> dp(n + 1);
dp[0] = 0; dp[1] = 1;
for (int i = 2; i <= n; i++)
    dp[i] = (dp[i - 1] + dp[i - 2]) % MOD;

At 109 iterations per second, 1018 steps takes roughly 30 years. Even if you only store two variables (no vector), the loop count itself is the bottleneck. You cannot iterate your way out of this -- you need to jump.

The same wall appears whenever a recurrence has small state but astronomical index: tribonacci at n=1015, tiling a 3×n grid for n=1012, or counting paths of length k=109 in a graph.

The Key Insight

Any linear recurrence f(n)=a1f(n1)+a2f(n2)++akf(nk) can be written as a single matrix multiplication, and the matrix power Mn can be computed in O(k3logn) using binary exponentiation.

The trick is to pack the last k terms into a column vector and find a k×k transition matrix M such that:

[f(n)f(n1)f(nk+1)]=M[f(n1)f(n2)f(nk)]

Applying M once advances the state by one step. Applying it n times gives Mnv0, and Mn is computable in O(k3logn) via repeated squaring -- the same binary exponentiation you already know for integers, but with matrix multiplication instead of scalar multiplication.

Analogy -- fast-forward button. The DP loop is playing a tape at 1x. Binary exponentiation lets you double the playback speed at each step: M1M2M4M8 You compose only the powers whose bits are set in n, so you reach Mn in log2n squarings.

Visual Walkthrough

Fibonacci as matrix multiplication:

The recurrence F(n)=F(n1)+F(n2) with F(0)=0,F(1)=1 becomes:

    [ F(n+1) ]   [ 1  1 ]   [ F(n)   ]
    [        ] = [       ] * [         ]
    [ F(n)   ]   [ 1  0 ]   [ F(n-1) ]

State vector transformation -- one step visualized:

  STATE VECTOR TRANSFORMATION:  [F_n, F_{n-1}] * M = [F_{n+1}, F_n]

    +--------+       +---------+       +----------+
    | F_n    |       | 1    1  |       | F_{n+1}  |
    |        |   *   |         |   =   |          |
    | F_{n-1}|       | 1    0  |       | F_n      |
    +--------+       +---------+       +----------+
     state(n)       transition M       state(n+1)

  How each entry is computed:
    F_{n+1} = 1*F_n + 1*F_{n-1}    <-- row 0 of M gives the recurrence
    F_n     = 1*F_n + 0*F_{n-1}    <-- row 1 of M shifts the old value down

  Example: [F_5, F_4] = [5, 3]
    [5, 3] * [[1,1],[1,0]] = [5+3, 5+0] = [8, 5] = [F_6, F_5]  OK

Binary exponentiation tree -- how M13 is computed:

  BINARY EXPONENTIATION TREE FOR M^13
  13 = 1101 in binary = 8 + 4 + 1

  Squaring chain (compute all needed powers):
  +---------+      +---------+      +---------+      +---------+
  |  M^1    | ---> |  M^2    | ---> |  M^4    | ---> |  M^8    |
  | (given) | sq   | = M*M   | sq   | =(M^2)^2| sq   | =(M^4)^2|
  +---------+      +---------+      +---------+      +---------+
       |                                  |                |
       | bit 0 = 1                        | bit 2 = 1      | bit 3 = 1
       | (use it!)                        | (use it!)      | (use it!)
       v                                  v                v
     result = M^1         result = M^1 * M^4    result *= M^8
                          result = M^5          result = M^13

  Only 3 squarings + 2 multiplications instead of 12 multiplications!

  Bit scan of 13 = 1101:
    bit 0: SET   --> result = M^1
    bit 1: CLEAR --> skip (but still square: base = M^2)
    bit 2: SET   --> result = result * M^4 = M^5
    bit 3: SET   --> result = result * M^8 = M^13

Let M=[1110] and v0=[10]=[F(1)F(0)].

Then Mnv0=[F(n+1)F(n)], so the answer is the bottom element.

Walk through F(5) via binary exponentiation:

n=5=(101)2, so we need M5=M4M1.

Step 1: Compute powers of M by repeated squaring
--------------------------------------------------------------
  M^1 = [ 1  1 ]
        [ 1  0 ]

  M^2 = M^1 * M^1 = [ 1*1+1*1  1*1+1*0 ] = [ 2  1 ]
                     [ 1*1+0*1  1*1+0*0 ]   [ 1  1 ]

  M^4 = M^2 * M^2 = [ 2*2+1*1  2*1+1*1 ] = [ 5  3 ]
                     [ 1*2+1*1  1*1+1*1 ]   [ 3  2 ]

Step 2: Multiply powers whose bits are set in n = 5 = (101)_2
--------------------------------------------------------------
  bit 0 (set):    result = M^1         = [ 1  1 ]
                                         [ 1  0 ]

  bit 1 (clear):  skip M^2

  bit 2 (set):    result = M^4 * result
                         = [ 5  3 ] * [ 1  1 ] = [ 8  5 ]
                           [ 3  2 ]   [ 1  0 ]   [ 5  3 ]

Step 3: Apply to base vector
--------------------------------------------------------------
  M^5 * v0 = [ 8  5 ] * [ 1 ] = [ 8 ]  -->  F(6) = 8
             [ 5  3 ]   [ 0 ]   [ 5 ]  -->  F(5) = 5  <-- answer

Check: F(5)=5. Correct. We used 2 squarings + 1 multiply instead of 5 DP iterations -- the gap grows to billions of saved steps when n=1018.

The Invariant

+-------------------------------------------------------------------+
|  INVARIANT                                                         |
|                                                                    |
|  At each step i of binary exponentiation:                          |
|                                                                    |
|    base = M^(2^i)                                                  |
|    result = M^(sum of 2^j for all set bits j < i that are set)     |
|                                                                    |
|  After processing all bits of n:                                   |
|    result = M^n                                                    |
|                                                                    |
|  CORRECTNESS: M^(2^(i+1)) = (M^(2^i))^2  -- squaring is valid    |
|               M^n = product of M^(2^j) for set bits j in n        |
|                                                                    |
|  COMPLEXITY: O(log n) squarings, each costs O(k^3) for k x k      |
|              matrices.  Total: O(k^3 * log n).                     |
+-------------------------------------------------------------------+

Convention box — Fibonacci indexing

Most off-by-one errors in matrix exponentiation come from mixing up two equivalent-but-different conventions for how the state vector and matrix power line up. Pick one and stick to it for the whole problem:

+----------------------------------------------------------------------+
|  CONVENTION A (used in this file's walkthrough)                       |
|    base vector  v_0 = [F(1), F(0)]^T = [1, 0]^T                      |
|    after k applications:  M^k * v_0 = [F(k+1), F(k)]^T               |
|    -> to get F(n), compute M^n * v_0 and read the BOTTOM entry.      |
+----------------------------------------------------------------------+
|  CONVENTION B                                                         |
|    base vector  v_1 = [F(1), F(0)]^T  but pull the power down by one |
|    M^(n-1) * v_1 = [F(n), F(n-1)]^T                                  |
|    -> to get F(n), compute M^(n-1) * v_1 and read the TOP entry.     |
+----------------------------------------------------------------------+
|  Pick ONE. Mixing the two is the #1 source of off-by-one bugs in     |
|  matrix exponentiation. Test on F(0), F(1), F(2), F(3) by hand       |
|  before trusting the code.                                            |
+----------------------------------------------------------------------+

State-vector design — the actually hard part

Once you can multiply matrices, exponentiation is mechanical. The skill that takes practice is deciding what entries belong in the state vector. The rule is: the state must contain exactly the values needed for one matrix multiplication to advance time by one step.

Three common state-vector idioms:

  • Pure recurrence of order k. State is [f(n),f(n1),,f(nk+1)]T. Top row of M encodes the recurrence; the rest is a shift.
  • Recurrence with a polynomial inhomogeneous term (e.g. f(n)=2f(n1)+n). Augment the state with n and 1. Each polynomial term in n adds one row: a constant adds 1, a linear term adds n, a quadratic term adds n2.
  • Adjacency matrix on a graph. State is the indicator vector of the current vertex (or distribution). (Ak)ij counts walks of length k.

If you cannot write down "applying M once advances time by one step," the state is incomplete — add whatever is missing before touching the matrix entries.

When to Reach for This

Trigger phrases in problem statements:

TriggerWhy it signals matrix exponentiation
"compute f(n) for n1018"Index too large for DP loop
"linear recurrence"Direct matrix formulation
"number of paths of length k"Adjacency matrix to the k-th power
"k-th Fibonacci / tribonacci / n-bonacci"Classic matrix speedup
"tiling a m×n grid, n1012"State vector over column profiles
"i=0nMi" (geometric series of matrices)Divide-and-conquer on the sum with matrix exp

Decision flowchart:

  Is there a linear recurrence of order k (small k)?
     |
     +--YES--> Is n large (> ~10^7)?
     |            |
     |            +--YES--> Matrix exponentiation  O(k^3 log n)
     |            +--NO---> Plain DP loop  O(k * n)
     |
     +--NO---> Not applicable (consider other techniques)

What the Code Won't Teach You

The matrix multiplication code is mechanical. What it obscures:

Semi-ring generalization. Standard matrix exponentiation uses (+,×) -- you sum products. Replace the operations with (min,+) and you get shortest-path computation: (Ak)ij = shortest walk of exactly k edges from i to j. Replace with (max,+) for longest paths, or (OR,AND) for reachability. The binary exponentiation structure works for any semi-ring -- the code looks identical, only combine changes.

SEMI-RING MATRIX EXPONENTIATION:

  +------------------------+---------------+----------------------+
  |  (X, Y) semi-ring      | identity for X|  what A^k computes   |
  +------------------------+---------------+----------------------+
  |  (+, *)  standard      |  0            |  # of walks of len k |
  |  (min,+) tropical      |  +inf         |  shortest k-edge walk|
  |  (max,+)               |  -inf         |  longest k-edge walk |
  |  (OR,AND)              |  false        |  k-step reachability |
  +------------------------+---------------+----------------------+

State augmentation is a modelling skill, not a code skill. When the recurrence has a constant term (f(n)=2f(n1)+3), you augment the state with a dummy "1" row. When it has a polynomial term ($f(n) = f(n{-}1)

  • n^2$), you augment with 1,n,n2. The matrix grows, but the code is unchanged -- the hard part is choosing the right state vector, and no amount of staring at operator* teaches that.

The semi-ring generalization turns matrix exponentiation into a shortest/longest path finder. Replace addition with min and multiplication with + in the matrix multiply routine, and Ak[u][v] gives the shortest path from u to v using exactly k edges. This is Bellman-Ford in matrix form: Ak=AAA (k times) with the (min,+) semi-ring. Binary exponentiation still works, so you get shortest paths using exactly k edges in O(n3logk). The same trick with (max,+) finds longest paths. The code template for matrix exponentiation supports all of these with a one-line change to the multiplication function.


3. Core Mechanics

3.1 Matrix Multiplication

For two k×k matrices A and B, the product C=A×B is:

C[i][j]=p=0k1A[i][p]B[p][j](modm)

Complexity: O(k3) per multiplication.

Key properties that make this work:

  • Associativity: (AB)C=A(BC) -- lets us group multiplications freely.
  • Identity element: the identity matrix I with I[i][i]=1, zeros elsewhere.
  • Not commutative: ABBA in general -- order matters in chained multiplications.

3.2 Matrix Binary Exponentiation

Exactly like scalar binary exponentiation, but replace scalar * with matrix *:

matpow(M, n):
    result = I          // k x k identity matrix
    base   = M
    while n > 0:
        if n is odd:
            result = result * base
        base = base * base
        n = n >> 1
    return result

Multiplications performed: at most 2log2n. Total complexity: O(k3logn).

BINARY EXPONENTIATION OF MATRICES -- trace for M⁹:

  9 in binary: 1 0 0 1

  Start: result = I (identity), base = M

  bit 0 (1): result = I x M = M        base = M² 
  bit 1 (0): result = M  (no change)   base = M⁴
  bit 2 (0): result = M  (no change)   base = M⁸
  bit 3 (1): result = M x M⁸ = M⁹     base = M¹⁶ (unused)

  Only 2 matrix multiplies for "result x base"
  Plus 3 squarings for "base x base"
  Total: 5 matrix multiplies instead of 8 (naïve)

  ┌───────────────────────────────────────────┐
  │   Exponent bits:  1   0   0   1           │
  │                   │           │            │
  │   Multiply:      xM⁸        xM¹           │
  │                   │           │            │
  │   Result:    I -> M -> ─── -> ─── -> M⁹      │
  │                                            │
  │   Squarings: M->M²->M⁴->M⁸  (3 squarings)  │
  │   Total ops: 3 squarings + 2 multiplies   │
  └───────────────────────────────────────────┘

3.3 Building the Transition Matrix

Given a recurrence f(n)=c1f(n1)+c2f(n2)++ckf(nk):

  State vector:   s(n) = [ f(n), f(n-1), ..., f(n-k+1) ]^T

  Transition matrix M (k x k):

  Row 0:   [ c_1   c_2   c_3  ...  c_{k-1}   c_k  ]   <-- recurrence coefficients
  Row 1:   [  1     0     0   ...    0         0   ]   <-- shift: f(n-1) <- f(n-1)
  Row 2:   [  0     1     0   ...    0         0   ]   <-- shift: f(n-2) <- f(n-2)
    ...           ...
  Row k-1: [  0     0     0   ...    1         0   ]   <-- shift: f(n-k+1) <- f(n-k+1)

Example -- Tribonacci T(n)=T(n1)+T(n2)+T(n3):

  M = [ 1  1  1 ]     v0 = [ T(2) ]   [ 1 ]
      [ 1  0  0 ]          [ T(1) ] = [ 1 ]
      [ 0  1  0 ]          [ T(0) ]   [ 0 ]

Then Mn2v0 gives [T(n)T(n1)T(n2)].


Implementation

4.1 Matrix Struct

cpp
#include <bits/stdc++.h>
using namespace std;

using ll = long long;
const ll MOD = 1e9 + 7;

struct Matrix {
    int n;                        // dimension (n x n)
    vector<vector<ll>> a;

    Matrix(int n) : n(n), a(n, vector<ll>(n, 0)) {}

    // Identity matrix factory
    static Matrix identity(int n) {
        Matrix I(n);
        for (int i = 0; i < n; i++) I.a[i][i] = 1;
        return I;
    }

    Matrix operator*(const Matrix& o) const {
        Matrix res(n);
        for (int i = 0; i < n; i++)
            for (int p = 0; p < n; p++) {
                if (a[i][p] == 0) continue;        // skip zeros early
                for (int j = 0; j < n; j++)
                    res.a[i][j] = (res.a[i][j] + a[i][p] * o.a[p][j]) % MOD;
            }
        return res;
    }
};

Notes:

  • The p-loop is in the middle (i-p-j order) so the inner loop streams over o.a[p][j] -- better cache behavior than i-j-p.
  • The if (a[i][p] == 0) continue skip is cheap and helps on sparse matrices.
  • For k5 (most contest problems) this runs in nanoseconds per multiply.

4.2 Matrix Binary Exponentiation

cpp
Matrix matpow(Matrix base, ll exp) {
    Matrix result = Matrix::identity(base.n);
    while (exp > 0) {
        if (exp & 1) result = result * base;
        base = base * base;
        exp >>= 1;
    }
    return result;
}

4.3 Solving Fibonacci

cpp
ll fibonacci(ll n) {
    if (n <= 1) return n;

    Matrix M(2);
    M.a = {{1, 1},
            {1, 0}};

    Matrix result = matpow(M, n - 1);
    // result.a[0][0] = F(n), result.a[0][1] = F(n-1)
    return result.a[0][0];
}

Why n - 1? We start from v0=[F(1),F(0)]T=[1,0]T. Multiplying by M once gives [F(2),F(1)]T. So Mn1 applied to v0 yields [F(n),F(n1)]T. Since the second column of Mn1 gets multiplied by F(0)=0, the answer is simply result.a[0][0].

4.4 General Linear Recurrence Solver

cpp
// Solve f(n) = c[0]*f(n-1) + c[1]*f(n-2) + ... + c[k-1]*f(n-k)
// init[0] = f(0), init[1] = f(1), ..., init[k-1] = f(k-1)
ll solve_recurrence(vector<ll> c, vector<ll> init, ll n) {
    int k = (int)c.size();
    if (n < k) return init[n] % MOD;

    // Build transition matrix
    Matrix M(k);
    for (int j = 0; j < k; j++)
        M.a[0][j] = (c[j] % MOD + MOD) % MOD;     // first row = coefficients
    for (int i = 1; i < k; i++)
        M.a[i][i - 1] = 1;                          // sub-diagonal = 1

    Matrix result = matpow(M, n - (k - 1));

    // Multiply result by initial state vector [f(k-1), f(k-2), ..., f(0)]
    ll ans = 0;
    for (int j = 0; j < k; j++)
        ans = (ans + result.a[0][j] % MOD * (init[k - 1 - j] % MOD)) % MOD;
    return (ans + MOD) % MOD;
}

Usage for Fibonacci: solve_recurrence({1, 1}, {0, 1}, n)Usage for Tribonacci: solve_recurrence({1, 1, 1}, {0, 0, 1}, n)


ASCII Trace: Computing Fibonacci F(6) via Matrix Exponentiation

Recurrence: F(n) = F(n-1) + F(n-2),  F(0)=0, F(1)=1

Matrix form: [F(n+1)]  =  [1  1]^n  x  [F(1)]  =  M^n x [1]
             [F(n)  ]     [1  0]       [F(0)]         [0]

Goal: compute M^6 using binary exponentiation.  6 = 110₂

Initialize: result = I = [1 0]    base = M = [1 1]
                         [0 1]               [1 0]

Step 1: bit=0 (LSB of 6=110₂)  -> 0, skip multiply
  base = M² = [1 1] x [1 1] = [2 1]
              [1 0]   [1 0]   [1 1]

Step 2: bit=1                   -> 1, result *= base
  result = I x M² = [2 1]
                     [1 1]
  base = M⁴ = [2 1] x [2 1] = [5 3]
              [1 1]   [1 1]   [3 2]

Step 3: bit=1 (MSB)             -> 1, result *= base
  result = [2 1] x [5 3] = [13 8]
           [1 1]   [3 2]   [ 8 5]

result = M⁶ = [13  8]
              [ 8  5]

Answer: F(6) = result[1][0] = 8   OK  (0,1,1,2,3,5,8)
        F(7) = result[0][0] = 13  OK

5. Patterns and Variations

Pattern 1: Fibonacci / k-bonacci

The most common entry point. Recurrence order k, matrix size k×k.

RecurrencekTransition matrix sizeTime
Fibonacci22×2O(8logn)
Tribonacci33×3O(27logn)
Tetranacci44×4O(64logn)

Pattern 2: Recurrence with Constant Term

Suppose f(n)=2f(n1)+3. This is not a pure linear recurrence -- there is an additive constant. Fix: augment the state with a dummy variable that stays 1:

  State: [ f(n), 1 ]^T

  M = [ 2  3 ]     Meaning:  f(n)   = 2*f(n-1) + 3*1
      [ 0  1 ]               1      = 0*f(n-1) + 1*1

Works for any polynomial additive term by adding more dummy dimensions (1,n,n2,).

STATE AUGMENTATION -- ADDING A CONSTANT TERM:

  Recurrence:  f(n) = 2*f(n-1) + 3        (constant +3)

  Without augmentation:  CANNOT express as matrix multiply
  With augmentation:     add a "1" that stays 1 forever

  +----------+     +---------+     +----------+
  | f(n+1)   |     |  2   3  |     |  f(n)    |
  |          |  =  |         |  *  |          |
  |   1      |     |  0   1  |     |   1      |
  +----------+     +---------+     +----------+

  f(3): start [f(1),1]=[1,1] -> M*[1,1]=[5,1] -> M*[5,1]=[13,1]
  Check: f(2)=2*1+3=5, f(3)=2*5+3=13  (correct)

Pattern 3: Tiling Problems

"Count the ways to tile a 3×n grid with 1×2 dominoes."

The state is the column profile -- which cells in the current column are already covered by a horizontal domino started in the previous column. For a 3×n grid there are 23=8 possible profiles, giving an 8×8 transition matrix. Many entries are zero (invalid transitions), so the effective matrix is often smaller. Apply Mn to the initial state.

COLUMN PROFILE DP FOR TILING:

  3*n grid, 1*2 dominoes.  State = which cells in column c are
  already covered by a horizontal domino from column c-1.

  Column c-1   Column c     Profile for col c
  +---+        +---+
  | # |------->| . |        bit 0: covered (#->.)
  |   |        |   |        bit 1: empty
  | # |------->| . |        bit 2: covered (#->.)
  +---+        +---+
  Profile = 101 (binary) = cells 0,2 pre-covered

  Transition matrix M[from_profile][to_profile] = 1 if valid
  Answer = M^n applied to initial empty profile (all zeros)

Pattern 4: Counting Paths of Length k in a Graph

Theorem: If A is the adjacency matrix of a graph with n vertices, then (Ak)[i][j] counts the number of walks of length exactly k from vertex i to vertex j.

cpp
// Count paths of length k from src to dst in a graph with n vertices
ll count_paths(vector<vector<int>>& adj, int n, int src, int dst, ll k) {
    Matrix A(n);
    for (int u = 0; u < n; u++)
        for (int v : adj[u])
            A.a[u][v] = (A.a[u][v] + 1) % MOD;

    Matrix result = matpow(A, k);
    return result.a[src][dst];
}

Complexity: O(n3logk). This is efficient when n is small (say n100) and k is huge.

PATH COUNTING VIA MATRIX POWER:

  Graph:                    Adjacency matrix A:
    0 ──-> 1                   │ 0  1  1 │
    │    ↗│                   │ 0  0  1 │
    ↓  ╱  ↓                   │ 1  0  0 │
    2 <-── 

  Paths of length 2 from 0:
    0->1->2, 0->2->0              A² = │ 1  0  1 │
                                    │ 1  0  0 │
                                    │ 0  1  1 │

  A²[0][0] = 1: path 0->2->0
  A²[0][2] = 1: path 0->1->2

  Paths of length 3 from 0:     A³ = A² x A = │ 0  2  1 │
                                                │ 0  1  1 │
                                                │ 1  0  1 │

  A³[0][1] = 2: paths 0->1->2->? ... 
  (0->2->0->1 and 0->1->2->? -> actually need to verify)

  ┌──────────────────────────────────────────────┐
  │ A^k[i][j] = number of directed paths of     │
  │ EXACTLY k edges from i to j.                 │
  │                                              │
  │ Works because matrix multiply sums over      │
  │ all possible intermediate vertices:           │
  │ (A^k)[i][j] = Σ_m (A^{k-1})[i][m] * A[m][j]│
  └──────────────────────────────────────────────┘

Pattern 5: Sum of a Geometric Matrix Series

Compute S=I+M+M2++Mn1.

Divide-and-conquer approach:

  • If n is even: S(n)=(I+Mn/2)S(n/2)
  • If n is odd: S(n)=I+MS(n1)

Or use the augmented-matrix trick -- build a (2k)×(2k) block matrix:

  T = [ M   I ]       Then T^n = [ M^n        S(n)     ]
      [ 0   I ]                  [  0          I        ]

Extract the top-right k×k block of Tn to get S(n).


Complexity Analysis

OperationTimeSpace
Matrix multiply (k×k)O(k3)O(k2)
Matrix exponentiation MnO(k3logn)O(k2)
General recurrence of order kO(k3logn)O(k2)
Graph paths of length k (n vertices)O(n3logk)O(n2)

Practical limits in contests (within ~1 second):

  k (matrix size)  |  Max n (approx)
  -----------------+------------------
       2           |  10^18    (easily)
       3           |  10^18
      10           |  10^18
      50           |  10^15
     100           |  10^12
     200           |  10^9     (tight)

The bottleneck is the k3 factor in each multiplication. For k>200, each squaring step becomes expensive and the constant is large.


Edge Cases and Pitfalls

Pitfall 1: Off-by-One in Exponent

The most common bug. If your base vector is [f(k1),,f(0)]T and you want f(n), the exponent is n(k1), not n.

  Wrong:  matpow(M, n)         -->  gives f(n + k - 1)
  Right:  matpow(M, n - k + 1) -->  gives f(n)

Always test with small values: compute f(k) using your matrix formula and verify it matches the recurrence.

Pitfall 2: Modular Arithmetic in Matrix Multiply

Every intermediate sum must be reduced mod m. Since each product a[i][p]b[p][j] can be up to (109+6)21018, it fits in long long. But summing k such products can overflow if k>9. Solutions:

  • Reduce after each addition (done in the code above).
  • Or use __int128 for the accumulator and reduce once at the end.

Pitfall 3: n=0 Edge Case

M0=I (identity matrix). Make sure your code handles exp = 0 correctly -- the while (exp > 0) loop naturally returns result = I, which is correct.

Pitfall 4: Negative Coefficients

If the recurrence has negative coefficients (e.g., f(n)=2f(n1)f(n2)), add MOD before taking % to avoid negative entries in the matrix:

cpp
M.a[0][j] = (c[j] % MOD + MOD) % MOD;

Mental Traps

"Matrix exponentiation is only for Fibonacci-like problems." Fibonacci is the simplest instance. The technique applies whenever the state can be packed into a vector and the transition is a linear map: path counting in graphs (Ak), tiling problems (column-profile DP), linear recurrences with constant/polynomial terms (state augmentation). If you reach for matrix exponentiation only when you see "Fibonacci," you miss half its applications.

MATRIX EXPONENTIATION -- WIDER THAN YOU THINK:

  Problem type              State size k     Matrix meaning
  ──────────────────────    ────────────     ──────────────────
  Fibonacci / k-bonacci     2-4              recurrence coefficients
  f(n) = af(n-1) + c        3               augmented with constant
  Tiling m*n grid            2^m             column-profile DP
  Path count in graph       |V|              adjacency matrix
  Shortest k-edge path      |V|              (min,+) adjacency

"k×k matrix multiplication is O(k2)." It is O(k3). With logn squarings, total cost is O(k3logn). For k=100,n=1018: 1003×606×107 -- fine. For k=200: 2003×605×108 -- TLE risk. Always compute the constant before coding.

Trap: "State size k doesn't matter -- matrix exponentiation is always O(logn)."

It's O(k3logn). The k3 factor from matrix multiplication dominates when k is even moderately large. For k=2 (Fibonacci), this is 8logn -- trivial. For k=50, it's 125000logn. For k=200, it's 8×106logn, which exceeds 108 when n=1018 (log2n60). Know the crossover: k50 is comfortable; k=100 is tight; k>150 may TLE.

STATE SIZE vs RUNTIME:

  n = 10¹⁸,  log₂(n) ~= 60

  k    │  k³ x 60    │  Status
  ─────┼─────────────┼──────────
    2  │       480   │  trivial
    5  │     7,500   │  fast
   10  │    60,000   │  fast
   50  │ 7,500,000   │  ok
  100  │ 60,000,000  │  tight (1-2s)
  200  │480,000,000  │  TLE X

  ┌─────────────────────────────────────────┐
  │ Rule of thumb: k <= 70 is always safe.  │
  │ k > 100: profile before submitting.     │
  │ k > 200: look for a different approach  │
  │ (Berlekamp-Massey, Kitamasa, etc.)      │
  └─────────────────────────────────────────┘

Trap: "If the recurrence has f(n)=c1f(n1)+c2f(n2)++ckf(nk)+d, I need a (k+1)×(k+1) matrix."

Only if d is a constant. If d depends on n (e.g., f(n)=2f(n1)+n), you need to encode n itself in the state. For d=n: state = [f(n),n,1], so the matrix is (k+2)×(k+2). For d=n2: you need n2, n, and 1 in the state. Each polynomial term adds a row. Getting the augmented state wrong means the matrix doesn't represent the recurrence.

STATE AUGMENTATION FOR NON-CONSTANT TERMS:

  Recurrence: f(n) = 2*f(n-1) + n

  State: [f(n), n, 1]ᵀ

  Transition:
    f(n+1) = 2*f(n) + (n+1) = 2*f(n) + n + 1
    n+1    = n + 1
    1      = 1

  Matrix:
    ┌           ┐   ┌           ┐   ┌        ┐
    │ f(n+1)    │   │ 2   1   1 │   │ f(n)   │
    │ n+1       │ = │ 0   1   1 │ x │ n      │
    │ 1         │   │ 0   0   1 │   │ 1      │
    └           ┘   └           ┘   └        ┘

  Check: f(n+1) = 2*f(n) + 1*n + 1*1 = 2f(n) + n + 1  OK
         n+1    = 0*f(n) + 1*n + 1*1 = n + 1            OK
         1      = 0*f(n) + 0*n + 1*1 = 1                OK

Worked Examples

Example 1: Fibonacci, n=1018

Problem: Compute F(n)mod(109+7).

cpp
int main() {
    ll n;
    cin >> n;
    cout << fibonacci(n) << "\n";
    return 0;
}

Using the fibonacci function from Section 4.3.

Trace for n=10:

  M = [ 1 1 ]     matpow(M, 9):
      [ 1 0 ]
                   9 = 1001 in binary
                   result starts as I

  bit 0 (set):    result = I * M   = M
  bit 1 (clear):  base = M^2, skip
  bit 2 (clear):  base = M^4, skip
  bit 3 (set):    result = M^8 * M = M^9

  M^9 = [ 55  34 ]
        [ 34  21 ]

  F(10) = M^9[0][0] = 55.   Correct.

Example 2: Counting Paths of Length k

Problem: Given a directed graph with n50 vertices and m edges, count the number of walks of length exactly k1018 from vertex 1 to vertex n, modulo 109+7.

cpp
int main() {
    int n, m;
    ll k;
    cin >> n >> m >> k;

    Matrix A(n);
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        u--; v--;
        A.a[u][v] = (A.a[u][v] + 1) % MOD;
    }

    Matrix result = matpow(A, k);
    cout << result.a[0][n - 1] << "\n";
    return 0;
}

Example 3: Recurrence with Constant Term

Problem: f(n)=3f(n1)+2f(n2)+5, with f(0)=1,f(1)=2. Compute f(n) for n1018.

Augment the state to handle the constant:

cpp
int main() {
    ll n;
    cin >> n;
    if (n == 0) { cout << 1; return 0; }
    if (n == 1) { cout << 2; return 0; }

    // State: [f(n), f(n-1), 1]^T
    // Transition:
    //   f(n)   = 3*f(n-1) + 2*f(n-2) + 5*1
    //   f(n-1) = 1*f(n-1) + 0*f(n-2) + 0*1
    //   1      = 0*f(n-1) + 0*f(n-2) + 1*1

    Matrix M(3);
    M.a = {{3, 2, 5},
            {1, 0, 0},
            {0, 0, 1}};

    Matrix R = matpow(M, n - 1);

    // base vector: [f(1), f(0), 1] = [2, 1, 1]
    ll ans = (R.a[0][0] * 2 % MOD + R.a[0][1] * 1 % MOD + R.a[0][2] * 1 % MOD) % MOD;
    cout << ans << "\n";
    return 0;
}

Self-Test

  • [ ] Write the O(k3) matrix multiplication with correct index order and % MOD -- no peeking
  • [ ] Construct the transition matrix for f(n)=2f(n1)+3f(n2)+5 (constant term) -- what is the matrix dimension?
  • [ ] Given a 3-node directed graph, compute the number of walks of length 4 from node 0 to node 2 by hand using A4
  • [ ] Explain why M0=I (identity) and what goes wrong if your code initializes result to all-zeros instead
  • [ ] State the practical size limit: for which k does O(k3log1018) fit in 1 second?
  • [ ] Construct the 3×3 transition matrix for the recurrence f(n)=f(n1)+2f(n2)+3 -- write out the augmented state vector and verify the matrix by computing f(3) from f(1),f(0).
  • [ ] Count the number of paths of length 4 from vertex 0 to vertex 2 in a 3-node directed graph with adjacency matrix A=[[0,1,1],[1,0,1],[0,1,0]] -- compute A4 by hand (or by repeated squaring).
  • [ ] Compute F10 (10th Fibonacci number) using matrix exponentiation: write M9[1,0]T with the 2x2 Fibonacci matrix, using binary expansion 9=10012.
  • [ ] Explain how to encode the recurrence f(n)=f(n1)+n2 as matrix exponentiation -- what is the state vector and the matrix size?
  • [ ] Modify the matrix multiplication to use the (min,+) semi-ring and compute the shortest 3-edge path between all pairs in a 3-node graph with edge weights.

Practice Problems

#ProblemSourceKey IdeaDifficulty
1FibonacciCSESDirect 2×2 matrix exponentiationCF ~1600
2Graph Paths ICSESAdjacency matrix to the k-th powerCF ~1700
3Graph Paths IICSESShortest path of exactly k edges via min-plus matrixCF ~1900
4C. Neko does MathsCF Div 2Recurrence accelerationCF ~1800
5E. The Number of InversionsCF Div 1Matrix exponentiation on DP stateCF ~2000
6D. Magic GemsCF Div 2Linear recurrence, k-bonacci variantCF ~1700
7E. Pchelyonok and SegmentsCF Div 2Matrix exponentiation + combinatoricsCF ~2000
8B. Parity of InversionsCF Div 2Matrix exponentiation trickCF ~1800

Recommended progression: Start with CSES Fibonacci (#1), then Graph Paths I (#2) to see the adjacency matrix pattern. Move to Magic Gems (#6) for general recurrences, then tackle the harder CF problems.


10. Key Takeaways

+-------------------------------------------------------------------+
|  CHEAT SHEET -- MATRIX EXPONENTIATION                               |
|                                                                    |
|  1. Identify a linear recurrence of order k                        |
|     f(n) = c1*f(n-1) + c2*f(n-2) + ... + ck*f(n-k)              |
|                                                                    |
|  2. Build the k x k transition matrix M                            |
|     - Row 0: recurrence coefficients [c1, c2, ..., ck]            |
|     - Rows 1..k-1: identity shift (1 on sub-diagonal)             |
|                                                                    |
|  3. Build the initial state vector v0                              |
|     v0 = [f(k-1), f(k-2), ..., f(0)]^T                           |
|                                                                    |
|  4. Compute M^(n - k + 1) * v0 via binary exponentiation          |
|     - The top entry of the result is f(n)                          |
|                                                                    |
|  5. Extensions:                                                    |
|     - Constant term  -->  augment state with dummy "1"             |
|     - Graph paths    -->  A^k where A = adjacency matrix           |
|     - Tiling         -->  column-profile DP, matrix exp on states  |
|     - Matrix sum     -->  augmented block matrix trick             |
|                                                                    |
|  COMPLEXITY: O(k^3 * log n)   (k = state size, n = target index)  |
+-------------------------------------------------------------------+

One-sentence summary: Pack your DP state into a vector, write the transition as a matrix, and binary-exponentiate your way from O(n) to O(k3logn).


The Aha Moment

If you can write: vn=Avn1, then vn=Anv0. And An can be computed in O(k3logn) via binary exponentiation -- the same squaring trick you use for modular exponentiation, but with matrix multiplication instead of scalar multiplication. Any linear recurrence with constant coefficients becomes a matrix power.

The real "aha": the hard part isn't the exponentiation -- it's formulating the state vector and transition matrix. Once you write v and A, the code is always the same 20 lines.


Pattern Fingerprint

Constraint / SignalTechnique
"Compute f(n) where n1018" and f is a linear recurrenceMatrix exponentiation
"k-th order recurrence, k10, n1018"k×k matrix raised to the n-th power
"Count paths of length n in a graph with 100 nodes"Adjacency matrix to the n-th power
DP with transitions like dp[i]=adp[i1]+bdp[i2]+cPack into state vector, add constant as extra dimension
"Number of tilings / strings of length n with constraints"Build DFA/NFA transition matrix, exponentiate
"n is huge but state space is small" (200)Matrix exponentiation is the go-to

Rating Progression

CF RatingWhat You Should Be Able To Do
1200Know Fibonacci can be computed in O(logn) via matrix exponentiation
1500Formulate a 2×2 or 3×3 transition matrix for a given recurrence
1800Handle recurrences with constant terms (extend state vector); count graph paths via adjacency matrix power
2100Build transition matrices from DFA/NFA; combine matrix exponentiation with segment tree; handle block-diagonal optimizations

Before You Code Checklist

  • [ ] Write the recurrence on paper first -- identify the state: what values do you need to carry from step n1 to step n?
  • [ ] Initialize the identity matrix correctly -- I[i][i]=1, all others 0; a wrong identity gives wrong results silently
  • [ ] Zero-initialize the product matrix in multiply() -- forgetting this is the #1 bug
  • [ ] Handle the constant term -- if f(n)=2f(n1)+3, add a "1" to the state vector and an extra row/column for the constant
  • [ ] Check matrix dimensions -- the state vector size k determines k×k matrices; off-by-one in k corrupts everything

What Would You Do If...?

Scenario 1: "The recurrence has a non-constant coefficient, e.g., f(n)=nf(n1)." Matrix exponentiation only works for constant-coefficient linear recurrences. If the coefficient depends on n, you can't use a single transition matrix. Sometimes you can reformulate (e.g., take logs), use segment tree of matrices for different ranges, or accept O(n) DP.

Scenario 2: "The matrix is 200×200 and n=1018."O(k3logn)=2003×604.8×108 -- borderline. Optimize: use bitwise operations if entries are 0/1 (bitset multiplication gives k3/64), or look for structure (sparse matrix, block diagonal) to reduce the effective dimension.

Scenario 3: "You need Mnmodp for multiple values of n." Precompute M1,M2,M4,M8, (at most logn matrices). Then for each query, combine the relevant powers. If queries are online, this is still O(k3logn) per query, but the precomputation is shared.


The Mistake That Teaches

Story: You need the n-th Fibonacci number modulo 109+7 where n1018. You write the 2×2 matrix, implement matrix multiplication and binary exponentiation, and test: F(10)=55 (ok), F(50) matches. You submit -- Wrong Answer.

After debugging, you print the identity matrix your mat_pow starts with... and it's all zeros. You initialized result as a zero matrix instead of the identity matrix. The exponentiation computed An0=0 for every input.

The fix:

cpp
Matrix result;
// MUST initialize as identity
for (int i = 0; i < k; i++) result.a[i][i] = 1;

Small test cases accidentally passed because your base case handling returned v0 directly for n=0, bypassing the matrix code. The bug only manifested for large n.

Lesson: Always verify your identity matrix initialization, and test with n=1 (not just n=0 or large n).


Debug This

Bug 1:

Find the bug in this matrix multiplication
cpp
struct Matrix {
    long long a[MAXK][MAXK];
};

Matrix multiply(Matrix &A, Matrix &B) {
    Matrix C;
    for (int i = 0; i < k; i++)
        for (int j = 0; j < k; j++)
            for (int p = 0; p < k; p++)
                C.a[i][j] = (C.a[i][j] + A.a[i][p] * B.a[p][j]) % MOD;
    return C;
}

Bug: Matrix C; is declared but never zero-initialized. C.a[i][j] contains garbage values, so the accumulation C.a[i][j] + ... produces wrong results. In C++, local struct members are not zero-initialized by default.

Fix: Add Matrix C = {}; or memset(C.a, 0, sizeof(C.a)); before the loops.

Bug 2:

Find the bug in this matrix exponentiation
cpp
Matrix mat_pow(Matrix A, long long n) {
    Matrix result;
    memset(result.a, 0, sizeof(result.a));
    for (int i = 0; i < k; i++) result.a[i][i] = 1;  // identity

    while (n) {
        if (n & 1) result = multiply(result, A);
        A = multiply(A, A);
        n >>= 1;
    }
    return result;
}

// Fibonacci: F(n) = F(n-1) + F(n-2)
// A = {{1,1},{1,0}}, v = {F(1), F(0)} = {1, 0}
// Answer: (mat_pow(A, n) * v)[0]

Bug: For n=0, this returns the identity matrix times v, giving v[0]=F(1)=1. But F(0)=0. The exponent should be n, not n1 or n+1, depending on how you define the base. Here, An[F(1),F(0)]T=[F(n+1),F(n)]T, so the answer is element [1] (second component), not [0].

Fix: Return (mat_pow(A, n) * v)[1] to get F(n), or adjust the initial vector to [F(0),F(1)]=[0,1].

Bug 3:

Find the bug in this recurrence-to-matrix conversion
cpp
// Recurrence: f(n) = 2*f(n-1) + 3*f(n-2) + 5
// State vector: [f(n), f(n-1)]
// Transition matrix:
long long A[2][2] = {
    {2, 3},
    {1, 0}
};

Bug: The recurrence has a constant term (+5), but the matrix doesn't account for it. A 2×2 matrix can only express f(n)=2f(n1)+3f(n2), missing the +5.

Fix: Extend the state vector to [f(n),f(n1),1] and use a 3×3 matrix:

cpp
long long A[3][3] = {
    {2, 3, 5},  // f(n) = 2*f(n-1) + 3*f(n-2) + 5*1
    {1, 0, 0},  // f(n-1) = 1*f(n-1)
    {0, 0, 1}   // 1 = 1 (constant stays constant)
};

Common Mistakes

  • Off-by-one in the exponent. If your base vector holds f(k-1)..f(0), you need M^(n-k+1), not M^n.
  • Forgetting the constant term. Recurrences with additive constants need an extra dimension (append a 1 to the state vector).
  • Integer overflow in matrix multiply. Each cell accumulates k products of large values -- intermediate sums overflow even with long long if you don't reduce mod p at each step.
  • Wrong identity matrix for base case. M^0 must be the identity matrix, not the zero matrix.

Historical Origin

Matrix exponentiation as a computational technique piggybacks on two ideas: Arthur Cayley's 1858 formalization of matrix algebra, and the ancient binary exponentiation method (known in India by 200 BCE for scalar powers). The specific application to linear recurrences became a competitive programming staple in the early 2000s, popularized by problems on TopCoder and Codeforces that featured n1018.


Mnemonic Anchor

"Pack the state, build the matrix, square your way to 1018."


ConceptLink
Math Fundamentals01-math-fundamentals
DP Intro02-dp-intro-1d
Gaussian Elimination10-gaussian-elimination
PracticeLadder - Mixed

Built for the climb from Codeforces 1100 to 2100.