Skip to content

Number Theory

Essential toolkit for contest math: primes, divisors, modular arithmetic, and multiplicative functions. Appears constantly in Codeforces Div 2 B-D and is the backbone of any problem tagged math or number theory.

Difficulty: [Intermediate]Prerequisites: Math FundamentalsCF Rating Range: 1300-1800

What this file covers, in order:

  1. GCD and the Euclidean algorithm (revisited from a primes/divisibility angle).
  2. Sieve of Eratosthenes — primes up to N in O(NloglogN).
  3. Linear sieve and smallest-prime-factor table — primes plus O(logn) factorization.
  4. Factorization in isolation (trial division up to 1012, Miller-Rabin / Pollard rho up to 1018).
  5. Divisor counts and divisor sums.
  6. Euler's totient ϕ(n) and Mobius μ(n) — multiplicative functions on the sieve.
  7. Modular inverse via Fermat (revisited; full extgcd treatment lives in the next file).

Note on overlap with Math Fundamentals. Fundamentals teaches the GCD algorithm. This file teaches what GCD reveals about primes, divisibility, and factorization patterns — and how those interact with sieves and multiplicative functions. The repetition is deliberate, but it has a different teaching purpose.

Quick Revisit

  • USE WHEN: problem involves primes, divisors, factorization, or Euler's totient
  • INVARIANT: every integer > 1 has a unique prime factorization
  • TIME: O(n log log n) sieve, O(sqrt(n)) factorization | SPACE: O(n) for sieve
  • CLASSIC BUG: sieve inner loop starting at 2i instead of ii (correct but unnecessarily slow)
  • PRACTICE: see practice problems in this chapter

Contents

Contest Frequency: ** Regular -- modular arithmetic appears frequently


Intuition

SIEVE OF ERATOSTHENES -- CROSSING OUT COMPOSITES:

  Start: all marked prime
   2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20

  p=2: cross out 4,6,8,10,12,14,16,18,20  (start from 2^2=4)
   2  3  X  5  X  7  X  X  X  11  X  13  X  X  X  17  X  19  X

  p=3: cross out 9,15  (start from 3^2=9; 6,12,18 already crossed)
   2  3  X  5  X  7  X  X  X  11  X  13  X  X  X  17  X  19  X

  p=4: skip (composite).  p=5: 5^2=25 > 20 -> STOP.

  Primes <= 20:  {2, 3, 5, 7, 11, 13, 17, 19}

You need gcd(48,18). The brute-force approach: try every integer d from 1 to min(a,b)=18 and check whether d divides both a and b.

d:  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18
48: Y  Y  Y  Y  N  Y  N  Y  N  N  N  Y  N  N  N  Y  N  N
18: Y  Y  Y  N  N  Y  N  N  Y  N  N  N  N  N  N  N  N  Y
     *  *  *        *

That is 18 divisibility checks just for two small numbers. For a,b1018, you would need up to 1018 iterations -- completely hopeless. We need something fundamentally faster.

The Euclidean algorithm replaces the problem gcd(a,b) with the strictly smaller problem gcd(b,amodb), and the remainder shrinks so fast that the whole computation finishes in O(logmin(a,b)) steps.

Why does this work? If d divides both a and b, then d also divides aqb for any integer q -- in particular d divides amodb. Conversely, any common divisor of b and amodb divides a. So the set of common divisors never changes; only the numbers get smaller.

Think of it like making change: instead of counting coins one by one, you hand over the largest bills first and only deal with the leftover. Each "mod" operation is a big leap, not a tiny step.

Where it breaks: the inputs must be non-negative integers. Passing negative values to % in C++ gives implementation-defined signs before C++11 and truncation-toward-zero since C++11 -- always ensure a,b0. Also, gcd(0,0) is conventionally 0, but some library functions disagree; guard the edge case.

Visual Walkthrough

Compute gcd(48,18) with the Euclidean algorithm:

Step   a     b     a mod b    Reasoning
----   ----  ----  --------   -----------------------------------
  1    48    18      12       48 = 2*18 + 12  -->  gcd(48,18) = gcd(18,12)
  2    18    12       6       18 = 1*12 +  6  -->  gcd(18,12) = gcd(12, 6)
  3    12     6       0       12 = 2* 6 +  0  -->  gcd(12, 6) = gcd( 6, 0)
  4     6     0      ---      b = 0, done     -->  answer = 6

4 steps, 4 operations. The naive scan checked 18 candidates. For larger inputs the gap explodes: gcd(1018,1017) needs roughly 60 Euclidean steps versus 1017 trial divisions.

Worked Example: Sieve of Eratosthenes for N = 30

  Initial array (all marked prime):
   2  3  4  5  6  7  8  9 10
  11 12 13 14 15 16 17 18 19 20
  21 22 23 24 25 26 27 28 29 30

  ── p = 2: cross out 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 ──
   2  3  x  5  x  7  x  9  x
  11  x 13  x 15  x 17  x 19  x
  21  x 23  x 25  x 27  x 29  x
                                    (14 composites marked)

  ── p = 3: cross out 9, 15, 21, 27  (start at 3² = 9) ──
   2  3  x  5  x  7  x  x  x
  11  x 13  x  x  x 17  x 19  x
   x  x 23  x 25  x  x  x 29  x
                                    (4 new composites)

  ── p = 5: cross out 25  (start at 5² = 25) ──
   2  3  x  5  x  7  x  x  x
  11  x 13  x  x  x 17  x 19  x
   x  x 23  x  x  x  x  x 29  x
                                    (1 new composite)

  Next prime: p = 7.  But 7² = 49 > 30 -> STOP.

  Primes <= 30: { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 }   (10 primes)

  Total inner-loop iterations: 14 + 4 + 1 = 19
  (matches the ∑ N/p ~= N*ln(ln N) bound)

In C++20 the standard library gives you this for free:

cpp
#include <numeric>
int g = std::gcd(48, 18); // 6

The Invariant

+------------------------------------------------------------------+
|  INVARIANT                                                        |
|                                                                   |
|  gcd(a, b)  =  gcd(b, a mod b)        for all a >= 0, b > 0     |
|                                                                   |
|  CONVERGENCE GUARANTEE                                            |
|  The remainder (a mod b) < b, and after every two consecutive     |
|  steps the larger argument drops by at least half.                |
|  Therefore the algorithm terminates in at most                    |
|  2 * floor(log2(min(a, b))) + 2 steps  =  O(log min(a, b)).      |
+------------------------------------------------------------------+

Trace it in the walkthrough: 48181260. After steps 1-2 the "large" argument went from 48 to 12 -- more than halved. After steps 3-4 it went from 12 to 0. Each pair of steps at least halves the problem, giving the logarithmic bound.

When to Reach for This

Trigger phrases in problem statements:

PhraseWhy it signals number theory
"gcd", "greatest common divisor"Direct application of Euclidean algorithm
"lcm", "least common multiple"lcm(a,b)=a/gcd(a,b)b (divide first to avoid overflow)
"coprime", "gcd=1"Coprimality checks, Euler's totient, Mobius function
"Euler's totient", "ϕ(n)"Counts integers coprime to n; built on GCD/factorization
"modular inverse", "a1modm"Requires gcd(a,m)=1; computed via extended GCD or Fermat
"answer modulo 109+7"Modular arithmetic pipeline: binpow + modinv
"number of divisors", "d(n)"Prime factorization, then (ei+1)

Counter-examples -- when this is NOT the right tool:

  • "Find the k-th prime" -- this is a sieve problem, not a GCD problem.
  • "Maximize GCD by choosing a subset" -- often greedy or DP on divisors, not raw Euclidean algorithm.
  • "Count multiples in a range" -- arithmetic, not number theory machinery. Just R/d(L1)/d.

What the Code Won't Teach You

The sieve code runs; modinv returns the right number. What remains invisible:

The scale guide is the real skill. Knowing which algorithm to reach for at each input size is more important than knowing the algorithms themselves. Trial division up to n handles n1012. Miller-Rabin handles primality for n1018. Pollard's rho factors n1018 in O(n1/4). A precomputed sieve covers bulk queries for n107. Picking the wrong tool for the scale is the most common mistake in contest number theory.

FACTORIZATION / PRIMALITY SCALE GUIDE:

  +----------------------+------------------------+--------------+
  |  n range             |  Primality test         |  Factorize   |
  +----------------------+------------------------+--------------+
  |  n <= 10^7           |  Precomputed sieve O(1) |  SPF sieve   |
  |  n <= 10^12          |  Trial div O(sqrt n)    |  Trial div   |
  |  n <= 10^18          |  Miller-Rabin O(log^2 n)|  Pollard rho |
  +----------------------+------------------------+--------------+
  Wrong tool at wrong scale = TLE or MLE.

Multiplicative functions compose with the sieve, not with each other.ϕ(n), d(n), σ(n) are all multiplicative. The linear sieve computes them for all nN in O(N) by exploiting the smallest prime factor decomposition. The key insight: when p|n and p2|n, the update formula differs from the coprime case. Getting this branch wrong is the #1 bug in linear sieve implementations, and no amount of reading phi[p*n] = phi[n] * (p-1) tells you why the branch exists.


C++ STL Reference

Function / ClassHeaderWhat it doesTime Complexity
__gcd(a, b) / gcd(a, b)<algorithm> / <numeric>Greatest common divisor. gcd (C++17) handles 0 correctly.O(logmin(a,b))
lcm(a, b)<numeric>Least common multiple (C++17). Watch overflow: lcm may exceed long long.O(logmin(a,b))
__int128built-in128-bit integer. Use when intermediate products overflow long long.--
bitset<N><bitset>Compact boolean array. Can be used as a sieve with N/64 memory.O(N/64) per bitwise op
vector<bool><vector>Bit-packed boolean vector. Fine for sieve, but [] returns proxy.O(1) access
pow / powl<cmath>Avoid for integer work -- floating point errors. Write your own binpow.--

Implementation (Contest-Ready)

Version 1: Minimal Contest Template

cpp
#include <bits/stdc++.h>
using namespace std;

using ll = long long;
const int MOD = 1e9 + 7;

// --- Sieve of Eratosthenes ---
vector<int> sieve(int n) {
    vector<bool> is_prime(n + 1, true);
    is_prime[0] = is_prime[1] = false;
    for (int i = 2; i * i <= n; i++)
        if (is_prime[i])
            for (int j = i * i; j <= n; j += i)
                is_prime[j] = false;
    vector<int> primes;
    for (int i = 2; i <= n; i++)
        if (is_prime[i]) primes.push_back(i);
    return primes;
}

// --- Smallest Prime Factor sieve ---
vector<int> spf_sieve(int n) {
    vector<int> spf(n + 1);
    iota(spf.begin(), spf.end(), 0);
    for (int i = 2; i * i <= n; i++)
        if (spf[i] == i)
            for (int j = i * i; j <= n; j += i)
                if (spf[j] == j) spf[j] = i;
    return spf;
}

// --- Factorize using SPF table ---
vector<pair<int,int>> factorize(int n, const vector<int>& spf) {
    vector<pair<int,int>> factors;
    while (n > 1) {
        int p = spf[n], cnt = 0;
        while (n % p == 0) { n /= p; cnt++; }
        factors.emplace_back(p, cnt);
    }
    return factors;
}

// --- Trial division factorization (no sieve needed) ---
vector<pair<ll,int>> factorize(ll n) {
    vector<pair<ll,int>> factors;
    for (ll d = 2; d * d <= n; d++) {
        if (n % d == 0) {
            int cnt = 0;
            while (n % d == 0) { n /= d; cnt++; }
            factors.emplace_back(d, cnt);
        }
    }
    if (n > 1) factors.emplace_back(n, 1);
    return factors;
}

// --- Binary exponentiation ---
ll binpow(ll base, ll exp, ll mod) {
    ll result = 1;
    base %= mod;
    while (exp > 0) {
        if (exp & 1) result = result * base % mod;
        base = base * base % mod;
        exp >>= 1;
    }
    return result;
}

// --- Modular inverse (mod must be prime) ---
ll modinv(ll a, ll mod) {
    return binpow(a, mod - 2, mod);
}

// --- Extended GCD ---
ll extgcd(ll a, ll b, ll &x, ll &y) {
    if (b == 0) { x = 1; y = 0; return a; }
    ll x1, y1;
    ll g = extgcd(b, a % b, x1, y1);
    x = y1;
    y = x1 - (a / b) * y1;
    return g;
}

// --- Modular inverse (general, mod need not be prime) ---
ll modinv_general(ll a, ll mod) {
    ll x, y;
    ll g = extgcd(a, mod, x, y);
    if (g != 1) return -1; // inverse doesn't exist
    return (x % mod + mod) % mod;
}

// --- Euler's totient for single n ---
ll euler_phi(ll n) {
    ll result = n;
    for (ll d = 2; d * d <= n; d++) {
        if (n % d == 0) {
            while (n % d == 0) n /= d;
            result -= result / d;
        }
    }
    if (n > 1) result -= result / n;
    return result;
}

// --- Euler's totient sieve ---
vector<int> phi_sieve(int n) {
    vector<int> phi(n + 1);
    iota(phi.begin(), phi.end(), 0);
    for (int i = 2; i <= n; i++) {
        if (phi[i] == i) { // i is prime
            for (int j = i; j <= n; j += i)
                phi[j] -= phi[j] / i;
        }
    }
    return phi;
}

int main() {
    // Example: factorize and compute inverse
    auto primes = sieve(1000000);
    auto sp = spf_sieve(1000000);

    int n = 360;
    auto f = factorize(n, sp);
    for (auto [p, e] : f) cout << p << "^" << e << " ";
    cout << "\n";

    ll inv5 = modinv(5, MOD);
    cout << "5^{-1} mod 1e9+7 = " << inv5 << "\n";
    cout << "check: " << 5LL * inv5 % MOD << "\n";
}

Version 2: Explained Version

cpp
#include <bits/stdc++.h>
using namespace std;

using ll = long long;
const int MOD = 1e9 + 7;

// ======= Sieve of Eratosthenes =======
// Returns a list of all primes <= n.
// Time: O(n log log n).  Space: O(n).
// We only start crossing out at i*i because smaller multiples
// of i were already marked by smaller primes.
vector<int> sieve(int n) {
    vector<bool> is_prime(n + 1, true);
    is_prime[0] = is_prime[1] = false;
    for (int i = 2; i * i <= n; i++) {
        if (is_prime[i]) {
            // Mark all multiples of i starting from i^2.
            // Multiples below i^2 (like 2*i, 3*i, ...) were already
            // marked when we processed 2, 3, etc.
            for (int j = i * i; j <= n; j += i)
                is_prime[j] = false;
        }
    }
    vector<int> primes;
    for (int i = 2; i <= n; i++)
        if (is_prime[i]) primes.push_back(i);
    return primes;
}

// ======= Smallest Prime Factor (SPF) Sieve =======
// spf[i] = smallest prime that divides i.
// Allows O(log n) factorization of any n in range.
// iota fills spf[i] = i, so primes keep spf[p] = p.
vector<int> spf_sieve(int n) {
    vector<int> spf(n + 1);
    iota(spf.begin(), spf.end(), 0); // spf[i] = i initially
    for (int i = 2; i * i <= n; i++) {
        if (spf[i] == i) { // i is prime (nobody set a smaller factor)
            for (int j = i * i; j <= n; j += i) {
                if (spf[j] == j) // only update if not already set
                    spf[j] = i;
            }
        }
    }
    return spf;
}

// Factorize n using SPF table: repeatedly divide by spf[n].
// Returns vector of (prime, exponent) pairs.
vector<pair<int,int>> factorize(int n, const vector<int>& spf) {
    vector<pair<int,int>> factors;
    while (n > 1) {
        int p = spf[n], cnt = 0;
        while (n % p == 0) { n /= p; cnt++; }
        factors.emplace_back(p, cnt);
    }
    return factors;
}

// ======= Binary Exponentiation =======
// Computes base^exp % mod in O(log exp).
// Key idea: express exp in binary. For each bit, either
// multiply result by base (if bit is 1) or skip.
// Always keep intermediate values < mod^2 to avoid overflow.
// With mod ~ 1e9, base*base ~ 1e18 which fits in long long.
ll binpow(ll base, ll exp, ll mod) {
    ll result = 1;
    base %= mod; // ensure base < mod to start
    while (exp > 0) {
        if (exp & 1)                     // current bit is 1
            result = result * base % mod;
        base = base * base % mod;        // square the base
        exp >>= 1;                       // shift to next bit
    }
    return result;
}

// ======= Modular Inverse via Fermat's Little Theorem =======
// If mod is prime, a^{mod-1} = 1 (mod), so a^{-1} = a^{mod-2}.
// This is the go-to method for MOD = 1e9+7.
ll modinv(ll a, ll mod) {
    return binpow(a, mod - 2, mod);
}

// ======= Extended GCD =======
// Finds x, y such that a*x + b*y = gcd(a, b).
// When gcd(a, mod) = 1, the x gives us a^{-1} mod b.
// Works for any modulus, not just prime.
ll extgcd(ll a, ll b, ll &x, ll &y) {
    if (b == 0) {
        x = 1; y = 0;
        return a;
    }
    ll x1, y1;
    ll g = extgcd(b, a % b, x1, y1);
    // Back-substitute: a*x + b*y = g
    // becomes b*x1 + (a%b)*y1 = g
    // => a*y1 + b*(x1 - (a/b)*y1) = g
    x = y1;
    y = x1 - (a / b) * y1;
    return g;
}

// Modular inverse via ExtGCD. Returns -1 if no inverse.
ll modinv_general(ll a, ll mod) {
    ll x, y;
    ll g = extgcd(a, mod, x, y);
    if (g != 1) return -1;
    return (x % mod + mod) % mod; // ensure positive
}

// ======= Euler's Totient =======
// phi(n) = count of k in [1,n] with gcd(k,n) = 1.
// Formula: phi(n) = n * product of (1 - 1/p) for each prime p | n.
// We compute it during trial division to avoid floating point.
ll euler_phi(ll n) {
    ll result = n;
    for (ll d = 2; d * d <= n; d++) {
        if (n % d == 0) {
            while (n % d == 0) n /= d;
            result -= result / d; // equivalent to result *= (1 - 1/d)
        }
    }
    if (n > 1) result -= result / n; // remaining prime factor
    return result;
}

// ======= Totient Sieve =======
// Computes phi[i] for all i in [0, n].
// Same idea as prime sieve: if i is prime, update all multiples.
vector<int> phi_sieve(int n) {
    vector<int> phi(n + 1);
    iota(phi.begin(), phi.end(), 0); // phi[i] = i initially
    for (int i = 2; i <= n; i++) {
        if (phi[i] == i) { // i is prime
            for (int j = i; j <= n; j += i)
                phi[j] -= phi[j] / i; // multiply by (1 - 1/i)
        }
    }
    return phi;
}

int main() {
    auto sp = spf_sieve(1000000);

    // Factorize 360 = 2^3 * 3^2 * 5
    auto f = factorize(360, sp);
    for (auto [p, e] : f)
        cout << p << "^" << e << " ";
    cout << "\n"; // Output: 2^3 3^2 5^1

    // Modular inverse: 5 * inv(5) = 1 (mod 1e9+7)
    ll inv5 = modinv(5, MOD);
    cout << 5LL * inv5 % MOD << "\n"; // Output: 1

    // Euler's totient: phi(360) = 96
    cout << euler_phi(360) << "\n"; // Output: 96
}

Operations Reference

OperationTimeSpace
Sieve of Eratosthenes (primes up to N)O(NloglogN)O(N)
SPF sieve (smallest prime factor up to N)O(NloglogN)O(N)
Factorize via SPF tableO(logn)O(logn)
Trial division factorizationO(n)O(logn)
Binary exponentiationO(loge)O(1)
Modular inverse (Fermat)O(logm)O(1)
Extended GCDO(logmin(a,b))O(logmin(a,b)) stack
Euler's totient (single n)O(n)O(1)
Euler's totient sieve up to NO(NloglogN)O(N)
GCD (Euclidean)O(logmin(a,b))O(1)


🔍 Why This Works -- Euler's Totient is Multiplicative

Claim: If gcd(a,b)=1, then φ(ab)=φ(a)φ(b).

Proof sketch via CRT: By the Chinese Remainder Theorem, when gcd(a,b)=1, there's a bijection between Zab and Za×Zb (mapping x(xmoda,xmodb)). Under this bijection, gcd(x,ab)=1 if and only if gcd(xmoda,a)=1 AND gcd(xmodb,b)=1.

So the count of integers coprime to ab (which is φ(ab)) equals the count of pairs where the first is coprime to a and the second is coprime to b -- that's φ(a)φ(b).

This is why φ(n) can be computed from the prime factorization: φ(p1a1pkak)=piai1(pi1).

Problem Patterns & Tricks

Pattern 1: Sieve + Query

Precompute a sieve (prime, SPF, or totient) up to N, then answer Q queries in O(1) or O(logn) each. Nearly every "for each ni, compute ..." problem with ni106 uses this.

cpp
auto sp = spf_sieve(1000000);
while (q--) {
    int n; cin >> n;
    auto f = factorize(n, sp);
    // answer using f
}

Problems: CF 576A, CF 154B

Pattern 2: Divisor Enumeration

Enumerate all divisors of n from its prime factorization, or sieve-style: for each d from 1 to N, iterate over multiples d,2d,3d, Total work is O(NlogN) (harmonic series).

cpp
// All divisors of every number up to N
vector<vector<int>> divisors(N + 1);
for (int d = 1; d <= N; d++)
    for (int m = d; m <= N; m += d)
        divisors[m].push_back(d);

Problems: CF 1512G, CF 236B

Pattern 3: "Modulo 1e9+7" -- Modular Arithmetic Pipeline

When the answer involves factorials, binomial coefficients, or counting: precompute factorial and inverse factorial arrays, then answer combinations in O(1).

cpp
vector<ll> fac(N+1), inv_fac(N+1);
fac[0] = 1;
for (int i = 1; i <= N; i++) fac[i] = fac[i-1] * i % MOD;
inv_fac[N] = modinv(fac[N], MOD);
for (int i = N - 1; i >= 0; i--) inv_fac[i] = inv_fac[i+1] * (i+1) % MOD;

auto C = [&](int n, int k) -> ll {
    if (k < 0 || k > n) return 0;
    return fac[n] % MOD * inv_fac[k] % MOD * inv_fac[n-k] % MOD;
};

Problems: CF 1462E, CF 1312D

Pattern 4: GCD/LCM Manipulation

Problems asking about GCD of subarrays, LCM of subsets, or "make all elements equal by GCD operations." Key insight: GCD of a range can only decrease (or stay) as you extend the range, giving at most O(logmax) distinct values.

Problems: CF 1632D, CF 475D

Pattern 5: Euler's Totient in Counting

EULER'S TOTIENT phi(12) -- WHICH NUMBERS ARE COPRIME TO 12?

  12 = 2^2 * 3

   1  2  3  4  5  6  7  8  9  10  11  12
   Y  .  .  .  Y  .  Y  .  .   .   Y   .

  Y = gcd(k, 12) = 1    ->  {1, 5, 7, 11}  ->  phi(12) = 4
  Formula: 12 * (1 - 1/2) * (1 - 1/3) = 12 * 1/2 * 2/3 = 4

"Count pairs (a,b) with 1abn and gcd(a,b)=1." Answer: i=1nϕ(i). Or "count lattice points visible from origin" -- same idea.

Problems: CF 1295D, CF 431C

Pattern 6: Multiplicative Functions + Sieve

Problems where you need to compute a multiplicative function f(n) for all nN. Use a modified sieve: for each prime p, update f at multiples of p. Common for Mobius function μ(n) or sum-of-divisors.

Problems: CF 1139D, CF 757E

Pattern 7: Chinese Remainder Theorem (CRT)

Given multiple modular constraints, combine them into one. Used when a problem gives conditions modulo different primes or when you need to recover a value from remainders.

cpp
// CRT for two equations: x = r1 (mod m1), x = r2 (mod m2)
// Requires gcd(m1, m2) = 1
pair<ll,ll> crt(ll r1, ll m1, ll r2, ll m2) {
    ll x, y;
    extgcd(m1, m2, x, y);
    ll mod = m1 * m2;
    ll ans = (r1 + m1 * ((r2 - r1) % m2 * x % m2) % mod + mod) % mod;
    return {ans, mod};
}

Problems: CF 687B, CF 338D


Contest Cheat Sheet

+--------------------------------------------------------------+
|                  NUMBER THEORY CHEAT SHEET                    |
+--------------------------------------------------------------+
| WHEN TO USE:                                                  |
|  - "count/check primes" --> sieve                            |
|  - "answer modulo 1e9+7" --> binpow + modinv                 |
|  - "gcd of range/subset" --> gcd properties                  |
|  - "coprime pairs" --> Euler totient                          |
|  - "multiple mod constraints" --> CRT                         |
+--------------------------------------------------------------+
| KEY CODE:                                                     |
|  ll binpow(ll b, ll e, ll m) {                               |
|    ll r=1; b%=m;                                              |
|    while(e>0){if(e&1)r=r*b%m; b=b*b%m; e>>=1;} return r; }  |
|  ll modinv(ll a, ll m) { return binpow(a, m-2, m); }         |
+--------------------------------------------------------------+
| COMPLEXITIES:                                                 |
|  Sieve to N        : O(N log log N) time, O(N) space         |
|  SPF factorize     : O(log n) per query after O(N) sieve     |
|  Trial division    : O(sqrt(n)) per number                    |
|  binpow / modinv   : O(log n) time, O(1) space               |
|  ExtGCD            : O(log n) time                            |
+--------------------------------------------------------------+
| OVERFLOW GUARD: 1LL * a * b % MOD  (not a * b % MOD)         |
| PRIME CHECK: MOD = 1e9+7 is prime --> use Fermat inverse      |
+--------------------------------------------------------------+

Gotchas & Debugging

  • Overflow on multiplication. Two int values multiplied can overflow int silently. Always cast: 1LL * a * b % MOD. If MOD is near 1018, even long long * long long overflows -- use __int128 or binary multiplication.

  • Sieve size off-by-one. If you need primes up to N, allocate vector<bool>(N + 1). Accessing is_prime[N] on a size-N vector is UB.

  • Forgetting base case in binpow. binpow(0, 0, MOD) returns 1 in the template above. Make sure that's what you want.

  • Negative mod results. In C++, -7 % 3 == -1, not 2. Always do (x % MOD + MOD) % MOD when x might be negative.

  • ExtGCD returning negative x. The x from extgcd can be negative. Normalize: (x % mod + mod) % mod.

  • Using pow() from <cmath>. Floating-point pow(2, 30) might return 1073741823.9999 and truncate to the wrong integer. Use binpow or 1LL << 30.

  • Sieving too large. Sieve of 107 is fine (~10 MB). Sieve of 108 needs ~100 MB and may TLE or MLE on CF. For n up to 1012, use trial division up to n instead.

  • Totient sieve vs single totient. Totient sieve up to N is O(NloglogN). For a single large n, use the trial-division formula.

  • Inverse of 0. modinv(0, MOD) is undefined. If your code computes binpow(0, MOD-2, MOD), it returns 0 -- silently wrong. Guard against it.

Mental Traps

"Primality testing = sieve." Sieve is for bulk computation: all primes up to N. Primality testing is for one number. For a single n1018, use Miller-Rabin (O(log2n)). For all primes up to 107, use sieve. Using sieve for 1018 (impossible memory) or Miller-Rabin for "all primes up to 106" (1000x slower than sieve) are category errors.

PRIMALITY TESTING != PRIME ENUMERATION:

  +----------------------------------------------------------+
  |  "Is 10^18 + 7 prime?"   ->  Miller-Rabin  (per-query)  |
  |  "List all primes <= 10^7" ->  Sieve         (bulk)      |
  |                                                          |
  |  WRONG: sieve up to 10^18      (impossible: 10^18 bytes)|
  |  WRONG: Miller-Rabin * 10^7    (works but 1000x slower) |
  +----------------------------------------------------------+

"Euler's theorem always works: aϕ(m)1(modm)." Only when gcd(a,m)=1. When gcd(a,m)>1 the theorem does not apply. Example: 4ϕ(6)=42=164(mod6), not 1. Blindly applying Euler's theorem to reduce exponents when a shares a factor with m gives wrong answers -- and the output looks plausible, so you won't catch it without a targeted test case.


Common Mistakes

  1. Sieve inner loop starting at 2*i instead of i*i. Both are correct, but 2*i is O(n log n) instead of O(n log log n). For n = 10^7 this matters.
  2. Forgetting that 1 is not prime. Special-case n = 1 in primality tests and factorization routines.
  3. Euler's totient for prime powers. phi(p^k) = p^{k-1} * (p - 1), not p^k - 1. The formula involves the prime base, not just the power.
  4. Integer overflow in sieve. i * i overflows int when i > 46340. Use (long long)i * i or start the inner loop at i * i with a long long variable.
  5. Forgetting to handle the remaining factor. After trial division up to sqrt(n), if n > 1 then n itself is a prime factor. Omitting this loses the largest prime.
  6. Brute-forcing pairs when counting by GCD. For "count pairs (i,j) with gcd(ai,aj)=1" on N=105, an O(N2) loop TLEs, and "for each value, count coprime elements" is still O(Nmax_val). The right move is Möbius inversion: let cnt[d] = number of elements divisible by d, then recover pairs with gcd = d by inclusion-exclusion over multiples -- total O(max_vallogmax_val). When you need to count pairs by GCD, think multiplicative functions and inversion, not brute force. The harmonic sum (N/d) is O(NlogN), and it is your friend.

Debug Drills

Drill 1. Sieve marks primes as composite.

cpp
vector<bool> is_prime(n + 1, true);
is_prime[0] = is_prime[1] = false;
for (int i = 2; i <= n; i++) {
    if (is_prime[i]) {
        for (int j = i + i; j <= n; j += i)
            is_prime[j] = false;
    }
}
// Works, but TLE for n = 10^7. What's the optimization?
What's wrong?

Start the inner loop at j = i * i instead of j = i + i. All multiples i*k where k < i have already been sieved by smaller primes. Also change outer loop to i * i <= n. This reduces work from O(N log log N) with a larger constant to the theoretical minimum.

Note: when i * i overflows int, use (long long)i * i <= n.

Drill 2. Wrong totient computation.

cpp
int phi(int n) {
    int result = n;
    for (int p = 2; p * p <= n; p++) {
        if (n % p == 0) {
            result -= result / p;
            while (n % p == 0) n /= p;
        }
    }
    // Missing something here?
    return result;
}
What's wrong?

If n > 1 after the loop, there's a remaining prime factor larger than √(original n). Add:

cpp
if (n > 1) result -= result / n;

Without this, numbers like φ(14) = φ(2x7) would miss the factor of 7.

Drill 3. Integer overflow in factorization check.

cpp
vector<int> factorize(int n) {
    vector<int> factors;
    for (int d = 2; d * d <= n; d++) {  // What if n ~ 10^9?
        while (n % d == 0) {
            factors.push_back(d);
            n /= d;
        }
    }
    if (n > 1) factors.push_back(n);
    return factors;
}
What's wrong?

d * d overflows int when d ~= 46341 (since 46341² > 2^31 - 1). Fix: use (long long)d * d <= n in the loop condition. Alternatively, precompute sqrt_n and compare d <= sqrt_n.


Self-Test

  • [ ] Write the Sieve of Eratosthenes from memory -- including the p2 optimization and correct array bounds
  • [ ] Compute ϕ(12) by hand using 12=223 and the formula ϕ(n)=n(11/p1)(11/p2)
  • [ ] State Fermat's little theorem and use it to compute 3100mod7 without a calculator
  • [ ] Explain when trial division is fast enough vs when Miller-Rabin is needed -- give the size threshold
  • [ ] State what "multiplicative function" means and give two examples (ϕ, d)

Practice Problems

RatingYou should be able to...
CF 1200Implement basic sieve, check primality, compute GCD
CF 1500Use SPF sieve for O(log n) factorization; enumerate divisors in O(√n)
CF 1800Apply Euler's totient formula; use harmonic sieve for bulk divisor queries
CF 2100Implement Möbius inversion; solve multiplicative-function problems; optimize with Dirichlet convolution
#ProblemSourceDifficultyKey Idea
1Almost PrimeCF 26A800Count numbers with exactly 2 distinct prime factors (sieve + factorize)
2T-primesCF 230B1300A number has exactly 3 divisors iff it's a perfect square of a prime
3Noldbach ProblemCF 17A1000Sieve + check if prime is sum of two consecutive primes
4Sherlock and his GirlfriendCF 776B1200SPF sieve to separate primes from composites
5Divisor SummationCF 1512G1500Enumerate divisors via harmonic sieve, answer queries
6Obtain the StringCF 1295D1600Euler totient application (Obtain a string via coprime counting)
7Array GCDCF 475D1700GCD of subarrays -- distinct GCD values are O(nlogmax)
8Same GCDsCF 1295D1700Direct Euler totient: count x in [0,m) with gcd(a+x,m)=gcd(a,m)
9Orac and LCMCF 1350B1400Suffix GCD + pairwise LCM insight
10Bash and a Tough Math PuzzleCF 914D1900Segment tree on GCD with single-element replacement queries
11ExponentiationCSES1300Modular exponentiation
12Counting DivisorsCSES1400Sieve-based divisor count for multiple queries
13Common DivisorsCSES1500Find maximum GCD of any pair
14Sum of DivisorsCSES1600Harmonic sum trick for divisor sums
15Prime MultiplesCSES1700Inclusion-exclusion with prime factors

Further Reading


Recap

  • Sieve of Eratosthenes: mark composites in O(n log log n). Linear sieve gives O(n) with smallest prime factor for each number.
  • Trial division: factor n in O(sqrt(n)) by testing primes up to sqrt(n). Don't forget the remaining factor if n > 1.
  • Euler's totient: phi(n) = n * product(1 - 1/p) over distinct prime factors p. Multiplicative function.
  • Key pattern: "factor the number, formula the answer" -- most number theory problems reduce to prime factorization plus a formula.

Built for the climb from Codeforces 1100 to 2100.