Appearance
Randomization Techniques
Quick Revisit
- USE WHEN: Deterministic solution can be hacked (adversarial input), or you need random pivots/hashes/bases
- INVARIANT: Randomize the part the adversary controls (pivot choice, hash base/mod) so worst-case inputs don't exist
- TIME: Expected O(n log n) for randomized quicksort/select; O(n) for hashing | SPACE: O(n)
- CLASSIC BUG: Using a fixed seed or fixed hash constants—adversary reverse-engineers them and forces collisions/worst-case
- PRACTICE: 01-ladder-1100-to-1400
Difficulty: Intermediate | CF Rating Range: 1400–1800 | Prerequisites: Math Fundamentals, String Hashing | Applies to: CF 1100–2100
Contents
- Intuition
- Core Mechanics
- C++ STL Reference -- The
<random>Toolkit - Randomized Hashing
- Randomized Algorithms
- The Birthday Paradox and Collision Analysis
- Anti-Hack Strategies for Codeforces
- Common Patterns
- Complexity Analysis
- Edge Cases and Pitfalls
- Contest Cheat Sheet
- Self-Test
- Practice Problems
- Probabilistic Method in Competitive Programming
- If I Had 5 Minutes
- Pattern Fingerprint
- Rating Progression
- Before You Code Checklist
- What Would You Do If...?
- Historical Note
- The Mistake That Teaches
- When NOT to Use Randomization
- Debug This
- Cross-References
- Extended Practice Problems
- Further Reading
- Next Up
- Appendix: Quick Reference
Intuition
You submit a clean deterministic solution on Codeforces. It passes pretests. Then during the hacking phase, someone drops a crafted input and your solution gets TLE or WA.
Consider quicksort with the classic "always pick the first element as pivot" strategy. Against a sorted array of
text
Input: [1, 2, 3, 4, 5, ..., 100000]
Pivot: 1
\--- partition splits into (0 elements) | (99999 elements)
Pivot: 2
\--- partition splits into (0 elements) | (99998 elements)
...
Result: n + (n-1) + (n-2) + ... + 1 = n*(n+1)/2 comparisons --> TLEOr consider polynomial string hashing with a fixed base and modulus. An adversary reverse-engineers your constants and constructs colliding strings, turning your
Deterministic algorithms have knowable worst cases—and on competitive programming platforms with open hacking, the adversary will find them.
"Adding randomness makes adversarial attacks impossible—the enemy can't predict your random choices."
When you pick a random pivot, no input arrangement is consistently bad. When you pick a random hash base at runtime, no pre-built collision set works. The adversary would need to predict your random seed—which they cannot.
The shift in thinking: instead of guaranteeing
Visual Walkthrough—Random Pivot Selection
text
Array: [3, 1, 4, 1, 5, 9, 2, 6, 5, 3]
--- Deterministic (first element) on sorted input ---
Sorted: [1, 1, 2, 3, 3, 4, 5, 5, 6, 9]
Pivot=1 -> split: [] | [1,2,3,3,4,5,5,6,9] depth grows linearly
Pivot=1 -> split: [] | [2,3,3,4,5,5,6,9]
... O(n) depth --> O(n^2) total
--- Randomized pivot on the SAME sorted input ---
Sorted: [1, 1, 2, 3, 3, 4, 5, 5, 6, 9]
Random pivot=5 -> split: [1,1,2,3,3,4] | [5,6,9] ~balanced!
Left: random pivot=2 -> [1,1] | [3,3,4] ~balanced!
Right: random pivot=6 -> [5] | [9] ~balanced!
... O(log n) expected depth --> O(n log n) expected totalEach random pivot has at least a
The Invariant
"Expected running time is
regardless of input distribution."
For randomized quicksort, the expected number of comparisons is
For randomized hashing with base chosen uniformly from
What the Code Won't Teach You
The templates show how to randomize, but not why each choice matters. Here are insights that only come from debugging wrong submissions:
Hashing safety is about the modulus, not the base. Polynomial rolling hash treats a string as a number in base
text
Modulus M Collision prob (n=10^5 comparisons)
+--------------------------------------------------+
| 10^9 + 7 ~10^10 / 2*10^9 ~ 5.0 | <-- UNSAFE (expect collisions!)
| 2^61 - 1 ~10^10 / 2^62 ~ 2*10^{-9} | <-- Safe
| Double hash ~10^10 / 2^122 ~ negligible | <-- Very safe
+--------------------------------------------------+unordered_map is a landmine. The default hash function on Codeforces is known. An adversary constructs SafeHash struct from the Anti-Hack Strategies section, which XORs each key with a runtime random constant—shifting the collision distribution from adversary-controlled to uniformly random.
text
Default unordered_map with adversarial keys:
Bucket 0: key1 -> key2 -> key3 -> ... -> keyN O(n) lookup!
Bucket 1: (empty)
Bucket 2: (empty)
...
SafeHash unordered_map (same keys, random salt):
Bucket 0: key7 -> key42 O(1) expected
Bucket 1: key3 -> key99 -> key15 O(1) expected
Bucket 2: key1 O(1) expected
... (keys distributed ~uniformly)When randomization is the intended solution, look for signals:
When to Reach for This
Trigger phrases and situations:
- "anti-hack"—you need a solution that cannot be broken by adversarial input
- "probabilistic"—the problem explicitly allows small error probability
- "expected value solution"—you need expected
, not worst-case - Hash collisions—your deterministic hash keeps getting hacked
- Randomized algorithms—Miller-Rabin, Pollard rho, Freivalds' algorithm
- Large search spaces—random sampling finds a witness faster than enumeration
Core Mechanics
Random Number Generation in C++
Never use rand()—it is slow, low quality, and only 15-bit on some platforms. Use the Mersenne Twister mt19937 or mt19937_64:
cpp
#include <bits/stdc++.h>
using namespace std;
// Seed with chrono to prevent predictable output
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
mt19937_64 rng64(chrono::steady_clock::now().time_since_epoch().count());
// Generate a random integer in [a, b] inclusive
int rand_int(int a, int b) {
return uniform_int_distribution<int>(a, b)(rng);
}
// Generate a random long long in [a, b] inclusive
long long rand_ll(long long a, long long b) {
return uniform_int_distribution<long long>(a, b)(rng64);
}Why chrono seed? A fixed seed lets a hacker replicate your random sequence. A chrono-based seed changes with every submission.
Random Shuffling
The simplest and most powerful anti-hack technique—shuffling the input before processing destroys any adversarial ordering:
cpp
vector<int> a(n);
for (int i = 0; i < n; i++) cin >> a[i];
shuffle(a.begin(), a.end(), rng);
// Now process a[] -- no adversarial ordering possibleUse it for quicksort, treap insertion, or any algorithm whose worst case depends on input order.
Random Pivot Selection (Quicksort / Quickselect)
cpp
int partition(vector<int>& a, int lo, int hi) {
int pivot_idx = rand_int(lo, hi);
swap(a[pivot_idx], a[hi]);
int pivot = a[hi];
int i = lo;
for (int j = lo; j < hi; j++) {
if (a[j] <= pivot) {
swap(a[i], a[j]);
i++;
}
}
swap(a[i], a[hi]);
return i;
}
void quicksort(vector<int>& a, int lo, int hi) {
if (lo >= hi) return;
int p = partition(a, lo, hi);
quicksort(a, lo, p - 1);
quicksort(a, p + 1, hi);
}Expected comparisons:
C++ STL Reference -- The <random> Toolkit
Everything you need lives in <random> (plus shuffle from <algorithm>). Forget rand() and srand()—15-bit range on MSVC, global state, non-uniform distribution. The modern C++ facilities are fast, high-quality, and safe when seeded properly.
| Class / Function | Header | What It Does | Complexity |
|---|---|---|---|
mt19937 | <random> | 32-bit Mersenne Twister engine; period | |
mt19937_64 | <random> | 64-bit variant for long long ranges | |
uniform_int_distribution<T>(a, b) | <random> | Uniform integer in | |
uniform_real_distribution<T>(a, b) | <random> | Uniform real in | |
discrete_distribution<T>({w₀, w₁, ...}) | <random> | Weighted random index; returns | |
bernoulli_distribution(p) | <random> | Returns true with probability | |
shuffle(first, last, gen) | <algorithm> | Uniform random permutation (Fisher-Yates internally) | |
sample(first, last, out, n, gen) | <algorithm> | Choose | |
nth_element(first, nth, last) | <algorithm> | Rearranges so element at nth is in sorted position; |
Usage patterns you'll actually reach for in contest:
cpp
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
// Random int in [lo, hi]
int x = uniform_int_distribution<int>(lo, hi)(rng);
// Random long long -- use the 64-bit engine
mt19937_64 rng64(chrono::steady_clock::now().time_since_epoch().count());
long long y = uniform_int_distribution<long long>(lo, hi)(rng64);
// Shuffle a vector -- kills adversarial ordering
shuffle(v.begin(), v.end(), rng);
// Pick k random elements without replacement (C++17)
vector<int> chosen;
sample(v.begin(), v.end(), back_inserter(chosen), k, rng);
// k-th smallest element without full sort
nth_element(v.begin(), v.begin() + k, v.end());
// v[k] is now the (k+1)-th smallest; elements before it are <= v[k]text
Distribution overview:
uniform_int_distribution(1, 6) -> 1 2 3 4 5 6 (each 1/6)
uniform_real_distribution(0, 1) -> [0.0 ................. 1.0)
bernoulli_distribution(0.3) -> false (70%) | true (30%)
discrete_distribution({3, 1, 1}) -> 0 (60%) 1 (20%) 2 (20%)Two things that bite people:
random_shufflewas removed in C++17. Always useshufflewith an explicit engine as the third argument. If you seerandom_shufflein old code or editorials, replace it immediately.rng() % nis biased whenrng.max() + 1is not divisible byn. Useuniform_int_distributioninstead—it handles rejection sampling internally so every value in the range is equally likely.
Randomized Hashing
Polynomial Hashing with Random Base
The standard polynomial hash of string
Anti-hack version: choose
cpp
const long long MOD = (1LL << 61) - 1; // Mersenne prime, very fast modular arithmetic
long long mod_mul(long long a, long long b) {
// Compute (a * b) % MOD without overflow using __int128
return (__int128)a * b % MOD;
}
long long mod_add(long long a, long long b) {
a += b;
if (a >= MOD) a -= MOD;
return a;
}
struct RollingHash {
int n;
long long base;
vector<long long> h, pw;
RollingHash(const string& s) : n(s.size()), h(n + 1), pw(n + 1) {
base = rand_ll(256, MOD - 1);
pw[0] = 1;
for (int i = 0; i < n; i++) {
pw[i + 1] = mod_mul(pw[i], base);
h[i + 1] = mod_add(mod_mul(h[i], base), s[i]);
}
}
// Hash of s[l..r] (0-indexed, inclusive)
long long query(int l, int r) {
return mod_add(h[r + 1], MOD - mod_mul(h[l], pw[r - l + 1]));
}
};Why
Double Hashing (Belt and Suspenders)
For problems where even a single collision is fatal, use two independent random bases:
cpp
struct DoubleHash {
RollingHash h1, h2;
DoubleHash(const string& s) : h1(s), h2(s) {}
pair<long long, long long> query(int l, int r) {
return {h1.query(l, r), h2.query(l, r)};
}
};Collision probability drops to
Randomized Algorithms
Miller-Rabin Primality Test
Deterministic primality testing typically uses trial division up to
cpp
long long power(long long a, long long d, long long mod) {
long long result = 1;
a %= mod;
while (d > 0) {
if (d & 1) result = (__int128)result * a % mod;
a = (__int128)a * a % mod;
d >>= 1;
}
return result;
}
bool miller_rabin(long long n, long long a) {
if (n % a == 0) return n == a;
long long d = n - 1;
int r = 0;
while (d % 2 == 0) { d /= 2; r++; }
long long x = power(a, d, n);
if (x == 1 || x == n - 1) return true;
for (int i = 0; i < r - 1; i++) {
x = (__int128)x * x % n;
if (x == n - 1) return true;
}
return false;
}
bool is_prime(long long n) {
if (n < 2) return false;
// Deterministic for n < 3.317e23 with these bases
for (long long a : {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37}) {
if (!miller_rabin(n, a)) return false;
}
return true;
}The fixed witness set
Freivalds' Algorithm (Matrix Multiplication Verification)
Given three
Pick a random binary vector
cpp
bool freivalds(vector<vector<long long>>& A,
vector<vector<long long>>& B,
vector<vector<long long>>& C, int n, int iterations = 20) {
for (int iter = 0; iter < iterations; iter++) {
vector<long long> r(n);
for (int i = 0; i < n; i++) r[i] = rng() & 1;
// Compute Br
vector<long long> Br(n, 0);
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
Br[i] += B[i][j] * r[j];
// Compute A(Br)
vector<long long> ABr(n, 0);
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
ABr[i] += A[i][j] * Br[j];
// Compute Cr
vector<long long> Cr(n, 0);
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
Cr[i] += C[i][j] * r[j];
if (ABr != Cr) return false;
}
return true; // Correct with probability >= 1 - 2^{-iterations}
}Error probability after
Random Sampling / Witness Search
Many problems reduce to: find an element with property
Example: given a multiset where the majority element appears
cpp
// Find an element appearing > n/2 times (guaranteed to exist)
int find_majority(vector<int>& a) {
int n = a.size();
while (true) {
int candidate = a[rand_int(0, n - 1)];
int count = 0;
for (int x : a) count += (x == candidate);
if (count > n / 2) return candidate;
}
// Expected iterations: < 2
}Las Vegas vs Monte Carlo — A Deeper Look
Las Vegas algorithms always produce the correct answer; only the running time is random. Think of them as "keep trying until you succeed."
Monte Carlo algorithms always terminate within a fixed time bound, but the answer may be wrong with some bounded probability
text
+------------------------------------------------------------------+
| Property | Las Vegas | Monte Carlo |
+------------------------------------------------------------------+
| Correctness | ALWAYS correct | Correct w.p. >= 1-eps |
| Runtime | Random (expected bound) | Deterministic bound |
| Error control | Run longer / retry | Repeat k times: |
| | | error <= eps^k |
| CP examples | Randomized Quicksort | Miller-Rabin (random |
| | Randomized Quickselect | witnesses) |
| | Treap insert/delete | Freivalds' algorithm |
| | Random pivot partition | Randomized hashing |
| When to use | Need exact answer, | Approximate answer OK, |
| | can tolerate variable | need guaranteed speed |
| | runtime | |
+------------------------------------------------------------------+Converting between the two:
- Monte Carlo → Las Vegas: if you can verify the answer cheaply, run the MC algorithm, verify, and retry on failure. For example, Miller-Rabin identifies a candidate; trial division up to small bounds confirms it.
- Las Vegas → Monte Carlo: run the LV algorithm for a fixed time budget. If it hasn't finished, output a default answer. You sacrifice correctness for a timing guarantee.
text
Las Vegas runtime distribution Monte Carlo error reduction
by repetition
Probability Error prob
^ ^
| * | *
| * * | *
| * * | *
| * * * | * *
| * * * * * | * * * *
+-------------------> Time +--------------------> k (rounds)
Expected = O(n log n) eps^k shrinks fast
(but could be longer) (20 rounds -> ~10^{-6})Contest rule of thumb: if the problem says "print any valid answer" or has
Cross-reference: Miller-Rabin (above) is the classic Monte Carlo example; randomized quicksort (Random Pivot Selection) is the classic Las Vegas example.
Randomized Quickselect — Expected O(n) Median Finding
Quickselect finds the
Partition around a random pivot—exactly like quicksort—but recurse into only one side: the side that contains index
text
Find the 4th smallest element (k=3, 0-indexed) in [7, 2, 9, 1, 5, 8, 3]:
Step 1: Random pivot = 5
Partition: [2, 1, 3] | 5 | [7, 9, 8]
pivot is at index 3 == k --> FOUND! Answer = 5
(Lucky case. Usually takes a few rounds.)
Step 2 (if pivot had landed at index 5):
Random pivot = 8
Partition: [7, 2, 1, 5, 3] | 8 | [9]
pivot at index 5, k=3 < 5 --> recurse LEFT on [7, 2, 1, 5, 3]
Step 3: Random pivot = 2
Partition: [1] | 2 | [7, 5, 3]
pivot at index 1, k=3 > 1 --> recurse RIGHT on [7, 5, 3], k'=3-2=1
Step 4: Random pivot = 5
Partition: [3] | 5 | [7]
pivot at index 1 == k' --> FOUND! Answer = 5Full C++ implementation:
cpp
#include <bits/stdc++.h>
using namespace std;
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
// Returns the k-th smallest element (0-indexed) in a[lo..hi]
int quickselect(vector<int>& a, int lo, int hi, int k) {
while (lo < hi) {
// Random pivot
int pivot_idx = uniform_int_distribution<int>(lo, hi)(rng);
swap(a[pivot_idx], a[hi]);
int pivot = a[hi];
// Three-way partition (handles duplicates efficiently)
int lt = lo, gt = hi; // a[lo..lt-1] < pivot, a[gt+1..hi] > pivot
int i = lo;
while (i <= gt) {
if (a[i] < pivot) swap(a[i++], a[lt++]);
else if (a[i] > pivot) swap(a[i], a[gt--]);
else i++;
}
// Now a[lt..gt] == pivot
if (k < lt) hi = lt - 1;
else if (k > gt) lo = gt + 1;
else return a[k]; // pivot is the answer
}
return a[lo];
}
// Convenience: find the k-th smallest in the entire array
int kth_smallest(vector<int> a, int k) {
// Note: pass by value to avoid modifying the original
return quickselect(a, 0, (int)a.size() - 1, k);
}Complexity analysis:
- Expected comparisons:
(proof via linearity of expectation). - Each pivot lands in the middle half with probability
—a "good" pivot that reduces the subarray by at least . - After a good pivot, remaining size
. Expected good pivots before the next recursive call: . - Total work:
. - Worst case is
, but the probability of exceeding is exponentially small.
Contest tip: for safety, shuffle the array before calling quickselect, or use
nth_elementfrom the STL, which uses introselect (quickselect with a fallback to median-of-medians for guaranteedworst case).
Treap — Randomized Binary Search Tree
A treap (tree + heap) is a BST where each node also carries a random priority. It maintains BST order on keys and max-heap order on priorities simultaneously.
Why it works: random priorities force the tree into the shape of a random BST, which has expected depth
text
Treap with keys {1, 3, 5, 7, 9} and random priorities:
BST property: left < root < right (on keys)
Heap property: parent priority > child priority
(5, pri=97)
/ \
(3, pri=84) (9, pri=91)
/ /
(1, pri=32) (7, pri=45)
Verify BST on keys: 1 < 3 < 5 < 7 < 9 <-- correct
Verify heap on pri: 97 > 84, 97 > 91 <-- root highest
84 > 32 <-- left subtree
91 > 45 <-- right subtreeCore Operations—Split and Merge:
All other treap operations reduce to two primitives:
text
SPLIT(t, key) --> (L, R)
L contains all nodes with key <= key
R contains all nodes with key > key
(5,97) (3,84) (5,97)
/ \ split(4) / / \
(3,84) (9,91) --------> (1,32) (empty) (9,91)
/ / /
(1,32) (7,45) L (7,45)
R
MERGE(L, R) --> T (all keys in L < all keys in R)
Pick root by higher priority, recursively merge the rest.
(3,84) (9,91) (9,91)
/ / merge / \
(1,32) (7,45) --------> (3,84) (empty)
/ \
(1,32) (7,45)
91 > 84, so 9 is root.
Left of 9 = merge(L, left_child_of_9) = merge({1,3}, {7})
84 > 45, so 3 is root of that subtree.
Final: (9, pri=91)
/
(3, pri=84)
/ \
(1, pri=32) (7, pri=45)Full C++ implementation (implicit treap with split/merge):
cpp
#include <bits/stdc++.h>
using namespace std;
mt19937 rng_treap(chrono::steady_clock::now().time_since_epoch().count());
struct Node {
int key, priority, size;
Node *left, *right;
Node(int k) : key(k), priority(rng_treap()), size(1),
left(nullptr), right(nullptr) {}
};
int sz(Node* t) { return t ? t->size : 0; }
void update(Node* t) {
if (t) t->size = 1 + sz(t->left) + sz(t->right);
}
// Split treap t into L (keys <= key) and R (keys > key)
void split(Node* t, int key, Node*& L, Node*& R) {
if (!t) { L = R = nullptr; return; }
if (t->key <= key) {
split(t->right, key, t->right, R);
L = t;
} else {
split(t->left, key, L, t->left);
R = t;
}
update(t);
}
// Merge two treaps L and R (all keys in L < all keys in R)
Node* merge(Node* L, Node* R) {
if (!L || !R) return L ? L : R;
if (L->priority > R->priority) {
L->right = merge(L->right, R);
update(L);
return L;
} else {
R->left = merge(L, R->left);
update(R);
return R;
}
}
// Insert key into treap rooted at t
void insert(Node*& t, int key) {
Node *L, *R;
split(t, key, L, R);
t = merge(merge(L, new Node(key)), R);
}
// Erase one occurrence of key from treap rooted at t
void erase(Node*& t, int key) {
if (!t) return;
if (t->key == key) {
Node* tmp = merge(t->left, t->right);
delete t;
t = tmp;
} else if (key < t->key) {
erase(t->left, key);
} else {
erase(t->right, key);
}
update(t);
}
// Find k-th smallest (0-indexed)
int kth(Node* t, int k) {
int left_size = sz(t->left);
if (k < left_size) return kth(t->left, k);
if (k == left_size) return t->key;
return kth(t->right, k - left_size - 1);
}
// Count elements strictly less than key
int count_less(Node* t, int key) {
if (!t) return 0;
if (t->key < key) return 1 + sz(t->left) + count_less(t->right, key);
return count_less(t->left, key);
}Why Treap over std::set?
std::set is fine for insert/find/erase, but it cannot:
- Find the
-th smallest element in - Count elements in a range in
- Split and merge in
—essential for interval operations - Serve as a sequence with split/merge at arbitrary positions (implicit treap)
text
Operation comparison:
+----------------------------------------------------------+
| Operation | std::set | Treap | Notes |
+----------------------------------------------------------+
| Insert/Delete | O(log n) | O(log n)* | *expected |
| Find k-th element | O(n) | O(log n)* | treap wins |
| Split at key | O(n) | O(log n)* | treap wins |
| Merge two sets | O(n log n) | O(log n)* | treap wins |
| Count in range | O(n) | O(log n)* | treap wins |
+----------------------------------------------------------+
*expected time with random prioritiesCross-reference: Treaps combine ideas from Sorting and Searching (BST property) and randomization (heap with random priorities).
Reservoir Sampling & Fisher-Yates Shuffle
Reservoir Sampling (Algorithm R)
Problem: given a stream of
Maintain a "reservoir" of
text
Stream: a, b, c, d, e, f, g (k=3)
Step 1: reservoir = [a, b, c] (first k items)
Step 2: i=4, include d with prob 3/4=0.75
Say YES -> replace random slot -> [a, d, c]
Step 3: i=5, include e with prob 3/5=0.60
Say NO -> reservoir unchanged -> [a, d, c]
Step 4: i=6, include f with prob 3/6=0.50
Say YES -> replace random slot -> [f, d, c]
Step 5: i=7, include g with prob 3/7=0.43
Say NO -> reservoir unchanged -> [f, d, c]
Each of the 7 elements had exactly 3/7 probability of being in the final set.Full C++ implementation:
cpp
#include <bits/stdc++.h>
using namespace std;
mt19937 rng_reservoir(chrono::steady_clock::now().time_since_epoch().count());
// Reservoir sampling: select k elements uniformly from stream
vector<int> reservoir_sample(vector<int>& stream, int k) {
int n = stream.size();
vector<int> reservoir(stream.begin(), stream.begin() + min(k, n));
for (int i = k; i < n; i++) {
int j = uniform_int_distribution<int>(0, i)(rng_reservoir);
if (j < k) {
reservoir[j] = stream[i];
}
}
return reservoir;
}Correctness proof sketch: by induction. After processing
- Base case (
): all elements are in the reservoir, . - Inductive step: element
enters with probability . An existing element survives if either: (a) the new element is not chosen: probability , or (b) the new element is chosen but replaces a different slot: probability . Total survival probability: . By the inductive hypothesis, the element was present with probability . New probability: . ∎
Fisher-Yates Shuffle (Knuth Shuffle)
Problem: generate a uniformly random permutation of an array in
cpp
void fisher_yates(vector<int>& a) {
int n = a.size();
for (int i = n - 1; i > 0; i--) {
int j = uniform_int_distribution<int>(0, i)(rng);
swap(a[i], a[j]);
}
}Correctness: at step
text
Fisher-Yates on [A, B, C, D]:
Step i=3: swap a[3] with a[rand(0..3)]
4 choices: [A,B,C,D] [A,B,D,C] [A,D,C,B] [D,B,C,A]
(swap w/0) (swap w/1) (swap w/2) (swap w/3=noop)
Step i=2: swap a[2] with a[rand(0..2)]
3 choices for each of the 4 above = 12 states
Step i=1: swap a[1] with a[rand(0..1)]
2 choices for each = 24 states = 4! permutations <-- all equally likelyWarning:
random_shufflewas deprecated in C++14 and removed in C++17. Always useshuffle(a.begin(), a.end(), rng)instead.
Anti-Hash Attack — Why Random Bases Are Essential
This section digs into the mechanics of hash attacks, complementing the overview in the Anti-Hack Strategies section.
How Fixed Bases Get Hacked
With a fixed hash base (e.g., base = 31, mod = 1e9+7), an attacker can reverse-engineer collisions two ways:
Polynomial root attack: a collision between strings
means —a polynomial of degree in , with at most roots mod . Since is known, the attacker constructs strings that are roots of that polynomial. Thue-Morse attack: strings built from the Thue-Morse sequence produce collisions for any base against a given modulus. This is the method behind the well-known
unordered_maphack on Codeforces.
text
Attack on fixed base=31, mod=10^9+7:
Attacker knows: hash("ab") = 31*97 + 98 = 3105 (mod 10^9+7)
Attacker constructs s2 with hash("s2") = 3105 (mod 10^9+7)
--> guaranteed collision!
With random base chosen at RUNTIME:
Attacker prepares collision for base=31... but you picked base=7392841.
--> collision probability = L / (2^61 - 1) ~ 10^{-13}
+-------------------------------------------------------+
| "A fixed base is a lock whose key is publicly known. |
| A random base is a lock whose key is chosen after |
| the attacker has already left." |
+-------------------------------------------------------+Birthday Paradox Collision Probability Table
How many hash comparisons
text
+----------------------------------------------------------------+
| Hash space | Size | Q for 50% | Q for safe |
| | | collision | (< 10^{-6}) |
+----------------------------------------------------------------+
| mod 10^9+7 | ~10^9 | ~31,623 | ~44 |
| mod 10^9+9 | ~10^9 | ~31,623 | ~44 |
| mod 2^32 | ~4*10^9 | ~65,536 | ~92 |
| mod 2^61-1 | ~2.3*10^18 | ~1.5*10^9 | ~2,147 |
| Double hash | ~5.3*10^36 | ~2.3*10^18 | ~3*10^15 |
| (2^61-1)^2 | | | |
+----------------------------------------------------------------+
"Q for safe" = max Q such that collision prob < 10^{-6}
Formula: Q < sqrt(2 * N * 10^{-6})Practical rules for Codeforces:
- Never use
mod = 10^9 + 7alone withhash comparisons. mod = 2^{61} - 1is the safe default—good up to ~comparisons. - For interactive or hacking-enabled problems, use double hash.
- Always choose the base randomly at runtime.
Cross-reference: see String Hashing for the full polynomial hashing tutorial, and Hash Maps for
unordered_mapanti-hack patterns.
The Birthday Paradox and Collision Analysis
Among
A collision becomes likely (
Implications for competitive programming:
| Hash space size | Collision likely after |
|---|---|
Rule of thumb: if you are making
Anti-Hack Strategies for Codeforces
Strategy 1: Shuffle the Input
If your algorithm depends on element order—BST insertion, quicksort, convex hull trick ordering—shuffle first:
cpp
shuffle(a.begin(), a.end(), rng);Strategy 2: Randomize Hash Parameters
Never hardcode hash bases or moduli—choose them at runtime:
cpp
const long long MOD = (1LL << 61) - 1;
const long long BASE = rand_ll(256, MOD - 1);Strategy 3: Randomize Map/Set Hash
unordered_map on Codeforces is vulnerable to collision attacks via the known default hash. Use a custom hash with a random component:
cpp
struct SafeHash {
static uint64_t splitmix64(uint64_t x) {
x += 0x9e3779b97f4a7c15;
x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
return x ^ (x >> 31);
}
size_t operator()(uint64_t x) const {
static const uint64_t FIXED_RANDOM =
chrono::steady_clock::now().time_since_epoch().count();
return splitmix64(x + FIXED_RANDOM);
}
};
unordered_map<long long, int, SafeHash> safe_map;Strategy 4: Random Perturbation for Geometry
Floating-point geometry problems can fail on degenerate cases—collinear points, overlapping coordinates. A tiny random perturbation resolves them:
cpp
// Perturb points slightly to avoid degeneracies
for (auto& [x, y] : points) {
x += uniform_real_distribution<double>(-1e-9, 1e-9)(rng);
y += uniform_real_distribution<double>(-1e-9, 1e-9)(rng);
}Strategy 5: Multiple Random Runs
When a single run might fail, repeat and take the best result:
cpp
int best = INT_MAX;
for (int attempt = 0; attempt < 30; attempt++) {
shuffle(a.begin(), a.end(), rng);
int result = solve(a);
best = min(best, result);
}
cout << best << "\n";Common Patterns
Random Hashing for String Comparison
Problem shape: compare
Build a rolling hash with a random base. Each comparison is RollingHash struct from the Randomized Hashing section.
Random Sampling for Witness Finding
Problem shape: find an element satisfying property
Sample uniformly at random and verify. Expected attempts:
- Finding a vertex in the minimum cut (sample
vertices) - Finding a color that appears frequently
- Finding a random edge in a dense subgraph
Birthday Paradox for Collision Detection
Problem shape: find two elements that share a property (e.g., same hash, same remainder).
Generate
- Pollard's rho factorization
- Finding cycles in iterated functions
- Collision search in meet-in-the-middle
Randomized Reduction to Simpler Problem
Problem shape: a complex problem that becomes tractable for random inputs.
Randomly transform the input to destroy adversarial structure, then solve the expected easy case. Examples:
- Random coordinate rotation to avoid axis-aligned edge cases in geometry
- Random edge contraction for minimum cut (Karger's algorithm)
- Random XOR basis for linear algebra over GF(2)
Complexity Analysis
Each randomized technique trades a deterministic guarantee for a probabilistic one. The table below shows both the time cost and the error bound you accept.
| Technique | Time Complexity | Error Probability |
|---|---|---|
| Random quicksort | N/A (always correct) | |
| Random quickselect | N/A (always correct) | |
| Polynomial hash check | ||
| Miller-Rabin (12 bases) | ||
| Freivalds' ( | ||
| Random sampling ( | Tunable |
Las Vegas vs. Monte Carlo:
- Las Vegas: always correct, expected runtime bound—randomized quicksort, quickselect, random sampling with verification.
- Monte Carlo: always fast, bounded error probability—Miller-Rabin (with random witnesses), Freivalds' algorithm, randomized hashing.
Monte Carlo solutions are accepted in competitive programming when the error probability is below
Edge Cases and Pitfalls
Pitfall 1: Using rand() Instead of mt19937
rand() has too many problems to count: 15-bit range on many platforms, poor distribution, global state, and slow speed. Always use mt19937.
Pitfall 2: Fixed Seeds
cpp
// BAD: Hacker can replicate your random sequence
mt19937 rng(42);
// GOOD: Unpredictable seed
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());Pitfall 3: Insufficient Hash Space
Using mod = 10^9 + 7 with
Pitfall 4: Not Enough Iterations
For Monte Carlo algorithms, always compute the required repetitions. If each attempt succeeds with probability
| Success prob | Attempts for |
|---|---|
Pitfall 5: Modular Arithmetic Overflow
When MOD is near a * b overflows 64-bit. Use __int128:
cpp
long long safe_mul(long long a, long long b, long long mod) {
return (__int128)a * b % mod;
}Pitfall 6: Biased Random Ranges
cpp
// BAD: biased if RAND_MAX+1 is not divisible by n
int x = rand() % n;
// GOOD: uniform distribution
int x = uniform_int_distribution<int>(0, n - 1)(rng);Contest Cheat Sheet
text
+----------------------------------------------------------------------+
| RANDOMIZATION CHEAT SHEET |
+----------------------------------------------------------------------+
| SETUP (paste at top of every solution): |
| |
| mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
| |
| int rand_int(int a, int b) { |
| return uniform_int_distribution<int>(a, b)(rng); |
| } |
+----------------------------------------------------------------------+
| ANTI-HACK PATTERNS: |
| |
| Adversarial order: shuffle(a.begin(), a.end(), rng); |
| Hash collision: base = rand_int(256, MOD - 1); |
| Map blowup: unordered_map<K, V, SafeHash> mp; |
+----------------------------------------------------------------------+
| COMMON TECHNIQUES: |
| |
| Random sampling: pick random element, check property |
| -> if >= 1/k of elements satisfy, expected k attempts |
| -> 30 attempts gives < 10^{-6} failure for k = 2 |
| |
| Birthday paradox: O(sqrt(N)) samples -> likely collision in N |
| -> mod 10^9+7: collisions after ~sqrt(10^9) ~= 31623 |
| -> mod 2^61-1: safe up to ~sqrt(2^61) ~= 1.5x10^9 |
| |
| Monte Carlo: repeat k times for error <= 2^{-k} |
| -> k = 20 gives error < 10^{-6} (when each round p >= 1/2) |
+----------------------------------------------------------------------+
| KEY COMPLEXITIES: |
| |
| Randomized quicksort: O(n log n) expected, always correct |
| Randomized quickselect: O(n) expected, always correct |
| nth_element (STL): O(n) expected |
| Miller-Rabin (12 bases): O(log^2 n), exact for n < 3.3x10^23 |
| Treap (split/merge): O(log n) expected per operation |
| Poly hash check: O(n), error <= L/p per comparison |
| Freivalds' (k rounds): O(k*n^2), error <= 2^{-k} |
+----------------------------------------------------------------------+
| SAFE HASH FOR unordered_map (splitmix64): |
| |
| struct SafeHash { |
| static uint64_t splitmix64(uint64_t x) { |
| x += 0x9e3779b97f4a7c15; |
| x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9; |
| x = (x ^ (x >> 27)) * 0x94d049bb133111eb; |
| return x ^ (x >> 31); |
| } |
| size_t operator()(uint64_t x) const { |
| static const uint64_t FIXED_RANDOM = |
| chrono::steady_clock::now().time_since_epoch().count(); |
| return splitmix64(x + FIXED_RANDOM); |
| } |
| }; |
| // Usage: unordered_map<long long, int, SafeHash> mp; |
+----------------------------------------------------------------------+
| LAS VEGAS vs MONTE CARLO: |
| |
| Las Vegas: always correct, random runtime |
| (quicksort, quickselect, treap) |
| Monte Carlo: always fast, bounded error probability |
| (Miller-Rabin, Freivalds', hash comparison) |
+----------------------------------------------------------------------+Self-Test
Verify you can do all of these without looking at the notes:
- [ ] Write the three-line
mt19937seeding pattern plus arand_int(a, b)helper from memory. Explain whychronoseeding matters on Codeforces. - [ ] Given
hash comparisons using mod , estimate the collision probability via the birthday paradox. State why switching to mod or double hashing fixes it. - [ ] Distinguish Las Vegas from Monte Carlo: give one example of each from this chapter and state precisely what guarantee each provides (correctness vs runtime).
- [ ] State the birthday paradox threshold: how many random samples from a space of size
before a collision is expected? Write the formula. - [ ] Write
SafeHash(the splitmix64 variant) forunordered_mapfrom memory. Explain what attack it defends against and why the defaultstd::hash<int>is vulnerable. - [ ] A problem says "
of elements satisfy property ." How many random samples are needed to find one with probability ? Show the calculation. - [ ] Explain why
rng() % nis biased and what to use instead. - [ ] State the expected number of comparisons in randomized quicksort and why the worst case
is astronomically unlikely.
Practice Problems
| # | Problem | Key Technique | Difficulty |
|---|---|---|---|
| 1 | CF 364D - Ghd | Random sampling + GCD | CF 2000 |
| 2 | CF 869E - The Untended Antiquity | 2D hashing with random values | CF 2100 |
| 3 | CF 1746F - Kazaee | Randomized parity check | CF 2100 |
| 4 | CF 585F - Digits of Number Pi | Aho-Corasick + random hashing | CF 2100 |
| 5 | CF 1178F2 - Long Colorful Strip | Random shuffle + brute force | CF 1900 |
| 6 | CF 1523D - Love-Hate | Random sampling + SOS DP | CF 2000 |
| 7 | CF 1477C - Nezzar and Nice Beatmap | Random permutation construction | CF 1400 |
| 8 | CF 727F - Polycarp's Problems | Randomized binary search | CF 1600 |
| 9 | CF 1354D - Multiset | Binary search + anti-hack shuffle | CF 1400 |
| 10 | CF 1514D - Cut and Stick | Random sampling for majority element | CF 1600 |
| 11 | Playlist | Sliding window with hash set for distinct elements | CSES 1400 |
| 12 | String Matching | Can solve with randomized polynomial hashing | CSES 1500 |
Suggested Progression
- Start: #7, #9 (direct application of shuffle / random selection)
- Build: #8, #10 (random sampling with verification)
- Strengthen: #1, #6 (random sampling + complex verification)
- Master: #2, #3 (randomized hashing in advanced settings)
Probabilistic Method in Competitive Programming
The probabilistic method proves that an object with a desired property exists by showing that a random construction achieves it with positive probability. In CP this translates directly: randomly generate a solution and it is likely to be valid.
Random Coloring Example
Problem: given a graph with
Color each vertex independently, each color with probability
cpp
// Random 2-coloring that cuts >= m/2 edges in expectation
// Repeat a few times, take the best
int best_cut = 0;
vector<int> best_color;
for (int attempt = 0; attempt < 30; attempt++) {
vector<int> color(n);
for (int i = 0; i < n; i++) color[i] = rng() & 1;
int cut = 0;
for (auto& [u, v] : edges)
cut += (color[u] != color[v]);
if (cut > best_cut) {
best_cut = cut;
best_color = color;
}
}
// best_cut is very likely >= m/2text
Random 2-coloring on a 6-vertex graph:
(W)----(B) W = white, B = black
/ \ / \ Edges between different colors = CUT
(B) (W)(B) (W) Edges between same colors = UNCUT
\ | | /
(B)--(W) Cut edges: 7 out of 10 (>= m/2 = 5) OKIf I Had 5 Minutes
Randomization doesn't make your algorithm lucky—it makes the adversary blind. By injecting randomness into your decisions, you transform worst-case inputs into average-case inputs: no fixed input can be consistently bad against unpredictable choices. Think of it as shuffling a deck before dealing—once the cards are randomized, any stacked arrangement is neutralized.
The mnemonic to burn in: SHUFFLE—SEED—SPACE—the three S's of competitive-programming randomization:
- Shuffle the input to destroy adversarial ordering
- Seed with chrono to prevent replication across submissions
- Space must be large (
) to survive the birthday paradox
Last-minute review before a contest:
- Seed:
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count()); - Shuffle:
shuffle(a.begin(), a.end(), rng);—kills adversarial order - Safe hash: random base + mod
—kills collision attacks - Safe map:
SafeHashwith splitmix64 +FIXED_RANDOM—killsunordered_maphacks - Repeat: for Monte Carlo, 20–30 iterations gives error
Pattern Fingerprint
| Constraint Pattern | Technique | Why |
|---|---|---|
| "Find any valid answer" + large | Random sampling + verify | Expected |
| String comparison, | Random-base polynomial hash | Prevents adversarial collision construction |
unordered_map with | SafeHash (splitmix64 + random salt) | Default hash is publicly known and hackable |
| Sorted/structured input + TLE risk | shuffle() before processing | Destroys adversarial ordering |
Randomized quickselect / nth_element | Expected | |
| Dynamic ordered set with | Treap with split/merge | |
| Primality test, | Miller-Rabin with 12 fixed witnesses | |
| Matrix product verification | Freivalds' algorithm | |
| "Majority element in subarray" | Random sampling + count | |
| Interactive / hacking-enabled problem | Double hash + chrono seed | Maximum protection against adversary |
Rating Progression
CF 1200 (Beginner)
text
+----------------------------------------------------------+
| What you'll see: |
| - Problems solvable with sort + greedy |
| - Randomization not required, but shuffle() can save |
| you from adversarial TLE on quicksort-based ideas |
| Key skill: know how to seed mt19937 and call shuffle() |
+----------------------------------------------------------+CF 1500 (Intermediate)
text
+----------------------------------------------------------+
| What you'll see: |
| - Hashing problems where random base prevents hacks |
| - "Find any valid element" via random sampling |
| - unordered_map TLE from collision attacks |
| Key skill: SafeHash, random-base polynomial hashing, |
| random sampling for majority/witness finding |
+----------------------------------------------------------+CF 1800 (Advanced)
text
+----------------------------------------------------------+
| What you'll see: |
| - Treap as ordered set with k-th element queries |
| - Miller-Rabin for large primality checks |
| - Randomized algorithms as the intended solution |
| Key skill: treap split/merge, quickselect, Freivalds', |
| probability analysis for error bounds |
+----------------------------------------------------------+CF 2100 (Expert)
text
+----------------------------------------------------------+
| What you'll see: |
| - Probabilistic method arguments (random coloring) |
| - Pollard rho factorization |
| - Randomized complexity reductions |
| - Problems where randomization IS the editorial |
| Key skill: designing custom randomized algorithms, |
| analyzing expected value and error probability |
+----------------------------------------------------------+Before You Code Checklist
Before writing a randomized solution, verify:
- [ ] Seed is unpredictable: using
chrono::steady_clock, not a fixed constant- [ ] Distribution is uniform: using
uniform_int_distribution, notrng() % n- [ ] Enough iterations: for Monte Carlo, calculate
needed for error - [ ] Hash space is large enough:
where = number of comparisons - [ ] Solution is testable: you can stress-test against a brute-force solution
What Would You Do If...?
Scenario 1: your unordered_map solution passes on your machine but gets TLE on Codeforces with
The default hash is being attacked. The attacker constructed keys that all land in the same bucket, turning SafeHash (Strategy 3 in the Anti-Hack Strategies section).
Scenario 2: your rolling hash solution gets WA on test 42, but passes all other tests including stress tests with
Likely a hash collision. The birthday paradox says
Scenario 3: you need the median of
Use randomized quickselect (nth_element from the STL. Don't sort the full array when you only need one order statistic.
Historical Note
The systematic study of randomized algorithms began with Michael Rabin's 1976 paper introducing the Miller-Rabin primality test and randomized closest-pair algorithms. László Babai coined the term "Las Vegas algorithm" in 1979. Cecilia Aragon and Raimund Seidel invented treaps in 1989, elegantly fusing two classical data structures with a dash of randomness.
The Mistake That Teaches
The Debugging Story: "Why Does My Hash Keep Colliding?"
Alice writes a string matching solution for a Codeforces round. She uses polynomial hashing with base = 31 and mod = 1e9 + 7. Her solution passes pretests. During hacking, Bob submits a test with 200,000 short strings.
Alice's solution gets Wrong Answer.
She's confused—her hash logic is correct. She tests locally with random strings: no issues. She tests with Bob's input: collisions everywhere.
What happened? Bob ran a script that generates strings with identical hashes for base 31 mod
The fix (three characters changed):
cpp
// BEFORE (hackable):
const long long BASE = 31;
// AFTER (safe):
const long long BASE = rand_ll(256, MOD - 1);The lesson: a hardcoded base is a publicly visible lock combination. A runtime-random base is a lock that changes with every submission. The adversary arrives with the old key and finds it doesn't fit.
text
Alice's timeline:
t=0 Submit with BASE=31 --> AC on pretests
t=30 Bob downloads her code, sees BASE=31
t=31 Bob generates collision strings for BASE=31
t=32 Bob hacks Alice --> WA
Alice v2:
t=0 Submit with BASE=rand_ll() --> AC on pretests
t=30 Bob downloads her code, sees BASE=rand_ll()
t=31 Bob: "I can't predict the base..." --> hack FAILSWhen NOT to Use Randomization
Randomization is powerful, but not always the right call:
When deterministic
exists and suffices: don't add complexity. std::sortis already introsort—quicksort with a heapsort fallback—giving guaranteed. When the problem requires exact answers with a proof: some problems, especially interactive ones, need deterministic guarantees. A Monte Carlo approach with
error probability may fail on the one test case it gets wrong. When
is small enough for brute force: randomization adds code complexity. If , just enumerate. When the random approach has high constant factors:
mt19937is not free. In ultra-tight time limit problems (, TL = 1s), the overhead of random number generation in every operation can tip the balance. When you cannot verify the answer: Las Vegas algorithms need a cheap verification step. If verification is as expensive as the computation itself, random restarts buy nothing.
Debug This
Exercise 1: this shuffle is biased. Why?
cpp
// BUGGY: biased shuffle
void bad_shuffle(vector<int>& a) {
int n = a.size();
for (int i = 0; i < n; i++) {
int j = uniform_int_distribution<int>(0, n - 1)(rng);
swap(a[i], a[j]);
}
}Answer
The range of j should be [i, n-1] (or [0, i]), not [0, n-1]. With [0, n-1], there are j from [0, i] (backward) or [i, n-1] (forward).
Exercise 2: this hash produces collisions even with random base. Why?
cpp
// BUGGY: broken hash
long long hash_string(const string& s) {
long long h = 0;
long long base = rand_ll(2, 1000000006);
long long mod = 1000000007;
for (char c : s) {
h = h * base + c;
h %= mod;
}
return h;
}Answer
The base is re-randomized on every call! Two calls to hash_string on the same string will likely produce different hashes because each call picks a new random base. The base must be generated ONCE (globally or in a struct constructor) and reused for all hash computations in the same run.
Exercise 3: this reservoir sampling has off-by-one. Where?
cpp
// BUGGY: off-by-one in reservoir sampling
vector<int> sample(vector<int>& stream, int k) {
vector<int> res(stream.begin(), stream.begin() + k);
for (int i = k; i < (int)stream.size(); i++) {
int j = uniform_int_distribution<int>(0, i - 1)(rng);
if (j < k) res[j] = stream[i];
}
return res;
}Answer
The distribution should be uniform_int_distribution<int>(0, i), not (0, i - 1). With (0, i-1), the
Exercise 4: this treap merge has a silent bug that corrupts the tree. What is it?
cpp
// BUGGY: treap merge
Node* merge(Node* L, Node* R) {
if (!L) return R;
if (!R) return L;
if (L->priority > R->priority) {
L->right = merge(L->right, R);
return L;
} else {
R->left = merge(L, R->left);
return R;
}
}Answer
The update() call is missing! After modifying L->right or R->left, the size field is stale. Any subsequent call to sz() or kth() will return wrong results. The fix: add update(L); return L; and update(R); return R; respectively.
Cross-References
String Hashing: full treatment of polynomial hashing, rolling hash, and substring comparison. The random-base technique from the Randomized Hashing section is applied there for anti-hack safety.
Hash Maps and Unordered Maps: explains why
unordered_mapis vulnerable and howSafeHash(Anti-Hack Strategies) prevents collision attacks. Essential reading for Codeforces contestants.Sorting and Searching: covers deterministic sorting algorithms. Randomized quicksort and quickselect above are randomized extensions of the partition-based techniques discussed there.
Extended Practice Problems
| # | Problem | Key Technique | Difficulty |
|---|---|---|---|
| 13 | CF 1523D - Love-Hate | Random sampling + SOS DP | CF 2000 |
| 14 | CF 896C - Willem, Chtholly and Seniorious | Chtholly tree (randomized interval merging) | CF 2100 |
| 15 | CF 1763C - Another Array Problem | Random observation for small cases | CF 1400 |
| 16 | CF 1154G - Minimum Possible LCM | Random sampling + GCD tricks | CF 1900 |
| 17 | CF 1081G - Mergesort Strikes Back | Expected value of random permutations | CF 2400 |
| 18 | CF 1705E - Mark and Professor | Randomized hashing on bitsets | CF 2100 |
| 19 | CF 1554D - Diane | Hash-based string construction | CF 1800 |
| 20 | CF 1142B - Lynyrd Skynyrd | Random sampling for cycle detection | CF 1800 |
Extended Progression
- Warm-up (CF 1200–1400): #7, #9, #15—shuffle, basic random selection
- Foundation (CF 1400–1600): #8, #10, #19—sampling, hashing patterns
- Intermediate (CF 1600–1900): #1, #5, #16, #20—sampling + complex logic
- Advanced (CF 1900–2100): #2, #3, #6, #14, #18—advanced randomized DS
- Expert (CF 2100+): #4, #13, #17—deep probability + randomized analysis
Further Reading
- cp-algorithms: Randomized Algorithms—Miller-Rabin, Pollard's rho, and randomized primality testing with full C++ implementations.
- CF Blog: Blowing up unordered_map—Neal Wu's classic post explaining why the default hash is hackable and presenting the splitmix64
SafeHashsolution that everyone now uses. - CF Blog: Anti-hash tests—how to construct strings that collide under a known base-mod pair, and why random bases defeat these attacks.
- CF Blog: On the Probability of Hash Collision—birthday paradox analysis applied to competitive programming hash sizes, with concrete probability tables.
- CP-Algorithms: Treap (Cartesian Tree)—full treatment of implicit and explicit treaps with split/merge operations and applications to range queries.
- Wikipedia: Randomized algorithm—good overview of Las Vegas vs. Monte Carlo classification with complexity theory context.
- Original papers: Rabin (1976)—randomized primality and closest pair; Aragon & Seidel (1989)—treaps; Freivalds (1979)—randomized matrix verification; Schwartz-Zippel (1980)—polynomial identity testing.
- Related chapters in this notebook: String Hashing, Math Fundamentals, Number Theory
Next Up
Continue to: Binary Lifting and LCA for techniques where preprocessing meets tree structure, or return to Sorting and Searching to solidify the deterministic foundations that randomization builds upon.
Appendix: Quick Reference
cpp
// --- Paste this block at the top of your solution ---
#include <bits/stdc++.h>
using namespace std;
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
mt19937_64 rng64(chrono::steady_clock::now().time_since_epoch().count());
int rand_int(int a, int b) {
return uniform_int_distribution<int>(a, b)(rng);
}
long long rand_ll(long long a, long long b) {
return uniform_int_distribution<long long>(a, b)(rng64);
}
struct SafeHash {
static uint64_t splitmix64(uint64_t x) {
x += 0x9e3779b97f4a7c15;
x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9;
x = (x ^ (x >> 27)) * 0x94d049bb133111eb;
return x ^ (x >> 31);
}
size_t operator()(uint64_t x) const {
static const uint64_t FIXED_RANDOM =
chrono::steady_clock::now().time_since_epoch().count();
return splitmix64(x + FIXED_RANDOM);
}
};
// --- End of randomization toolkit ---