Skip to content

Complexity Analysis

Quick Revisit

  • USE WHEN: You see n <= ... in constraints and need to pick algorithm family
  • INVARIANT: ~10^8 simple ops per second; n*complexity must fit in time limit
  • TIME: O(1) mental check | SPACE: O(1)
  • CLASSIC BUG: Forgetting hidden constant factors or nested loops that multiply (e.g., O(n) loop inside O(n) loop = O(n^2))
  • PRACTICE: ../12-practice-ladders/01-ladder-1100-to-1400.md

Read the constraints, know the algorithm. This is the single most important meta-skill in competitive programming—translating n into the family of algorithms that will pass.

Difficulty: Beginner | Prerequisites: Fast I/O and Pragmas

Big-O notation was introduced by German mathematician Paul Bachmann in 1894 in Die Analytische Zahlentheorie. Edmund Landau popularized it—which is why the family of asymptotic symbols (O, Ω, Θ, o, ω) is called Landau notation. Donald Knuth later argued for the precise distinction between O (upper bound) and Θ (tight bound) that competitive programmers use today.

The practical upshot: "Constraints are the cheat code." Every problem ships with its own hint—the input bounds reveal the intended complexity class before you understand the problem. Read n first, think second. Think of it like choosing transport by distance: walking covers 1 km (n5000), a car covers 100 km (n106), a plane covers 10,000 km (n108). Pick the wrong vehicle and you never arrive on time.


Contents


Intuition

Reading the Constraints Is Half the Solution

Before you write a single line of code, look at the input bounds. They tell you exactly how fast your solution must be. A 2-second time limit on Codeforces means roughly 108 simple operations. If n=2×105, anything O(n2) does 4×1010 operations—dead on arrival. You need O(nlogn) or better.

The constraint nX is not just a limit—it is the problem setter whispering the intended time complexity. Once you internalize the constraint-to-complexity mapping, you can eliminate 90% of wrong approaches in 10 seconds, before writing a single line of code.

Counting Loop Iterations

Single loop—nothing surprising:

cpp
for (int i = 0; i < n; i++) { ... }  // invariant: processed i elements so far -> O(n)

Independent nested loops—multiply:

cpp
for (int i = 0; i < n; i++)           // invariant: completed i full sweeps of j
    for (int j = 0; j < m; j++) { ... }  // invariant: processed i*m + j pairs -> O(n * m)

Dependent inner loop—this one trips people up. Consider:

cpp
for (int i = 0; i < n; i++)           // invariant: completed i iterations
    for (int j = i; j < n; j++) { ... }  // invariant: j starts at i, runs n-i times

When i=0, the inner loop runs n times. When i=1, it runs n1 times. Total iterations:

n+(n1)+(n2)++1=n(n1)2=O(n2)

The constant 1/2 vanishes in big-O. Still quadratic. Still TLEs at n=105.

The Harmonic Sum: Why for(i=1..n, j+=i) Is O(nlogn)

This pattern is everywhere—sieve of Eratosthenes, divisor enumeration, computing Euler's totient:

cpp
for (int i = 1; i <= n; i++)               // invariant: processed multiples of 1..i-1
    for (int j = i; j <= n; j += i) { ... } // invariant: visits floor(n/i) multiples of i

For a fixed i, the inner loop runs n/i times. Total work:

i=1nnini=1n1i=nHnnlnn

Hn is the harmonic number. It grows like lnn (plus the Euler-Mascheroni constant γ0.577, but constants don't matter for big-O). So this double loop is O(nlogn), not O(n2).

This is why the Sieve of Eratosthenes runs in O(nloglogn)—the sum only runs over primes, and p prime1/ploglogn.

Harmonic Sum Visual Proof -- Why Hn=O(lnn)

The harmonic series Hn=1+12+13++1n grows logarithmically. Here is the integral approximation argument:

The function f(x)=1/x is decreasing. For each integer k, the rectangle of height 1/k and width 1 sits above the curve on [k,k+1]. So:

Hn=k=1n1k1+1n1xdx=1+lnn

Similarly, the rectangles sit below the curve shifted by one:

Hn1n+11xdx=ln(n+1)

Therefore ln(n+1)Hn1+lnn, which gives Hn=Θ(lnn).

text
  1/x
  ^
  |
1 +##
  |# ##
  |#   ###
  |#   |  ####
  |#   |  |   ######
  |#   |  |   |     ###########
  +----+--+---+-----+-----------+--->  x
  1    2  3   4     5    ...    n

  Bars -- rectangles of height 1/k
  Curve under bars -- integral of 1/x -- ln(n)
  Sum of bar areas -- H_n ~ ln(n)

How this applies to the sieve of Eratosthenes:

The sieve marks composites by iterating over multiples of each prime pn:

cpp
for (int p = 2; p <= n; p++) {           // invariant: all composites with smallest factor < p are marked
    if (!is_prime[p]) continue;
    for (int j = p*p; j <= n; j += p)    // invariant: marks multiples of p starting at p^2
        is_prime[j] = false;
}

Work for prime p: approximately n/p operations. Total over all primes:

p primennp=np prime1pnln(lnn)

This is even better than O(nlogn) because the sum over reciprocals of primes grows much slower than the full harmonic sum. See Math Toolkit for more on prime sieves.

Recursion Trees Tell You Everything

Forget formulas for a moment. Draw the tree.

Naive Fibonaccifib(n) = fib(n-1) + fib(n-2):

Each call spawns two children. The tree has depth n. The number of nodes roughly doubles each level. Total nodes 2n. So T(n)=O(2n).

With memoization, each value fib(k) is computed exactly once. The second call for any k returns instantly. Total unique calls: n. So T(n)=O(n).

Binary search—each call eliminates half the array:

One recursive call, problem size halved. The tree is a single chain of depth log2n. Each level does O(1) work. Total: O(logn).

Merge sort—two calls on halves, then O(n) merge:

Two children per node, each on n/2 elements. At every level the total work across all nodes is O(n). The tree has log2n levels. Total: O(nlogn).

The Master Theorem (Just Three Cases)

For recurrences of the form T(n)=aT(n/b)+O(nd):

CaseConditionResultWho dominates?
1d>logbaO(nd)Root (top-level work)
2d=logbaO(ndlogn)Every level tied
3d<logbaO(nlogba)Leaves

Examples:

  • Binary search: T(n)=1T(n/2)+O(1). a=1,b=2,d=0. log21=0=d Case 2 O(n0logn)=O(logn). ✓

  • Merge sort: T(n)=2T(n/2)+O(n). a=2,b=2,d=1. log22=1=d Case 2 O(nlogn). ✓

  • Karatsuba multiplication: T(n)=3T(n/2)+O(n). a=3,b=2,d=1. log231.585>d Case 3 O(n1.585). Faster than the schoolbook O(n2).

  • Naive matrix multiply (Strassen-like split with 8 subproblems): T(n)=8T(n/2)+O(n2). log28=3>2=d Case 3 O(n3). No improvement (Strassen reduces a from 8 to 7).

text
  Work per level in the recursion tree

  Case 1 -- d > log_b(a)        Case 3 -- d < log_b(a)
  Root dominates                 Leaves dominate

  Level 0  |############|       Level 0  |##          |
  Level 1  |########    |       Level 1  |####        |
  Level 2  |#####       |       Level 2  |#######     |
  Level 3  |###         |       Level 3  |############|
           +------------+                +------------+
           Result -- n^d                 Result -- n^log_b(a)

  Case 2 -- d = log_b(a)
  All levels tied

  Level 0  |############|
  Level 1  |############|
  Level 2  |############|
  Level 3  |############|
           +------------+
           Result -- n^d * log n

Amortized: When the Worst Case Lies

Sometimes a single operation is expensive, but it can't happen often enough to matter.

Two-pointer / sliding window:

cpp
int j = 0;
for (int i = 0; i < n; i++) {       // invariant: window starts at i
    while (j < n && ok(i, j)) j++;   // invariant: j only increases across all iterations
    // process window [i, j)
}

The inner while looks scary—isn't this O(n2)? No. The variable j only moves forward. Across all iterations of the outer loop, j increments at most n times total. So the entire thing is O(n).

Stack-based problems (next greater element, etc.):

cpp
stack<int> st;
for (int i = 0; i < n; i++) {                      // invariant: elements 0..i-1 processed
    while (!st.empty() && a[st.top()] <= a[i])
        st.pop();                                    // invariant: each element popped at most once total
    st.push(i);                                      // invariant: each element pushed exactly once
}

Each element is pushed once and popped at most once. Total push + pop operations 2n. So O(n) overall, even though the inner while can pop many elements in a single iteration.

Dynamic array doubling (std::vector):

Most push_back calls are O(1). Occasionally the vector doubles, copying all n elements—O(n). But doublings happen at sizes 1,2,4,8,,n. Total copy cost: 1+2+4++n=2n1=O(n). Spread over n pushes, each push is O(1) amortized.

text
  Array doubling -- amortized O(1)

  push   1   2   3   4   5   6   7   8   9
  cap    1   2   4   4   8   8   8   8  16
         *   *   *       *               *    <-- realloc here
  cost   1   1   2       4               8    <-- copy cost

  Total copies -- 1+1+2+4+8 -- 16 -- 2n-1
  Spread over n pushes --> O(1) each

The Constraint Table You Should Memorize

This is the single most useful thing in this file. Burn it into memory:

ConstraintTarget ComplexityAlgorithm Families
n10O(n!)brute force, permutations, backtracking
n20O(2n) or O(2nn)bitmask DP, subset enumeration
n80-100O(n4)4D DP (rare)
n500O(n3)Floyd-Warshall, matrix chain DP
n5000O(n2)2D DP, quadratic sorting, brute pairs
n105O(nn)sqrt decomposition, Mo's algorithm
n106O(nlogn)sorting, segment trees, BIT, merge sort
n108O(n)prefix sums, linear scan, two-pointer

These are approximate. O(nlogn) at n=5×105 is ~107 ops—plenty of room. O(n2) at n=104 is 108—tight but usually fine. The constraint is the fingerprint of the intended algorithm: n5000 means the setter expects O(n2); n105 points to O(nlogn) or O(nn); n106 demands O(nlogn) or better.

text
  Operations at n = 100000          10^8 budget
                                         |
  O(n)        *                          |          10^5       SAFE
  O(n log n)  ***                        |          2*10^6     SAFE
  O(n sqrt n) **************             |          3*10^7     SAFE
  O(n^2)      ###############################-->    10^10      TLE
              +--------------------------+
              0                         10^8

Worked Example: Picking Your Approach

Problem: Given an array of n2×105 integers, find the longest subarray with sum k.

Step 1: n=2×105. The table says O(nlogn) or better.

Step 2: O(n2) brute force (check all subarrays) does 4×1010 ops. Way too slow.

Step 3: Think of O(n) or O(nlogn) techniques for subarrays: sliding window, prefix sums + binary search, two pointers.

Step 4: If all elements are non-negative, two-pointer gives O(n). If elements can be negative, prefix sums + binary search (or a multiset) gives O(nlogn).

You picked the algorithm before writing any code, just from the constraint.

What the Code Won't Teach You

Use the constraint table as the first step, not the last. The correct workflow is: read problem, read constraints, decide target complexity, think of algorithms in that class, pick one. Most people do it backwards: think of an algorithm, code it, then check if it is fast enough. That is a recipe for wasted time.

text
  WRONG workflow                CORRECT workflow
  +--------------+              +--------------+
  | Read problem |              | Read problem |
  +------v-------+              +------v-------+
         |                             |
  +------v-------+              +------v-------+
  | Think of an  |              | Read n       |
  | algorithm    |              | Check table  |
  +------v-------+              +------v-------+
         |                             |
  +------v-------+              +------v-------+
  | Code it      |              | Target cplx  |
  +------v-------+              +------v-------+
         |                             |
  +------v-------+              +------v-------+
  | Check if     |              | Pick algo in |
  | fast enough  |              | that class   |
  +------v-------+              +------v-------+
         |                             |
    TLE -- start over            Code + AC

O(nn) is a real target, not a fallback. When n105 and you need something between O(nlogn) and O(n2), O(nn) via Mo's algorithm or sqrt decomposition is often the intended approach. Recognizing this as a distinct target saves you from spending 30 minutes trying to find an O(nlogn) algorithm that does not exist.

text
         Complexity Spectrum at n = 10^5

  n          n log n      n sqrt n       n^2
  10^5       2*10^6       3*10^7         10^10
   |            |              |             |
   v            v              v             v
  -+-----------+--------------+-------------+---->
   |            |              |             |
   |   FAST    |  STANDARD   |   VIABLE   |  TLE
   |            |              |             |
   +<-- safe -->+<--- safe --->+<-- safe -->+##TLE##
                                    ^
                                    |
                        Do not skip this range

"Will it TLE?" requires estimating the constant, too. The same O(nlogn) can be 10x apart in practice: what matters is C×X(n), where C is the constant factor hiding inside the big-O. Data structure choice drives C more than algorithm choice does.

text
  Same O(n log n) -- different constants

  BIT         -- 10ns per op
  10^6 * 20 * 10ns  -- 200ms       SAFE

  Seg tree    -- 100ns per op
  10^6 * 20 * 100ns -- 2000ms      TIGHT

The analysis above was all reasoning about loops and recurrences. Sometimes a picture makes the argument click faster—here are the key complexity classes drawn out.


Visual Intuition

Recursion Tree: Naive Fibonacci (Exponential Blowup)

text
                        fib(5)
                       /      \
                   fib(4)      fib(3)
                  /     \       /    \
              fib(3)  fib(2) fib(2) fib(1)
              /   \    / \    / \
          fib(2) fib(1) . .  . .
           / \
          .   .

Level 0:  1 call                       = 1
Level 1:  2 calls                      = 2
Level 2:  4 calls                      = 4
Level 3:  up to 8 calls                = 8
...
Level n:  up to 2^n calls

Total ~ 2^n  -->  O(2^n)

Each unique fib(k) is recomputed many times. Memoization collapses this entire tree into a single chain of n calls.

Recursion Tree: Merge Sort (O(nlogn) Work)

text
Level 0:  [  1  5  3  7  2  6  4  8  ]          merge cost: n = 8
              /                  \
Level 1:  [1  5  3  7]      [2  6  4  8]        merge cost: 4 + 4 = 8
            /      \           /      \
Level 2:  [1  5]  [3  7]   [2  6]  [4  8]       merge cost: 2+2+2+2 = 8
           / \     / \       / \     / \
Level 3:  [1][5] [3][7]   [2][6] [4][8]         merge cost: 0 (base)

           log2(8) = 3 levels
           Each level does O(n) work
           Total: O(n log n)

Harmonic Sum: Which Multiples Get Visited

For n=12, each row shows which indices j the inner loop touches:

text
i=1:  *  *  *  *  *  *  *  *  *  *  *  *    (12 hits)
i=2:  *  .  *  .  *  .  *  .  *  .  *  .    ( 6 hits)
i=3:  *  .  .  *  .  .  *  .  .  *  .  .    ( 4 hits)
i=4:  *  .  .  .  *  .  .  .  *  .  .  .    ( 3 hits)
i=5:  *  .  .  .  .  *  .  .  .  .  *  .    ( 2 hits)
i=6:  *  .  .  .  .  .  *  .  .  .  .  .    ( 2 hits)
i=7:  *  .  .  .  .  .  .  .  .  .  .  .    ( 1 hit )
...
      j: 1  2  3  4  5  6  7  8  9  10 11 12

Total = 12 + 6 + 4 + 3 + 2 + 2 + 1 + 1 + 1 + 1 + 1 + 1
      = 35   (vs n^2 = 144)
      ~ n * ln(n) = 12 * 2.48 ~ 30   (close!)

C++ STL Complexity Reference

Know the cost of what you call. A single set::find inside a loop turns O(n) into O(nlogn).

OperationContainer / AlgorithmComplexity
sort(a, a+n)<algorithm>O(nlogn)
stable_sort(a, a+n)<algorithm>O(nlogn)
nth_element(a, a+k, a+n)<algorithm>O(n) average
lower_bound(a, a+n, x)<algorithm>O(logn)
unique(a, a+n)<algorithm>O(n)
next_permutation<algorithm>O(n) per call, O(n!n) total
__builtin_popcount(x)GCC built-inO(1)
vector::push_back<vector>O(1) amortized
vector::insert (middle)<vector>O(n)
set/map::find<set> / <map>O(logn)
set/map::insert<set> / <map>O(logn)
unordered_set/map::find<unordered_set>O(1) avg, O(n) worst
priority_queue::push/pop<queue>O(logn)
deque::push_front/back<deque>O(1)
string::substr(pos, len)<string>O(len)
string::find(pattern)<string>O(nm) worst
bitset<N>::count()<bitset>O(N/64)

Trap: unordered_map is O(1) average but can degrade to O(n) with adversarial input (hash collisions). On Codeforces, people will hack you. Use a custom hash or stick with map if you're unsure.


Implementation

Complexity Benchmarker

Paste this into a local file to build intuition for how fast each complexity class actually runs:

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    auto bench = [](string label, auto fn) {
        auto t0 = chrono::high_resolution_clock::now();
        fn();
        auto t1 = chrono::high_resolution_clock::now();
        double ms = chrono::duration<double, milli>(t1 - t0).count();
        cout << label << ": " << fixed << setprecision(1) << ms << " ms\n";
    };

    const int N = 1e6;
    volatile int sink = 0;

    bench("O(n)      n=1e6", [&]() {
        for (int i = 0; i < N; i++) sink += i;          // invariant: summed 0..i-1
    });

    bench("O(n logn) n=1e6", [&]() {
        vector<int> v(N);
        iota(v.begin(), v.end(), 0);
        sort(v.begin(), v.end(), greater<>());           // invariant: v sorted descending after
    });

    const int M = 5000;
    bench("O(n^2)    n=5000", [&]() {
        for (int i = 0; i < M; i++)                      // invariant: processed i outer iterations
            for (int j = i; j < M; j++) sink += j;       // invariant: summed i..j-1
    });

    const int K = 500;
    bench("O(n^3)    n=500", [&]() {
        for (int i = 0; i < K; i++)                      // invariant: processed i outer iterations
            for (int j = 0; j < K; j++)                  // invariant: processed i*K + j middle iterations
                for (int k = 0; k < K; k++) sink += k;   // invariant: summed 0..k-1
    });

    return 0;
}

Will-It-TLE Estimator

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    int n;
    cin >> n;

    // Rule of thumb: ~10^8 simple ops per second on CF judges.
    // Time limit is usually 1-2 seconds => budget ~2*10^8 ops.

    auto estimate = [](long long ops) -> string {
        // 2 * 10^8 ops in 2 seconds is our budget
        const long long BUDGET = 2e8;
        if (ops <= BUDGET) return "SAFE";
        if (ops <= 5 * BUDGET) return "TIGHT";
        return "TLE";
    };

    long long n2 = (long long)n * n;
    long long nlogn = (long long)n * (int)(log2(n) + 1);
    long long nsqrtn = (long long)n * (int)sqrt(n);

    cout << "n = " << n << "\n";
    cout << "O(n):        " << n        << " ops -> " << estimate(n) << "\n";
    cout << "O(n log n):  " << nlogn    << " ops -> " << estimate(nlogn) << "\n";
    cout << "O(n sqrt n): " << nsqrtn   << " ops -> " << estimate(nsqrtn) << "\n";
    cout << "O(n^2):      " << n2       << " ops -> " << estimate(n2) << "\n";

    // Example: n=200000
    // O(n):        200000       -> SAFE
    // O(n log n):  3400000      -> SAFE
    // O(n sqrt n): 89442000     -> SAFE
    // O(n^2):      40000000000  -> TLE

    return 0;
}

Operations Reference

The Constraint → Complexity Table

This is the expanded reference. When you see a constraint during a contest, jump to the row and read across—the algorithm family column narrows your search instantly. See Recursion and Backtracking for details on the brute-force families and STL Algorithms for built-in O(nlogn) tools.

nMax ops ()Target O()Typical algorithms
8104O(n!n)brute permutations with checking, TSP brute force
103.6×106O(n!)brute permutations, backtracking, all orderings
15-16108O(2nn2)bitmask DP with transitions, Hamiltonian path DP
202×107O(2nn)bitmask DP, subset enumeration, SOS DP
25108O(2n)subset DP, inclusion-exclusion over subsets
40107O(2n/2n)meet-in-the-middle (split into two halves of 20)
100108O(n4)4D DP, small brute force, Gaussian elimination
4006×107O(n3)DP on intervals, matrix exponentiation base
5001.25×108O(n3)Floyd-Warshall, matrix chain DP, network flow (small)
50002.5×107O(n2)2D DP, brute pairs, quadratic sorting, LCS
104108O(n2) (tight)bubble/insertion sort, 2D DP with small constant
1053.2×107O(nn)sqrt decomp, Mo's algorithm, block decomposition
5×105107O(nlog2n)merge sort tree, CDQ divide-and-conquer, persistent seg tree
1062×107O(nlogn)sort, seg tree, BIT, merge sort, convex hull trick, CDQ
107107O(n) or O(nloglogn)linear sieve, suffix array (SA-IS), Euler tour
108108O(n)prefix sums, two-pointer, linear DP, simple iteration
109--O(n) or O(n1/3)prime factorization, Pollard rho, baby-step giant-step
1012--O(n)number-theoretic functions, prime counting (Lucy DP)
1018--O(logn) or O(n3)binary search, matrix exponentiation, fast exponentiation, math

Operations Per Second Rule of Thumb

On a typical Codeforces judge (and competitive programming judges in general):

  • Simple operations (add, compare, array access): 108-109 / sec
  • With cache misses (random memory access): 107-108 / sec
  • Heavy operations (modular division, map lookups): 107 / sec

A 2-second time limit on CF gives you roughly 2×108 to 5×108 "effective operations," depending on how simple each op is. When in doubt, assume 108 per second and give yourself a 2x safety margin.


Problem Patterns & Tricks

Constraint Reading → Algorithm Selection

The first thing you should do on every problem: read n, check the table, eliminate impossible complexity classes. If n2×105, you know O(n2) won't work. Stop thinking about brute force and focus on O(nlogn) or O(n) approaches.

CF 1462D - Add to Neighbour and Remove: n500, so O(n3) is fine. A nested loop over segments works.

Is This Really O(n2)?

Many solutions look quadratic but are actually linear or O(nlogn) due to amortized analysis. Before you dismiss a two-loop solution:

  • Does the inner variable only increase across outer iterations? Amortized O(n).
  • Does each element get processed a bounded number of times? O(n).

CF 1237D - Balanced Playlist: Two-pointer solution with inner loop that appears O(n2) but is O(n).

The nn Sweet Spot

When n105 and you need something between O(nlogn) and O(n2), think sqrt decomposition or Mo's algorithm. Block size n gives O(nn) which is 107.5—fits comfortably.

CF 86D - Powerful Array: Classic Mo's algorithm. Offline range queries in O((n+q)n).

When O(nlog2n) Is Fine

Don't panic about extra log factors. At n=105:

  • O(nlogn)1.7×106
  • O(nlog2n)2.8×107

Both are well within budget. Merge sort tree, persistent segment tree with binary search—the extra log is almost always acceptable.

The Harmonic Trick in Disguise

Any time a problem involves iterating over divisors or multiples, expect the harmonic sum. This includes:

  • Sieve of Eratosthenes variants
  • Counting divisors for each number 1..n
  • GCD/LCM-related preprocessing

Total work: O(nlogn), not O(n2).

CF 1154G - Minimum Possible LCM: Iterate over all multiples of each value. Harmonic sum gives O(nlogn).

2n With Meet-in-the-Middle

If n40 (not 20), split into two halves of 20. Enumerate subsets of each half (220106), combine with sorting + binary search. Total: O(2n/2n).

CF 888E - Maximum Subsequence: n35, modular sum. Classic meet-in-the-middle.


Contest Cheat Sheet

text
+------------------------------------------------------------------+
|                   COMPLEXITY QUICK REFERENCE                      |
+------------------------------------------------------------------+
| CONSTRAINT -> COMPLEXITY -> THINK OF                              |
|------------------------------------------------------------------|
| n <= 10      O(n!)         permutations, brute force              |
| n <= 20      O(2^n)        bitmask DP, subsets                    |
| n <= 500     O(n^3)        Floyd, matrix DP, 3 nested loops       |
| n <= 5000    O(n^2)        2D DP, brute pairs                     |
| n <= 1e5     O(n*sqrt(n))  Mo's, sqrt decomposition               |
| n <= 1e6     O(n log n)    sort, seg tree, BIT, divide & conquer  |
| n <= 1e8     O(n)          prefix sums, two-pointer, greedy       |
|------------------------------------------------------------------|
| BUDGET: ~10^8 simple ops per second                               |
| 2s TL => ~2*10^8 ops budget (with safety margin)                  |
|------------------------------------------------------------------|
| AMORTIZED O(n):                                                   |
|   two-pointer (j only moves forward)                              |
|   stack (each elem pushed/popped once)                            |
|   vector doubling (geometric sum = 2n)                            |
|------------------------------------------------------------------|
| HARMONIC SUM: for(i=1..n) for(j=i;j<=n;j+=i) => O(n log n)      |
|------------------------------------------------------------------|
| MASTER THEOREM: T(n) = a*T(n/b) + O(n^d)                         |
|   d > log_b(a): O(n^d)     d = log_b(a): O(n^d logn)             |
|   d < log_b(a): O(n^(log_b(a)))                                  |
|------------------------------------------------------------------|
| STL COSTS: sort O(nlogn)  set/map O(logn)  umap O(1) avg         |
|            nth_element O(n)  push_back O(1) amort                 |
+------------------------------------------------------------------+

Gotchas & Debugging

Hidden Constants Kill

Big-O hides constants. std::map is O(logn) but with a constant 5-10x larger than a sorted array + binary search due to pointer chasing and cache misses. unordered_map is O(1) average but has high overhead per lookup compared to a flat array.

Rule of thumb: if you can use an array, use an array. Replace map<int,int> with a plain array when keys are bounded.

O(nlogn) Can Lose to O(nn)

A segment tree with lazy propagation (O(nlogn)) has a significant constant from the recursive structure and cache-unfriendly access. Mo's algorithm (O(nn)) can be faster in practice up to n105 because it's just array scans. Profile before over-engineering.

Codeforces TLs Are Generous -- But Not Infinite

CF usually sets TL = 2-3x the intended solution's runtime. An O(nlogn) solution where the setter expected O(n) will pass, but an O(n2) solution where O(nlogn) was expected will not. Don't rely on "maybe my O(n2) is fast enough"—it almost never is at n=105.

Nested STL Calls Stack Complexity

cpp
for (int i = 0; i < n; i++)
    s.erase(s.find(a[i]));  // O(log n) per iteration => O(n log n) total // invariant: a[0..i-1] removed from s

This is fine. But watch out for:

cpp
for (int i = 0; i < n; i++)
    for (int j = 0; j < n; j++)                   // invariant: checked i*n + j pairs so far
        if (s.count(a[i] + a[j])) ans++;  // O(n^2 log n) -- ouch

Each count is O(logn), multiplied by the O(n2) loop. At n=105 this is 1011.7 ops. Use unordered_set to drop it to O(n2) (still too slow at n=105, but the point is: know what your STL calls cost).

When computing n2 or n×m to estimate ops, use long long:

cpp
long long ops = (long long)n * n;  // not int * int

If your estimate overflows, your solution will definitely TLE anyway. But also watch for overflow inside your actual solution—any time you multiply two values that can be up to 105, the product can reach 1010, which overflows int.

Mental Traps

Trap: Reading constraints without computing the implied complexity. Many competitors see n2×105 and jump straight to coding whatever algorithm first came to mind. The constraint is a hint from the problem setter about the intended complexity class. Use it before writing code.

text
  WRONG                              RIGHT
  -----                              -----
  n = 200000                         n = 200000
       |                                  |
       v                                  v
  Pick any algorithm                 200000^2 = 4*10^10
       |                                  |
       v                             4*10^10 >> 10^8 budget
  Code n^2 solution                       |
       |                                  v
       v                             Target: n log n
  +----------+                            |
  |## TLE  ##|                            v
  +----------+                       +----------+
                                     |##  AC  ##|
                                     +----------+

Trap: Assuming "O(n log n) means safe" without computing.O(nlogn) is not a magic pass. At n=106, logn20, so 2×107 ops—fine. But if the "log" is actually "log2" from a nested binary search, that is 4×108—potentially TLE with a heavy constant.

text
  n = 10^6           n log n            n log^2 n
                        |                   |
                        v                   v
                   20 * 10^6           400 * 10^6
                        |                   |
                   +----v----+         +----v----+
                   | 2*10^7  |         | 4*10^8  |
                   |  SAFE   |         |  TIGHT  |
                   +---------+         +---------+
                                            |
                                            v
                               heavy constant --> TLE

Trap: Not recognizing the harmonic sum pattern. A nested loop where the inner loop runs n/i times for each i looks quadratic to the untrained eye. It is O(nlogn). This appears in divisor enumeration, sieve variants, and any "multiples" computation.

text
  WRONG analysis              RIGHT analysis
  --------------              ---------------
  for i in 1 to n             for i in 1 to n
    for j from i step i         for j from i step i
       |                           |
       v                           v
  Two nested loops             inner runs n/i times
       |                           |
       v                           v
  Must be n^2                  sum = n/1 + n/2 + ... + n/n
       |                           = n * H(n)
       v                           |
  # Reject valid  #                v
  # n log n       #           n log n <-- SAFE
  # approach      #

So far we've focused on how many operations your code runs. But the judge also limits how much memory you can use.


Space Complexity and Stack Depth

Time isn't the only resource the judge measures. Memory Limit Exceeded (MLE) is just as fatal as TLE, and harder to debug because you rarely get a helpful message—just a verdict.

Auxiliary Space vs. Input Space

Input space is the memory the input itself occupies—the array you read in, the graph's adjacency list. You can't avoid this.

Auxiliary space is the extra memory your algorithm uses beyond the input. When a problem says "solve in O(1) extra space," it means auxiliary space only.

AlgorithmTotal SpaceAuxiliary Space
Merge sortO(n)O(n)—needs a temporary array
Quicksort (in-place)O(n)O(logn)—recursion stack
Heap sortO(n)O(1)—sorts in place
BFS on graphO(n+m)O(n)—visited + queue
DFS on graph (recursive)O(n+m)O(n)—recursion stack

In competitive programming, the distinction usually does not matter—you care about total memory. But know the vocabulary for interview-style problems.

Why Space Matters

Typical Codeforces memory limit: 256 MB. Some problems give 64 MB or 512 MB, so always check.

Quick reference for a single allocation:

DeclarationCalculationMemory
int a[1000000]106×4 bytes4 MB
long long a[1000000]106×8 bytes8 MB
int a[5000][5000]25×106×4 bytes100 MB—tight!
long long a[5000][5000]25×106×8 bytes200 MB—likely MLE

Key sizes: sizeof(int) = 4, sizeof(long long) = 8, sizeof(double) = 8, sizeof(bool) = 1 (but vector<bool> is 1 bit per entry).

Counting Memory Usage

StructureMemory Formula
Static array T a[n]sizeof(T) * n bytes
vector<T> of size nsizeof(T) * n + ~24 bytes overhead
Adjacency list (undirected, m edges)2m× sizeof(pair<int,int>) = 16m bytes
Segment tree over n elements4n× sizeof(node)
Recursion of depth dd× stack-frame size (typically 50-200 bytes per call)

Rule of thumb: add up all your large arrays and containers. If the total is within ~80 % of the memory limit, you're usually safe. The remaining 20 % covers code, constants, and OS overhead.

Common MLE Causes (and Fixes)

2-D DP with both dimensions 105

cpp
// dp[100000][100000] = 10^{10} ints = 40 GB -- impossible
int dp[100000][100000]; // MLE

Fix—rolling array: if each row only depends on the previous row, keep just two rows:

cpp
int dp[2][100000]; // 0.8 MB -- fine
for (int i = 1; i <= n; i++) {             // invariant: dp[prv] holds row i-1 results
    int cur = i & 1, prv = cur ^ 1;
    for (int j = 0; j < m; j++)            // invariant: dp[cur][0..j-1] already computed
        dp[cur][j] = /* transition using dp[prv][...] */;
}

Adjacency matrix for large n

cpp
bool adj[100000][100000]; // 10^{10} bytes = 10 GB -- impossible

Fix: use an adjacency list. It uses O(n+m) memory, not O(n2).

Storing all BFS/DFS states

Pushing every intermediate state into a vector can blow up memory when the state space is large. Keep only what you need (e.g., distances, not full paths).

MLE Estimation Table

Use this during contests to quickly check whether an array fits:

text
Memory Limit | Max int (4 B) elements | Max long long (8 B) elements
-------------+------------------------+-----------------------------
  64 MB      |      16 x 10^6         |         8 x 10^6
 256 MB      |      64 x 10^6         |        32 x 10^6
 512 MB      |     128 x 10^6         |        64 x 10^6
   1 GB      |     256 x 10^6         |       128 x 10^6

Formula: max elements=limit in bytessizeof(T).

Stack Depth and Recursion

The call stack is a separate, limited memory region:

  • Default stack size: ~1-8 MB (platform and judge dependent).
  • Each recursive call adds a stack frame: local variables + return address + saved registers ~= 50-200 bytes, depending on how many locals you declare.
  • Max safe recursion depth: roughly 104-105 calls.

A DFS on a path graph of 105 nodes recurses 105 deep. With 200 bytes per frame, that's ~20 MB—possible stack overflow on many judges.

Stack depth rules of thumb:

Recursion DepthRisk LevelNotes
104SafeDefault stack handles this on all judges
104105BorderlineWorks on most CF judges, may fail locally
105106DangerousNeeds stack size increase or iterative conversion
>106Almost certain crashMust convert to iterative

Example: DFS on a tree with n=105 nodes is fine on Codeforces (which gives ~256 MB stack). But a tree with n=106 nodes and a worst-case path graph will recurse 106 deep—at 100 bytes per frame, that's 100 MB of stack, which will likely crash.

Fixes:

cpp
// Fix 1: Increase stack size via compiler pragma (works on CF with G++)
// Add at the top of your file:
#pragma comment(linker, "/STACK:1000000000")  // Windows MSVC
// On Linux, compile with: g++ -O2 -Wl,-z,stacksize=268435456 sol.cpp

// Fix 2: On Linux systems (local testing), before running:
// $ ulimit -s unlimited
// This removes the stack size limit for the current shell session.
// NOTE: This does NOT work on online judges -- it's for local testing only.

// Fix 3: Convert recursion to an explicit stack (always works, recommended)
void dfs_iterative(int start) {
    stack<int> st;
    st.push(start);
    while (!st.empty()) {                  // invariant: st contains unvisited frontier
        int v = st.top(); st.pop();
        if (visited[v]) continue;
        visited[v] = true;
        for (int u : adj[v])               // invariant: all neighbors of v get queued
            if (!visited[u]) st.push(u);
    }
}

// Fix 4: For DFS specifically, use a parent-tracking iterative version
// that preserves DFS order (unlike the stack version above which reverses it)
void dfs_iterative_ordered(int root) {
    struct Frame { int v, parent, child_idx; };
    stack<Frame> st;
    st.push({root, -1, 0});
    visited[root] = true;
    while (!st.empty()) {                  // invariant: simulates call stack
        auto& [v, par, ci] = st.top();
        if (ci < (int)adj[v].size()) {
            int u = adj[v][ci++];
            if (u != par && !visited[u]) {
                visited[u] = true;
                st.push({u, v, 0});        // "recurse" into child
            }
        } else {
            st.pop();                       // "return" from this call
        }
    }
}

When to convert recursion to iteration:

  • Tree/graph with n>5×104 nodes and potential chain/path structure
  • DP with recursion depth proportional to n where n>105
  • Any problem where you get Runtime Error (RE) and suspect stack overflow
  • When the judge does not support stack size pragmas (rare but happens on some OJs)

Bottom line: estimate memory the same way you estimate time. Add up your arrays, check the limit, and watch your recursion depth. See Recursion and Backtracking for more on converting recursive algorithms to iterative ones.


When Complexity Analysis Fails/Misleads

Complexity analysis is your most powerful tool, but it has blind spots. Know when it lies to you:

Constants matter more than you think. An O(n) algorithm with a constant factor of 100 is slower than O(nlogn) with constant 1 at n<21001030. In practice: std::map (O(logn) with huge constant) loses to a hash map (O(1) average with moderate constant) and both lose to a plain array (O(1) with minimal constant).

Cache effects can dominate. A 2D array traversed column-by-column is O(n2) just like row-by-row—same big-O. But column traversal causes cache misses and can be 5-10x slower. Big-O says nothing about cache behavior.

cpp
// Same O(n^2) -- but wildly different wall-clock time for large n
for (int i = 0; i < n; i++)
    for (int j = 0; j < n; j++)
        a[i][j] += 1;   // row-major: FAST (sequential memory access)

for (int j = 0; j < n; j++)
    for (int i = 0; i < n; i++)
        a[i][j] += 1;   // column-major: SLOW (cache thrashing)

Worst case vs. average case. Quicksort is O(n2) worst case but O(nlogn) average—and in practice it beats merge sort (which is always O(nlogn)) thanks to cache locality. std::sort in GCC uses introsort: quicksort that falls back to heapsort if recursion gets too deep, giving worst-case O(nlogn) guarantee.

Amortized ≠ per-operation guarantee.vector::push_back is O(1) amortized, but a single push can take O(n). In real-time systems this matters. In CP it almost never does, but be aware when interleaving operations with time-sensitive logic.

The constant in O(nlog2n) adds up. At n=105: log2n289, so nlog2n2.9×107—fine. At n=106: log2n400, so 4×108—tight with any overhead. Don't dismiss extra log factors at large n.


Debug This -- What's the Real Complexity?

Determine the actual time complexity of each snippet. Answers are below (try before peeking!).

Snippet 1:

cpp
int ans = 0;
for (int i = 1; i <= n; i++)        // outer: 1 to n
    for (int j = 1; j <= n; j *= 2) // inner: 1, 2, 4, 8, ...
        ans += i + j;

Snippet 2:

cpp
int ans = 0;
for (int i = 0; i < n; i++)
    for (int j = 0; j < n; j++)
        for (int k = j; k < n; k++)
            ans++;

Snippet 3:

cpp
int ans = 0;
for (int i = 1; i <= n; i++) {
    int j = i;
    while (j > 0) {
        ans += j % 10;
        j /= 10;
    }
}

Snippet 4:

cpp
vector<int> v;
for (int i = 0; i < n; i++) {
    v.push_back(i);
    if (v.size() > 100)
        v.erase(v.begin());  // erase from front
}
Answers (click to reveal)

Snippet 1: O(nlogn). The inner loop runs log2n times (doubling each iteration). Outer loop runs n times. Total: n×logn.

Snippet 2: O(n3). For fixed i and j, the k loop runs nj times. Summing over j: j=0n1(nj)=n(n+1)/2=O(n2). Over all i: n×O(n2)=O(n3). (Not O(n4)!)

Snippet 3: O(nlogn). The inner while loop on j runs log10i+1 times (number of digits of i). Sum: i=1nlog10i=log10(n!)=O(nlogn) by Stirling's approximation.

Snippet 4: O(nmin(n,100))=O(n) because v.size() is capped at 101. Each erase(v.begin()) shifts at most 100 elements. So total work is O(100n)=O(n) despite erase being "O(size)" in general.


What Would You Do If...?

Scenario 1: You read the constraints and see n2×105, q2×105. The problem asks for range sum queries after point updates. What complexity do you target?

Answer: O((n+q)logn). This screams BIT or segment tree. O(nq) brute force does 4×1010 ops—instant TLE. Each query and update must be O(logn).

Scenario 2: n40, and you need to find if any subset sums to a target S. Your first instinct is bitmask DP over all 240 subsets, but 2401012—TLE.

Answer: Meet-in-the-middle. Split into two halves of 20. Enumerate 220106 subsets for each half. Sort one half, binary search from the other. Total: O(2n/2n).

Scenario 3: n106 and you need to count the number of divisors for every integer from 1 to n. A naive approach checks all divisors up to i for each i.

Answer: Use the "iterate over multiples" pattern: for each d from 1 to n, increment the divisor count for d,2d,3d,. This is the harmonic sum pattern: total ops =n/1+n/2++n/n=O(nlogn). Much faster than O(nn).


Before You Code Checklist

Run through this list before writing a single line on any contest problem:

  • [ ] Read n: What is the largest input dimension? Check all constraints (n, m, q, k, T).
  • [ ] Compute the budget: n2 at this n is how many ops? Compare to 2×108.
  • [ ] Identify the complexity class: Use the constraint table. What family of algorithms fits?
  • [ ] Check space: Will your arrays fit in the memory limit? Especially watch 2D arrays and adjacency matrices.
  • [ ] Watch for hidden factors: Does your approach use set/map (add logn)? Multiple test cases (multiply by T)?

If I Had 5 Minutes

Speed reference—the five things to know if you read nothing else:

  • 108 rule: A program does ~108 simple ops per second. Budget = time limit x 108.
  • Constraint → Complexity: n202n. n5000n2. n106nlogn.
  • Harmonic sum: for(i=1..n) for(j=i; j<=n; j+=i) is O(nlogn), not O(n2).
  • Amortized O(n): If a variable only moves forward across all iterations, the total is O(n).
  • Memory: int a[N] uses 4N bytes. 256 MB fits ~6.4×107 ints.

Rating Progression

How complexity analysis skills evolve as you grow on Codeforces:

CF 1200 (Beginner): You can look up the constraint table and pick O(n2) vs O(nlogn). You recognize that n=105 means "no brute force." Most Div 2 A/B problems test this directly.

CF 1500 (Intermediate): You recognize amortized O(n) patterns like two-pointer and monotone stack without second-guessing. You estimate operation counts mentally and know when O(nlog2n) is still safe. You never submit an O(n2) solution at n=105.

CF 1800 (Advanced): You spot the harmonic sum pattern in divisor/multiple problems instantly. You know when O(nn) is the intended complexity (Mo's, sqrt decomp). You can analyze recursion trees for non-standard recurrences. You make space-time tradeoffs (rolling array, bitset compression).

CF 2100+ (Expert): You reason about complexity with multiple parameters (O(n2kk) for bitmask DP with parameter k). You recognize when meet-in-the-middle applies (n40). You design algorithms to hit a target complexity, not just verify one. You understand when "TLE" means "wrong approach" versus "needs constant optimization." See Bit Manipulation for bitmask DP patterns.


The Mistake That Teaches

It was a Div 2 C. The constraint said n105, and I had a clean two-pointer solution—but inside the inner loop I called s.count(x) on a std::set. "It's just a lookup," I thought. "O(1)." Wrong. set::count is O(logn), not O(1). My "O(n)" solution was actually O(nlogn).

That time it still passed (the extra log fit in budget). But two weeks later, the same mistake on a problem with n=106 and tight TL gave me TLE. I spent 40 minutes optimizing the wrong thing before realizing the issue was one function call.

Lesson: Always mentally multiply the complexity of STL operations into your total. An innocent .count(), .find(), or .substr() inside a loop can silently add a log factor or worse. Refer to the STL Containers reference when in doubt.


Self-Test

Before moving to practice problems, verify you can do these without looking back at the text:

  • [ ] Given n=3000, immediately state whether O(n2) and O(n3) are safe, tight, or TLE for a 2-second time limit.
  • [ ] Explain why for(i=1..n) for(j=i; j<=n; j+=i) is O(nlogn), not O(n2), using the harmonic sum argument.
  • [ ] For the two-pointer pattern where j only moves forward, give a rigorous amortized argument that total operations are O(n).
  • [ ] Name three STL operations and their complexities, including one commonly mistaken for O(1) but actually O(logn).
  • [ ] Given n=2×105 and an O(nlogn) algorithm at 50 ns per operation, compute whether it passes a 2-second time limit.
  • [ ] Convert the constraint n20 into a target complexity class and name two algorithm families that fit.
  • [ ] Explain why vector::push_back is O(1) amortized using the geometric series argument.

Practice Problems

Problems ordered by difficulty. CF ratings shown where applicable.

#ProblemKey SkillCF RatingDifficulty
1CSES - Missing NumberO(n) vs O(nlogn)—know which you need~800****
2CF 1462D - Add to Neighbour and Removen500, so O(n2) or O(n3) is fine1100****
3CSES - Sum of Two ValuesTwo-pointer or hash map -> O(n) or O(nlogn)~1200****
4CF 1251C - Minimize the IntegerGreedy O(n)—constraints confirm no DP needed1200****
5CF 1237D - Balanced PlaylistTwo-pointer amortized O(n) despite nested loop1800****
6CF 86D - Powerful ArrayMo's algorithm O(nn)2000****
7CF 888E - Maximum SubsequenceMeet-in-the-middle O(2n/2)1800****
8CF 1154G - Minimum Possible LCMHarmonic sum trick O(nlogn)1900****
9AtCoder ABC 169F - Knapsack for All SubsetsCounting DP—constraint dictates O(nS)~2000****

Further Reading


Next Up: Arrays and Strings—where you apply these complexity skills to the most common data structures in competitive programming. Every array technique has a complexity signature; now you know how to read it.

Built for the climb from Codeforces 1100 to 2100.