Debugging and Stress Testing

Quick Revisit
USE WHEN: WA on a hidden test and staring at code is not finding the bug
INVARIANT: If brute-force is correct, any input where sol and brute disagree reveals the bug
TIME: O(minutes of your time) | SPACE: 3 files (sol, brute, gen) + stress script
CLASSIC BUG: Generator produces only easy cases (e.g., sorted input)—randomize all parameters
PRACTICE: ../12-practice-ladders/01-ladder-1100-to-1400.md

Difficulty: Beginner | CF Rating: 1000–1400 | Prerequisites: Basic C++, basic bash/shell

Intuition
Visual Intuition
C++ STL Reference
Implementation (Contest-Ready)
Generator Reference
Common Bug Patterns
Contest Cheat Sheet
Gotchas and Meta-Debugging
Self-Test Checklist
Practice Problems
Further Reading
What to Solve Next

Intuition

Stop Staring at Your Code

You submitted your solution. WA on test 47. The test has $n = 200 000$ elements and you can't eyeball what went wrong. You re-read your code five times. You add a +1 somewhere, resubmit. Still WA.

Staring is the wrong strategy. Your brain wrote the bug—the same brain is now trying to find it, reading what it meant to write rather than what it actually wrote. You need a mechanical process that removes your bias from the equation entirely.

That process is stress testing.

When to Reach for This

Stress testing is the right move whenever you hit one of these walls:

WA on hidden test cases—you passed the samples but the judge says no.
Passes samples but fails submission—your logic looks correct on the examples, yet something is off.
Can't find the bug by reading code—you have re-read your solution three times and see nothing wrong.
Edge case uncertainty—you are not sure your code handles $n = 1$ , empty input, all-same values, or maximum constraints.
Off-by-one suspicion—your loop bounds, binary search, or index arithmetic feel fragile.

If any of these apply, stop staring and start stress testing. The cost is 5–10 minutes of setup; the payoff is a concrete failing input you can actually read.

The Idea in One Sentence

Write a slow-but-obviously-correct brute force, generate thousands of small random inputs, and run both solutions until their outputs disagree. Then you have a tiny failing test you can actually debug by hand.

The Three Files You Need

Every stress test has three programs and one script:

File	Purpose
`sol.cpp`	Your real solution—the one that might be buggy
`brute.cpp`	A trivially correct $O (n^{2})$ or $O (n!)$ solution that is definitely right
`gen.cpp`	A random test generator that writes to stdout
`stress.sh`	A bash loop that wires everything together

The flow looks like this:

text

                    +----------+
                    |  gen.cpp |
                    |  (seed)  |
                    +----+-----+
                         |
                         v
                    +-----------+
                    | input.txt |
                    +-----+-----+
                     /          \
                    v            v
            +----------+   +-----------+
            | sol.cpp  |   | brute.cpp |
            +----+-----+   +-----+-----+
                 |                |
                 v                v
            +----------+   +-----------+
            | out1.txt |   | out2.txt  |
            +----+-----+   +-----+-----+
                  \              /
                   v            v
                  +------+-------+
                  | diff out1    |
                  |      out2    |
                  +--------------+
                        |
               +--------+--------+
               |                 |
               v                 v
           MATCH              MISMATCH!
          (next i)          (print input,
                              stop)

If the diff finds a mismatch, you have a small test case that breaks your solution. Now you can trace through it by hand.

Writing Generators That Actually Work

Never use rand(). Its quality is implementation-defined, its range is tiny on some platforms, and seeding it properly is annoying. Use the Mersenne Twister instead:

cpp

#include <bits/stdc++.h>
using namespace std;

int main(int argc, char* argv[]) {
    mt19937 rng(atoi(argv[1]));  // seed from command line

    // random int in [lo, hi]
    auto randint = [&](int lo, int hi) {
        return uniform_int_distribution<int>(lo, hi)(rng);
    };

    int n = randint(1, 10);  // keep small for stress testing!
    cout << n << "\n";
    for (int i = 0; i < n; i++) {
        cout << randint(-100, 100) << " \n"[i == n - 1];
    }
}

The seed comes from argv[1] so each iteration of the stress loop produces a different test. Keep $n$ small—you want tests you can read when a mismatch surfaces.

Random Arrays

cpp

int n = randint(1, 10);
cout << n << "\n";
for (int i = 0; i < n; i++)
    cout << randint(1, 1000) << " \n"[i == n - 1];

Random Permutations

cpp

int n = randint(1, 8);
vector<int> p(n);
iota(p.begin(), p.end(), 1);
shuffle(p.begin(), p.end(), rng);
cout << n << "\n";
for (int i = 0; i < n; i++)
    cout << p[i] << " \n"[i == n - 1];

Random Trees

Use the random-parent method: node $i$ picks a random parent among $1 \dots i - 1$ .

cpp

int n = randint(2, 10);
cout << n << "\n";
for (int i = 2; i <= n; i++) {
    int parent = randint(1, i - 1);
    cout << parent << " " << i << "\n";
}

Alternatively, use a Pruefer sequence for a uniformly random labeled tree:

cpp

int n = randint(2, 10);
vector<int> code(n - 2);
for (int& x : code) x = randint(1, n);
// decode Pruefer sequence
vector<int> deg(n + 1, 1);
for (int x : code) deg[x]++;
int leaf = 1;
while (deg[leaf] != 1) leaf++;
int ptr = leaf;
vector<pair<int,int>> edges;
for (int x : code) {
    edges.push_back({ptr, x});
    deg[x]--;
    if (deg[x] == 1 && x < leaf) {
        ptr = x;
    } else {
        leaf++;
        while (leaf <= n && deg[leaf] != 1) leaf++;
        ptr = leaf;
    }
}
edges.push_back({ptr, n});
cout << n << "\n";
for (auto [u, v] : edges) cout << u << " " << v << "\n";

Random Graphs

Be careful to avoid duplicate edges and self-loops:

cpp

int n = randint(2, 6);
int max_edges = n * (n - 1) / 2;
int m = randint(n - 1, min(max_edges, 2 * n));  // connected-ish
set<pair<int,int>> used;
cout << n << " " << m << "\n";
// first make a spanning tree to guarantee connectivity
vector<int> perm(n);
iota(perm.begin(), perm.end(), 1);
shuffle(perm.begin(), perm.end(), rng);
for (int i = 1; i < n; i++) {
    int u = perm[randint(0, i - 1)], v = perm[i];
    if (u > v) swap(u, v);
    used.insert({u, v});
    cout << u << " " << v << "\n";
}
// add random edges until we have m
while ((int)used.size() < m) {
    int u = randint(1, n), v = randint(1, n);
    if (u == v) continue;
    if (u > v) swap(u, v);
    if (used.count({u, v})) continue;
    used.insert({u, v});
    cout << u << " " << v << "\n";
}

Random Strings

cpp

int n = randint(1, 15);
string s(n, ' ');
for (char& c : s) c = 'a' + randint(0, 25);
cout << s << "\n";

For problems on small alphabets, use randint(0, 1) for binary strings or randint(0, 2) for {a, b, c}.

Writing Brute Force: Stupid but Correct

Your brute force does not need to be fast. It needs to be obviously correct—so obvious that even a cursory read would catch any bug. If the problem says "find the minimum sum subarray," try all $O (n^{2})$ subarrays:

cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    int n; cin >> n;
    vector<int> a(n);
    for (int& x : a) cin >> x;

    int ans = INT_MAX;
    for (int i = 0; i < n; i++) {
        int sum = 0;
        for (int j = i; j < n; j++) {
            sum += a[j];
            ans = min(ans, sum);
        }
    }
    cout << ans << "\n";
}

If the problem is on graphs and you're unsure about your Dijkstra, write Floyd-Warshall. If it's DP, brute-force with bitmask over all subsets. The goal is zero cleverness—zero room for bugs.

text

  COMPARISON PIPELINE

  seed ---> +-------+
            | gen   |---> input
            +-------+        |
                        +----+----+
                        |         |
                        v         v
                   +--------+ +--------+
                   | brute  | |  sol   |
                   | O(n^2) | | O(nlgn)|
                   +---+----+ +---+----+
                       |          |
                       v          v
                   expected    actual
                       |          |
                       +----+-----+
                            |
                            v
                       +--------+
                       |  diff  |
                       +---+----+
                           |
                     +-----+------+
                     |            |
                   match      MISMATCH
                     |            |
                     v            v
                  next seed    STOP
                               print input

The Stress Loop

Here is the complete stress.sh—copy-paste it into every contest directory:

bash

#!/bin/bash
set -e
g++ -O2 -o sol sol.cpp
g++ -O2 -o brute brute.cpp
g++ -O2 -o gen gen.cpp
echo "Compiled. Running stress test..."
for i in $(seq 1 10000); do
    ./gen $i > input.txt
    ./sol < input.txt > out1.txt
    ./brute < input.txt > out2.txt
    if ! diff -q out1.txt out2.txt > /dev/null; then
        echo "MISMATCH on test $i"
        echo "--- Input ---"
        cat input.txt
        echo "--- Expected (brute) ---"
        cat out2.txt
        echo "--- Got (sol) ---"
        cat out1.txt
        break
    fi
    if (( i % 100 == 0 )); then
        echo "Passed $i tests..."
    fi
done

Run it: chmod +x stress.sh && ./stress.sh

If it passes 10,000 tests, your solution is probably correct (or your generator isn't producing the right edge cases—more on that later).

text

  STRESS TESTING WORKFLOW LOOP

  +-------+     +-------+     +--------+
  | Write |---->| Write |---->| Write  |
  |  sol  |     | brute |     |  gen   |
  +-------+     +-------+     +---+----+
                                  |
              +-------------------+
              |
              v
  +-----------------------+
  | Run stress loop       |<---------+
  | gen -> sol -> brute   |          |
  | -> diff               |          |
  +----------+------------+          |
             |                       |
       +-----+------+                |
       |            |                |
     match      mismatch             |
       |            |                |
       v            v                |
  +---------+  +----------+          |
  | 10k     |  | Add cerr |          |
  | passed  |  | Trace by |          |
  | Submit  |  | hand     |          |
  +---------+  +-----+----+          |
                     |               |
                     v               |
                 +--------+          |
                 | Fix    |----------+
                 | bug    |
                 +--------+

Debug Printing with cerr

When stress testing finds a failing case, you need to figure out where in your code the bug lives. Add cerr prints:

cpp

cerr << "i=" << i << " dp[i]=" << dp[i] << "\n";

cerr goes to stderr, so it won't pollute your stdout (which is what the judge reads). But you don't want these prints running on the judge either—they slow you down. Guard them:

cpp

#ifdef LOCAL
#define dbg(x) cerr << #x << " = " << (x) << "\n"
#else
#define dbg(x)
#endif

Compile locally with g++ -DLOCAL -O2 sol.cpp and dbg(variable) prints on your machine but vanishes on the judge.

Here is a more powerful debug macro that handles multiple arguments:

cpp

#ifdef LOCAL
#define dbg(...) cerr << "[" << #__VA_ARGS__ << "]: ", _dbg(__VA_ARGS__)
void _dbg() { cerr << "\n"; }
template<class T, class... U>
void _dbg(T t, U... u) { cerr << t; if (sizeof...(u)) cerr << ", "; _dbg(u...); }
#else
#define dbg(...)
#endif

Usage: dbg(i, j, dp[i][j]) prints [i, j, dp[i][j]]: 3, 5, 42.

Assert Your Assumptions

If you assume an array index is in bounds, prove it:

cpp

assert(0 <= idx && idx < n);

assert() crashes your program with a message naming the exact line that failed. That is infinitely more useful than a silent wrong answer with no indication of where things went sideways. Use it liberally during debugging—strip it before submitting if you are worried about speed, though assert is nearly free in practice.

The Edge Case Checklist

Before you even stress test, run these by hand or in your generator:

text

+-------------------------------------------------------+
|  EDGE CASES TO ALWAYS CHECK                           |
+-------------------------------------------------------+
|  * n = 0 or n = 1                                     |
|  * All elements the same                              |
|  * Array already sorted / reverse sorted              |
|  * Maximum values (watch for overflow!)               |
|  * Negative numbers (especially with modular arith)   |
|  * Empty graph (m = 0)                                |
|  * Disconnected graph                                 |
|  * Self-loops and duplicate edges                     |
|  * Single-node graph                                  |
|  * n = max constraint (does it TLE or MLE?)           |
|  * Answer is 0 or answer is negative                  |
|  * All queries are the same                           |
|  * Strings of length 1                                |
+-------------------------------------------------------+

A good generator will produce some of these naturally when $n$ is small. If you want guaranteed edge cases, add them as hardcoded tests inside your stress loop before the random phase.

Print, Trace, Repeat

Once you have a failing test—say $n = 4$ , array = [3, -1, 2, -5]—the process is:

Run your solution on that input with cerr debug prints.
Run the brute force on the same input.
On paper, trace through your algorithm step by step.
Find the first moment where your program's state diverges from what it should be.

That divergence point is your bug. It is almost always an off-by-one, a wrong comparison operator, or an uninitialized variable.

text

  BUG BISECTION STRATEGY
  Binary search on your code

  +---------- your code ----------+
  | line 1                        |
  |   ***                         |
  | line 50  <--- add print here  |
  |   ***                         |
  | line 100                      |
  +-------------------------------+
              |
        +-----+------+
        |            |
   correct at    wrong at
   line 50       line 50
        |            |
        v            v
  +-----------+ +-----------+
  | Search    | | Search    |
  | lines     | | lines     |
  | 51--100   | | 1--49     |
  +-----------+ +-----------+
        |            |
        v            v
  check line 75  check line 25
        |            |
        v            v
     narrow        narrow
     further       further
        |            |
        +-----+------+
              |
              v
       Exact bug line

What the Code Won't Teach You

Debugging is hypothesis testing. Every bug investigation follows the same loop: form a hypothesis about what is wrong, design a test that would falsify it, run the test, update your hypothesis. Simple on paper, genuinely difficult in practice.

Most beginners skip the hypothesis entirely and go straight to changing code—debugging by random mutation. That approach occasionally works, but only by accident, and it leaves you no closer to understanding what went wrong.

text

  Hypothesis-driven debugging loop

  +----------+     +----------+     +---------+
  | Observe  |---->| Form     |---->| Design  |
  | failure  |     | hypothe- |     | test    |
  +----------+     | sis      |     +----+----+
       ^           +----------+          |
       |                                 v
       |           +----------+     +---------+
       +-----------| Update   |<----| Run     |
                   | belief   |     | test    |
                   +----------+     +---------+

The bug is usually in the part you were most confident about. Humans are systematically overconfident about code they wrote recently or understand best. The bug hides in "trivial" pieces—I/O parsing, an index calculation, a base case—not in the clever algorithm you have been studying. When debugging, verify the boring parts first.

Stress testing is a force multiplier, not a last resort. Do not wait until you have spent 30 minutes staring at code to reach for it. If you have any doubt about correctness—especially on greedy or DP problems—write the brute force first and stress test before your first submission. Finding a bug before submission costs minutes; finding it after can cost an hour or a contest. Once you can trace a stress test by hand, watching the loop run gives you even stronger intuition for where bugs actually hide.

Visual Intuition

Stress Test in Action

Suppose your problem is: "Given an array, find the maximum subarray sum."

Your sol.cpp uses Kadane's algorithm, but you wrote max(0, ...) instead of max(a[i], ...)—a classic bug when all elements are negative.

text

gen output (seed=17):     sol output:     brute output:

  4                         0               -1
  -3 -1 -5 -2

  diff says: MISMATCH!

Your solution prints 0, but the right answer is -1 (the subarray [-1]). The code resets to 0 instead of taking the element itself. Bug found in 30 seconds, not 30 minutes.

Debug Output for Tracing

With dbg() prints inside Kadane's:

text

Input: 4 elements: -3 -1 -5 -2

[i, a[i], cur, best]: 0, -3, 0, 0      <-- cur should be -3, not 0!
[i, a[i], cur, best]: 1, -1, 0, 0      <-- bug propagates
[i, a[i], cur, best]: 2, -5, 0, 0
[i, a[i], cur, best]: 3, -2, 0, 0

Output: 0   (wrong -- should be -1)

Line $i = 0$ reveals the bug: cur should be $- 3$ but is $0$ because of the faulty max(0, cur + a[i]).

C++ STL Reference

Function / Class	Header	What it does	Time
`mt19937 rng(seed)`	`<random>`	Mersenne Twister PRNG, 32-bit	$O (1)$ per call
`mt19937_64 rng(seed)`	`<random>`	64-bit Mersenne Twister	$O (1)$ per call
`uniform_int_distribution<int>(a,b)`	`<random>`	Random int in $[a, b]$	$O (1)$
`shuffle(begin, end, rng)`	`<algorithm>`	Random permutation of range	$O (n)$
`assert(expr)`	`<cassert>`	Abort with message if `expr` is false	$O (1)$
`cerr << x`	`<iostream>`	Print to stderr (judge ignores)	$O (1)$
`iota(begin, end, val)`	`<numeric>`	Fill range with `val, val+1, ...`	$O (n)$
`atoi(s)`	`<cstdlib>`	Convert C-string to int	$O (l e n)$
`chrono::steady_clock`	`<chrono>`	Seed from clock for non-reproducible	$O (1)$

For seeding with time when you don't need reproducibility:

cpp

mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());

But for stress testing, use argv[1]—you need reproducible seeds to replay failing cases.

Implementation (Contest-Ready)

Minimal Template—Just Copy and Go

gen.cpp (random array):

cpp

#include <bits/stdc++.h>
using namespace std;
int main(int argc, char* argv[]) {
    mt19937 rng(atoi(argv[1]));
    auto ri = [&](int a, int b) { return uniform_int_distribution<int>(a, b)(rng); };
    int n = ri(1, 10);
    cout << n << "\n";
    for (int i = 0; i < n; i++) cout << ri(-100, 100) << " \n"[i == n - 1];
}

brute.cpp (max subarray sum, $O (n^{2})$ — correct by construction):

cpp

#include <bits/stdc++.h>
using namespace std;
int main() {
    int n; cin >> n;
    vector<int> a(n);
    for (int& x : a) cin >> x;
    int ans = a[0];
    for (int i = 0; i < n; i++) {
        int s = 0;
        for (int j = i; j < n; j++) { s += a[j]; ans = max(ans, s); }
    }
    cout << ans << "\n";
}

stress.sh:

bash

#!/bin/bash
set -e
g++ -O2 -o sol sol.cpp
g++ -O2 -o brute brute.cpp
g++ -O2 -o gen gen.cpp
for i in $(seq 1 10000); do
    ./gen $i > input.txt
    ./sol < input.txt > out1.txt
    ./brute < input.txt > out2.txt
    if ! diff -q out1.txt out2.txt > /dev/null; then
        echo "MISMATCH on test $i"
        cat input.txt
        echo "---"
        echo "Expected:"; cat out2.txt
        echo "Got:"; cat out1.txt
        break
    fi
done
echo "Done."

Debug macro (paste at the top of sol.cpp):

cpp

#ifdef LOCAL
#define dbg(...) cerr << "[" << #__VA_ARGS__ << "]: ", _dbg(__VA_ARGS__)
void _dbg() { cerr << "\n"; }
template<class T, class... U>
void _dbg(T t, U... u) { cerr << t; if (sizeof...(u)) cerr << ", "; _dbg(u...); }
#else
#define dbg(...)
#endif

Explained Version

gen.cpp (random tree, with comments):

cpp

#include <bits/stdc++.h>
using namespace std;

int main(int argc, char* argv[]) {
    // Seed from command line so each stress iteration is different
    mt19937 rng(atoi(argv[1]));

    // Helper: random integer in [a, b] inclusive
    auto ri = [&](int a, int b) {
        return uniform_int_distribution<int>(a, b)(rng);
    };

    // Keep n small so failing cases are human-readable
    int n = ri(2, 10);
    cout << n << "\n";

    // Random tree: node i picks a parent from [1, i-1]
    // This always produces a valid tree (connected, n-1 edges, no cycles)
    for (int i = 2; i <= n; i++) {
        int parent = ri(1, i - 1);
        // Output edges in either order; shuffle if problem expects it
        cout << parent << " " << i << "\n";
    }
}

gen.cpp (random graph, with comments):

cpp

#include <bits/stdc++.h>
using namespace std;

int main(int argc, char* argv[]) {
    mt19937 rng(atoi(argv[1]));
    auto ri = [&](int a, int b) {
        return uniform_int_distribution<int>(a, b)(rng);
    };

    int n = ri(2, 6);
    int max_m = n * (n - 1) / 2;
    int m = ri(n - 1, min(max_m, 2 * n));

    cout << n << " " << m << "\n";

    // Build a random spanning tree first (guarantees connectivity)
    vector<int> ord(n);
    iota(ord.begin(), ord.end(), 1);
    shuffle(ord.begin(), ord.end(), rng);

    set<pair<int,int>> edges;
    for (int i = 1; i < n; i++) {
        int u = ord[ri(0, i - 1)], v = ord[i];
        if (u > v) swap(u, v);
        edges.insert({u, v});
        cout << u << " " << v << "\n";
    }

    // Fill remaining edges randomly, avoiding duplicates and self-loops
    while ((int)edges.size() < m) {
        int u = ri(1, n), v = ri(1, n);
        if (u == v) continue;
        if (u > v) swap(u, v);
        if (edges.count({u, v})) continue;
        edges.insert({u, v});
        cout << u << " " << v << "\n";
    }
}

Generator Reference

The table below covers every common input shape you will encounter and how to generate it:

Generator Type	What it produces	Key Parameters	Notes
Random array	Array of $n$ integers	$n$ , value range $[l o, h i]$	Most common; keep $n \leq 10$ for stress
Random permutation	Permutation of $1 \dots n$	$n$	Use `shuffle` with `mt19937`
Random tree	Tree on $n$ nodes	$n$	Random parent or Pruefer sequence
Random graph	Simple undirected graph	$n$ , $m$	Span tree + random edges; avoid duplicates
Random string	String of length $n$	$n$ , alphabet size	`'a' + randint(0, k-1)`
Random queries	$q$ queries on a structure	$q$ , query type, ranges	Match format of problem input
Weighted edges	Graph with edge weights	$n$ , $m$ , weight range	Add `randint(1, W)` per edge
Random DAG	Directed acyclic graph	$n$ , $m$	Edges only from lower to higher node index
Pair of arrays	Two arrays (e.g., for matching)	$n$ , value ranges	Two loops, same $n$

Common Bug Patterns

Pattern 1: WA on a Large Test

You passed samples but fail on test 47 with $n = 10^{5}$ . You can't read the input. Stress test with small $n$ —the bug almost always reproduces on $n \leq 10$ .

Pattern 2: Wrong Answer, but Only Sometimes

Your solution works on most tests but not all. Increase the stress loop count to 50,000+ and broaden your generator to cover more shapes (sorted, reverse-sorted, all-same).

Pattern 3: TLE on Specific Inputs

Write a worst-case generator instead of a random one. If you suspect your algorithm degrades on sorted input, generate sorted arrays. If you suspect quadratic behavior from a hash map, generate keys with collisions.

Pattern 4: Runtime Error

Compile with sanitizer flags:

bash

g++ -O2 -fsanitize=address,undefined -o sol sol.cpp

-fsanitize=address catches out-of-bounds access, use-after-free, and similar memory errors. -fsanitize=undefined catches signed overflow, shift errors, and null dereference. Both give you the exact line and a stack trace—far more useful than a bare crash.

Pattern 5: Integer Overflow

Your intermediate calculation exceeds $2^{31} - 1$ . Symptoms: works on small tests, fails silently on large ones. Quick check:

cpp

assert(1LL * a * b <= 2e18);  // verify product fits in long long

Or stress test with values near $10^{9}$ and watch for results that go negative.

Pattern 6: Off-by-One in Binary Search

Your binary search answer is off by 1. Stress test against a brute-force linear scan. The mismatch surfaces quickly and the tiny input makes the bug easy to trace.

Pattern 7: Uninitialized Variables

Globals are zero-initialized in C++, but locals are not. If you reuse a local variable across iterations without resetting it, the bug is intermittent and maddening. Sanitizers (-fsanitize=memory with Clang) can catch this automatically.

Contest Cheat Sheet

text

+================================================================+
|                    STRESS TEST WORKFLOW                        |
+================================================================+
|  1. Write sol.cpp   (your actual solution)                     |
|  2. Write brute.cpp (dumb O(n^2) or O(n!) that is correct)    |
|  3. Write gen.cpp   (random input, small n, seed from argv[1]) |
|  4. Run stress.sh   (loop: gen -> sol -> brute -> diff)        |
|  5. Found mismatch? Add cerr prints, trace by hand.            |
|  6. Fix bug. Re-run stress. Repeat until 10k tests pass.       |
+----------------------------------------------------------------+
|  COMPILER FLAGS                                                |
+----------------------------------------------------------------+
|  Normal:      g++ -O2 -o sol sol.cpp                           |
|  Debug:       g++ -DLOCAL -g -O0 -o sol sol.cpp               |
|  Sanitizers:  g++ -O2 -fsanitize=address,undefined -o sol ...  |
+----------------------------------------------------------------+
|  QUICK DEBUG CHECKLIST                                         |
+----------------------------------------------------------------+
|  [ ] Integer overflow? Use long long.                          |
|  [ ] Off-by-one? Check loop bounds.                            |
|  [ ] Uninitialized variable? Check all locals.                 |
|  [ ] Wrong data type? int vs long long.                        |
|  [ ] Edge case n=0 or n=1?                                     |
|  [ ] Forgot to read all input?                                 |
|  [ ] Multiple test cases: forgot to reset globals?             |
+================================================================+

Gotchas and Meta-Debugging

Sometimes the stress test itself is the problem. When your loop runs clean but you still can't trust it, work through this list.

Your brute force also has a bug. If both solutions are wrong in the same way, diff will never catch it. Verify your brute on the sample input before running the loop.

Your generator doesn't match the input format. If the problem expects $1$ -indexed nodes and your generator outputs $0$ -indexed, both solutions may parse differently. Always match the exact input format from the problem statement.

You forgot to recompile. You fixed sol.cpp but ran the old binary. Add set -e to your bash script and always compile at the start—the stress.sh template above does this automatically.

Generator range is too narrow. If you generate $n \in [1, 3]$ and values in $[1, 5]$ , you might never produce the pattern that triggers the bug. Widen your ranges after initial testing.

Generator never produces edge cases. A uniform random generator rarely gives you all-same arrays or sorted arrays. Explicitly add them:

cpp

// Force edge cases 10% of the time
int type = ri(1, 10);
if (type == 1) {
    // all same
    int val = ri(-100, 100);
    for (int i = 0; i < n; i++) cout << val << " \n"[i == n - 1];
} else if (type == 2) {
    // sorted
    vector<int> a(n);
    for (int& x : a) x = ri(-100, 100);
    sort(a.begin(), a.end());
    for (int i = 0; i < n; i++) cout << a[i] << " \n"[i == n - 1];
} else {
    // random
    for (int i = 0; i < n; i++) cout << ri(-100, 100) << " \n"[i == n - 1];
}

Windows line endings. If you develop on Windows, diff may flag every line because of \r\n vs \n. Use diff --strip-trailing-cr or convert files first.

Multiple valid outputs. If the problem says "print any valid answer," plain diff won't work. Write a checker instead: a program that reads the input and the output and verifies the output is correct. Replace the diff line in your stress script with:

bash

./checker input.txt out1.txt

Mental Traps

Trap: "It looks right, so it must be right."

Code that looks correct and code that is correct are different things. Your brain reads what you intended to write, not what is actually on the screen. Visual inspection has a dangerously high false-positive rate. Trust your tests, not your eyes.

text

  WRONG                            RIGHT

  Read code --> Looks fine         Read code --> Write test
       ^            |                   ^             |
       |            v                   |             v
       +---- Re-read again             +---- Run test
                                              |       |
                                           PASS    FAIL
                                              |       |
                                           OK      Fix

Trap: "I'll add cerr and figure it out."

Scattering prints without a clear question to answer produces walls of noise, not insight. Before you print anything, ask: what should this value be, and what does the code actually produce? The mismatch is the bug. Print exactly the variable that should reveal the discrepancy, nothing more.

text

  WRONG -- scatter prints          RIGHT -- targeted hypothesis

  +--sol-cpp----------+            +--sol-cpp---------+
  | # print a         |            |                  |
  | # print b         |            |   hypothesis:    |
  | # print c         |            |   x should be 5  |
  | # print d         |            |   # print x  <-- |
  | # print e         |            |                  |
  +-------------------+            +------------------+
  Result: wall of noise            Result: x is 3
  Action: stare at wall            Action: fix x

Trap: "The brute force might be wrong too."

When your stress test finds a discrepancy, resist the urge to doubt the brute force. Write it to be maximally simple—literally enumerate all possibilities and check each one, even if that means $O (2^{n})$ . The less clever the code, the fewer places a bug can hide.

text

  WRONG -- clever brute            RIGHT -- trivial brute

  +--brute-cpp--------+           +--brute-cpp--------+
  | O(n log n)         |           | O(2^n)            |
  | optimized logic    |           | try ALL subsets   |
  | * * * bugs * * *   |           | check EACH one    |
  +--------------------+           +-------------------+
       |                                |
       v                                v
  Risk: BOTH wrong                 Risk: slow but correct
  Stress test finds nothing        Stress test catches real bugs

Self-Test Checklist

Before moving on, verify you can do these without looking back at the notes:

[ ] Write a complete stress test setup (generator + brute force + comparison loop in bash) for a simple array problem in under 10 minutes.
[ ] State three conditions you should assert in every non-trivial solution (e.g., array bounds, DP state validity, non-negative values).
[ ] Describe the "binary search debugging" technique—how to isolate a bug in a 200-line solution by adding prints only at the midpoint first.
[ ] Explain why stress testing with $n \leq 5$ is sometimes insufficient and what input size range is more appropriate.
[ ] Name two bugs that are almost impossible to find by reading code but easy to find by stress testing.
[ ] Write a random tree generator from memory using the random-parent method.
[ ] Explain when you need a custom checker instead of diff and write the stress loop modification.

If you want to pressure-test that understanding, work through these scenarios out loud before you peek at how you'd actually handle them:

Your solution passes the samples but gets WA on test 3—how do you turn that into a minimal failing case you can read?
Your brute force is itself too slow to stress-test against—what is the minimum viable brute force you can still trust?
You suspect an off-by-one but can't spot it—how does a stress test restricted to $n = 1, 2, 3$ surface it fast?

Practice Problems

Problems where stress testing is especially useful—the solution logic is tricky enough that bugs are likely, and brute force is easy to write.

#	Problem	Source	Difficulty	Why stress test helps
1	Maximum Subarray Sum	CSES	CF ~1000	Classic Kadane bug with all-negative arrays
2	Nearest Smaller Values	CSES	CF ~1200	Stack-based, off-by-one errors common
3	Towers	CSES	CF ~1200	Greedy logic easy to get wrong
4	CF 1676D - X-Sum	CF	CF 1000	Diagonal indexing bugs
5	CF 1915D - Unnatural Conditions	CF	CF 800	Constructive; easy to miss cases
6	CF 1873E - Building an Aquarium	CF	CF 1100	Binary search boundary bug
7	Subarray Sums II	CSES	CF ~1300	Prefix sums with negative values
8	CF 1850F - We Were Both Children	CF	CF 1300	Sieve-like logic, overcounting bugs
9	Playlist	CSES	CF ~1200	Two-pointer / sliding window boundary
10	CF 1729D - Friends and the Restaurant	CF	CF 1200	Greedy pairing, sorting edge cases

Drill the setup: Pick any problem from the 1100–1400 practice ladder, solve it, then deliberately introduce a bug and stress-test until you find it.
Generators for structures: Practice writing generators for trees and graphs—you will need them for data structure problems.
Combine with techniques: Many bugs hide in essential technique implementations (binary search boundaries, two-pointer invariants, greedy proofs). Stress testing is how you verify those.
Contest workflow: Read Debugging Under Pressure to integrate stress testing into your timed contest strategy.

Debugging and Stress Testing ​

Contents ​

Intuition ​

Stop Staring at Your Code ​

When to Reach for This ​

The Idea in One Sentence ​

The Three Files You Need ​

Writing Generators That Actually Work ​

Random Arrays ​

Random Permutations ​

Random Trees ​

Random Graphs ​

Random Strings ​

Writing Brute Force: Stupid but Correct ​

The Stress Loop ​

Debug Printing with cerr ​

Assert Your Assumptions ​

The Edge Case Checklist ​

Print, Trace, Repeat ​

What the Code Won't Teach You ​

Visual Intuition ​

Stress Test in Action ​

Debug Output for Tracing ​

C++ STL Reference ​

Implementation (Contest-Ready) ​

Minimal Template—Just Copy and Go ​

Explained Version ​

Generator Reference ​

Common Bug Patterns ​

Pattern 1: WA on a Large Test ​

Pattern 2: Wrong Answer, but Only Sometimes ​

Pattern 3: TLE on Specific Inputs ​

Pattern 4: Runtime Error ​

Pattern 5: Integer Overflow ​

Pattern 6: Off-by-One in Binary Search ​

Pattern 7: Uninitialized Variables ​

Contest Cheat Sheet ​

Gotchas and Meta-Debugging ​

Mental Traps ​

Self-Test Checklist ​

Practice Problems ​

Further Reading ​

The Mistake That Teaches ​

See Also ​

What to Solve Next ​

Debugging and Stress Testing

Contents

Intuition

Stop Staring at Your Code

When to Reach for This

The Idea in One Sentence

The Three Files You Need

Writing Generators That Actually Work

Random Arrays

Random Permutations

Random Trees

Random Graphs

Random Strings

Writing Brute Force: Stupid but Correct

The Stress Loop

Debug Printing with cerr

Assert Your Assumptions

The Edge Case Checklist

Print, Trace, Repeat

What the Code Won't Teach You

Visual Intuition

Stress Test in Action

Debug Output for Tracing

C++ STL Reference

Implementation (Contest-Ready)

Minimal Template—Just Copy and Go

Explained Version

Generator Reference

Common Bug Patterns

Pattern 1: WA on a Large Test

Pattern 2: Wrong Answer, but Only Sometimes

Pattern 3: TLE on Specific Inputs

Pattern 4: Runtime Error

Pattern 5: Integer Overflow

Pattern 6: Off-by-One in Binary Search

Pattern 7: Uninitialized Variables

Contest Cheat Sheet

Gotchas and Meta-Debugging

Mental Traps

Self-Test Checklist

Practice Problems

Further Reading

The Mistake That Teaches

See Also

What to Solve Next