Skip to content

Debugging and Stress Testing

Quick Revisit

  • USE WHEN: WA on a hidden test and staring at code is not finding the bug
  • INVARIANT: If brute-force is correct, any input where sol and brute disagree reveals the bug
  • TIME: O(minutes of your time) | SPACE: 3 files (sol, brute, gen) + stress script
  • CLASSIC BUG: Generator produces only easy cases (e.g., sorted input)—randomize all parameters
  • PRACTICE: ../12-practice-ladders/01-ladder-1100-to-1400.md

Difficulty: Beginner | CF Rating: 1000–1400 | Prerequisites: Basic C++, basic bash/shell


Contents


Intuition

Stop Staring at Your Code

You submitted your solution. WA on test 47. The test has n=200000 elements and you can't eyeball what went wrong. You re-read your code five times. You add a +1 somewhere, resubmit. Still WA.

Staring is the wrong strategy. Your brain wrote the bug—the same brain is now trying to find it, reading what it meant to write rather than what it actually wrote. You need a mechanical process that removes your bias from the equation entirely.

That process is stress testing.

When to Reach for This

Stress testing is the right move whenever you hit one of these walls:

  • WA on hidden test cases—you passed the samples but the judge says no.
  • Passes samples but fails submission—your logic looks correct on the examples, yet something is off.
  • Can't find the bug by reading code—you have re-read your solution three times and see nothing wrong.
  • Edge case uncertainty—you are not sure your code handles n=1, empty input, all-same values, or maximum constraints.
  • Off-by-one suspicion—your loop bounds, binary search, or index arithmetic feel fragile.

If any of these apply, stop staring and start stress testing. The cost is 5–10 minutes of setup; the payoff is a concrete failing input you can actually read.

The Idea in One Sentence

Write a slow-but-obviously-correct brute force, generate thousands of small random inputs, and run both solutions until their outputs disagree. Then you have a tiny failing test you can actually debug by hand.

The Three Files You Need

Every stress test has three programs and one script:

FilePurpose
sol.cppYour real solution—the one that might be buggy
brute.cppA trivially correct O(n2) or O(n!) solution that is definitely right
gen.cppA random test generator that writes to stdout
stress.shA bash loop that wires everything together

The flow looks like this:

text
                    +----------+
                    |  gen.cpp |
                    |  (seed)  |
                    +----+-----+
                         |
                         v
                    +-----------+
                    | input.txt |
                    +-----+-----+
                     /          \
                    v            v
            +----------+   +-----------+
            | sol.cpp  |   | brute.cpp |
            +----+-----+   +-----+-----+
                 |                |
                 v                v
            +----------+   +-----------+
            | out1.txt |   | out2.txt  |
            +----+-----+   +-----+-----+
                  \              /
                   v            v
                  +------+-------+
                  | diff out1    |
                  |      out2    |
                  +--------------+
                        |
               +--------+--------+
               |                 |
               v                 v
           MATCH              MISMATCH!
          (next i)          (print input,
                              stop)

If the diff finds a mismatch, you have a small test case that breaks your solution. Now you can trace through it by hand.

Writing Generators That Actually Work

Never use rand(). Its quality is implementation-defined, its range is tiny on some platforms, and seeding it properly is annoying. Use the Mersenne Twister instead:

cpp
#include <bits/stdc++.h>
using namespace std;

int main(int argc, char* argv[]) {
    mt19937 rng(atoi(argv[1]));  // seed from command line

    // random int in [lo, hi]
    auto randint = [&](int lo, int hi) {
        return uniform_int_distribution<int>(lo, hi)(rng);
    };

    int n = randint(1, 10);  // keep small for stress testing!
    cout << n << "\n";
    for (int i = 0; i < n; i++) {
        cout << randint(-100, 100) << " \n"[i == n - 1];
    }
}

The seed comes from argv[1] so each iteration of the stress loop produces a different test. Keep n small—you want tests you can read when a mismatch surfaces.

Random Arrays

cpp
int n = randint(1, 10);
cout << n << "\n";
for (int i = 0; i < n; i++)
    cout << randint(1, 1000) << " \n"[i == n - 1];

Random Permutations

cpp
int n = randint(1, 8);
vector<int> p(n);
iota(p.begin(), p.end(), 1);
shuffle(p.begin(), p.end(), rng);
cout << n << "\n";
for (int i = 0; i < n; i++)
    cout << p[i] << " \n"[i == n - 1];

Random Trees

Use the random-parent method: node i picks a random parent among 1i1.

cpp
int n = randint(2, 10);
cout << n << "\n";
for (int i = 2; i <= n; i++) {
    int parent = randint(1, i - 1);
    cout << parent << " " << i << "\n";
}

Alternatively, use a Pruefer sequence for a uniformly random labeled tree:

cpp
int n = randint(2, 10);
vector<int> code(n - 2);
for (int& x : code) x = randint(1, n);
// decode Pruefer sequence
vector<int> deg(n + 1, 1);
for (int x : code) deg[x]++;
int leaf = 1;
while (deg[leaf] != 1) leaf++;
int ptr = leaf;
vector<pair<int,int>> edges;
for (int x : code) {
    edges.push_back({ptr, x});
    deg[x]--;
    if (deg[x] == 1 && x < leaf) {
        ptr = x;
    } else {
        leaf++;
        while (leaf <= n && deg[leaf] != 1) leaf++;
        ptr = leaf;
    }
}
edges.push_back({ptr, n});
cout << n << "\n";
for (auto [u, v] : edges) cout << u << " " << v << "\n";

Random Graphs

Be careful to avoid duplicate edges and self-loops:

cpp
int n = randint(2, 6);
int max_edges = n * (n - 1) / 2;
int m = randint(n - 1, min(max_edges, 2 * n));  // connected-ish
set<pair<int,int>> used;
cout << n << " " << m << "\n";
// first make a spanning tree to guarantee connectivity
vector<int> perm(n);
iota(perm.begin(), perm.end(), 1);
shuffle(perm.begin(), perm.end(), rng);
for (int i = 1; i < n; i++) {
    int u = perm[randint(0, i - 1)], v = perm[i];
    if (u > v) swap(u, v);
    used.insert({u, v});
    cout << u << " " << v << "\n";
}
// add random edges until we have m
while ((int)used.size() < m) {
    int u = randint(1, n), v = randint(1, n);
    if (u == v) continue;
    if (u > v) swap(u, v);
    if (used.count({u, v})) continue;
    used.insert({u, v});
    cout << u << " " << v << "\n";
}

Random Strings

cpp
int n = randint(1, 15);
string s(n, ' ');
for (char& c : s) c = 'a' + randint(0, 25);
cout << s << "\n";

For problems on small alphabets, use randint(0, 1) for binary strings or randint(0, 2) for {a, b, c}.

Writing Brute Force: Stupid but Correct

Your brute force does not need to be fast. It needs to be obviously correct—so obvious that even a cursory read would catch any bug. If the problem says "find the minimum sum subarray," try all O(n2) subarrays:

cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    int n; cin >> n;
    vector<int> a(n);
    for (int& x : a) cin >> x;

    int ans = INT_MAX;
    for (int i = 0; i < n; i++) {
        int sum = 0;
        for (int j = i; j < n; j++) {
            sum += a[j];
            ans = min(ans, sum);
        }
    }
    cout << ans << "\n";
}

If the problem is on graphs and you're unsure about your Dijkstra, write Floyd-Warshall. If it's DP, brute-force with bitmask over all subsets. The goal is zero cleverness—zero room for bugs.

text
  COMPARISON PIPELINE

  seed ---> +-------+
            | gen   |---> input
            +-------+        |
                        +----+----+
                        |         |
                        v         v
                   +--------+ +--------+
                   | brute  | |  sol   |
                   | O(n^2) | | O(nlgn)|
                   +---+----+ +---+----+
                       |          |
                       v          v
                   expected    actual
                       |          |
                       +----+-----+
                            |
                            v
                       +--------+
                       |  diff  |
                       +---+----+
                           |
                     +-----+------+
                     |            |
                   match      MISMATCH
                     |            |
                     v            v
                  next seed    STOP
                               print input

The Stress Loop

Here is the complete stress.sh—copy-paste it into every contest directory:

bash
#!/bin/bash
set -e
g++ -O2 -o sol sol.cpp
g++ -O2 -o brute brute.cpp
g++ -O2 -o gen gen.cpp
echo "Compiled. Running stress test..."
for i in $(seq 1 10000); do
    ./gen $i > input.txt
    ./sol < input.txt > out1.txt
    ./brute < input.txt > out2.txt
    if ! diff -q out1.txt out2.txt > /dev/null; then
        echo "MISMATCH on test $i"
        echo "--- Input ---"
        cat input.txt
        echo "--- Expected (brute) ---"
        cat out2.txt
        echo "--- Got (sol) ---"
        cat out1.txt
        break
    fi
    if (( i % 100 == 0 )); then
        echo "Passed $i tests..."
    fi
done

Run it: chmod +x stress.sh && ./stress.sh

If it passes 10,000 tests, your solution is probably correct (or your generator isn't producing the right edge cases—more on that later).

text
  STRESS TESTING WORKFLOW LOOP

  +-------+     +-------+     +--------+
  | Write |---->| Write |---->| Write  |
  |  sol  |     | brute |     |  gen   |
  +-------+     +-------+     +---+----+
                                  |
              +-------------------+
              |
              v
  +-----------------------+
  | Run stress loop       |<---------+
  | gen -> sol -> brute   |          |
  | -> diff               |          |
  +----------+------------+          |
             |                       |
       +-----+------+                |
       |            |                |
     match      mismatch             |
       |            |                |
       v            v                |
  +---------+  +----------+          |
  | 10k     |  | Add cerr |          |
  | passed  |  | Trace by |          |
  | Submit  |  | hand     |          |
  +---------+  +-----+----+          |
                     |               |
                     v               |
                 +--------+          |
                 | Fix    |----------+
                 | bug    |
                 +--------+

Debug Printing with cerr

When stress testing finds a failing case, you need to figure out where in your code the bug lives. Add cerr prints:

cpp
cerr << "i=" << i << " dp[i]=" << dp[i] << "\n";

cerr goes to stderr, so it won't pollute your stdout (which is what the judge reads). But you don't want these prints running on the judge either—they slow you down. Guard them:

cpp
#ifdef LOCAL
#define dbg(x) cerr << #x << " = " << (x) << "\n"
#else
#define dbg(x)
#endif

Compile locally with g++ -DLOCAL -O2 sol.cpp and dbg(variable) prints on your machine but vanishes on the judge.

Here is a more powerful debug macro that handles multiple arguments:

cpp
#ifdef LOCAL
#define dbg(...) cerr << "[" << #__VA_ARGS__ << "]: ", _dbg(__VA_ARGS__)
void _dbg() { cerr << "\n"; }
template<class T, class... U>
void _dbg(T t, U... u) { cerr << t; if (sizeof...(u)) cerr << ", "; _dbg(u...); }
#else
#define dbg(...)
#endif

Usage: dbg(i, j, dp[i][j]) prints [i, j, dp[i][j]]: 3, 5, 42.

Assert Your Assumptions

If you assume an array index is in bounds, prove it:

cpp
assert(0 <= idx && idx < n);

assert() crashes your program with a message naming the exact line that failed. That is infinitely more useful than a silent wrong answer with no indication of where things went sideways. Use it liberally during debugging—strip it before submitting if you are worried about speed, though assert is nearly free in practice.

The Edge Case Checklist

Before you even stress test, run these by hand or in your generator:

text
+-------------------------------------------------------+
|  EDGE CASES TO ALWAYS CHECK                           |
+-------------------------------------------------------+
|  * n = 0 or n = 1                                     |
|  * All elements the same                              |
|  * Array already sorted / reverse sorted              |
|  * Maximum values (watch for overflow!)               |
|  * Negative numbers (especially with modular arith)   |
|  * Empty graph (m = 0)                                |
|  * Disconnected graph                                 |
|  * Self-loops and duplicate edges                     |
|  * Single-node graph                                  |
|  * n = max constraint (does it TLE or MLE?)           |
|  * Answer is 0 or answer is negative                  |
|  * All queries are the same                           |
|  * Strings of length 1                                |
+-------------------------------------------------------+

A good generator will produce some of these naturally when n is small. If you want guaranteed edge cases, add them as hardcoded tests inside your stress loop before the random phase.

Once you have a failing test—say n=4, array = [3, -1, 2, -5]—the process is:

  1. Run your solution on that input with cerr debug prints.
  2. Run the brute force on the same input.
  3. On paper, trace through your algorithm step by step.
  4. Find the first moment where your program's state diverges from what it should be.

That divergence point is your bug. It is almost always an off-by-one, a wrong comparison operator, or an uninitialized variable.

text
  BUG BISECTION STRATEGY
  Binary search on your code

  +---------- your code ----------+
  | line 1                        |
  |   ***                         |
  | line 50  <--- add print here  |
  |   ***                         |
  | line 100                      |
  +-------------------------------+
              |
        +-----+------+
        |            |
   correct at    wrong at
   line 50       line 50
        |            |
        v            v
  +-----------+ +-----------+
  | Search    | | Search    |
  | lines     | | lines     |
  | 51--100   | | 1--49     |
  +-----------+ +-----------+
        |            |
        v            v
  check line 75  check line 25
        |            |
        v            v
     narrow        narrow
     further       further
        |            |
        +-----+------+
              |
              v
       Exact bug line

What the Code Won't Teach You

Debugging is hypothesis testing. Every bug investigation follows the same loop: form a hypothesis about what is wrong, design a test that would falsify it, run the test, update your hypothesis. Simple on paper, genuinely difficult in practice.

Most beginners skip the hypothesis entirely and go straight to changing code—debugging by random mutation. That approach occasionally works, but only by accident, and it leaves you no closer to understanding what went wrong.

text
  Hypothesis-driven debugging loop

  +----------+     +----------+     +---------+
  | Observe  |---->| Form     |---->| Design  |
  | failure  |     | hypothe- |     | test    |
  +----------+     | sis      |     +----+----+
       ^           +----------+          |
       |                                 v
       |           +----------+     +---------+
       +-----------| Update   |<----| Run     |
                   | belief   |     | test    |
                   +----------+     +---------+

The bug is usually in the part you were most confident about. Humans are systematically overconfident about code they wrote recently or understand best. The bug hides in "trivial" pieces—I/O parsing, an index calculation, a base case—not in the clever algorithm you have been studying. When debugging, verify the boring parts first.

Stress testing is a force multiplier, not a last resort. Do not wait until you have spent 30 minutes staring at code to reach for it. If you have any doubt about correctness—especially on greedy or DP problems—write the brute force first and stress test before your first submission. Finding a bug before submission costs minutes; finding it after can cost an hour or a contest. Once you can trace a stress test by hand, watching the loop run gives you even stronger intuition for where bugs actually hide.


Visual Intuition

Stress Test in Action

Suppose your problem is: "Given an array, find the maximum subarray sum."

Your sol.cpp uses Kadane's algorithm, but you wrote max(0, ...) instead of max(a[i], ...)—a classic bug when all elements are negative.

text
gen output (seed=17):     sol output:     brute output:

  4                         0               -1
  -3 -1 -5 -2

  diff says: MISMATCH!

Your solution prints 0, but the right answer is -1 (the subarray [-1]). The code resets to 0 instead of taking the element itself. Bug found in 30 seconds, not 30 minutes.

Debug Output for Tracing

With dbg() prints inside Kadane's:

text
Input: 4 elements: -3 -1 -5 -2

[i, a[i], cur, best]: 0, -3, 0, 0      <-- cur should be -3, not 0!
[i, a[i], cur, best]: 1, -1, 0, 0      <-- bug propagates
[i, a[i], cur, best]: 2, -5, 0, 0
[i, a[i], cur, best]: 3, -2, 0, 0

Output: 0   (wrong -- should be -1)

Line i=0 reveals the bug: cur should be 3 but is 0 because of the faulty max(0, cur + a[i]).


C++ STL Reference

Function / ClassHeaderWhat it doesTime
mt19937 rng(seed)<random>Mersenne Twister PRNG, 32-bitO(1) per call
mt19937_64 rng(seed)<random>64-bit Mersenne TwisterO(1) per call
uniform_int_distribution<int>(a,b)<random>Random int in [a,b]O(1)
shuffle(begin, end, rng)<algorithm>Random permutation of rangeO(n)
assert(expr)<cassert>Abort with message if expr is falseO(1)
cerr << x<iostream>Print to stderr (judge ignores)O(1)
iota(begin, end, val)<numeric>Fill range with val, val+1, ...O(n)
atoi(s)<cstdlib>Convert C-string to intO(len)
chrono::steady_clock<chrono>Seed from clock for non-reproducibleO(1)

For seeding with time when you don't need reproducibility:

cpp
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());

But for stress testing, use argv[1]—you need reproducible seeds to replay failing cases.


Implementation (Contest-Ready)

Minimal Template—Just Copy and Go

gen.cpp (random array):

cpp
#include <bits/stdc++.h>
using namespace std;
int main(int argc, char* argv[]) {
    mt19937 rng(atoi(argv[1]));
    auto ri = [&](int a, int b) { return uniform_int_distribution<int>(a, b)(rng); };
    int n = ri(1, 10);
    cout << n << "\n";
    for (int i = 0; i < n; i++) cout << ri(-100, 100) << " \n"[i == n - 1];
}

brute.cpp (max subarray sum, O(n2) — correct by construction):

cpp
#include <bits/stdc++.h>
using namespace std;
int main() {
    int n; cin >> n;
    vector<int> a(n);
    for (int& x : a) cin >> x;
    int ans = a[0];
    for (int i = 0; i < n; i++) {
        int s = 0;
        for (int j = i; j < n; j++) { s += a[j]; ans = max(ans, s); }
    }
    cout << ans << "\n";
}

stress.sh:

bash
#!/bin/bash
set -e
g++ -O2 -o sol sol.cpp
g++ -O2 -o brute brute.cpp
g++ -O2 -o gen gen.cpp
for i in $(seq 1 10000); do
    ./gen $i > input.txt
    ./sol < input.txt > out1.txt
    ./brute < input.txt > out2.txt
    if ! diff -q out1.txt out2.txt > /dev/null; then
        echo "MISMATCH on test $i"
        cat input.txt
        echo "---"
        echo "Expected:"; cat out2.txt
        echo "Got:"; cat out1.txt
        break
    fi
done
echo "Done."

Debug macro (paste at the top of sol.cpp):

cpp
#ifdef LOCAL
#define dbg(...) cerr << "[" << #__VA_ARGS__ << "]: ", _dbg(__VA_ARGS__)
void _dbg() { cerr << "\n"; }
template<class T, class... U>
void _dbg(T t, U... u) { cerr << t; if (sizeof...(u)) cerr << ", "; _dbg(u...); }
#else
#define dbg(...)
#endif

Explained Version

gen.cpp (random tree, with comments):

cpp
#include <bits/stdc++.h>
using namespace std;

int main(int argc, char* argv[]) {
    // Seed from command line so each stress iteration is different
    mt19937 rng(atoi(argv[1]));

    // Helper: random integer in [a, b] inclusive
    auto ri = [&](int a, int b) {
        return uniform_int_distribution<int>(a, b)(rng);
    };

    // Keep n small so failing cases are human-readable
    int n = ri(2, 10);
    cout << n << "\n";

    // Random tree: node i picks a parent from [1, i-1]
    // This always produces a valid tree (connected, n-1 edges, no cycles)
    for (int i = 2; i <= n; i++) {
        int parent = ri(1, i - 1);
        // Output edges in either order; shuffle if problem expects it
        cout << parent << " " << i << "\n";
    }
}

gen.cpp (random graph, with comments):

cpp
#include <bits/stdc++.h>
using namespace std;

int main(int argc, char* argv[]) {
    mt19937 rng(atoi(argv[1]));
    auto ri = [&](int a, int b) {
        return uniform_int_distribution<int>(a, b)(rng);
    };

    int n = ri(2, 6);
    int max_m = n * (n - 1) / 2;
    int m = ri(n - 1, min(max_m, 2 * n));

    cout << n << " " << m << "\n";

    // Build a random spanning tree first (guarantees connectivity)
    vector<int> ord(n);
    iota(ord.begin(), ord.end(), 1);
    shuffle(ord.begin(), ord.end(), rng);

    set<pair<int,int>> edges;
    for (int i = 1; i < n; i++) {
        int u = ord[ri(0, i - 1)], v = ord[i];
        if (u > v) swap(u, v);
        edges.insert({u, v});
        cout << u << " " << v << "\n";
    }

    // Fill remaining edges randomly, avoiding duplicates and self-loops
    while ((int)edges.size() < m) {
        int u = ri(1, n), v = ri(1, n);
        if (u == v) continue;
        if (u > v) swap(u, v);
        if (edges.count({u, v})) continue;
        edges.insert({u, v});
        cout << u << " " << v << "\n";
    }
}

Generator Reference

The table below covers every common input shape you will encounter and how to generate it:

Generator TypeWhat it producesKey ParametersNotes
Random arrayArray of n integersn, value range [lo,hi]Most common; keep n10 for stress
Random permutationPermutation of 1nnUse shuffle with mt19937
Random treeTree on n nodesnRandom parent or Pruefer sequence
Random graphSimple undirected graphn, mSpan tree + random edges; avoid duplicates
Random stringString of length nn, alphabet size'a' + randint(0, k-1)
Random queriesq queries on a structureq, query type, rangesMatch format of problem input
Weighted edgesGraph with edge weightsn, m, weight rangeAdd randint(1, W) per edge
Random DAGDirected acyclic graphn, mEdges only from lower to higher node index
Pair of arraysTwo arrays (e.g., for matching)n, value rangesTwo loops, same n

Common Bug Patterns

Pattern 1: WA on a Large Test

You passed samples but fail on test 47 with n=105. You can't read the input. Stress test with small n—the bug almost always reproduces on n10.

Pattern 2: Wrong Answer, but Only Sometimes

Your solution works on most tests but not all. Increase the stress loop count to 50,000+ and broaden your generator to cover more shapes (sorted, reverse-sorted, all-same).

Pattern 3: TLE on Specific Inputs

Write a worst-case generator instead of a random one. If you suspect your algorithm degrades on sorted input, generate sorted arrays. If you suspect quadratic behavior from a hash map, generate keys with collisions.

Pattern 4: Runtime Error

Compile with sanitizer flags:

bash
g++ -O2 -fsanitize=address,undefined -o sol sol.cpp

-fsanitize=address catches out-of-bounds access, use-after-free, and similar memory errors. -fsanitize=undefined catches signed overflow, shift errors, and null dereference. Both give you the exact line and a stack trace—far more useful than a bare crash.

Pattern 5: Integer Overflow

Your intermediate calculation exceeds 2311. Symptoms: works on small tests, fails silently on large ones. Quick check:

cpp
assert(1LL * a * b <= 2e18);  // verify product fits in long long

Or stress test with values near 109 and watch for results that go negative.

Your binary search answer is off by 1. Stress test against a brute-force linear scan. The mismatch surfaces quickly and the tiny input makes the bug easy to trace.

Pattern 7: Uninitialized Variables

Globals are zero-initialized in C++, but locals are not. If you reuse a local variable across iterations without resetting it, the bug is intermittent and maddening. Sanitizers (-fsanitize=memory with Clang) can catch this automatically.


Contest Cheat Sheet

text
+================================================================+
|                    STRESS TEST WORKFLOW                        |
+================================================================+
|  1. Write sol.cpp   (your actual solution)                     |
|  2. Write brute.cpp (dumb O(n^2) or O(n!) that is correct)    |
|  3. Write gen.cpp   (random input, small n, seed from argv[1]) |
|  4. Run stress.sh   (loop: gen -> sol -> brute -> diff)        |
|  5. Found mismatch? Add cerr prints, trace by hand.            |
|  6. Fix bug. Re-run stress. Repeat until 10k tests pass.       |
+----------------------------------------------------------------+
|  COMPILER FLAGS                                                |
+----------------------------------------------------------------+
|  Normal:      g++ -O2 -o sol sol.cpp                           |
|  Debug:       g++ -DLOCAL -g -O0 -o sol sol.cpp               |
|  Sanitizers:  g++ -O2 -fsanitize=address,undefined -o sol ...  |
+----------------------------------------------------------------+
|  QUICK DEBUG CHECKLIST                                         |
+----------------------------------------------------------------+
|  [ ] Integer overflow? Use long long.                          |
|  [ ] Off-by-one? Check loop bounds.                            |
|  [ ] Uninitialized variable? Check all locals.                 |
|  [ ] Wrong data type? int vs long long.                        |
|  [ ] Edge case n=0 or n=1?                                     |
|  [ ] Forgot to read all input?                                 |
|  [ ] Multiple test cases: forgot to reset globals?             |
+================================================================+

Gotchas and Meta-Debugging

Sometimes the stress test itself is the problem. When your loop runs clean but you still can't trust it, work through this list.

Your brute force also has a bug. If both solutions are wrong in the same way, diff will never catch it. Verify your brute on the sample input before running the loop.

Your generator doesn't match the input format. If the problem expects 1-indexed nodes and your generator outputs 0-indexed, both solutions may parse differently. Always match the exact input format from the problem statement.

You forgot to recompile. You fixed sol.cpp but ran the old binary. Add set -e to your bash script and always compile at the start—the stress.sh template above does this automatically.

Generator range is too narrow. If you generate n[1,3] and values in [1,5], you might never produce the pattern that triggers the bug. Widen your ranges after initial testing.

Generator never produces edge cases. A uniform random generator rarely gives you all-same arrays or sorted arrays. Explicitly add them:

cpp
// Force edge cases 10% of the time
int type = ri(1, 10);
if (type == 1) {
    // all same
    int val = ri(-100, 100);
    for (int i = 0; i < n; i++) cout << val << " \n"[i == n - 1];
} else if (type == 2) {
    // sorted
    vector<int> a(n);
    for (int& x : a) x = ri(-100, 100);
    sort(a.begin(), a.end());
    for (int i = 0; i < n; i++) cout << a[i] << " \n"[i == n - 1];
} else {
    // random
    for (int i = 0; i < n; i++) cout << ri(-100, 100) << " \n"[i == n - 1];
}

Windows line endings. If you develop on Windows, diff may flag every line because of \r\n vs \n. Use diff --strip-trailing-cr or convert files first.

Multiple valid outputs. If the problem says "print any valid answer," plain diff won't work. Write a checker instead: a program that reads the input and the output and verifies the output is correct. Replace the diff line in your stress script with:

bash
./checker input.txt out1.txt

Mental Traps

Trap: "It looks right, so it must be right."

Code that looks correct and code that is correct are different things. Your brain reads what you intended to write, not what is actually on the screen. Visual inspection has a dangerously high false-positive rate. Trust your tests, not your eyes.

text
  WRONG                            RIGHT

  Read code --> Looks fine         Read code --> Write test
       ^            |                   ^             |
       |            v                   |             v
       +---- Re-read again             +---- Run test
                                              |       |
                                           PASS    FAIL
                                              |       |
                                           OK      Fix

Trap: "I'll add cerr and figure it out."

Scattering prints without a clear question to answer produces walls of noise, not insight. Before you print anything, ask: what should this value be, and what does the code actually produce? The mismatch is the bug. Print exactly the variable that should reveal the discrepancy, nothing more.

text
  WRONG -- scatter prints          RIGHT -- targeted hypothesis

  +--sol-cpp----------+            +--sol-cpp---------+
  | # print a         |            |                  |
  | # print b         |            |   hypothesis:    |
  | # print c         |            |   x should be 5  |
  | # print d         |            |   # print x  <-- |
  | # print e         |            |                  |
  +-------------------+            +------------------+
  Result: wall of noise            Result: x is 3
  Action: stare at wall            Action: fix x

Trap: "The brute force might be wrong too."

When your stress test finds a discrepancy, resist the urge to doubt the brute force. Write it to be maximally simple—literally enumerate all possibilities and check each one, even if that means O(2n). The less clever the code, the fewer places a bug can hide.

text
  WRONG -- clever brute            RIGHT -- trivial brute

  +--brute-cpp--------+           +--brute-cpp--------+
  | O(n log n)         |           | O(2^n)            |
  | optimized logic    |           | try ALL subsets   |
  | * * * bugs * * *   |           | check EACH one    |
  +--------------------+           +-------------------+
       |                                |
       v                                v
  Risk: BOTH wrong                 Risk: slow but correct
  Stress test finds nothing        Stress test catches real bugs

Self-Test Checklist

Before moving on, verify you can do these without looking back at the notes:

  • [ ] Write a complete stress test setup (generator + brute force + comparison loop in bash) for a simple array problem in under 10 minutes.
  • [ ] State three conditions you should assert in every non-trivial solution (e.g., array bounds, DP state validity, non-negative values).
  • [ ] Describe the "binary search debugging" technique—how to isolate a bug in a 200-line solution by adding prints only at the midpoint first.
  • [ ] Explain why stress testing with n5 is sometimes insufficient and what input size range is more appropriate.
  • [ ] Name two bugs that are almost impossible to find by reading code but easy to find by stress testing.
  • [ ] Write a random tree generator from memory using the random-parent method.
  • [ ] Explain when you need a custom checker instead of diff and write the stress loop modification.

If you want to pressure-test that understanding, work through these scenarios out loud before you peek at how you'd actually handle them:

  1. Your solution passes the samples but gets WA on test 3—how do you turn that into a minimal failing case you can read?
  2. Your brute force is itself too slow to stress-test against—what is the minimum viable brute force you can still trust?
  3. You suspect an off-by-one but can't spot it—how does a stress test restricted to n=1,2,3 surface it fast?

Practice Problems

Problems where stress testing is especially useful—the solution logic is tricky enough that bugs are likely, and brute force is easy to write.

#ProblemSourceDifficultyWhy stress test helps
1Maximum Subarray SumCSESCF ~1000Classic Kadane bug with all-negative arrays
2Nearest Smaller ValuesCSESCF ~1200Stack-based, off-by-one errors common
3TowersCSESCF ~1200Greedy logic easy to get wrong
4CF 1676D - X-SumCFCF 1000Diagonal indexing bugs
5CF 1915D - Unnatural ConditionsCFCF 800Constructive; easy to miss cases
6CF 1873E - Building an AquariumCFCF 1100Binary search boundary bug
7Subarray Sums IICSESCF ~1300Prefix sums with negative values
8CF 1850F - We Were Both ChildrenCFCF 1300Sieve-like logic, overcounting bugs
9PlaylistCSESCF ~1200Two-pointer / sliding window boundary
10CF 1729D - Friends and the RestaurantCFCF 1200Greedy pairing, sorting edge cases

Further Reading


The Mistake That Teaches

Round 814, problem C. You submit confidently—WA on test 7. You stare at the code for 15 minutes, finding nothing.

Finally you write a stress test: random n10, compare with brute force. After 200 tests, the case n=3 with values [1, 1, 2] fails. The bug? You wrote i < n instead of i <= n in a loop. Thirty seconds to fix; fifteen minutes wasted staring. The stress test found it in 0.3 seconds.

The bug is never where you think it is. Stress testing removes human bias by letting random inputs find the edge case your eyes missed. The one line worth memorizing: don't debug with your eyes—debug with random inputs.

See Also


What to Solve Next

The mechanical process is yours. Now build the muscle memory to reach for it instinctively:

  • Drill the setup: Pick any problem from the 1100–1400 practice ladder, solve it, then deliberately introduce a bug and stress-test until you find it.
  • Generators for structures: Practice writing generators for trees and graphs—you will need them for data structure problems.
  • Combine with techniques: Many bugs hide in essential technique implementations (binary search boundaries, two-pointer invariants, greedy proofs). Stress testing is how you verify those.
  • Contest workflow: Read Debugging Under Pressure to integrate stress testing into your timed contest strategy.

Built for the climb from Codeforces 1100 to 2100.