Appearance
Sorting and Searching
Quick Revisit
- USE WHEN: Problem involves finding pairs, counting duplicates, or answering repeated queries on an array
- INVARIANT: After sorting, equal elements are adjacent and binary search works in O(log n)
- TIME: O(n log n) sort + O(log n) per query | SPACE: O(n) or O(1) if in-place
- CLASSIC BUG: Using <= in comparator instead of < (violates strict weak ordering, causes UB)
- PRACTICE: ../12-practice-ladders/01-ladder-1100-to-1400.md
Sort first, ask questions later.
Difficulty: Beginner | Prerequisites: Arrays and Strings | CF Rating Relevance: 800–1200 | Contest Frequency: Common—fundamental to nearly every contest
Contents
- Intuition
- How Sorting Works Under the Hood
- Counting Sort and Radix Sort
- C++ STL Reference
- Implementation (Contest-Ready)
- Operations Reference
- Problem Patterns & Tricks
- When NOT to Sort
- Before You Code—Sorting Checklist
- Contest Cheat Sheet
- Common Mistakes & Debugging
- Self-Test
- Practice Problems
- Further Reading
Intuition
You have this array of 8 elements and need to answer repeated "find" queries:
text
a = | 5 | 2 | 8 | 2 | 9 | 1 | 2 | 7 |
0 1 2 3 4 5 6 7Query 1: "Is 7 in the array?" You scan left to right: check index 0 (5, no), 1 (2, no), 2 (8, no), 3 (2, no), 4 (9, no), 5 (1, no), 6 (2, no), 7 (7, yes). That is 7 comparisons.
Query 2: "Is 3 in the array?" Scan all 8 elements, never find it. 8 comparisons.
Query 3: "Is 9 in the array?" Check 0..4—found at index 4. 5 comparisons.
Three queries, roughly
We are scanning the entire array from scratch every single time. There has to be a better way.
Sorting pays a one-time
Think of a messy desk buried in unsorted papers. Every time someone asks "do you have the invoice for April 3rd?" you rummage through the entire pile—that is linear search. Now imagine spending twenty minutes organizing everything by date into a sorted file cabinet. That upfront effort is real, but afterward every lookup is near-instant: flip to the right section, narrow down, done. Sorting is building that file cabinet. Binary search is the fast lookup it enables.
The analogy maps well: the papers are array elements, the date ordering is the sort key, and "flip to the middle, decide which half" is exactly binary search. One place it breaks down: a file cabinet lets you slip new papers into the right spot easily, but a sorted array does not—insertion costs set or multiset instead.
Visual Walkthrough
Same array. First, pay the upfront cost—sort it:
text
Before: | 5 | 2 | 8 | 2 | 9 | 1 | 2 | 7 |
After: | 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
0 1 2 3 4 5 6 7
Cost of sorting: O(n log n) = roughly 8 * 3 = 24 comparisonsNow answer the same three queries with binary search:
Query 1: "Is 7 in the array?"
text
Step 1:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^ ^
lo mid hi
lo=0, hi=7, mid=3. a[3]=2 < 7 --> lo = 4
Step 2:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^ ^
lo mid hi
lo=4, hi=7, mid=5. a[5]=7 == 7 --> FOUND. 2 comparisons.Query 2: "Is 3 in the array?"
text
Step 1:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^ ^
lo mid hi
lo=0, hi=7, mid=3. a[3]=2 < 3 --> lo = 4
Step 2:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^ ^
lo mid hi
lo=4, hi=7, mid=5. a[5]=7 > 3 --> hi = 4
Step 3:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^
lo,hi
lo=4, hi=4, mid=4. a[4]=5 > 3 --> hi = 3
lo > hi --> NOT FOUND. 3 comparisons.Query 3: "Is 9 in the array?"
text
Step 1:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^ ^
lo mid hi
lo=0, hi=7, mid=3. a[3]=2 < 9 --> lo = 4
Step 2:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^ ^
lo mid hi
lo=4, hi=7, mid=5. a[5]=7 < 9 --> lo = 6
Step 3:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^ ^
lo hi
^
mid
lo=6, hi=7, mid=6. a[6]=8 < 9 --> lo = 7
Step 4:
| 1 | 2 | 2 | 2 | 5 | 7 | 8 | 9 |
^
lo,hi
lo=7, hi=7, mid=7. a[7]=9 == 9 --> FOUND. 4 comparisons.Comparison tally:
text
+--------------------------------------------------+
| Unsorted (linear) Sorted (binary) |
| Query 1: 7 ops 2 ops |
| Query 2: 8 ops 3 ops |
| Query 3: 5 ops 4 ops |
| ------------------------------------------ |
| Queries: 20 ops 9 ops |
| Sort cost: 0 ops ~24 ops |
| TOTAL: 20 ops ~33 ops |
+--------------------------------------------------+With just 3 queries, the upfront sort nearly breaks even. At
The Invariant
text
+------------------------------------------------------------------+
| INVARIANT: After sorting, a[i] <= a[i+1] for all valid i. |
| This monotone ordering guarantees that if a[mid] < target, then |
| every element in a[0..mid] is also < target, so the entire left |
| half can be eliminated. Each comparison discards half the search |
| space, yielding O(log n) per query. |
+------------------------------------------------------------------+Why is the invariant maintained? std::sort establishes it, and binary search is read-only—it never modifies the array. As long as you do not insert, remove, or overwrite elements between the sort and the searches, the invariant holds.
Connect back to the walkthrough: look at Query 1, Step 1. We check a[3] = 2 < 7. Because the array is sorted, we know a[0], a[1], a[2] are all a[3] = 2 tells us nothing about any other position—we would still have to check them one by one.
What the Code Won't Teach You
Sorting is an enabler, not a goal. You rarely sort because the problem says "sort this array." You sort because it unlocks binary search, two pointers, greedy, sweep line, or deduplication. The real skill is recognizing that a problem becomes tractable after sorting—and then choosing what to sort by.
Custom comparator design is where the bugs hide. The call to std::sort is one line; the comparator is where you encode problem-specific logic. Get it wrong and you trigger undefined behavior—not just a wrong answer. Train yourself to always verify strict weak ordering: never return true for equal elements.
text
WHAT SORTING UNLOCKS -- THE ENABLING MAP
=========================================
sort(a)
|
+---> Binary search O(log n) lookups
| lower_bound / upper_bound
|
+---> Two pointers O(n) pair finding
| converge from both ends
|
+---> Greedy process "best first"
| interval scheduling, knapsack
|
+---> Sweep line sorted events
| overlap counting, union of intervals
|
+---> Deduplication sort + unique
| coordinate compression
|
'---> Adjacent scan O(n) closest pair
min difference must be neighborsnth_element is the secret weapon nobody uses. When a problem asks for the std::nth_element does it in
When to Reach for This
Trigger phrases—if you see these in a problem statement, think sort + search:
- "Find whether element X exists in the array" or "how many elements satisfy..."—sort, then
lower_bound/upper_boundforlookups. - "Sort by a custom key" or "minimize/maximize by processing in some order"—custom comparator sort as a preprocessing step for greedy.
- "Coordinate compression" or "values up to
but only distinct values"—sort + unique to compress coordinates. - "Count pairs with sum/difference equal to K"—sort enables two-pointer or binary search pairing.
- "Answer
offline queries about membership or rank"—sort the array once, answer all queries in .
Counter-examples—looks like sort + search but is not:
- "Find the K-th smallest element (single query, no repeated lookups)." You do not need a full sort.
nth_elementsolves this inaverage—cheaper than . See the STL reference below for nth_element. - "Elements are inserted and deleted between queries." A static sorted array cannot handle dynamic updates efficiently (
per insert). Use a set,multiset, or a balanced BST structure instead. See:08-set-and-multiset.md. - "The array is already given in sorted order." Do not re-sort—skip straight to binary search. This sounds obvious, but in contest stress it is easy to add a redundant
sortcall that wastes time and—worse—might destroy a secondary ordering you need.
How Sorting Works Under the Hood
You call sort() and it just works—but knowing what it does helps you pick the right tool and predict performance.
Merge Sort
Idea: Divide the array in half, recursively sort each half, then merge the two sorted halves into one.
text
Input: [5, 2, 8, 1, 9, 3]
Split: [5, 2, 8] [1, 9, 3]
Split: [5] [2, 8] [1] [9, 3]
Split: [5] [2] [8] [1] [9] [3]
Merge: [5] [2, 8] -> [2, 5, 8] [1] [3, 9] -> [1, 3, 9]
Merge: [2, 5, 8] + [1, 3, 9]
compare 2 vs 1 -> take 1
compare 2 vs 3 -> take 2
compare 5 vs 3 -> take 3
compare 5 vs 9 -> take 5
take 8, take 9
Result: [1, 2, 3, 5, 8, 9]Complexity:
In C++, stable_sort() uses merge sort (or a variant). Use it when stability matters—for example, sorting by one key while preserving the order from a previous sort on another key.
Quicksort
Idea: Pick a pivot element, partition the array into elements
- Average case:
—when the pivot splits the array roughly in half each time. - Worst case:
—when the pivot is always the smallest or largest element (e.g., already-sorted input with a naive first-element pivot).
Why std::sort is safe: The C++ standard library does not use plain quicksort. It uses introsort—quicksort that monitors recursion depth and switches to heapsort if it gets too deep. This guarantees
So when you call sort(), you get the best of three worlds: quicksort's speed, heapsort's worst-case guarantee, and insertion sort's efficiency on small ranges.
Merge sort was invented by John von Neumann in 1945 for the EDVAC computer. Quicksort followed from Tony Hoare in 1959. The
When to Use What
| Need | Use | Why |
|---|---|---|
| General sorting | sort()—introsort | |
| Stable sort (preserve order of equals) | stable_sort()—merge sort | |
| Partial sort (top | partial_sort() | [first, mid) |
nth_element() | ||
| Already nearly sorted | sort() still fine | Introsort handles it; stable_sort is also fast here |
| Small integer range | Counting sort (see below) |
Counting Sort and Radix Sort
Comparison-based sorts have a theoretical lower bound of
Counting Sort
When it applies: All values are integers in range
Idea: Count how many times each value appears, then reconstruct the sorted array from the counts.
Complexity:
Complexity Proof: Counting sort performs exactly three passes:
- One pass over
elements to count occurrences— . - One pass over the
buckets to accumulate prefix sums— . - One pass over
elements to place each into its sorted position— .
Total:
Simple version (in-place, not stable—fine for most CP problems):
cpp
void counting_sort(vector<int>& a, int max_val) {
vector<int> cnt(max_val + 1, 0);
for (int x : a) cnt[x]++; // Invariant: cnt[v] = number of occurrences of v seen so far
int idx = 0;
for (int v = 0; v <= max_val; v++) // Invariant: idx = total elements placed so far
while (cnt[v]--) a[idx++] = v;
}Stable version (preserves relative order of equal elements—needed when elements carry extra data):
cpp
// Stable counting sort: elements with the same key appear in their original order
vector<int> counting_sort_stable(const vector<int>& a, int max_val) {
int n = a.size();
vector<int> cnt(max_val + 1, 0);
for (int x : a) cnt[x]++; // Invariant: cnt[v] = occurrences of v in a
// Convert counts to starting positions
for (int v = 1; v <= max_val; v++) // Invariant: cnt[v] = first output index for value v
cnt[v] += cnt[v - 1];
vector<int> out(n);
for (int i = n - 1; i >= 0; i--) { // Invariant: elements a[i+1..n-1] are already placed
out[--cnt[a[i]]] = a[i];
}
return out;
}When you will see it in CP:
- Sorting by frequency—count occurrences, then iterate values in order.
- Bucket-style problems where values are bounded (e.g., grades 0–100, characters 'a'–'z').
- Coordinate compression when the value range is already small.
- Building frequency arrays for string problems (character histograms).
Important: If sort().
Radix Sort
Idea: Sort integers by processing one digit at a time, from least significant to most significant. Each digit-level sort uses counting sort as a subroutine.
Complexity:
Complexity Proof: Each of the
Working C++ implementation (base-256 LSD radix sort for non-negative integers):
cpp
void radix_sort(vector<int>& a) {
int n = a.size();
vector<int> buf(n);
for (int shift = 0; shift < 32; shift += 8) { // Invariant: elements are sorted by bits [0..shift-1]
int cnt[256] = {};
for (int i = 0; i < n; i++) // Invariant: cnt[d] = count of elements with digit d at current byte
cnt[(a[i] >> shift) & 0xFF]++;
for (int i = 1; i < 256; i++) // Invariant: cnt[i] = starting output position for digit i
cnt[i] += cnt[i - 1];
for (int i = n - 1; i >= 0; i--) // Invariant: elements a[i+1..n-1] already placed (stable)
buf[--cnt[(a[i] >> shift) & 0xFF]] = a[i];
swap(a, buf);
}
}When to use each:
- Counting sort: When all values are in a small range
where . Perfect for character frequencies, grades, small bounded values. - Radix sort: When you have many large integers (e.g.,
values up to ) and need to beat std::sort's constant factor. Rarely needed in CP, but essential for extreme time limits. std::sort: Default choice. Only switch to counting/radix when profiling shows you need the constant-factor speedup or when values are naturally bounded.
Radix sort is rarely coded from scratch in competitive programming—sort() is almost always fast enough, and radix sort's constant-factor advantage only materializes for very large
C++ STL Reference
The full toolkit for sorting and searching in the C++ standard library. You will use sort and lower_bound constantly; the rest are situational but worth knowing.
| Function / Class | Header | What it does | Time Complexity |
|---|---|---|---|
sort(first, last) | <algorithm> | IntroSort. Sorts range in ascending order. | |
sort(first, last, cmp) | <algorithm> | Sort with custom comparator cmp. | |
stable_sort(first, last) | <algorithm> | Merge sort. Preserves relative order of equal elements. | |
stable_sort(first, last, cmp) | <algorithm> | Stable sort with custom comparator. | Same as above |
partial_sort(first, mid, last) | <algorithm> | Sorts only [first, mid)—the smallest mid - first elements. | mid - first |
nth_element(first, nth, last) | <algorithm> | Rearranges so element at nth is what would be there if sorted. Elements before nth are | |
lower_bound(first, last, val) | <algorithm> | Iterator to first element val. Range must be sorted. | |
upper_bound(first, last, val) | <algorithm> | Iterator to first element val. Range must be sorted. | |
equal_range(first, last, val) | <algorithm> | Returns {lower_bound, upper_bound} as a pair. | |
binary_search(first, last, val) | <algorithm> | Returns true if val exists in sorted range. | |
find(first, last, val) | <algorithm> | Linear search. Returns iterator to first match or last. | |
count(first, last, val) | <algorithm> | Counts occurrences of val in range. | |
ranges::sort(r, cmp, proj) | <algorithm> | C++20 ranges sort with optional projection. | |
is_sorted(first, last) | <algorithm> | Checks if range is sorted. | |
merge(f1, l1, f2, l2, out) | <algorithm> | Merges two sorted ranges into out. |
Lambda comparator examples:
cpp
// Sort descending
sort(a.begin(), a.end(), [](int x, int y){ return x > y; });
// Sort pairs by second element, break ties by first
sort(v.begin(), v.end(), [](const auto& a, const auto& b){
return a.second != b.second ? a.second < b.second : a.first < b.first;
});
// C++20 ranges with projection (sort structs by field)
ranges::sort(events, {}, &Event::end_time); // sort by end_time ascendingImplementation (Contest-Ready)
Minimal Contest Template
cpp
#include <bits/stdc++.h>
using namespace std;
int main(){
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n; cin >> n;
vector<int> a(n);
for(auto& x : a) cin >> x;
sort(a.begin(), a.end());
int q; cin >> q;
while(q--){
int x; cin >> x;
// Invariant: a is sorted, so lower_bound/upper_bound are valid
// count of elements equal to x
auto lo = lower_bound(a.begin(), a.end(), x);
auto hi = upper_bound(a.begin(), a.end(), x);
int cnt = hi - lo;
// index of first element >= x (-1 if none)
int pos = (lo != a.end()) ? (int)(lo - a.begin()) : -1;
cout << cnt << " " << pos << "\n";
}
}Sort + Binary Search with Custom Comparator (Explained)
cpp
#include <bits/stdc++.h>
using namespace std;
int main(){
ios::sync_with_stdio(false);
cin.tie(nullptr);
int n; cin >> n;
// Store (value, original_index) so we can recover positions after sorting
vector<pair<int,int>> a(n);
for(int i = 0; i < n; i++){
cin >> a[i].first;
a[i].second = i; // Invariant: a[i].second = original index of element i
}
// Sort by value ascending; ties broken by original index
// The comparator must satisfy strict weak ordering:
// cmp(x, x) must be false
// if cmp(x, y) then !cmp(y, x)
sort(a.begin(), a.end(), [](const pair<int,int>& x, const pair<int,int>& y){
if(x.first != y.first) return x.first < y.first;
return x.second < y.second;
});
int q; cin >> q;
while(q--){
int x; cin >> x;
// lower_bound: iterator to first element with .first >= x
// We search using a custom comparator that compares only .first
auto lo = lower_bound(a.begin(), a.end(), make_pair(x, INT_MIN));
// upper_bound: iterator to first element with .first > x
auto hi = upper_bound(a.begin(), a.end(), make_pair(x, INT_MAX));
int cnt = (int)(hi - lo);
if(cnt > 0){
// The original index of the first occurrence
int orig_idx = lo->second;
cout << cnt << " " << orig_idx << "\n";
} else {
cout << "0 -1\n";
}
}
}Operations Reference
Time and space costs for every sorting and searching operation covered above.
| Operation | Time | Space | Notes |
|---|---|---|---|
sort | IntroSort, in-place | ||
stable_sort | Extra memory for merge | ||
nth_element | In-place, partial order | ||
partial_sort (top | Heapselect | ||
lower_bound / upper_bound | Requires sorted range | ||
binary_search | Requires sorted range | ||
find (unsorted) | Linear scan | ||
count (unsorted) | Linear scan | ||
count via upper - lower (sorted) | Two binary searches | ||
merge two sorted ranges | Into output range | ||
Build sorted structure (set/multiset) | Alternative to sort+BS |
Where Does the O(n log n) Come From?
Merge sort divides the array in half recursively, then merges the sorted halves.
- How many levels? Each level halves the problem:
. That's levels. - Work per level? At each level, every element participates in exactly one merge. Merging two sorted halves costs
total across all merges at that level. - Total:
levels per level .
This also proves the lower bound: any comparison-based sort must make
Problem Patterns & Tricks
"Find/count pairs with property X" + N <= 10^5—reach for Sort + Two Pointers or Sort + Binary Search.
Sort + Binary Search (Existence / Count)
Sort once, then answer "is X in the array?" or "how many elements fall in lower_bound / upper_bound—
cpp
// Count elements in [L, R]
int cnt = upper_bound(a.begin(), a.end(), R) - lower_bound(a.begin(), a.end(), L);CF examples: CF 706B—Interesting drink (direct lower/upper bound), CF 474B—Worms.
Sort + Two Pointers (Pair / Triplet Finding)
Sort, then converge two pointers from both ends to find pairs summing to a target. Generalizes to 3SUM by fixing one element and running two pointers on the rest.
cpp
sort(a.begin(), a.end());
int lo = 0, hi = n - 1;
while(lo < hi){
// Invariant: answer (if it exists) has one index in [lo..] and one in [..hi]
int s = a[lo] + a[hi];
if(s == target) { /* found */ lo++; hi--; }
else if(s < target) lo++;
else hi--;
}CF examples: CF 1850E—Cardboard for Pictures, CF 1915D—Unnatural Language Processing.
Custom Sort for Greedy
Many greedy problems reduce to "sort by the right criterion, then sweep." The art is choosing what to sort by.
- Interval scheduling: sort by end time, greedily pick non-overlapping intervals.
- Job scheduling with deadlines: sort by deadline.
- Fractional knapsack: sort by value/weight ratio descending.
cpp
// Interval scheduling: sort by end time
sort(intervals.begin(), intervals.end(), [](const auto& a, const auto& b){
return a.second < b.second;
});CF examples: CF 1472D—Even-Odd Game, CF 1353D—Constructing the Array.
Sort + Unique (Deduplication / Coordinate Compression)
Sort, then remove duplicates with erase(unique(...), end()). The fundamental preprocessing step for coordinate compression and counting distinct values.
cpp
sort(a.begin(), a.end());
a.erase(unique(a.begin(), a.end()), a.end());
// a now contains sorted unique elementsCF examples: CF 1etz C—various compression problems, CF 1560C—Infinity Table.
Binary Search on Answer
Not searching in an array—searching over a value space. "What is the minimum check(X) is true?" Sort is usually a subroutine inside check().
cpp
long long lo = 0, hi = 2e18;
while(lo < hi){
// Invariant: answer is in [lo, hi]
long long mid = lo + (hi - lo) / 2;
if(check(mid)) hi = mid;
else lo = mid + 1;
}
// lo == smallest valid answerCF examples: CF 1623C—Balance the Bits, CF 1760E—Binary Inversions.
Sort to Detect / Count Inversions
Merge sort counts inversions in
cpp
// Merge sort inversion count (sketch)
long long merge_count(vector<int>& a, int l, int r){
if(r - l <= 1) return 0;
int m = (l + r) / 2;
long long cnt = merge_count(a, l, m) + merge_count(a, m, r);
vector<int> buf;
int i = l, j = m;
while(i < m && j < r){
// Invariant: buf contains merged elements from a[l..i-1] and a[m..j-1]
if(a[i] <= a[j]) buf.push_back(a[i++]);
else { buf.push_back(a[j++]); cnt += m - i; }
}
while(i < m) buf.push_back(a[i++]); // Invariant: remaining left-half elements are all > buf.back()
while(j < r) buf.push_back(a[j++]); // Invariant: remaining right-half elements are all > buf.back()
copy(buf.begin(), buf.end(), a.begin() + l);
return cnt;
}CF examples: CF 1490F—Equalize the Array, CF 1430E—String Reversal.
Sort Events (Sweep Line)
Convert intervals to (start, +1) and (end, −1) events, sort by coordinate, then sweep left to right tracking the active count. The pattern handles overlap counting, maximum simultaneous events, and interval union.
cpp
vector<pair<int,int>> events; // {coordinate, +1 or -1}
for(auto& [l, r] : intervals){
events.push_back({l, +1});
events.push_back({r + 1, -1}); // +1 for closed intervals
}
sort(events.begin(), events.end());
int cur = 0, mx = 0;
for(auto& [pos, delta] : events){
cur += delta; // Invariant: cur = number of active intervals at position pos
mx = max(mx, cur);
}CF examples: CF 1355C—Count Triangles, CF 1462D—Add to Neighbour and Remove.
When NOT to Sort
- When order matters—if the problem asks about the original array order (e.g., longest increasing subsequence), sorting destroys the information you need. See Divide and Conquer for merge-sort-based approaches that preserve index information.
- When N is very small—for
, brute force is often simpler and fast enough. See Complete Search. - When you need online processing—if elements arrive one at a time, consider a balanced BST or heap instead of re-sorting.
- When value range is tiny—if all values are in
, counting sort or direct array indexing beats . See Counting Sort and Radix Sort above.
Before You Code—Sorting Checklist
- [ ] I confirmed the sort order matches my algorithm's needs (ascending? descending? custom?)
- [ ] My comparator uses strict
<, NOT<= - [ ] I checked if sort stability matters (does relative order of equal elements affect the answer?)
- [ ] I verified N*log(N) fits in the time limit (N <= 10^6 → ~20 million ops, fine)
- [ ] If using binary search after sort: am I searching on the right property?
Contest Cheat Sheet
Mnemonic: "SCUB"—Sort first, Compare with strict-less-than, Use lower_bound, Beware stability.
If I Had 5 Minutes
- Default to
sort(a.begin(), a.end())—it's O(N log N) and enough for N <= 10^6 - Pair/struct sorting: use
tie()for multi-key, not manual comparators - Need stability?
stable_sort(). Need partial?nth_element()is O(N) - After sorting: binary search with
lower_bound/upper_bound, NOT linear scan - Custom comparator must be strict (
<not<=) or you get UB
text
+--------------------------------------------------------------+
| SORTING & SEARCHING CHEAT SHEET |
+--------------------------------------------------------------+
| WHEN TO USE: |
| - Need to find/count elements quickly --> sort + BS |
| - Need pairs with target sum --> sort + 2ptr |
| - Greedy needs "best first" ordering --> custom sort |
| - Count distinct / compress coords --> sort + unique |
| - Count elements in range [L,R] --> upper - lower |
+--------------------------------------------------------------+
| KEY STL CALLS: |
| sort(a.begin(), a.end()); |
| sort(a.begin(), a.end(), [](int x, int y){ return x>y; }); |
| auto it = lower_bound(a.begin(), a.end(), x); |
| int cnt = upper_bound(all, R) - lower_bound(all, L); |
| nth_element(a.begin(), a.begin()+k, a.end()); // kth elem |
+--------------------------------------------------------------+
| COMPLEXITY: |
| sort: O(n log n) time, O(log n) space |
| stable_sort: O(n log n) time, O(n) space |
| lower/upper: O(log n) per query |
| nth_element: O(n) average |
+--------------------------------------------------------------+
| WATCH OUT: |
| - Comparator must be strict weak ordering (no cmp(x,x)) |
| - lower_bound/upper_bound need SORTED range |
| - Use (long long) for products that overflow int |
+--------------------------------------------------------------+Common Mistakes & Debugging
Strict Weak Ordering Violations
The leading cause of mysterious runtime errors with sort. Your comparator cmp must satisfy:
- Irreflexivity:
cmp(a, a)isfalse. - Asymmetry: if
cmp(a, b)then!cmp(b, a). - Transitivity: if
cmp(a, b)andcmp(b, c)thencmp(a, c).
Wrong: return a.x <= b.x;—returns true for equal elements. Right: return a.x < b.x;
On GCC with _GLIBCXX_DEBUG, this triggers an assertion. On judges, it can cause a wrong answer, an infinite loop, or a segfault.
Custom Comparator Gotchas
cpp
// WRONG: causes undefined behavior (not strict weak ordering)
sort(a.begin(), a.end(), [](int x, int y) { return x <= y; });
// RIGHT: use strict less-than
sort(a.begin(), a.end(), [](int x, int y) { return x < y; });
// WRONG: overflow when computing a - b for large values
sort(a.begin(), a.end(), [](int x, int y) { return x - y < 0; }); // overflow!
// RIGHT: direct comparison
sort(a.begin(), a.end(), [](int x, int y) { return x < y; });
// WRONG: lambda captures dangling reference
int key;
cin >> key;
auto cmp = [&key](int x, int y) { return abs(x - key) < abs(y - key); };
// Safe here, but if 'key' goes out of scope before sort finishes, UB.
// Prefer capture by value [key] when the captured variable might not outlive the lambda.
// Multi-key sort using tie (clean and safe):
sort(events.begin(), events.end(), [](const auto& a, const auto& b) {
return tie(a.time, a.type) < tie(b.time, b.type);
});
// tie() compares lexicographically -- first by time, then by type on ties.
// This is MUCH cleaner and safer than:
// if (a.time != b.time) return a.time < b.time; return a.type < b.type;Signed / Unsigned Comparison
lower_bound returns an iterator. Computing it - a.begin() gives a signed ptrdiff_t, but a.size() is size_t (unsigned). Comparing signed and unsigned can produce bugs when the signed value is negative.
cpp
// Dangerous:
if(it - a.begin() < a.size()) // mixed sign comparison
// Safe:
int pos = (int)(it - a.begin());
if(pos < n) // both intWhen stable_sort Matters
Sort by key stable_sort, and elements with equal sort, that prior ordering is lost. This matters for multi-criteria sorts done in multiple passes.
Sort Stability—Visual
text
Input: [(3,'a'), (1,'b'), (3,'c'), (2,'d')]
Stable sort by first element:
[(1,'b'), (2,'d'), (3,'a'), (3,'c')]
^^^^^^^^^^^^^^^^
Original relative order preserved!
Unstable sort by first element:
[(1,'b'), (2,'d'), (3,'c'), (3,'a')] <-- 'c' and 'a' may swap
^^^^^^^^^^^^^^^^
Relative order NOT guaranteed!
+----------+------------+------------------+
| Sort | Stable? | When to use |
+----------+------------+------------------+
| sort() | NO | Default, fastest |
| stable_ | YES | Multi-key sort, |
| sort() | | preserve order |
+----------+------------+------------------+Forgetting to Sort Before Binary Search
lower_bound and upper_bound require a sorted range. Calling them on an unsorted range compiles without complaint but returns garbage. If binary search gives wrong answers, first check that you actually sorted.
Off-by-One with upper_bound
upper_bound(a.begin(), a.end(), x) returns an iterator past the last element equal to x. To get the index of the last element
cpp
auto it = upper_bound(a.begin(), a.end(), x);
if(it != a.begin()){
--it;
// *it is the largest element <= x
}Integer Overflow in Comparators
Comparing by difference (e.g., return a - b < 0) overflows int for large values. Use direct comparison (return a < b) or cast to long long.
Mental Traps
Trap 1: "sort is just sort." Sorting is not the destination—it is the departure gate. The moment you call sort(), you should already know what technique comes next: binary search? two pointers? greedy sweep? If you sort without a follow-up plan, you are spending
text
THE SORT -> THEN-WHAT PIPELINE
==============================
Problem: "find all pairs with sum = target"
sort(a)
|
+--------- ? ---------+
| |
Binary search Two pointers
for each a[i], lo=0, hi=n-1
search target-a[i] converge inward
| |
O(n log n) O(n)
total total
Both work. Two pointers is faster.
But BOTH require the sort step first.Trap 2: "lower_bound works on any container."std::lower_bound(s.begin(), s.end(), val) on a std::set compiles but runs in s.lower_bound(val) for
The Mistake That Teaches
In a Div2-C, I needed to sort intervals by end time for activity selection. I wrote return a.end <= b.end; as my comparator. It passed locally with GCC, cleared the sample tests, and then crashed with Runtime Error on test 5. The <= comparator violates strict weak ordering: when two intervals share the same end time, the comparator claims both a < b and b < a—a contradiction. Introsort relies on that property and can segfault or infinite-loop when it's broken. Changing <= to < fixed everything. Lesson: strict weak ordering is not a suggestion; it is a contract.
What Would You Do If...?
- You need to sort by multiple keys? Use
tie(key1, key2)in your comparator. It is cleaner and less error-prone than writingif (a.key1 != b.key1) return a.key1 < b.key1; return a.key2 < b.key2;. - Sort + binary search gives TLE? Check whether you are sorting inside a loop. Sort once outside, then binary search inside.
- You need the k-th smallest element but not a full sort? Use
nth_element()—average. It rearranges so the element at position is what would be there if fully sorted; elements before it are it, elements after are it.
Self-Test
Close the file and answer these from memory. If any gives you pause, re-read the relevant section before continuing.
- [ ] Write a custom comparator for sorting
(deadline, duration)pairs by deadline ascending, ties broken by duration ascending—with valid strict weak ordering. - [ ] Explain why
std::lower_boundonstd::setiterators isand how to get . - [ ] Count pairs
(i, j)witha[i] + a[j] = targetinusing sort + binary search. - [ ] Name three algorithms or techniques that require sorted input as a prerequisite.
- [ ] State when
nth_elementis preferable tosortand what complexity advantage it gives.
Debug This!
Bug 1: Why does this give the wrong count of values
cpp
vector<int> a = {1, 3, 5, 7, 9};
int x = 5;
int count = lower_bound(a.begin(), a.end(), x) - a.begin();
// Expected: 3 (values 1, 3, 5)
// Got: 2Answer: lower_bound returns an iterator to the first element upper_bound: upper_bound(a.begin(), a.end(), x) - a.begin() returns 3.
Bug 2: Why does this RE on some test cases?
cpp
sort(a.begin(), a.end(), [](pair<int,int> x, pair<int,int> y) {
return x.first * y.second <= y.first * x.second; // Two bugs!
});Answer: Two bugs: (1) <= violates strict weak ordering—when x.first * y.second == y.first * x.second, both sides return true, which is a contradiction; (2) the multiplication can overflow int for large values. Fix: use < and cast to long long.
Bug 3: Why does sorting not help here?
cpp
// Find longest increasing subsequence
vector<int> a = {3, 1, 4, 1, 5, 9};
sort(a.begin(), a.end());
// Now find longest run of increasing... but it's ALL increasing after sort!Answer: Sorting destroys the original positions. LIS depends on which elements appear in order, not which values are adjacent after sorting. You need patience sorting or DP on the original sequence, not sort().
Bug 4: nth_element gives wrong result?
cpp
vector<int> a = {5, 3, 1, 4, 2};
nth_element(a.begin(), a.begin() + 2, a.end());
cout << a[2]; // Expected: median (3)
// This is actually correct! But people often expect a[k] = k-th smallest
// when nth_element uses 0-indexed position. a[2] IS the 3rd smallest = 3.Answer: This actually works correctly. The gotcha: people sometimes pass a.begin() + k expecting the a.begin() + k - 1.
Practice Problems
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | Vanya and Lanterns | CF 492B | 1200 | Sort + check gaps |
| 2 | Interesting drink | CF 706B | 1100 | Sort + upper_bound per query |
| 3 | Factory Robots | CSES | Easy | Sort + unique count |
| 4 | Towers | CSES | Easy | Sort / multiset greedy |
| 5 | Even-Odd Game | CF 1472D | 1200 | Sort descending + greedy |
| 6 | Count Triangles | CF 1355C | 1500 | Sort + two pointers / binary search |
| 7 | String Reversal | CF 1430E | 1700 | Sort-based inversion counting |
| 8 | Magic Powder - 2 | CF 670D2 | 1500 | Binary search on answer |
| 9 | Playlist | CSES | Medium | Sort + sliding window / two pointers |
| 10 | Tasks | CSES | Medium | Sort by deadline, greedy scheduling |
Rating Progression
- At CF 1200: Sort arrays and use
lower_boundfor existence checks. - At CF 1500: Design custom comparators for greedy problems; use sort + two pointers.
- At CF 1800: Exploit sort stability for multi-key problems; use counting sort for
when the value range is small. - At CF 2100: Combine sort with sweep line, merge sort for inversions, or sort events for complex simulations.
Further Reading
- cp-algorithms: Sorting—survey of sorting algorithms and their properties.
- C++ Reference: std::sort—full specification and complexity guarantees.
- C++ Reference: std::lower_bound—iterator-based binary search details.
- CF Blog: Strict Weak Ordering—why your comparator crashes.
Related Techniques
- Binary Search—sorting enables
lookups and "binary search on the answer" - Two Pointers—sorted arrays make two-pointer sweeps possible for pair/sum problems
- Greedy—many greedy strategies begin with "sort by some key, then iterate"
- Coordinate Compression—using sort + unique as a preprocessing step to map large value ranges to small indices
- Prefix Sums—combining sorted arrays with prefix sums for range counting
- Sliding Window—sliding window techniques that often pair with sorted order
- Data Structure Selection Guide—when to pick sort + search vs.
set/map/ heap - Practice Ladder 1100–1400—curated problems at the right difficulty for drilling these patterns
What to Solve Next: Work through the practice problems above at your current rating level, then move on to Binary Search, which builds directly on the sorted-array foundation from this chapter. Once comfortable with both, Two Pointers is the natural next technique to pair with sorting.