Appearance
Offline Queries
Quick Revisit
- USE WHEN: all queries known upfront — reorder them to share computation between overlapping ranges
- INVARIANT: queries sorted by key (endpoint, time) so consecutive queries share most of their work
- TIME: O((n+q) log n) with sorting + data structure | SPACE: O(n + q)
- CLASSIC BUG: forgetting to store original query indices — answers must be output in input order
- PRACTICE: 08-ladder-mixed
AL-38 | Read all queries before answering any of them, reorder for efficient processing. Turns
Difficulty: [Intermediate]Prerequisites: Sorting and Searching, Fenwick Tree. DSU is a later application (offline connectivity), not a hard prerequisite for the first reading. CF Rating Range: 1400--1800
Contents
- 2. Intuition
- 3. C++ STL Reference
- 4. Implementation (Contest-Ready)
- 5. Operations Reference
- 6. Problem Patterns & Tricks
- 7. Contest Cheat Sheet
- 8. Gotchas & Debugging
- 9. Self-Test
- 10. Practice Problems
- 11. Further Reading
Intuition
You have an array of
Consider 5 queries on a 6-element array:
a = [4, 2, 7, 1, 5, 3]
1 2 3 4 5 6
Q0: sum [1, 4] Q1: sum [2, 6] Q2: sum [1, 3]
Q3: sum [3, 6] Q4: sum [2, 5]Q0 touches positions 1--4. Q4 touches 2--5. They overlap on 2--4, yet online processing scans those positions twice. Q1 and Q3 overlap on 3--6 -- scanned twice again. Every shared sub-range is redundant work that we pay for but gain nothing from. With
If you know all queries upfront, you can reorder them to minimize redundant work -- sorting by endpoint, sweep line, or grouping lets you share computation across queries.
Think of a librarian who receives 50 book requests at once. Instead of walking to the shelves in request order -- romance, science, romance, history, science -- she batches by shelf location: all romance books in one trip, all science books in the next. The total number of books fetched is the same, but the walking (the redundant scanning) is eliminated.
Offline query processing does exactly this: read every query first, then choose a processing order that lets a single sweep over the data answer all of them.
Online processing: Offline processing:
Q1 --> process --> answer Q1 --+
Q2 --> process --> answer Q2 --+--> sort --> batch process --> answers
Q3 --> process --> answer Q3 --+
... ...
Qn --> process --> answer Qn --+
Each query independent. Queries reordered for efficiency.
Total: O(n * q) Total: O((n + q) log n)A Small Taxonomy
Not all offline methods reorder queries the same way. Three distinct flavors show up across this chapter:
- Timeline-preserving (sweep + sort key). Queries are sorted by a monotone key (right endpoint, weight threshold, time) and processed in that order while a single data structure sweeps. Examples: sort-by-R + BIT, offline DSU. The relative order of "events" is fixed; only the order of queries changes.
- Spatial / pointer-based (Mo's). Queries on a static array are reordered to minimize total pointer movement. The data structure is rebuilt incrementally as the window slides; there is no global timeline.
- Answer-guessing / batched search (parallel binary search, CDQ). Queries do not get reordered up-front; instead, all queries are processed in lockstep against trial answers (PBS) or partitioned recursively across a midpoint (CDQ). The work per query stays the same; the saving comes from sharing the expensive "replay" or "merge" step across queries.
This file covers the first flavor. The rest of the chapter covers the other two.
Visual Walkthrough
Array a = [4, 2, 7, 1, 5, 3] (1-indexed). Five range-sum queries:
Q0: [1, 4] Q1: [2, 6] Q2: [1, 3] Q3: [3, 6] Q4: [2, 5]Step 1 -- Sort queries by right endpoint
Q2: [1, 3] r=3
Q0: [1, 4] r=4
Q4: [2, 5] r=5
Q1: [2, 6] r=6
Q3: [3, 6] r=6Step 2 -- Initialize a BIT (Fenwick tree) of size 6, sweep pointer
BIT state: [4 . . . . .]
^
i=1
No query has r=1. Continue.Step 3 -- Advance sweep:
BIT state: [4 2 7 . . .]
1 2 3
^
i=3
--> Q2: [1,3] = prefix(3) - prefix(0) = 13 - 0 = 13Step 4 -- Advance sweep:
BIT state: [4 2 7 1 . .]
^
i=4
--> Q0: [1,4] = prefix(4) - prefix(0) = 14 - 0 = 14Step 5 -- Advance sweep:
BIT state: [4 2 7 1 5 .]
^
i=5
--> Q4: [2,5] = prefix(5) - prefix(1) = 15 - 4 = 11Step 6 -- Advance sweep:
BIT state: [4 2 7 1 5 3]
^
i=6
--> Q1: [2,6] = prefix(6) - prefix(1) = 22 - 4 = 18
--> Q3: [3,6] = prefix(6) - prefix(2) = 22 - 6 = 16Each element is added to the BIT exactly once. Each query is a single
The Invariant
+-------------------------------------------------------------------+
| INVARIANT: After the sweep pointer has advanced to position r, |
| the BIT reflects the full prefix a[1..r], and every query whose |
| right endpoint <= r has been answered. |
+-------------------------------------------------------------------+This guarantees correctness: when we evaluate a query
When to Reach for This
Trigger phrases in problem statements:
- "All queries are given upfront" or "read all queries before answering."
- "No updates between queries" -- the array (or graph) is static.
- "Sort queries" -- editorials or hints mention reordering queries.
- "For each query, find ..." with a static data set.
Counter-examples (this technique does NOT apply):
- Online queries with updates -- if elements change between queries, you cannot reorder. Use a segment tree with lazy propagation instead.
- Queries must be answered in arrival order -- some problems enforce an ordering (e.g., each query depends on the answer to the previous one). You cannot reorder; consider persistent data structures.
- Forced online via XOR decryption -- some problems XOR the next query with the previous answer, making it impossible to know future queries. Mo's algorithm with modifications or persistent structures may help, but basic offline reordering cannot.
Decision tree for choosing an offline technique:
Which offline technique to use?
Queries on ranges?
/ \
YES NO --> coordinate compression,
/ offline connectivity (DSU)
Need updates
between queries?
/ \
YES NO
| |
Online Can you maintain
required state incrementally?
/ \
YES NO
| |
Sort by R Mo's algorithm
+ BIT/segtree O((n+q)*sqrt(n))
O((n+q)*log n)What the Code Won't Teach You
The design question is always: "what order makes this cheap?" The topic file shows several orderings (by right endpoint, by weight). But in contests you encounter new situations and must reason fresh. Ask yourself: if I sort queries by X, what property does that give me about consecutive queries? Does that property make some operation (BIT update, pointer move) amortize to near-zero?
Offline + divide and conquer is a pattern. A family of offline techniques uses D&C: split the query range
at the midpoint, count contributions where updates come from the left half and queries from the right half, recurse (CDQ D&C). The offline component is that you have all updates and queries upfront and can assign them to halves. "Offline" is about information, not implementation. The real question is: does knowing future queries let you avoid redundant work? If yes, offline wins. If queries are inherently sequential (each depends on the last), offline buys you nothing.
C++ STL Reference
| Function / Class | Header | What it does | Time Complexity |
|---|---|---|---|
sort(first, last) | <algorithm> | Sort queries/edges by chosen key | |
sort(first, last, cmp) | <algorithm> | Sort with custom comparator (lambda) | |
iota(first, last, val) | <numeric> | Fill index array with 0, 1, 2, ... for indirect sort | |
lower_bound(first, last, val) | <algorithm> | Binary search for coordinate compression | |
unique(first, last) | <algorithm> | Remove consecutive duplicates (after sort) for compression | |
vector<array<int,K>> | <array> | Lightweight query struct with default < by first element |
Implementation (Contest-Ready)
Version 1 -- Minimal: Offline Distinct Values (Sort by R + BIT)
cpp
#include <bits/stdc++.h>
using namespace std;
struct BIT {
int n;
vector<int> t;
BIT(int n) : n(n), t(n + 1) {}
void update(int i, int v) { for (; i <= n; i += i & -i) t[i] += v; }
int query(int i) { int s = 0; for (; i > 0; i -= i & -i) s += t[i]; return s; }
int query(int l, int r) { return query(r) - query(l - 1); }
};
int main() {
int n, q;
scanf("%d%d", &n, &q);
vector<int> a(n + 1);
for (int i = 1; i <= n; i++) scanf("%d", &a[i]);
vector<array<int, 3>> qr(q); // {l, r, original_index}
for (int i = 0; i < q; i++) {
scanf("%d%d", &qr[i][0], &qr[i][1]);
qr[i][2] = i;
}
sort(qr.begin(), qr.end(), [](auto& a, auto& b) {
return a[1] < b[1]; // sort by right endpoint
});
BIT bit(n);
unordered_map<int, int> last;
vector<int> ans(q);
int ptr = 0;
for (int i = 1; i <= n; i++) {
if (last.count(a[i]))
bit.update(last[a[i]], -1);
bit.update(i, 1);
last[a[i]] = i;
while (ptr < q && qr[ptr][1] == i) {
ans[qr[ptr][2]] = bit.query(qr[ptr][0], i);
ptr++;
}
}
for (int i = 0; i < q; i++)
printf("%d\n", ans[i]);
}Version 2 -- Explained: Offline Connectivity (Sort + DSU)
cpp
#include <bits/stdc++.h>
using namespace std;
struct DSU {
vector<int> parent, rank_;
DSU(int n) : parent(n), rank_(n, 0) {
iota(parent.begin(), parent.end(), 0);
}
int find(int x) {
// Path compression: safe here because we never need to undo.
while (parent[x] != x) x = parent[x] = parent[parent[x]];
return x;
}
bool unite(int a, int b) {
a = find(a); b = find(b);
if (a == b) return false;
// Union by rank keeps tree depth O(log n).
if (rank_[a] < rank_[b]) swap(a, b);
parent[b] = a;
if (rank_[a] == rank_[b]) rank_[a]++;
return true;
}
bool connected(int a, int b) { return find(a) == find(b); }
};
int main() {
int n, m, q;
scanf("%d%d%d", &n, &m, &q);
// Each edge has a weight. We sort edges by weight so we can add them
// incrementally -- lightest first.
struct Edge { int u, v, w; };
vector<Edge> edges(m);
for (auto& [u, v, w] : edges) scanf("%d%d%d", &u, &v, &w);
sort(edges.begin(), edges.end(), [](auto& a, auto& b) {
return a.w < b.w;
});
// Each query asks: "are u and v connected using only edges with
// weight <= w?" We sort queries by w so they align with the edge
// sweep -- once we reach query weight w, all relevant edges are
// already in the DSU.
struct Query { int u, v, w, idx; };
vector<Query> queries(q);
for (int i = 0; i < q; i++) {
scanf("%d%d%d", &queries[i].u, &queries[i].v, &queries[i].w);
queries[i].idx = i; // preserve input order for output
}
sort(queries.begin(), queries.end(), [](auto& a, auto& b) {
return a.w < b.w;
});
DSU dsu(n);
vector<int> ans(q);
int ei = 0; // pointer into sorted edge array
for (auto& [u, v, w, idx] : queries) {
// Add every edge whose weight fits within this query's limit.
// Because both arrays are sorted by weight, each edge is
// processed at most once across all queries -- total O(m * a(n)).
while (ei < m && edges[ei].w <= w) {
dsu.unite(edges[ei].u, edges[ei].v);
ei++;
}
ans[idx] = dsu.connected(u, v) ? 1 : 0;
}
for (int i = 0; i < q; i++)
printf("%s\n", ans[i] ? "YES" : "NO");
}Operations Reference
| Operation | Time | Space |
|---|---|---|
| Sort queries | ||
| Sort edges | ||
| Sweep + BIT update/query (per element) | -- | |
| Sweep + BIT total | ||
| Sweep + DSU unite/find (per operation) | -- | |
| Sweep + DSU total | ||
| Overall (BIT variant) | ||
| Overall (DSU variant) |
Problem Patterns & Tricks
Pattern 1: Sort Queries by Right Endpoint + BIT
Sort all
Queries sorted by right endpoint:
Array indices: [ 0 1 2 3 4 5 6 7 8 9 ]
Q3: |--L=1------R=3--| process first (R=3)
Q1: |-----L=2---------R=5--| process second (R=5)
Q5: |--------L=3-----------R=7--| process third (R=7)
Q2: |-----------L=4-----------R=8--| process fourth (R=8)
<-- sweep line moves left to right -->
adding elements one by one into BITWorks for: distinct count, number of values in a range, count of elements satisfying a positional property over
Examples: CSES Distinct Values Queries, CF 220B.
Pattern 2: Sort Queries by Right Endpoint + Segment Tree
Same sweep, but use a segment tree when you need range-min/max instead of prefix sums. Typical use: track the minimum distance between two equal elements, updated as the sweep advances.
cpp
// Sketch: when sweep hits position i with value a[i],
// if a[i] was last seen at position p, update seg_tree at p
// with distance (i - p). Query [l, r] for range minimum.Example: CF 522D (Closest Equals).
Pattern 3: Sort Events and Queries Together (Sweep Line)
Treat array elements and queries as events on a shared axis. Sort everything by position (or coordinate). Process events as you sweep. Useful when queries and data points live in the same space.
cpp
// Merge points and queries into one event list
// Negative type = data point, non-negative = query index
vector<pair<int,int>> events;
sort(events.begin(), events.end());
for (auto [pos, type] : events) {
if (type < 0) bit.update(/* ... */);
else ans[type] = bit.query(/* ... */);
}Example: CF 1311F (Moving Points -- sort by coordinate, BIT for count/sum).
Pattern 4: Offline Connectivity with DSU
Queries ask "connected?" under a weight or time threshold. Sort edges and queries by threshold. Sweep through, adding edges to a DSU, answering queries at the right moment. Each edge is added at most once.
Example: CF 1851G (Vlad and the Mountains).
Pattern 5: Coordinate Compression
The simplest offline trick: read all values (from both data and queries) upfront, then compress to rank space. Required when values reach
cpp
vector<int> vals(a.begin(), a.end());
for (auto& [l, r, v] : queries) vals.push_back(v);
sort(vals.begin(), vals.end());
vals.erase(unique(vals.begin(), vals.end()), vals.end());
auto compress = [&](int x) {
return (int)(lower_bound(vals.begin(), vals.end(), x) - vals.begin());
};Example: Any BIT/segment tree problem with values up to
Pattern 6: DSU with Rollback (Offline Dynamic Connectivity)
When you need to add and remove edges, use DSU with rollback. Union by rank only (no path compression -- must be reversible). Save a checkpoint before temporary unions and restore afterwards.
cpp
struct RollbackDSU {
vector<int> par, rnk;
vector<pair<int,int>> history;
RollbackDSU(int n) : par(n), rnk(n, 0) { iota(par.begin(), par.end(), 0); }
int find(int x) { while (par[x] != x) x = par[x]; return x; }
bool unite(int a, int b) {
a = find(a); b = find(b);
if (a == b) return false;
if (rnk[a] < rnk[b]) swap(a, b);
history.push_back({b, par[b]});
par[b] = a;
if (rnk[a] == rnk[b]) { history.push_back({~a, rnk[a]}); rnk[a]++; }
return true;
}
int save() { return (int)history.size(); }
void rollback(int checkpoint) {
while ((int)history.size() > checkpoint) {
auto [node, val] = history.back(); history.pop_back();
if (node < 0) rnk[~node] = val;
else par[node] = val;
}
}
};Used in divide-and-conquer on query time intervals. Find is
Example: CF 1140F (Extending Set of Points).
Contest Cheat Sheet
+------------------------------------------------------------------+
| OFFLINE QUERIES CHEAT SHEET |
+------------------------------------------------------------------+
| WHEN TO USE: |
| - All queries given upfront (no forced online) |
| - Range queries on a static array |
| - "Connected using edges with weight <= w?" type queries |
| - Values up to 10^9 --> coordinate compression |
+------------------------------------------------------------------+
| KEY TEMPLATES: |
| sort(queries.begin(), queries.end(), by_right_endpoint); |
| for (int i = 1; i <= n; i++) { |
| update_BIT(i); |
| while (ptr < q && queries[ptr].r == i) |
| ans[queries[ptr].idx] = BIT.query(queries[ptr].l, i); |
| } |
+------------------------------------------------------------------+
| COMPLEXITY: |
| Sweep + BIT: O((n + q) log n) time | O(n + q) space |
| Sweep + DSU: O((m + q) a(n)) time | O(n + m + q) space |
+------------------------------------------------------------------+
| REMEMBER: |
| - Store original query index for output order |
| - 1-indexed BIT needs 1-indexed queries |
| - Sort edges AND queries by the SAME key |
| - DSU rollback: NO path compression, union by rank only |
+------------------------------------------------------------------+Gotchas & Debugging
Forgot to store query index. You sort queries but need to output answers in input order. Always carry the original index. Use
array<int,3>or a struct with anidxfield.0-indexed vs 1-indexed. BIT is naturally 1-indexed. If your array is 0-indexed, shift everything by 1 before touching the BIT.
Process order within the sweep. Update the data structure for position
before answering queries with . If you answer first, you miss the contribution of . unordered_mapTLE. Worst-caseper lookup due to hash collisions. For "last occurrence" tracking, prefer a plain vector<int>after coordinate compression, or use a custom hash.DSU path compression prevents rollback. If you need rollback, use union by rank without path compression.
find()becomesinstead of amortized . Integer overflow in BIT. If you sum values (not just count), use
long longin the BIT array.Queries with
. Shouldn't happen per most problem constraints, but guard against it if you generate sub-queries internally.
Mental Traps
"If I can store all queries, I can solve the problem offline." Storing queries is necessary but not sufficient. The benefit of offline processing comes from reordering queries to amortize cost. If you store all queries and process them in input order, you've gained nothing over online processing.
"Offline techniques are a fallback when online is too slow." Online techniques are sometimes strictly better (e.g., segment trees with lazy propagation). Offline is better when: (1) queries can be reordered to enable cheaper processing, or (2) the problem's structure inherently allows precomputation. The choice depends on query structure, not just time complexity.
"Offline means I can use any order I want." You can reorder queries, but the correct order depends entirely on your algorithm. Mo's order (by block +
) is optimal for Mo's. Sort by right endpoint for BIT sweeps. Sort by query parameter for parallel binary search. The wrong sort order gives a correct-looking but slow or wrong algorithm. "Offline always means Mo's algorithm." Mo's is one offline technique among many. Others include sort-by-R + BIT, CDQ divide and conquer, parallel binary search, and persistent structures. Jumping to Mo's on every offline problem misses cleaner, faster solutions.
The mistake that teaches (sort-by-R distinct count). I once sorted queries by left endpoint instead of right, while sweeping left-to-right. My BIT didn't contain all information for the range at query time, and 40% of answers were silently wrong. Debugging breakthrough: print the sweep pointer alongside each query's R. Always verify that your sweep direction and sort key are consistent.
The off-by-one on first occurrence. On CSES Distinct Values Queries, I wrote
BIT[i] += 1beforeBIT[last[a[i]]] -= 1. For the first occurrence (whenlast[a[i]] == 0), this caused a spurious-1at position 0 of the BIT -- corrupting all prefix sums. Always checkif (last[a[i]] != 0)or initializelastto-1. Off-by-one on sentinel values is the #1 offline query bug.
Debug Drills
Drill 1: DSU with rollback -- but rollback doesn't actually undo the union.
cpp
struct DSU {
vector<int> par, rank_;
vector<pair<int,int>> history; // (node, old_parent)
DSU(int n) : par(n), rank_(n, 0) { iota(par.begin(), par.end(), 0); }
int find(int x) { while (par[x] != x) x = par[x]; return x; }
bool unite(int a, int b) {
a = find(a); b = find(b);
if (a == b) return false;
if (rank_[a] < rank_[b]) swap(a, b);
history.push_back({b, par[b]});
par[b] = a;
if (rank_[a] == rank_[b]) rank_[a]++;
return true;
}
void rollback(int saved) {
while ((int)history.size() > saved) {
auto [node, old_par] = history.back();
history.pop_back();
par[node] = old_par;
}
}
};What's wrong?
When rank_[a] == rank_[b] and we increment rank_[a], the rollback only restores par[b] but never restores rank_[a]. After rollback, rank_[a] is permanently inflated, corrupting future union-by-rank decisions. Fix: also push the old rank onto the history stack and restore it during rollback.
Drill 2: Offline distinct-count queries with sort-by-R sweep.
cpp
vector<int> ans(Q);
sort(queries.begin(), queries.end(), [](auto& a, auto& b) {
return a.r < b.r;
});
int ptr = 0;
for (auto& [l, r, idx] : queries) {
while (ptr < r) {
ptr++;
if (last_pos[a[ptr]] != -1)
bit.update(last_pos[a[ptr]], -1);
bit.update(ptr, 1);
last_pos[a[ptr]] = ptr;
}
ans[idx] = bit.query(r) - bit.query(l - 1);
}What's wrong?
The pointer advances with ptr++ before processing, making it 1-indexed on entry -- but if the array a[] and BIT are 0-indexed, a[ptr] is off-by-one. More critically, the while condition ptr < r means when the loop exits, ptr == r, but the element at index r has already been processed (since ptr++ happens first). If r is 0-indexed, the element at index r is never processed. Fix: use while (ptr <= r) { /* process a[ptr] */ ptr++; } or adjust to while (ptr < r) { /* process a[ptr+1] first, then ptr++ */ } depending on your indexing convention.
Drill 3: Forgetting to map answers back to original indices.
cpp
sort(queries.begin(), queries.end(), [](auto& a, auto& b) {
return a.r < b.r;
});
vector<int> ans(Q);
for (int i = 0; i < Q; i++) {
// ... process query i ...
ans[i] = result; // BUG
}
for (int i = 0; i < Q; i++)
cout << ans[i] << "\n";What's wrong?
After sorting, query i in the sorted order is not the i-th original query. Writing ans[i] = result stores the answer at the wrong index. Fix: use ans[queries[i].original_index] = result so the answer is placed at the position corresponding to the original input order.
Drill 4: Sorting queries but forgetting to store indices.
cpp
sort(queries.begin(), queries.end(), [](auto& a, auto& b) {
return a.r < b.r;
});
for (int i = 0; i < q; i++) {
// process queries[i]
ans[i] = bit.query(queries[i].r) - bit.query(queries[i].l - 1);
}
// Output: for (int i = 0; i < q; i++) cout << ans[i] << "\n";What's wrong?
After sorting, queries[i] is no longer the i-th original query. The answer ans[i] is stored at the sorted position, not the original position. Fix: store the original index in each query struct and use it:
cpp
ans[queries[i].idx] = bit.query(queries[i].r) - bit.query(queries[i].l - 1);Without this, answers are printed in sorted-by-R order instead of input order.
Drill 5: Offline DSU -- processing order.
cpp
// Edges sorted by weight, queries sorted by weight threshold
int ei = 0;
for (auto& [threshold, u, v, idx] : queries) {
while (ei < m && edges[ei].w < threshold) // wrong comparator
dsu.unite(edges[ei].u, edges[ei].v), ei++;
ans[idx] = dsu.find(u) == dsu.find(v);
}What's wrong?
The condition edges[ei].w < threshold uses strict less-than. If the query asks "are u and v connected using edges with weight <= threshold?", edges with weight equal to the threshold are excluded. Fix: use <=:
cpp
while (ei < m && edges[ei].w <= threshold)This off-by-one on the comparator is especially insidious because it only fails when an edge weight exactly equals a query threshold -- which may not appear in small test cases.
Drill 6: Last occurrence tracking.
cpp
map<int, int> last_pos;
for (int i = 1; i <= n; i++) {
if (last_pos.count(a[i]))
bit.update(last_pos[a[i]], -1);
bit.update(i, 1);
last_pos[a[i]] = i;
// answer all queries with r == i
while (qi < q && queries[qi].r == i) {
ans[queries[qi].idx] = bit.query(queries[qi].r) - bit.query(queries[qi].l);
qi++;
}
}What's wrong?
The answer should be bit.query(queries[qi].r) - bit.query(queries[qi].l - 1), not bit.query(queries[qi].l). With a 1-indexed BIT, bit.query(x) returns the prefix sum from 1 to x. The count of distinct values in [L, R] is prefix(R) - prefix(L-1). Using prefix(L) excludes position L itself from the count, giving an answer that's too small by 1 whenever position L holds a "last occurrence."
Debug Checklist
- Debugging tip. Print queries after sorting to verify the sweep order. Check that every query gets answered exactly once (counter should equal
at the end).
Self-Test
Before moving on, check that you can do each of these without looking back at the notes:
- [ ] Define "offline" vs. "online" query processing in one sentence each -- in terms of the constraint on query processing order, not in terms of technique.
- [ ] Explain why sorting queries by right endpoint lets a BIT-based sweep answer range queries in
total time. - [ ] Write the boilerplate for sorting queries with stored indices and outputting answers in original order (~10 lines of C++).
- [ ] Name three different offline techniques (beyond Mo's) and describe the problem structure each is suited for.
- [ ] Given a new problem statement, identify whether it permits offline processing or forces online -- look for XOR decryption, query-answer dependencies, or "all queries given upfront."
- [ ] Explain when DSU with rollback is needed instead of DSU with path compression, and why path compression prevents rollback.
- [ ] Describe how coordinate compression enables offline techniques when values reach
but the number of distinct values is small.
Practice Problems
Rating Progression
| CF Rating | What to Master |
|---|---|
| 1200 | Recognize when all queries are known upfront; sort queries by one endpoint and sweep with a pointer or prefix sum. |
| 1500 | Sort-by-R technique with a BIT to answer "distinct elements in range" or "last occurrence" style queries. |
| 1800 | DSU with rollback for connectivity queries on dynamic edge sets; understand when union-by-rank is mandatory (no path compression). Offline DSU (sort edges + queries by weight/time), divide-and-conquer on time. |
| 2100 | Combine offline reordering with segment trees or persistent structures; recognize when offline + sweep beats a heavy online DS. CDQ + BIT, parallel binary search + offline DSU, offline + persistent structures. |
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | Distinct Values Queries | CSES | Easy | Sort by R, BIT last-occurrence trick |
| 2 | CF 220B -- Little Elephant and Array | CF | Medium | Sort by R, BIT counts values appearing exactly |
| 3 | CF 522D -- Closest Equals | CF | Medium | Sort by R, segment tree tracks min distance between equal values |
| 4 | CF 1311F -- Moving Points | CF | Medium | Sort points by coordinate, BIT for pairwise distance sum |
| 5 | CF 1851G -- Vlad and the Mountains | CF | Medium | Sort queries and edges by height, offline DSU |
| 6 | CF 1140F -- Extending Set of Points | CF | Hard | DSU with rollback, divide-and-conquer on time |
| 7 | CF 600E -- Lomsat gelral | CF | Hard | DSU on tree (small-to-large merge), offline subtree queries |
Further Reading
- cp-algorithms: Disjoint Set Union -- DSU with rollback details.
- CF Blog: Offline Query Techniques (Enchom) -- survey of offline methods.
- CF Blog: DSU on Tree -- small-to-large merging for tree queries.
- See:
03-mo-algorithm.mdfor the most powerful offline range query technique. - See:
02-sqrt-decomposition.mdfor block-based decomposition. - See:
04-cdq-divide-and-conquer.mdfor multi-dimensional partial order problems via divide and conquer. - See:
05-parallel-binary-search.mdfor batching multiple binary searches over the same event timeline.