Appearance
Linear Basis (XOR Basis)
Quick Revisit
- USE WHEN: XOR subset queries -- max XOR, min XOR, k-th XOR, count distinct XOR values
- INVARIANT: basis[k] has bit k as its highest set bit; at most 60 basis vectors for 64-bit values
- TIME: O(n log V) build, O(log V) per query | SPACE: O(log V)
- CLASSIC BUG: forgetting to reduce basis to RREF before k-th XOR query
- PRACTICE: ../12-practice-ladders/08-ladder-mixed.md
A Gaussian elimination structure over
| Difficulty | Advanced |
| CF Rating | 1800 - 2100 |
| Prerequisites | Bit Manipulation |
Contents
- Why XOR subsets are tricky
- Building the basis
- Worked example
- What you can do with a basis
- Applications in competitive programming
- Visual intuition
- C++ STL reference
- Implementation (contest-ready)
- Operations reference
- Problem patterns and tricks
- Contest cheat sheet
- Gotchas and debugging
- Self-Test
- Practice problems
- Further reading
Why XOR subsets are tricky
Here is a deceptively simple question: given
Brute force checks all
Just like in ordinary linear algebra, every vector space has a basis -- a minimal set of vectors that spans the whole space. For 64-bit integers the dimension is bounded by the number of bits actually used, so we never need more basis vectors than there are bits in the input. Building that basis is Gaussian elimination, adapted for XOR.
Sizing the basis array — match the input bound exactly. The right MAXLOG depends on the input, not the integer type:
Input bound Bits needed basisarray size(fits int)30 basis[30](fits long long)60 basis[60](signed long long, no negatives)63 basis[63]unsigned 64-bit values 64 basis[64](useuint64_t)The conventional
basis[60]choice is tuned tocontest bounds. If the problem hands you values up to , basis[60]silently misses the top two bits. Always read the constraint and set the bound to match.
Building the basis: Gaussian elimination over GF(2)
We maintain an array basis[0..59]. Slot
- Scan from the highest bit of
downward. - Let
be the current highest bit of . - If
basis[k] == 0-- no existing element has this leading bit. Storethere and stop. We just increased the rank by one. - If
basis[k] != 0-- XORwith basis[k]. This clears bitin and we repeat with the new, smaller .
- If
- If
becomes 0, it was linearly dependent -- discard it.
This is exactly row reduction. Each successful insertion adds one independent element. The invariant is: every stored value has a unique leading bit that no other stored value shares.
Worked example: inserting 7, 5, 3, 6
Let us trace through with four numbers. We show binary representations and which basis slots fill up.
Insert 7 (binary 111). Highest bit is bit 2. basis[2] is empty, so store 7 there.
basis[2] = 111 (7)
basis[1] = 000 (empty)
basis[0] = 000 (empty)Insert 5 (binary 101). Highest bit is bit 2. basis[2] = 7, so XOR: basis[1] is empty, so store 2 there.
basis[2] = 111 (7)
basis[1] = 010 (2)
basis[0] = 000 (empty)Note: we stored 2, not the original 5. That is fine -- the span is the same.
Insert 3 (binary 011). Highest bit is bit 1. basis[1] = 2, so XOR: basis[0] is empty, store 1.
basis[2] = 111 (7)
basis[1] = 010 (2)
basis[0] = 001 (1)All three slots are full -- rank 3, meaning
Insert 6 (binary 110). Highest bit is bit 2. XOR with basis[2]: basis[0]:
basis[2] = 111 (7) (unchanged)
basis[1] = 010 (2)
basis[0] = 001 (1)What you can do with a basis
Once built, the basis supports several powerful queries.
Can a value be represented? Run the same insertion procedure. If it reduces to 0, yes. If it would be inserted (non-zero remainder), no.
Maximum XOR. Greedily accumulate from the highest basis slot downward: if XOR-ing basis[k] with the running answer increases it, take it. Since basis[k] has bit
Minimum non-zero XOR. The lowest-index non-zero slot in the basis (after reducing to row echelon form) gives the minimum.
Merge two bases. Insert every element of one basis into the other -- at most 60 insertions.
Count of distinct XOR values. Exactly
Row-echelon vs reduced row-echelon — a subtle distinction
The standard insertion procedure described above produces a basis in row-echelon form (REF): each
basis[k]has bitas its highest set bit, but lower bits may be set, and may also be set in other basis vectors.
Query Form needed Insert / can-be-represented / maximum XOR Row echelon (REF) — what insert builds Minimum non-zero XOR Row echelon -th smallest XOR / ordered enumeration Reduced row echelon (RREF) RREF additionally forces: for each
basis[k], no other basis vector has bitset. That eliminates the cross-talk between basis vectors so that XOR-ing in any subset of basis vectors yields a value whose bit pattern at the pivot positions exactly matches the inclusion pattern. That is what makes the bijection "binary representation of ↔ subset of basis vectors ↔ -th XOR value" work. Skipping the reduce step before a
-th-smallest query is the silent bug the quick revisit warns about — max XOR will keep working, so it is easy to overlook.
SPAN OF THE BASIS (rank 3, values 0-7):
basis[2]=111 basis[1]=010 basis[0]=001
XOR combination Value
───────────────────── ─────
(none) 000 = 0
basis[0] 001 = 1
basis[1] 010 = 2
basis[1] ^ basis[0] 011 = 3
basis[2] 111 = 7
basis[2] ^ basis[0] 110 = 6
basis[2] ^ basis[1] 101 = 5
basis[2]^basis[1]^basis[0] 100 = 4
All 2^3 = 8 values reachable. This is a complete 3-bit space.Applications in competitive programming
- Maximum XOR subarray: Compute prefix XOR
. Any subarray XOR is . Fix , find maximum over all -- insert all previous prefix XORs into a basis and query max XOR with current . - XOR spanning tree: All spanning trees of a graph have XOR values differing by cycle XORs. Build a basis of cycle XORs, then find the spanning tree XOR that maximizes (or minimizes) the target.
- Path XOR in trees: XOR along a path = XOR of root-to-
and root-to- . Combine with basis for subtree queries. - Linearly independent subsets: The basis size is the maximum number of elements you can choose so that no subset XORs to 0.
Visual intuition
Basis slots during insertion
Inserting: 7(111) 5(101) 3(011) 6(110)
Step 1: x = 7 = 111
bit 2 slot empty --> store
[bit 2] 111 << 7 goes here
[bit 1] ---
[bit 0] ---
Step 2: x = 5 = 101
bit 2 slot has 111 --> x ^= 111 --> x = 010
bit 1 slot empty --> store
[bit 2] 111
[bit 1] 010 << 5 ^ 7 = 2 goes here
[bit 0] ---
Step 3: x = 3 = 011
bit 1 slot has 010 --> x ^= 010 --> x = 001
bit 0 slot empty --> store
[bit 2] 111
[bit 1] 010
[bit 0] 001 << 3 ^ 2 = 1 goes here
Step 4: x = 6 = 110
bit 2 slot has 111 --> x ^= 111 --> x = 001
bit 0 slot has 001 --> x ^= 001 --> x = 000
x == 0 --> DISCARD (dependent)
[bit 2] 111 (no change)
[bit 1] 010
[bit 0] 001Greedy maximum XOR walkthrough
Goal: maximize ans, starting from ans = 000
Check basis[2] = 111:
ans ^ 111 = 111 > 000? YES --> ans = 111
Check basis[1] = 010:
ans ^ 010 = 101 < 111? NO (bit 2 would turn off)
Wait -- 111 ^ 010 = 101. Bit 2 still set? Yes (101 has bit 2).
But 101 < 111, so skip? Actually the greedy rule is:
"if bit k of ans is 0, XOR in basis[k]"
ans = 111: bit 1 is already 1 --> skip basis[1].
Check basis[0] = 001:
bit 0 of ans is already 1 --> skip.
Result: ans = 111 = 7. Maximum achievable XOR.The greedy rule is simple: for each basis slot
What the Code Won't Teach You
The code compiles, runs, and gives correct answers. What it hides:
Why the greedy for max-XOR works. Each basis slot basis[k] into the running answer toggles bit
The persistent/index-aware basis for range queries. To answer "maximum XOR from elements in
STANDARD vs INDEX-AWARE INSERTION:
Standard insert(x): Index-aware insert(x, idx):
+---------------------------+ +-----------------------------------+
| for bit k high to low: | | for bit k high to low: |
| if basis[k]==0: | | if basis[k]==0: |
| basis[k]=x; return | | basis[k]=x; index[k]=idx |
| x ^= basis[k] | | else if idx > index[k]: |
| | | swap(x, basis[k]) |
| | | swap(idx, index[k]) |
| | | x ^= basis[k] |
+---------------------------+ +-----------------------------------+
Same span, but index-aware lets you answer range queries.Reduced row echelon form (for k-th smallest)
Before reduction: After reduction:
[bit 2] 111 [bit 2] 100
[bit 1] 010 [bit 1] 010
[bit 0] 001 [bit 0] 001
Reduction: for each basis[k], clear bit k in all other basis elements.
basis[2] ^= basis[1] --> 111 ^ 010 = 101
basis[2] ^= basis[0] --> 101 ^ 001 = 100
Now k-th smallest (0-indexed, excluding 0):
k=1 -> take basis[0] = 001 = 1
k=2 -> take basis[1] = 010 = 2
k=3 -> take basis[0]^basis[1] = 011 = 3
k=4 -> take basis[2] = 100 = 4
...and so on, reading k in binary.C++ STL reference
| Function / Expression | What it does | Notes |
|---|---|---|
x ^ y | Bitwise XOR | Associative, commutative |
x & y | Bitwise AND | |
x | y | Bitwise OR | |
x >> k | Right shift by | Divides by |
1LL << k | Bit | Use LL for |
__builtin_clzll(x) | Count leading zeros (64-bit) | Undefined for |
63 - __builtin_clzll(x) | Highest set bit index | Common idiom |
__builtin_popcountll(x) | Number of set bits | |
bit_width(x) | C++20: | <bit> header |
Implementation (contest-ready)
Minimal contest template
cpp
#include <bits/stdc++.h>
using namespace std;
struct LinearBasis {
static const int MAXLOG = 60;
long long basis[MAXLOG];
int sz;
LinearBasis() : sz(0) { memset(basis, 0, sizeof basis); }
bool insert(long long x) {
for (int i = MAXLOG - 1; i >= 0; i--) {
if (!((x >> i) & 1)) continue;
if (!basis[i]) { basis[i] = x; sz++; return true; }
x ^= basis[i];
}
return false; // dependent
}
bool can_represent(long long x) {
for (int i = MAXLOG - 1; i >= 0; i--) {
if ((x >> i) & 1) x ^= basis[i];
}
return x == 0;
}
long long max_xor() {
long long ans = 0;
for (int i = MAXLOG - 1; i >= 0; i--)
ans = max(ans, ans ^ basis[i]);
return ans;
}
long long min_xor() {
for (int i = 0; i < MAXLOG; i++)
if (basis[i]) return basis[i];
return 0; // empty basis
}
void reduce() {
for (int i = MAXLOG - 1; i >= 0; i--)
if (basis[i])
for (int j = i + 1; j < MAXLOG; j++)
if ((basis[j] >> i) & 1)
basis[j] ^= basis[i];
}
long long kth_smallest(long long k) {
// Call reduce() first. Returns k-th smallest (1-indexed).
// If 0 is reachable (sz < total elements), adjust k.
vector<long long> compact;
for (int i = 0; i < MAXLOG; i++)
if (basis[i]) compact.push_back(basis[i]);
if (k >= (1LL << (int)compact.size())) return -1;
long long ans = 0;
for (int i = (int)compact.size() - 1; i >= 0; i--)
if ((k >> i) & 1)
ans ^= compact[i];
return ans;
}
void merge(const LinearBasis &other) {
for (int i = MAXLOG - 1; i >= 0; i--)
if (other.basis[i]) insert(other.basis[i]);
}
int size() const { return sz; }
};
int main() {
int n;
scanf("%d", &n);
LinearBasis lb;
for (int i = 0; i < n; i++) {
long long x;
scanf("%lld", &x);
lb.insert(x);
}
printf("%lld\n", lb.max_xor());
return 0;
}Explained version
cpp
#include <bits/stdc++.h>
using namespace std;
struct LinearBasis {
static const int MAXLOG = 60; // enough for values up to ~10^18
long long basis[MAXLOG];
int sz; // number of independent vectors in the basis
LinearBasis() : sz(0) {
memset(basis, 0, sizeof basis);
}
// Try to insert x into the basis.
// Returns true if x was independent (basis grew by 1).
// Returns false if x was already representable (dependent).
bool insert(long long x) {
for (int i = MAXLOG - 1; i >= 0; i--) {
if (!((x >> i) & 1)) continue; // bit i not set, skip
if (!basis[i]) {
// No basis element has bit i as leading bit.
// x becomes the representative for this bit position.
basis[i] = x;
sz++;
return true;
}
// basis[i] exists with leading bit i.
// XOR x with it to clear bit i in x, then keep reducing.
x ^= basis[i];
}
// x reduced to 0 -- it was a linear combination of existing basis.
return false;
}
// Check whether x can be formed by XOR of some subset of the basis.
// Same logic as insert, but we don't modify the basis.
bool can_represent(long long x) {
for (int i = MAXLOG - 1; i >= 0; i--) {
if ((x >> i) & 1) x ^= basis[i];
}
return x == 0;
}
// Maximum XOR achievable from any subset of the basis.
// Greedy: from highest bit down, XOR in basis[i] if it increases ans.
// Since basis[i] has bit i as leading bit, XOR-ing it in sets bit i
// without disturbing any higher bits already accumulated.
long long max_xor() {
long long ans = 0;
for (int i = MAXLOG - 1; i >= 0; i--)
ans = max(ans, ans ^ basis[i]);
return ans;
}
// Minimum non-zero XOR achievable.
// In row echelon form (our invariant), the smallest basis element
// is the answer because it cannot be made smaller by XOR with others.
long long min_xor() {
for (int i = 0; i < MAXLOG; i++)
if (basis[i]) return basis[i];
return 0; // empty basis -- only 0 is representable
}
// Reduce to reduced row echelon form:
// for each basis[i], clear bit i in every other basis element.
// After this, each basis element has EXACTLY one unique bit set
// that no other element shares -- and may have lower bits set too,
// but those lower bits do NOT appear as leading bits of other elements.
//
// Actually, reduced row echelon means: for each basis[j] with j > i,
// if bit i is set in basis[j], XOR it out using basis[i].
void reduce() {
for (int i = MAXLOG - 1; i >= 0; i--) {
if (!basis[i]) continue;
for (int j = i + 1; j < MAXLOG; j++) {
if ((basis[j] >> i) & 1) {
basis[j] ^= basis[i];
}
}
}
}
// k-th smallest XOR value (1-indexed, among non-zero values).
// MUST call reduce() before using this.
//
// After reduction, collect non-zero basis elements into a compact list.
// The achievable values biject with binary numbers 0..2^sz - 1:
// bit j of k tells whether to include compact[j].
long long kth_smallest(long long k) {
vector<long long> compact;
for (int i = 0; i < MAXLOG; i++)
if (basis[i]) compact.push_back(basis[i]);
// k must be in [1, 2^sz - 1] for non-zero values.
// If the original set can produce 0 (i.e., sz < n), then
// k=1 means 0, and you should subtract 1 from k before calling.
if (k >= (1LL << (int)compact.size())) return -1;
long long ans = 0;
for (int i = (int)compact.size() - 1; i >= 0; i--)
if ((k >> i) & 1)
ans ^= compact[i];
return ans;
}
// Merge another basis into this one.
// At most MAXLOG insertions, so O(MAXLOG^2).
void merge(const LinearBasis &other) {
for (int i = MAXLOG - 1; i >= 0; i--)
if (other.basis[i]) insert(other.basis[i]);
}
int size() const { return sz; }
};
int main() {
int n;
scanf("%d", &n);
LinearBasis lb;
for (int i = 0; i < n; i++) {
long long x;
scanf("%lld", &x);
lb.insert(x);
}
// Example: print max XOR, min non-zero XOR, count of distinct values
printf("Max XOR: %lld\n", lb.max_xor());
printf("Min XOR: %lld\n", lb.min_xor());
printf("Distinct XOR: %lld\n", 1LL << lb.size());
// k-th smallest query
lb.reduce();
for (long long k = 1; k < (1LL << lb.size()); k++) {
printf("k=%lld -> %lld\n", k, lb.kth_smallest(k));
}
return 0;
}Operations reference
| Operation | Time Complexity | Space | Notes |
|---|---|---|---|
insert(x) | |||
can_represent(x) | Same loop as insert | ||
max_xor() | Single greedy pass | ||
min_xor() | Scan for first non-zero slot | ||
reduce() | One-time preprocessing | ||
kth_smallest(k) | Requires reduce() first | ||
merge(other) | Insert each of other's elements | ||
size() | Returns basis rank | ||
| Build from |
Problem patterns and tricks
1. Maximum XOR subset
The most direct application. Insert all values, call max_xor().
cpp
LinearBasis lb;
for (auto x : a) lb.insert(x);
cout << lb.max_xor() << "\n";Problems: CF 959F -- offline queries sorted by index, maintain basis as you add elements.
2. Maximum XOR subarray (prefix XOR trick)
Any subarray XOR
PREFIX XOR TRICK FOR MAXIMUM XOR SUBARRAY:
Array a: [3, 5, 2, 6]
Prefix XOR p: [0, 3, 6, 4, 2]
^ ^
p[0] p[4]
Subarray XOR(a[1..3]) = p[3] ^ p[0] = 4 ^ 0 = 4
Subarray XOR(a[2..4]) = p[4] ^ p[1] = 2 ^ 3 = 1
For each r, maximize p[r] ^ p[l-1] over all l <= r
--> insert p[0..r-1] into basis, query max XOR with p[r]cpp
vector<long long> p(n + 1);
for (int i = 0; i < n; i++) p[i + 1] = p[i] ^ a[i];
LinearBasis lb;
long long ans = 0;
for (int r = 1; r <= n; r++) {
lb.insert(p[r - 1]);
// max XOR of p[r] with any element in the basis
long long cur = p[r];
for (int i = 59; i >= 0; i--)
cur = max(cur, cur ^ lb.basis[i]);
ans = max(ans, cur);
}Problems: CF 845G -- Shortest Path Problem, which is really about XOR paths in a graph.
3. XOR spanning tree / cycle space
In an undirected weighted graph, fix any spanning tree. Every non-tree edge creates a cycle whose XOR is computable. Build a basis of these cycle XORs. The set of achievable path XORs between two nodes = (tree path XOR) max_xor.
XOR CYCLE SPACE IN A GRAPH:
1 ---5--- 2 Spanning tree edges: 1-2(5), 2-3(3)
| / | Non-tree edge: 1-3(7) --> cycle XOR = 5^3^7 = 1
7 3 |
| / | Path 1->3 via tree: 5 ^ 3 = 6
3 -------- 3 Path 1->3 via non-tree: 7
Difference: 6 ^ 7 = 1 (= cycle XOR)
To minimize path XOR: tree_path_xor ^ (optimal subset of cycle basis)Problems: CF 845G
4. Online insertion with offline queries
Sort queries by time or index. Maintain a linear basis, inserting elements as you go. Each query only sees elements inserted so far.
Problems: CF 959F -- sort queries by right endpoint, insert array elements left to right.
5. Path XOR in trees
Root the tree. For each node
Problems: Tree XOR queries (various gym problems).
6. k-th smallest XOR value
Build basis, reduce, then use kth_smallest. Remember to handle whether 0 is achievable: if the number of inserted elements exceeds the basis size, then some element was dependent, meaning 0 is in the span. In that case, kth_smallest.
cpp
LinearBasis lb;
int n = a.size();
for (auto x : a) lb.insert(x);
lb.reduce();
bool zero_reachable = (lb.size() < n);
// if zero_reachable, k=1 is 0, k=2 is kth_smallest(1), etc.Problems: CF 1101G -- Zero XOR Subset (check if 0 is achievable, which means basis size <
7. Counting subsets with a given XOR
The number of subsets of
Problems: Various counting problems on CF.
Contest cheat sheet
+------------------------------------------------------------------+
| LINEAR BASIS (XOR BASIS) CHEAT SHEET |
+------------------------------------------------------------------+
| WHEN TO USE: |
| - Maximum/minimum XOR of a subset |
| - k-th smallest XOR value |
| - Count distinct achievable XOR values (2^rank) |
| - Check if value is XOR-representable |
| - XOR cycle space in graphs |
+------------------------------------------------------------------+
| KEY CODE: |
| bool insert(ll x) { |
| for (int i=59;i>=0;i--) { |
| if (!((x>>i)&1)) continue; |
| if (!basis[i]) { basis[i]=x; return true; } |
| x ^= basis[i]; |
| } |
| return false; |
| } |
+------------------------------------------------------------------+
| COMPLEXITY: |
| insert O(log V) | max_xor O(log V) |
| represent? O(log V) | kth_small O(log V) + reduce |
| merge O(log^2 V) | reduce O(log^2 V) |
| build O(n log V) | space O(log V) |
+------------------------------------------------------------------+
| REMEMBER: |
| - Call reduce() BEFORE kth_smallest |
| - Use long long (not int) for 60-bit values |
| - Distinct XOR count = 2^(basis size) |
| - Zero reachable iff sz < n (some element was dependent) |
+------------------------------------------------------------------+Gotchas and debugging
Using int instead of long long. Values can be up to 1LL << k must use long long. A bare 1 << 40 is undefined behavior.
Forgetting to reduce() before kth_smallest. Without reduction, basis elements share bits across slots. The bijection between
Off-by-one in basis size vs. element count. The basis size is the number of non-zero slots, not size() should count only successful insertions. Getting this wrong breaks the zero-reachability check and the
Empty basis edge case. If no elements are inserted (or all are 0), the basis is empty. max_xor() and min_xor() should return 0. kth_smallest with
Handling 0 in k-th smallest. Zero is always reachable when the basis rank is less than the number of inserted elements (some element was dependent). If 0 should be counted as the "first" value, subtract 1 from
cpp
if (zero_reachable) {
if (k == 1) { cout << 0; return; }
k--; // shift because 0 is the smallest
}
cout << lb.kth_smallest(k);Merging bases is
Re-inserting a value. Inserting the same value twice is harmless -- it reduces to 0 the second time. But if you track counts or indices, be careful.
Mental Traps
"The basis is unique -- insertion order doesn't matter." The span of the basis is unique (same vector space regardless of order), but the actual vectors stored in basis[] depend on insertion order. Insert {5, 7} vs {7, 5} and you get different entries. This never affects query results (max, min, representability), but it can confuse debugging when you compare basis arrays across two implementations.
INSERTION ORDER AFFECTS VECTORS, NOT SPAN:
Order A: insert 7(111), then 5(101) Order B: insert 5(101), then 7(111)
+----------------------------+ +----------------------------+
| basis[2] = 111 (from 7) | | basis[2] = 101 (from 5) |
| basis[1] = 010 (5^7=2) | | basis[1] = 010 (7^5=2) |
+----------------------------+ +----------------------------+
Different vectors ---- Same span = {0, 2, 5, 7}"Failed insertion means the element is useless." A failed insertion (
DEPENDENT != USELESS:
Elements: {7, 5, 3, 6} Basis rank = 3 Dependent: {6}
# subsets XOR-ing to any reachable target: 2^(4-3) = 2
XOR to 0: {} and {7,5,3,6}
XOR to 7: {7} and {5,3,6}
XOR to 2: {5,7} and {3,6}
Each dependent element DOUBLES the solution count.Self-Test
- [ ] Write XOR basis insertion from memory -- the loop from high bit to low, placement-vs-XOR branch
- [ ] Write the max-XOR greedy query and explain why higher bits dominate lower ones
- [ ] Given a basis of rank 3 built from 5 elements, state how many distinct XOR values are achievable (including 0) and how many subsets produce each value
- [ ] Explain why
kth_smallestrequiresreduce()first -- what breaks without reduced row echelon form? - [ ] Describe how to check if a value
is representable using the basis, without modifying it
Practice problems
| # | Problem | Source | Difficulty | Key Idea |
|---|---|---|---|---|
| 1 | Xor-MST | CF 888G | Hard (2300) | Divide-and-conquer on bits + basis |
| 2 | Shortest Path Problem? | CF 845G | Medium (2100) | Cycle XOR basis in graph |
| 3 | Zero XOR Subset | CF 1101G | Medium (1900) | Check if basis rank < |
| 4 | Mahmoud and Ehab and yet another xor task | CF 959F | Medium (2100) | Offline queries + basis |
| 5 | Linear Algebra Test | CF 678F | Medium (2100) | Basis over GF(2) |
| 6 | Two Teams Composing | CF 1335E2 | Easy (1800) | Basis counting |
| 7 | XOR on Segment | CF 242E | Medium (2000) | Segment tree + bitwise operations |
| 8 | Compatible Numbers | CF 165E | Medium (2200) | Bitmask DP / XOR structure |
Further reading
- CP-Algorithms: Linear Algebra over GF(2) -- general reference on Gaussian elimination for XOR
- Codeforces Blog: Linear Basis -- concise explanation with code
- Codeforces Blog: XOR basis tutorial -- detailed walkthrough with problems
- USACO Guide: Xor Basis -- additional problems and editorial-style explanations
The Aha Moment
A set of
The real "aha": building the XOR basis is just Gaussian elimination on binary numbers. Insert each number top-bit-first: if the highest set bit isn't claimed by an existing basis vector, this number becomes a new basis vector. If it is, XOR with that basis vector and continue. After processing all numbers, the basis has at most 60 elements, and you can answer "maximum XOR", "minimum XOR", "k-th smallest XOR", and "can we make value
Pattern Fingerprint
| Constraint / Signal | Technique |
|---|---|
| "Maximum XOR of any subset" | XOR basis -- greedily XOR basis vectors from high to low |
| "Can we obtain XOR value | Insert all values into basis; try to reduce |
| "Count distinct XOR values" | |
| "k-th smallest XOR of any subset" | Reduce basis to RREF, treat |
| "Maximum XOR path from | Any path XOR cycle XORs; insert all cycle XORs into basis |
| "Online insertion + max XOR query" | XOR basis supports |
| "XOR of subarray" with prefix XOR trick | Combine prefix XOR with XOR basis on a sliding window or offline |
Rating Progression
| CF Rating | What You Should Be Able To Do |
|---|---|
| 1200 | Know XOR properties: |
| 1500 | Understand that XOR basis exists and reduces |
| 1800 | Implement XOR basis for max/min/count queries; handle "XOR path in graph" via cycle space |
| 2100 | Implement k-th smallest XOR; combine XOR basis with data structures (segment tree, divide and conquer); handle online queries with incremental insertion |
Before You Code Checklist
- [ ] Initialize basis array to all zeros --
long long basis[60] = {} - [ ] Insert from the highest bit downward -- for each number, find its highest set bit, check if that bit's slot is occupied, XOR if so, and continue
- [ ] For "maximum XOR": greedily XOR each basis vector -- start with
, for each basis vector from high to low, set - [ ] For "k-th smallest": convert to reduced row echelon form -- each basis vector should have a unique leading bit with all other basis vectors having 0 in that position
- [ ] Don't forget the empty subset -- XOR of the empty subset is 0; whether this counts depends on the problem
What Would You Do If...?
Scenario 1: "Find the maximum XOR of a path between any two nodes in a graph." Any two paths between the same endpoints differ by a set of cycle XORs. So: find any spanning tree, compute path XOR from root to each node, then insert all "back edge" cycle XORs into an XOR basis. For a query
Scenario 2: "Find the k-th smallest XOR of any subset,
Scenario 3: "You need XOR basis queries on a subarray basis[r] represents elements
The Mistake That Teaches
Story: You're solving "maximum XOR of any subset." You build the basis, then to find the maximum, you XOR ALL basis vectors together. Test case: basis = {4, 2, 1} (binary: 100, 010, 001). XOR all:
Next test: basis = {7, 3} (binary: 111, 011). XOR all:
The fix: Use the greedy algorithm:
cpp
long long ans = 0;
for (int i = 59; i >= 0; i--)
if (basis[i]) ans = max(ans, ans ^ basis[i]);This works because ans ^ basis[i] > ans precisely when basis[i] has a set bit that ans doesn't -- and processing from high bits to low ensures we never undo a good choice.
Lesson: "XOR everything together" is NOT the maximum XOR. Use the greedy bit-by-bit approach.
Debug This
Bug 1:
Find the bug in this XOR basis insertion
cpp
long long basis[60];
void insert(long long val) {
for (int i = 59; i >= 0; i--) {
if (!(val >> i & 1)) continue;
if (basis[i]) {
val ^= basis[i];
} else {
basis[i] = val;
return;
}
}
// val became 0 -- linearly dependent, discard
}
// Maximum XOR query:
long long max_xor() {
long long ans = 0;
for (int i = 0; i < 60; i++) // low to high
ans = max(ans, ans ^ basis[i]);
return ans;
}Bug: The max_xor function iterates from low bits to high bits. This can produce suboptimal results because a low-bit basis vector might set a bit that a later high-bit vector would have set anyway (alongside other bits). The greedy algorithm must go from high to low to ensure the highest bits are set first.
Fix: Change to for (int i = 59; i >= 0; i--).
Bug 2:
Find the bug in this "can we make value X" check
cpp
bool can_make(long long x) {
for (int i = 59; i >= 0; i--) {
if (!(x >> i & 1)) continue;
if (basis[i]) x ^= basis[i];
else return true; // this bit can be set freely?
}
return x == 0;
}Bug: When basis[i] is 0 (no basis vector for bit true -- but this is backwards! If there's no basis vector for bit false.
Fix: Swap the return values:
cpp
else return false; // can't produce this bitAnd the final return x == 0; becomes return true; (if
Wait, actually the original final return x == 0 is correct for success. Just fix the else:
cpp
else return false;Bug 3:
Find the bug in this k-th smallest XOR
cpp
// After building basis, compute k-th smallest distinct XOR (1-indexed)
long long kth_xor(long long k) {
// Compact basis: collect non-zero basis vectors
vector<long long> b;
for (int i = 0; i < 60; i++)
if (basis[i]) b.push_back(basis[i]);
int r = b.size();
if (k > (1LL << r)) return -1; // only 2^r distinct values
long long ans = 0;
for (int i = 0; i < r; i++) {
if (k >> i & 1) ans ^= b[i];
}
return ans;
}Bug: This requires the basis to be in reduced row echelon form (RREF) for the bit-selector interpretation to work correctly. If the basis is in standard (non-reduced) form, basis vectors may share set bits, and the mapping from
Fix: Before calling kth_xor, reduce the basis to RREF:
cpp
for (int i = 59; i >= 0; i--)
if (basis[i])
for (int j = i + 1; j < 60; j++)
if (basis[j] >> i & 1) basis[j] ^= basis[i];Also: if the empty subset (XOR = 0) is a valid answer, adjust
Common Mistakes
- Querying k-th XOR without RREF. The standard basis form has shared lower bits between vectors. Reduce to RREF first or the bit-selector mapping is wrong.
- Forgetting the empty subset. XOR of zero elements is 0. If 0 is a valid answer, adjust k accordingly in k-th queries.
- Wrong bit width. Using 30 bits when values can reach 10^18 (need 60 bits).
- Inserting duplicates and expecting rank to increase. Inserting the same value twice reduces to 0 -- harmless but rank stays the same.
Historical Origin
The XOR basis is a specialization of Gaussian elimination over finite fields, a technique with roots in Évariste Galois's 19th-century work on field theory. The specific application to binary (GF(2)) problems in competitive programming became popular in the 2010s, driven by Codeforces problems requiring efficient XOR subset queries -- transforming what seems like an exponential search into a
Mnemonic Anchor
"60 basis vectors span every XOR -- insert high-bit-first, query greedily."
Recap and Cross-Links
| Concept | Link |
|---|---|
| Bit Manipulation | 07-bit-manipulation |
| Gaussian Elimination | 10-gaussian-elimination |
| Matrix Exponentiation | 09-matrix-exponentiation |
| Practice | Ladder - Mixed |