Lyndon Factorization

Quick Revisit
USE WHEN: Finding lexicographically smallest rotation or decomposing a string into Lyndon words
INVARIANT: s = w₁w₂…wₖ where each wᵢ is a Lyndon word and w₁ ≥ w₂ ≥ … ≥ wₖ lexicographically
TIME: O(n) | SPACE: O(1)
CLASSIC BUG: Not handling the case when s[j] < s[k] (new Lyndon word starts) vs s[j] > s[k] (emit factors of length period) — swapping the two produces wrong factorization
PRACTICE: 07-ladder-strings

Duval's algorithm decomposes any string into Lyndon words in $O (n)$ time and $O (1)$ space, enabling minimum rotation and cyclic string comparisons.

Difficulty	Prerequisites
Advanced (CF 1800-2100)	KMP and Z-Function

From Rotations to Lyndon Words
Watching the Algorithm Work
More Worked Examples
Relevant String Operations
Implementation
Complexity at a Glance
Problem Patterns and Tricks
Contest Cheat Sheet
Gotchas and Debugging
Self-Test
Practice Problems
Further Reading

From Rotations to Lyndon Words

Suppose you need the lexicographically smallest rotation of "cabba". The five rotations are:

Start	Rotation
0	`cabba`
1	`abbac`
2	`bbaca`
3	`bacab`
4	`acabb`

The answer is "abbac" (start index 1). The brute-force way generates all $n$ rotations and sorts them -- $O (n^{2} \log n)$ at best. We can do it in $O (n)$ with zero hassle, and the road there passes through a beautiful concept called a Lyndon word.

Lyndon words

A string $w$ is a Lyndon word if it is strictly less than every one of its proper suffixes. "Strictly" is the key word -- equal doesn't count.

String	Lyndon?	Why
`"a"`	Yes	No proper suffixes to violate the rule
`"ab"`	Yes	`"ab"` $<$ `"b"`
`"aab"`	Yes	`"aab"` $<$ `"ab"` $<$ `"b"`
`"abcd"`	Yes	Strictly increasing -- always smaller than any suffix
`"ba"`	No	`"ba"` $>$ `"a"`
`"aa"`	No	`"aa"` $>$ `"a"` (prefix comparison: `"a"` is shorter)
`"abab"`	No	Not primitive -- repeats `"ab"`

  Checking if "aab" is a Lyndon word:

  "aab" vs all proper suffixes:
  
  "aab"  <  "ab"  ?   a < a? equal -> a < b? YES  OK
  "aab"  <  "b"   ?   a < b? YES                  OK
  
  All suffixes are strictly larger -> "aab" IS Lyndon
  
  Checking if "abab" is a Lyndon word:
  
  "abab" <  "bab"  ?  a < b? YES                  OK
  "abab" <  "ab"   ?  a=a, b=b, a=end -> "ab" shorter
                       "abab" > "ab"               X  FAIL
  
  Not all suffixes are strictly larger -> "abab" is NOT Lyndon

Two handy equivalences for a Lyndon word $w$ :

$w$ is the unique smallest rotation of itself. Every other rotation is strictly larger.
$w$ is primitive (cannot be written as a shorter string repeated) and is the smallest rotation of itself.

If either condition fails, $w$ isn't Lyndon.

The factorization theorem

Every string $s$ has a unique decomposition

s = w_{1} w_{2} \dots w_{k}

where each $w_{i}$ is a Lyndon word and

w_{1} \geq w_{2} \geq \dots \geq w_{k}

(lexicographically non-increasing). This is the Lyndon factorization of $s$ .

Let's work through "abbaabba" by hand. We need to split it into non-increasing Lyndon words:

Start scanning from the left. Is "a" Lyndon? Yes (single char). Can we extend? "ab" is Lyndon ("ab" $<$ "b"). Keep going: "abb" -- check suffixes: "abb" $<$ "bb" $<$ "b". Still Lyndon.
Try extending to "abba". Check "abba" against suffix "a": "abba" $>$ "a". That breaks it -- "abba" is not Lyndon.
First Lyndon word: "abb".
Remaining string: "aabba". Try "a", "aa" (not Lyndon -- "aa" $>$ "a"). Try "aab": "aab" $<$ "ab" $<$ "b". Lyndon. Try "aabb": "aabb" $<$ "abb" $<$ "bb" $<$ "b". Still Lyndon.
Try "aabba" -- suffix "a": "aabba" $>$ "a". Not Lyndon.
Second Lyndon word: "aabb".
Remaining: "a". Single character -- always Lyndon. Third word: "a".

Result: "abb" | "aabb" | "a".

Check non-increasing: "abb" $>$ "aabb" (position 1: b $>$ a) and "aabb" $>$ "a". Correct.

Duval's algorithm

Chen, Fox, and Lyndon proved the factorization theorem exists; Duval gave an $O (n)$ time, $O (1)$ extra space algorithm to compute it. The algorithm uses three variables:

$i$ : start of the current group being factorized -- this never moves backward.
$j$ : scanning position, moving right through the string.
$k$ : comparison position within the candidate Lyndon word pattern. The current candidate Lyndon word is $s [i . . j - 1]$ , and $s [k]$ is the position in that candidate that the new character $s [j]$ is being compared against.

Starting with $j = i + 1$ and $k = i$ , the algorithm compares $s [j]$ with $s [k]$ and handles three cases. The two unequal cases are the heart of Duval -- swap them and the algorithm produces wrong factorizations. The mental picture for each:

$s [j] = s [k]$ (repetition tracking). The new character matches the corresponding position in our candidate Lyndon word. We might be inside a repeated copy of the same word. Advance both $j$ and $k$ -- $k$ moves through the candidate word's positions in lockstep, $j$ moves through the actual string.

$s [j] > s [k]$ (extend a better candidate). The new character is larger than what the candidate word would have predicted. This is good news for the Lyndon property: a single character larger than every previous character strictly extends the candidate to a longer Lyndon word covering all of $s [i . . j]$ . Reset $k$ back to $i$ -- future characters are now compared against the start of this new, longer word -- and advance $j$ .

$s [j] < s [k]$ (emit factors of the discovered period and restart). The new character is smaller than predicted. The current Lyndon word cannot extend any further: appending $s [j]$ would put a smaller character in a position where the word's structure required something at least as large. At this moment the word has fully crystallized: its length is the period $ℓ = j - k$ , and the segment $s [i . . j - 1]$ contains some number of complete copies of it (one or more). Emit each complete copy, advance $i$ past them all, and restart the outer scan from there.

The asymmetry between the two unequal cases is what makes the algorithm correct: $s [j] > s [k]$ means the candidate is still growing, $s [j] < s [k]$ means the candidate is finished and its repetitions need to be output. If you mix them up -- e.g., reset $k$ on the smaller case -- the algorithm produces non-Lyndon factors or the wrong number of factors.

The whole thing terminates when $i$ reaches $n$ , and the total work across all iterations is $O (n)$ .

  How i, j, k evolve on a tiny string s = "ab a ab" (= "abaab"):

  +---+---+---+--------+--------+---------+----------------------------+
  | i | j | k |  s[j]  |  s[k]  | compare | action                     |
  +---+---+---+--------+--------+---------+----------------------------+
  | 0 | 1 | 0 |   b    |   a    |  j > k  | extend: k -> 0, j -> 2     |
  | 0 | 2 | 0 |   a    |   a    |  j = k  | repeat: k -> 1, j -> 3     |
  | 0 | 3 | 1 |   a    |   b    |  j < k  | BREAK; period = j - k = 2  |
  +---+---+---+--------+--------+---------+----------------------------+
  Emit: copies of length 2 starting at i=0 while i <= k=1.
        First emit "ab" (positions 0..1), i -> 2.  Now i = 2 > k = 1, stop.

  +---+---+---+--------+--------+---------+----------------------------+
  | 2 | 3 | 2 |   a    |   a    |  j = k  | repeat: k -> 3, j -> 4     |
  | 2 | 4 | 3 |   b    |   a    |  j > k  | extend: k -> 2, j -> 5=n   |
  +---+---+---+--------+--------+---------+----------------------------+
  j reached end. period = j - k = 5 - 2 = 3.
  Emit one copy of length 3 starting at i=2: "aab" (positions 2..4), i -> 5.

  Result for "abaab":  "ab" | "aab"
  Check non-increasing: "ab" >= "aab" (b > a at pos 1)  OK

Minimum rotation via Lyndon factorization

Here's the key trick: to find the smallest rotation of $s$ , run Duval on the conceptual doubled string $s + s$ (using modular indexing to avoid actually allocating it). Since the Lyndon factorization of $s + s$ is non-increasing, the last Lyndon word boundary that starts before position $n$ marks the start of the minimum rotation.

Why? Later words in the factorization are lexicographically $\leq$ earlier ones. The last boundary before position $n$ corresponds to the smallest Lyndon word that aligns with a valid rotation start -- and a Lyndon word is the smallest rotation of itself.

This gives $O (n)$ time and $O (1)$ extra space.

What the Code Won't Teach You

Why Duval's algorithm is truly $O (n)$ . The 15-line implementation hides an amortized argument: variable $i$ (the start of the current group) never decreases and advances a total of exactly $n$ across all iterations. When the comparison $s [j] < s [k]$ triggers a break, $i$ jumps forward by at least one Lyndon-word length. The inner comparisons are charged to the advancement of $j$ , which also never retreats. So the total work across all rounds is $O (n)$ -- but you'll never see this from reading the code alone.

The Burrows-Wheeler connection. Sorting all rotations of a string is the core of the Burrows-Wheeler Transform (used in bzip2 compression). The SA-IS suffix array algorithm classifies suffixes into S-type and L-type, mirroring Lyndon-word structure. Understanding Lyndon factorization gives you a second lens on why suffix array construction works.

Counting Lyndon words is a combinatorics problem. The number of Lyndon words of length $n$ over a $k$ -letter alphabet is $\frac{1}{n} \sum_{d ∣ n} μ (n / d) k^{d}$ (Möbius inversion). This formula appears in necklace-counting problems and connects string algorithms to number theory -- a bridge the implementation alone never reveals.

The "aha moment": uniqueness of the factorization. The Chen-Fox-Lyndon theorem says every string has a unique factorization into Lyndon words $w_{1}, w_{2}, \dots, w_{k}$ with $w_{1} \geq w_{2} \geq \dots \geq w_{k}$ . This isn't just an existence result -- it's what makes the factorization useful. Because the decomposition is unique and canonical, it serves as a normal form: two strings have the same Lyndon factorization if and only if they share specific structural properties. This canonicality is what makes it useful as a building block in proofs and reductions, not just as an algorithm.

Contest frequency is low, but when it appears, nothing else works. Lyndon factorization shows up in maybe 1 in 200 contest problems -- but those problems are essentially unsolvable without it. The classic signal is "find the lexicographically smallest rotation" (direct application), but subtler appearances include string comparison problems where you need a canonical decomposition, or problems involving primitive roots of strings. The technique is worth knowing at the 1900+ level precisely because it's rare enough that most competitors don't know it, giving you a decisive edge when it appears.

Minimum rotation is the gateway application. The most common contest use of Lyndon factorization: given a string $s$ , find the starting index of its lexicographically smallest rotation. Build the Lyndon factorization of $s + s$ , and the answer is the starting position of the last Lyndon factor that begins before index $n$ . This runs in $O (n)$ time and $O (1)$ space -- strictly better than the $O (n)$ time $O (n)$ space of Booth's algorithm below. If you know Duval's algorithm, you get minimum rotation for free.

Booth's algorithm

Booth's algorithm is an independent $O (n)$ approach to minimum rotation that uses a modified failure function, similar to KMP's prefix function. It processes the doubled string $s + s$ , maintaining:

A failure function array $f$ (like KMP's), indexed relative to the current best candidate.
A candidate starting position $k$ for the best rotation seen so far.

At each step, when a character comparison fails:

If the new character is smaller than expected, we've found a potentially better starting position -- update $k$ .
Otherwise, fall back using the failure function, just like KMP.

Booth's uses $O (n)$ extra space for the failure function array. Since the Duval-based approach achieves the same result with $O (1)$ space, it is generally preferable in contests. But Booth's is still worth knowing -- some older editorials reference it, and the KMP connection deepens your intuition for failure functions.

Watching the Algorithm Work

Duval's algorithm on "abbaabba"

  String:  a   b   b   a   a   b   b   a
  Index:   0   1   2   3   4   5   6   7

  Round 1 (i = 0):
  +----+----+------+------+---------+-----------+
  |  j |  k | s[j] | s[k] | Compare | Action    |
  +----+----+------+------+---------+-----------+
  |  1 |  0 |  b   |  a   |  j > k  | k=0, ext  |
  |  2 |  0 |  b   |  a   |  j > k  | k=0, ext  |
  |  3 |  0 |  a   |  a   |  j = k  | k=1, rep  |
  |  4 |  1 |  a   |  b   |  j < k  | BREAK     |
  +----+----+------+------+---------+-----------+
  len = j - k = 4 - 1 = 3
  Output: "abb"  (positions 0..2)
  i advances to 3

  Round 2 (i = 3):
  +----+----+------+------+---------+-----------+
  |  j |  k | s[j] | s[k] | Compare | Action    |
  +----+----+------+------+---------+-----------+
  |  4 |  3 |  a   |  a   |  j = k  | k=4, rep  |
  |  5 |  4 |  b   |  a   |  j > k  | k=3, ext  |
  |  6 |  3 |  b   |  a   |  j > k  | k=3, ext  |
  |  7 |  3 |  a   |  a   |  j = k  | k=4, rep  |
  |  8 |    |      |      | j >= n  | END       |
  +----+----+------+------+---------+-----------+
  len = j - k = 8 - 4 = 4
  Output: "aabb"  (positions 3..6)
  i advances to 7

  Round 3 (i = 7):
  j = 8 >= n immediately.
  Output: "a"  (position 7)
  i advances to 8.  Done.

Factorization result

  a   b   b | a   a   b   b | a
  +---------+---------------+---+
  |  "abb"  |    "aabb"     |"a"|
  +---------+---------------+---+
    Lyndon   >=   Lyndon   >= L

All three are Lyndon words, and they appear in non-increasing order.

Minimum rotation of "cabba"

  s = "cabba"  (n = 5)

  Doubled (conceptual):  c  a  b  b  a  c  a  b  b  a
                          0  1  2  3  4  5  6  7  8  9

  Duval on doubled string (modular indexing):

  Round 1: i=0
    s[0]='c' vs s[1]='a' -> 'c' > 'a' -> BREAK immediately
    Output: "c" (len 1), i -> 1

  Round 2: i=1   (i < n=5, so ans = 1)
    Scans j = 2..9, building the Lyndon word "abbac"
    s[1]='a' < s[2]='b' -> extend
    s[1]='a' < s[3]='b' -> extend
    s[1]='a' = s[4]='a' -> repetition track, k -> 2
    s[2]='b' < s[5]='c' -> extend (k resets to 1)
    ...continues matching through j=10 -> END
    Output: "abbac" (len 5), i -> 6

  i = 6 >= n = 5, outer loop exits.
  Last boundary before n: position 1.

  Answer: rotation starting at index 1 = "abbac"

  +-------+---------+
  | Start | Rotation|
  +-------+---------+
  |   0   | cabba   |
  |   1   | abbac   | <-- minimum
  |   2   | bbaca   |
  |   3   | bacab   |
  |   4   | acabb   |
  +-------+---------+

More Worked Examples

These additional examples hit the edge cases that trip people up in contests.

All identical characters: "aaaa"

Every single character is its own Lyndon word because "aa" is NOT Lyndon ("aa" $>$ "a" by prefix comparison). The algorithm sees nothing but equality, and when $j$ reaches the end of the string, it outputs single-character words for the entire length.

  String:  a   a   a   a
  Index:   0   1   2   3

  Round 1 (i = 0):
  +----+----+------+------+---------+-----------+
  |  j |  k | s[j] | s[k] | Compare | Action    |
  +----+----+------+------+---------+-----------+
  |  1 |  0 |  a   |  a   |  j = k  | k=1, rep  |
  |  2 |  1 |  a   |  a   |  j = k  | k=2, rep  |
  |  3 |  2 |  a   |  a   |  j = k  | k=3, rep  |
  |  4 |    |      |      | j >= n  | END       |
  +----+----+------+------+---------+-----------+
  len = j - k = 4 - 3 = 1
  Output: "a" (pos 0), "a" (pos 1), "a" (pos 2), "a" (pos 3)
  i advances to 4.  Done.

  Result: "a" | "a" | "a" | "a"

  Each is trivially Lyndon.
  Non-increasing: "a" >= "a" >= "a" >= "a"  OK

This is the most common edge case to test -- if your implementation gets stuck in an infinite loop on all-identical input, the inner while condition is wrong.

Repeated Lyndon word: "ababab"

When a string is a perfect repetition of a Lyndon word, Duval's algorithm detects the pattern through its equality-tracking mechanism. Variable $k$ advances in lockstep with $j$ , and when $j$ reaches the end, the algorithm knows exactly how many copies it found.

  String:  a   b   a   b   a   b
  Index:   0   1   2   3   4   5

  Round 1 (i = 0):
  +----+----+------+------+---------+-----------+
  |  j |  k | s[j] | s[k] | Compare | Action    |
  +----+----+------+------+---------+-----------+
  |  1 |  0 |  b   |  a   |  j > k  | k=0, ext  |
  |  2 |  0 |  a   |  a   |  j = k  | k=1, rep  |
  |  3 |  1 |  b   |  b   |  j = k  | k=2, rep  |
  |  4 |  2 |  a   |  a   |  j = k  | k=3, rep  |
  |  5 |  3 |  b   |  b   |  j = k  | k=4, rep  |
  |  6 |    |      |      | j >= n  | END       |
  +----+----+------+------+---------+-----------+
  len = j - k = 6 - 4 = 2
  Output: "ab" (pos 0), "ab" (pos 2), "ab" (pos 4)
  i advances to 6.  Done.

  Result: "ab" | "ab" | "ab"

  Each factor is the Lyndon word "ab".
  Non-increasing: "ab" >= "ab" >= "ab"  OK

Notice how the algorithm never backtracks. Variable $j$ scans forward once; the equality branch tracks which "copy number" we're seeing, and at the end the word length $ℓ = j - k$ tells us the period. This is how Lyndon factorization reveals periodic structure for free.

Decreasing string: "dcba"

When every character is smaller than the one before it, nothing can extend -- each character breaks immediately.

  String:  d   c   b   a
  Index:   0   1   2   3

  Round 1: i=0, j=1, k=0
    s[1]='c' < s[0]='d' -> BREAK
    Output: "d" (pos 0),  i -> 1

  Round 2: i=1, j=2, k=1
    s[2]='b' < s[1]='c' -> BREAK
    Output: "c" (pos 1),  i -> 2

  Round 3: i=2, j=3, k=2
    s[3]='a' < s[2]='b' -> BREAK
    Output: "b" (pos 2),  i -> 3

  Round 4: i=3, j=4 >= n -> END
    Output: "a" (pos 3),  i -> 4

  Result: "d" | "c" | "b" | "a"
  Non-increasing: d >= c >= b >= a  OK

A fully decreasing string produces the maximum number of Lyndon factors -- one per character. Conversely, a fully increasing string like "abcd" is itself a single Lyndon word (since "abcd" $<$ every proper suffix).

A real-world string: "banana"

This hits an interesting case where the algorithm finds multiple copies of the same Lyndon word embedded in a natural word.

  String:  b   a   n   a   n   a
  Index:   0   1   2   3   4   5

  Round 1 (i = 0):
    j=1, k=0: s[1]='a' < s[0]='b' -> BREAK
    len = 1,  Output: "b" (pos 0),  i -> 1

  Round 2 (i = 1):
  +----+----+------+------+---------+-----------+
  |  j |  k | s[j] | s[k] | Compare | Action    |
  +----+----+------+------+---------+-----------+
  |  2 |  1 |  n   |  a   |  j > k  | k=1, ext  |
  |  3 |  1 |  a   |  a   |  j = k  | k=2, rep  |
  |  4 |  2 |  n   |  n   |  j = k  | k=3, rep  |
  |  5 |  3 |  a   |  a   |  j = k  | k=4, rep  |
  |  6 |    |      |      | j >= n  | END       |
  +----+----+------+------+---------+-----------+
  len = j - k = 6 - 4 = 2
  Output: "an" (pos 1), "an" (pos 3)
  i advances to 5.  But wait -- i=5 > k=4, so the while(i<=k) loop
  only outputs two copies and leaves i=5.

  Round 3 (i = 5):
    j=6 >= n immediately.
    Output: "a" (pos 5),  i -> 6.  Done.

  Result: "b" | "an" | "an" | "a"

  Check: "b" >= "an"? (b > a at position 0)   OK
         "an" >= "an"? (equal)                  OK
         "an" >= "a"?  (equal prefix, "an" longer) OK

  All Lyndon: "b" OK, "an" (an < n) OK, "a" OK

Minimum rotation of "dbacd"

A slightly trickier minimum-rotation example showing how Duval on the doubled string tracks the answer.

  s = "dbacd"  (n = 5)

  Doubled (conceptual):  d  b  a  c  d  d  b  a  c  d
                          0  1  2  3  4  5  6  7  8  9

  Round 1: i=0
    s[0]='d' vs s[1]='b' -> 'd' > 'b' -> BREAK
    Output: "d" (len 1),  i -> 1

  Round 2: i=1   (i < n=5, so ans = 1)
    s[1]='b' vs s[2]='a' -> 'b' > 'a' -> BREAK
    Output: "b" (len 1),  i -> 2

  Round 3: i=2   (i < n=5, so ans = 2)
    s[2]='a' vs s[3]='c' -> extend (c > a), k=2
    s[2]='a' vs s[4]='d' -> extend (d > a), k=2
    s[2]='a' vs s[5]='d' -> extend (d > a), k=2
    s[2]='a' vs s[6]='b' -> extend (b > a), k=2
    s[2]='a' vs s[7]='a' -> equal, k=3
    s[3]='c' vs s[8]='c' -> equal, k=4
    s[4]='d' vs s[9]='d' -> equal, k=5
    j=10 >= 2n -> END
    len = 10 - 5 = 5
    Output: "acddb" (positions 2..6), i -> 7

  i = 7 >= n = 5, outer loop exits.
  Last boundary before n: position 2.

  Answer: rotation starting at index 2 = "acddba"

  Wait -- let's verify:
  +-------+-----------+
  | Start | Rotation  |
  +-------+-----------+
  |   0   | dbacd     |
  |   1   | bacdd     |
  |   2   | acddb     | <-- minimum
  |   3   | cddba     |
  |   4   | ddbac     |
  +-------+-----------+

  "acddb" is the lexicographically smallest.  OK

Relevant String Operations

These standard library features are useful alongside Lyndon factorization:

Operation	Syntax	Notes
Lexicographic compare	`s.compare(pos, len, t, pos2, len2)`	Returns $< 0$ , $0$ , $> 0$
Substring	`s.substr(pos, len)`	$O (len)$ copy
Rotate in-place	`rotate(s.begin(), s.begin()+k, s.end())`	Left-rotate by $k$
Concatenation	`s + s`	For doubled-string tricks
Operator `<`	`s < t`	Lexicographic; $O (min (\| s \|, \| t \|))$

In contest code you rarely call these -- the algorithms below work character by character. But rotate is handy for verification and testing.

Implementation

Contest template

Short, copy-paste ready. Three functions covering all common needs.

cpp

#include <bits/stdc++.h>
using namespace std;

vector<string> duval(const string &s) {
    int n = s.size();
    vector<string> res;
    int i = 0;
    while (i < n) {
        int j = i + 1, k = i;
        while (j < n && s[k] <= s[j]) {
            if (s[k] < s[j]) k = i;
            else k++;
            j++;
        }
        while (i <= k) {
            res.push_back(s.substr(i, j - k));
            i += j - k;
        }
    }
    return res;
}

int min_rotation(const string &s) {
    int n = s.size();
    int i = 0, ans = 0;
    while (i < n) {
        ans = i;
        int j = i + 1, k = i;
        while (j < 2 * n && s[k % n] <= s[j % n]) {
            if (s[k % n] < s[j % n]) k = i;
            else k++;
            j++;
        }
        while (i <= k) i += j - k;
    }
    return ans;
}

int booth(const string &s) {
    int n = s.size();
    string t = s + s;
    vector<int> f(2 * n, -1);
    int k = 0;
    for (int j = 1; j < 2 * n; j++) {
        int i = f[j - 1 - k];
        while (i != -1 && t[j] != t[k + i + 1]) {
            if (t[j] < t[k + i + 1]) k = j - i - 1;
            i = f[i];
        }
        if (t[j] != t[k + i + 1]) {
            if (t[j] < t[k + i + 1]) k = j;
            f[j - k] = -1;
        } else {
            f[j - k] = i + 1;
        }
    }
    return k;
}

int main() {
    string s;
    cin >> s;

    auto fact = duval(s);
    for (auto &w : fact) cout << "[" << w << "]";
    cout << "\n";

    int idx1 = min_rotation(s);
    int idx2 = booth(s);
    cout << "Min rotation at " << idx1 << ": "
         << s.substr(idx1) + s.substr(0, idx1) << "\n";
    cout << "Booth confirms: " << idx2 << "\n";
}

Explained version

Same algorithms with detailed commentary on every step.

cpp

#include <bits/stdc++.h>
using namespace std;

/*
 * Duval's algorithm for Lyndon factorization.
 *
 * Returns the unique decomposition s = w1 w2 ... wk where each wi
 * is a Lyndon word and w1 >= w2 >= ... >= wk lexicographically.
 *
 * Time: O(n)   Space: O(1) beyond the output vector
 */
vector<string> duval(const string &s) {
    int n = s.size();
    vector<string> res;
    int i = 0;

    while (i < n) {
        // j scans forward; k tracks our position inside the candidate
        // Lyndon word for comparison (starts at the word's beginning).
        int j = i + 1;
        int k = i;

        while (j < n && s[k] <= s[j]) {
            if (s[k] < s[j]) {
                // s[j] is strictly larger than the corresponding position
                // in the candidate word. The entire segment i..j becomes
                // one bigger Lyndon word. Reset k to start comparing
                // future characters against the beginning of this word.
                k = i;
            } else {
                // s[j] matches exactly. We might be seeing another copy
                // of the same Lyndon word repeating. Advance k to keep
                // tracking which position in the pattern we're matching.
                k++;
            }
            j++;
        }

        // Now s[j] < s[k] or j reached end-of-string.
        // The Lyndon word has length (j - k). Positions i..j-1 contain
        // one or more complete copies of it (plus maybe a prefix that
        // belongs to the next group -- but i won't reach that far).
        int len = j - k;
        while (i <= k) {
            res.push_back(s.substr(i, len));
            i += len;
        }
        // i has moved past all complete copies. Any leftover partial
        // match is already at the correct position for the next round.
    }

    return res;
}

/*
 * Minimum rotation via Duval on the doubled string.
 *
 * Conceptually runs Lyndon factorization on s+s using modular indexing
 * (no actual concatenation). The last Lyndon word boundary that starts
 * before position n is the answer.
 *
 * Time: O(n)   Space: O(1)
 */
int min_rotation(const string &s) {
    int n = s.size();
    if (n == 0) return 0;

    int i = 0, ans = 0;

    while (i < n) {
        // Every time we enter this loop with i < n, we have a valid
        // rotation candidate. Record it -- the last one wins because
        // the factorization is non-increasing.
        ans = i;

        int j = i + 1, k = i;

        // Same Duval logic, but on a string of length 2n accessed
        // via modular indexing. This lets us "see" rotations that
        // wrap around the original string boundary.
        while (j < 2 * n && s[k % n] <= s[j % n]) {
            if (s[k % n] < s[j % n]) k = i;
            else k++;
            j++;
        }

        // Advance i past all complete copies of this Lyndon word.
        while (i <= k) i += j - k;
    }

    return ans;
}

/*
 * Booth's algorithm -- minimum rotation via modified failure function.
 *
 * Processes the doubled string s+s, maintaining a candidate start k.
 * The failure function f[] is built relative to the current candidate,
 * similar to KMP's prefix function but updated whenever a smaller
 * rotation is discovered.
 *
 * Time: O(n)   Space: O(n)
 */
int booth(const string &s) {
    int n = s.size();
    if (n == 0) return 0;

    string t = s + s;
    vector<int> f(2 * n, -1);
    int k = 0;  // best starting position so far

    for (int j = 1; j < 2 * n; j++) {
        // i is the fallback position relative to candidate k.
        int i = f[j - 1 - k];

        // Walk back through the failure chain on mismatch.
        while (i != -1 && t[j] != t[k + i + 1]) {
            // If t[j] is smaller, we found a better candidate.
            if (t[j] < t[k + i + 1]) k = j - i - 1;
            i = f[i];
        }

        if (t[j] != t[k + i + 1]) {
            // Still mismatched after exhausting the failure chain.
            if (t[j] < t[k + i + 1]) k = j;
            f[j - k] = -1;
        } else {
            // Characters match -- extend the failure function.
            f[j - k] = i + 1;
        }
    }

    return k;
}

int main() {
    string s;
    cin >> s;

    cout << "Lyndon factorization:\n";
    auto fact = duval(s);
    for (auto &w : fact) cout << "  [" << w << "]";
    cout << "\n";

    int idx = min_rotation(s);
    string rotated = s.substr(idx) + s.substr(0, idx);
    cout << "Smallest rotation: \"" << rotated
         << "\" (starting at index " << idx << ")\n";

    int idx2 = booth(s);
    cout << "Booth confirms: index " << idx2 << "\n";
}

Complexity at a Glance

Operation	Time	Space	Notes
Lyndon factorization (Duval)	$O (n)$	$O (1)$ *	*Excluding the output vector
Min rotation (Duval, mod-index)	$O (n)$	$O (1)$	Preferred in contests
Min rotation (Booth)	$O (n)$	$O (n)$	Failure array of size $2 n$
Min rotation (Duval on `s+s`)	$O (n)$	$O (n)$	Explicit string doubling
Naive all-rotations sort	$O (n^{2} \log n)$	$O (n^{2})$	For comparison only

All three $O (n)$ methods make a single pass (or equivalent) and involve no hashing, no suffix structures, and no recursion. Duval with modular indexing is the lightest -- no allocations at all.

Problem Patterns and Tricks

1. Minimum cyclic rotation

The bread and butter. Given a string, find the lexicographically smallest rotation. Call min_rotation() and you're done.

cpp

string smallest_rotation(const string &s) {
    int k = min_rotation(s);
    return s.substr(k) + s.substr(0, k);
}

Appears whenever a problem says "cyclic" and "lexicographic" in the same sentence -- necklaces, circular DNA sequences, rotating lock combinations.

Why it works -- the deeper explanation. The Lyndon factorization of $s + s$ is a non-increasing sequence of Lyndon words. A Lyndon word is the unique smallest rotation of itself. So among all the words in the factorization of $s + s$ , the last one starting before position $n$ corresponds to the lexicographically smallest segment that aligns with a valid rotation boundary. The "non-increasing" guarantee means later = smaller-or-equal, so the last boundary wins.

  Why "last boundary before n" works:

  s = "cabba" -> s+s = "cabbacabba"

  Factorization of s+s:
    "c" | "abbac" | "abba"
     ^       ^
     i=0     i=1 (< n=5, record ans=1)

  "c" >= "abbac" >= "abba"  (non-increasing)

  The boundary at i=1 gives rotation "abbac" -- which IS the smallest.
  We don't need to compare all rotations; the non-increasing property
  guarantees the last valid boundary is the best.

Problems: CSES 1110 "Minimal Rotation", SPOJ MINMOVE

2. Necklace equivalence

Two strings represent the same "necklace" if one is a rotation of the other. To test equivalence, compute the canonical (smallest) rotation of each and compare.

cpp

bool same_necklace(const string &a, const string &b) {
    if (a.size() != b.size()) return false;
    int n = a.size();
    int ka = min_rotation(a), kb = min_rotation(b);
    for (int i = 0; i < n; i++)
        if (a[(ka + i) % n] != b[(kb + i) % n])
            return false;
    return true;
}

To group $m$ strings into necklace equivalence classes, compute all canonical forms in $O (m n)$ total, then sort and deduplicate.

Worked example -- counting distinct necklaces.

Given {"abc", "bca", "cab", "xyz", "zxy"}, how many distinct necklaces?

  Canonical rotations:
    "abc" -> min_rotation = 0 -> "abc"
    "bca" -> min_rotation = 1 -> "abc"    (same necklace!)
    "cab" -> min_rotation = 2 -> "abc"    (same necklace!)
    "xyz" -> min_rotation = 0 -> "xyz"
    "zxy" -> min_rotation = 1 -> "xyz"    (same necklace!)

  Distinct canonical forms: {"abc", "xyz"}
  Answer: 2 distinct necklaces.

This shows up in chemistry (molecular ring isomers), music theory (pitch class sets), and combinatorics. The $O (n)$ per-string cost of min_rotation makes it practical even for large datasets.

Problems: Kattis "minrotation", classical necklace counting

  Necklace equivalence via canonical rotation:

  String A: "c a b"        String B: "b c a"
            ╭───╮                    ╭───╮
            │ c │                    │ b │
           ╱     ╲                  ╱     ╲
         ╭───╮ ╭───╮            ╭───╮ ╭───╮
         │ b │ │ a │            │ a │ │ c │
         ╰───╯ ╰───╯            ╰───╯ ╰───╯

  min_rotation("cab") = 1 -> "abc"
  min_rotation("bca") = 2 -> "abc"

  Same canonical form -> same necklace OK

3. Cyclic substring matching

Given a pattern $p$ and a text $t$ , find all positions where any rotation of $p$ occurs as a substring of $t$ .

Strategy: compute the canonical rotation of $p$ , double $t$ to $t + t$ , and search $t + t$ for the canonical form using KMP or Z-function. This handles the wrap-around nature of rotations.

For the classic variant -- counting distinct rotations of $p$ that appear in $t$ -- the approach combines Lyndon factorization with a suffix automaton:

cpp

// Count how many distinct rotations of p appear as substrings of t
int count_cyclic_occurrences(const string &t, const string &p) {
    int n = p.size();
    // Build suffix automaton on t
    // Feed each rotation of p (= substrings of p+p of length n)
    // into the automaton, collecting distinct matches.
    // Lyndon factorization helps identify the canonical
    // starting point and avoid redundant checks.
    // See CF 235C for the full approach.
}

The key insight: rather than checking all $n$ rotations separately, the suffix automaton lets you slide through them in $O (n)$ total by advancing one character and dropping one character from the current window.

Problems: CF 235C "Cyclical Quest"

4. Lexicographic comparison of cyclic strings

Sometimes the problem isn't to find the minimum rotation but to compare two cyclic strings. Are all rotations of $a$ lexicographically less than all rotations of $b$ ? Or: given two positions $i, j$ in a circular string, which rotation is smaller?

The approach: compute min_rotation for both strings. If you need to compare arbitrary rotations (not just the minimum), you can combine Lyndon factorization with suffix arrays -- but for most contest problems, canonicalization via min_rotation + direct comparison is sufficient.

cpp

// Compare cyclic strings: is every rotation of a < every rotation of b?
// Equivalent to: canonical(a) < canonical(b)
bool cyclic_less(const string &a, const string &b) {
    int ka = min_rotation(a), kb = min_rotation(b);
    int n = a.size(), m = b.size();
    for (int i = 0; i < min(n, m); i++) {
        if (a[(ka + i) % n] != b[(kb + i) % m])
            return a[(ka + i) % n] < b[(kb + i) % m];
    }
    return n < m;
}

5. Suffix array connection

The Lyndon factorization is deeply connected to suffix arrays. In the SA-IS algorithm (one of the fastest linear-time suffix array constructions), the classification of suffixes into S-type and L-type mirrors the Lyndon word structure.

More directly: all proper suffixes of a Lyndon word are lexicographically larger than the word itself. This means suffixes within a Lyndon factor appear in a predictable order in the suffix array, which some problems exploit for faster construction or querying.

  Connection between Lyndon factors and suffix array ordering:

  s = "abb|aabb|a"   (Lyndon factorization from earlier)

  Within "abb" (a Lyndon word), suffixes are:
    "abb" < "bb" < "b"
  These appear in this relative order in the suffix array.

  Within "aabb" (a Lyndon word), suffixes are:
    "aabb" < "abb" < "bb" < "b"
  Again, the word itself comes first among its suffixes.

  This ordering property is what SA-IS exploits for linear-time
  suffix array construction.

6. String periodicity analysis

The Lyndon factorization reveals periodic structure. If the factorization of $s$ consists of $k$ copies of the same Lyndon word $w$ followed by a prefix of $w$ , then $s$ has period $| w |$ .

cpp

bool has_period(const string &s, int p) {
    auto fact = duval(s);
    if (fact.empty()) return false;
    const string &w = fact[0];
    if ((int)w.size() != p) return false;
    for (auto &f : fact)
        if (f != w && w.substr(0, f.size()) != f)
            return false;
    return true;
}

This connects to the KMP failure function: the smallest period of $s$ equals $n - π [n - 1]$ . Lyndon factorization gives an alternative lens on the same structure.

Worked example -- detecting periods via factorization.

  s = "abcabcab"

  Lyndon factorization: "abcabcab"
  Wait -- is "abcabcab" Lyndon? Check: "abcabcab" < "bcabcab"? Yes (a<b).
  "abcabcab" < "cabcab"? Yes (a<c). "abcabcab" < "abcab"? Same prefix...
  "abcabcab" vs "abcab": a=a, b=b, c=c, a=a, b=b, then "abcab" ends.
  "abcabcab" > "abcab" (longer, same prefix). So NOT Lyndon!

  Actual factorization: run Duval...
    i=0: "abc" extends (a<b<c all increasing). Then s[3]='a'=s[0]='a'
    -> repeat tracking. s[4]='b'=s[1]='b', s[5]='c'=s[2]='c',
    s[6]='a'=s[3]... wait, k would be at 3, s[k]=s[0]='a'=s[6]='a',
    s[7]='b'=s[1]='b'. j=8>=n. len=8-4=...

  Actually: the factorization is "abc" | "abc" | "ab"

  All factors equal to "abc" or a prefix of it -> period = 3.
  Matches KMP: π[7] = 5, so period = 8 - 5 = 3.  OK

Problems: CF 825F "String Compression"

7. Runs and Lyndon roots

A run (maximal repetition) in a string is a maximal interval $[l, r]$ where the substring has period $p \leq (r - l + 1) / 2$ . Every run has a Lyndon root -- the lexicographically smallest rotation of its period. This connection is used in the Bannai-I-Nakashima-Shinohara algorithm for computing all runs in $O (n)$ time.

  Runs and their Lyndon roots:

  s = "aabaabaab"

  Run [0, 5]: "aabaab" has period 3 ("aab")
    Lyndon root of "aab": "aab" itself (already Lyndon: aab < ab < b)

  Run [3, 8]: "abaab" ... actually this isn't periodic enough.
  Let's check: "aabaabaab" -- the run [0, 8] has period 3.
    "aab" repeated 3 times = "aabaabaab"
    Lyndon root: "aab" OK

  The Lyndon root is unique per run and provides a canonical
  representative for the repetition pattern.

Understanding this won't appear directly in a contest, but it deepens your intuition for why Lyndon words keep showing up in string algorithms -- they are the natural "atoms" that repetitive structure is built from.

8. Splitting a cyclic string into primitive components

Given a circular string, sometimes you need to decompose it into its primitive root (the shortest string whose repetitions form the full string). Lyndon factorization handles this: if the factorization consists of $k$ identical Lyndon words, the string has primitive root equal to that word and period $n / k$ .

cpp

// Returns the primitive root of a string (shortest w such that
// s = w repeated k times, possibly with a prefix remainder)
string primitive_root(const string &s) {
    auto fact = duval(s);
    return fact[0];  // First Lyndon factor is always the root
                     // (only if all factors are identical)
}

// Check if s is a perfect power (s = w^k for some k > 1)
bool is_power(const string &s) {
    auto fact = duval(s);
    if (fact.size() <= 1) return false;
    for (auto &f : fact)
        if (f != fact[0]) return false;
    return true;
}

Contest Cheat Sheet

+-----------------------------------------------------------+
|  LYNDON FACTORIZATION - QUICK REFERENCE                   |
+-----------------------------------------------------------+
|                                                           |
|  Lyndon word: s < all its proper suffixes (strictly)      |
|  Factorization: s = w1 w2 ... wk, each wi Lyndon,        |
|                 w1 >= w2 >= ... >= wk                     |
|                                                           |
|  Duval (factorization):   O(n) time, O(1) space          |
|  Min rotation (Duval):    O(n) time, O(1) space          |
|  Min rotation (Booth):    O(n) time, O(n) space          |
|                                                           |
|  When to use:                                             |
|  - "smallest/lexicographic rotation"                      |
|  - "cyclic equivalence" or "necklace"                     |
|  - "canonical form of cyclic string"                      |
|  - comparing strings up to rotation                       |
|                                                           |
|  Min rotation (Duval, O(1) space):                        |
|    int i=0, ans=0;                                        |
|    while (i < n) {                                        |
|      ans = i;                                             |
|      int j = i+1, k = i;                                 |
|      while (j < 2*n && s[k%n] <= s[j%n])                 |
|        if (s[k%n] < s[j%n]) k = i; else k++;             |
|        j++;                                               |
|      while (i <= k) i += j-k;                             |
|    } // ans = start of min rotation                       |
|                                                           |
+-----------------------------------------------------------+

Gotchas and Debugging

1. Strict inequality in the Lyndon definition. A Lyndon word must be strictly less than all proper suffixes. The string "aa" is not Lyndon because "aa" $>$ "a" (a prefix is always less than the full string in standard lexicographic order). The factorization of "aa" is "a" | "a" -- two separate words. If you mistakenly treat "aa" as Lyndon, your factorization will be wrong.

2. Off-by-one in Booth's algorithm. Booth's failure function is indexed relative to the current best candidate $k$ . Confusing absolute positions ( $j$ ) with relative positions ( $j - k$ ) is the most common implementation bug. If you get wrong answers only on certain inputs, double-check every f[j - k] access.

3. Empty string. Both duval() and min_rotation() should handle the empty string gracefully. The factorization of "" is empty, and min_rotation("") returns 0. The implementations above handle this correctly (the while loops simply don't execute).

4. All identical characters."aaaa" factorizes as "a" | "a" | "a" | "a" -- four single-character Lyndon words. The minimum rotation is 0. Make sure your inner while loop exits promptly when $s [j] = s [k]$ eventually leads to a break. It will: after $k$ advances past the first copy, the comparison s[k] \leq s[j] with $k$ past the boundary forces termination.

5. The <= in Duval's inner loop. The condition is s[k] <= s[j], not s[k] < s[j]. The equality branch handles the repetition-tracking case. Changing <= to < produces incorrect factorizations -- the algorithm won't detect repeated copies of a Lyndon word.

6. Forgetting the doubled string for min rotation. If you run Duval on $s$ alone, you get the factorization -- but the minimum rotation does not simply start at the last Lyndon word of $s$ . You need the doubled string or modular indexing to see rotations that wrap around the boundary. Example: "cabba" has factorization "c" | "abb" | "a" -- the last word starts at position 4, giving rotation "acabb". But the actual minimum is "abbac" (position 1), which only appears when looking at s + s.

7. Modular indexing bounds. In the Duval-based min_rotation, the inner loop goes up to $j < 2 n$ . If you accidentally use $j < n$ , you won't see rotations that wrap around. The outer loop condition i < n is correct -- we only need Lyndon word boundaries in the first half of the doubled string.

8. Returning boundaries instead of substrings. In contest code you often want start positions, not substrings (to avoid $O (n)$ allocation per word). Here's the variant:

cpp

vector<int> duval_starts(const string &s) {
    int n = s.size();
    vector<int> starts;
    int i = 0;
    while (i < n) {
        int j = i + 1, k = i;
        while (j < n && s[k] <= s[j]) {
            if (s[k] < s[j]) k = i;
            else k++;
            j++;
        }
        while (i <= k) {
            starts.push_back(i);
            i += j - k;
        }
    }
    return starts;
}

Mental Traps

Trap 1: "Lyndon factorization is only for finding the smallest rotation."

That's the most common application, but it's not the only one. The factorization reveals periodic structure, connects to suffix array construction (SA-IS), and provides canonical representatives for necklace equivalence classes. If you treat it as a single-purpose tool, you'll miss problems where the factorization's structure -- the sequence of Lyndon words and their lengths -- is itself the answer.

  What Lyndon factorization actually unlocks:

  ┌─────────────────────────────────────────────┐
  │            Lyndon Factorization              │
  │         s = w₁ w₂ ... wₖ (non-increasing)   │
  └──────┬──────────┬──────────┬────────────────┘
         │          │          │
         ▼          ▼          ▼
   ┌──────────┐ ┌────────┐ ┌──────────────────┐
   │ Min      │ │Necklace│ │ Period detection  │
   │ rotation │ │ canon  │ │ (all wᵢ equal?)  │
   └──────────┘ └────────┘ └──────────────────┘
         │                         │
         ▼                         ▼
   ┌──────────┐            ┌──────────────────┐
   │ SA-IS    │            │ String            │
   │ suffix   │            │ compression       │
   │ classify │            │ (CF 825F)         │
   └──────────┘            └──────────────────┘

Trap 2: "The uniqueness of the factorization is obvious."

It's not. Many different ways to split a string into Lyndon words exist. The Chen-Fox-Lyndon theorem says only one such split is non-increasing -- and that's the hard part to prove. If you don't appreciate uniqueness, you can't explain why Duval's algorithm always produces the correct answer regardless of input. The comparison logic ( $s [j] < s [k]$ triggers output, $s [j] > s [k]$ extends, $s [j] = s [k]$ tracks repetition) implicitly enforces the non-increasing constraint -- skipping this understanding means you can't debug the algorithm when it breaks.

Common Mistakes

Booth's failure-function confusion with KMP. "My Booth's algorithm returned index 0 for the string abab, but the minimum rotation is abab starting at index 0 -- wait, shouldn't it be abab?" Actually abab has rotations: abab, baba, abab, baba. The minimum is abab at index 0, which is correct. The real mistake was on a string like baa -- rotations are baa, aab, aba. The minimum is aab (index 1). The bug was in the failure-function update: when s[i] < s[j], the pointer j should reset to 0 (not stay put). In Booth's variant, you must handle the < case by resetting the candidate start position. Confusing Booth's failure function with KMP's is the classic trap.

Debug Drills

Drill 1 -- Duval's algorithm: wrong reset after mismatch

cpp

int i = 0;
while (i < n) {
    int j = i, k = i + 1;
    while (k < n && s[j] <= s[k]) {
        j = (s[j] == s[k]) ? j + 1 : i;  // BUG
        k++;
    }
    while (i <= j) {
        i += k - j;
    }
}

What's wrong?

When s[j] < s[k] (strict), j should reset to i (the start of the current tentative Lyndon word). But the condition s[j] <= s[k] groups == and < together incorrectly. When s[j] == s[k], we should advance j to j+1. When s[j] < s[k], we should reset j to i. The fix:

cpp

if (s[j] < s[k]) {
    j = i;   // extend current Lyndon word
} else {
    j++;     // s[j] == s[k], continue comparing
}

Wait -- actually in Duval's algorithm: if s[k] > s[j], we reset j = i (current block can be extended). If s[k] == s[j], we increment j. If s[k] < s[j], we break and output. The original code has the condition inverted.

Drill 2 -- Booth's algorithm: forgetting to double the string

cpp

vector<int> f(n, -1);
int k = 0;
for (int j = 1; j < n; j++) {  // should be 2*n
    // ... Booth's logic ...
}

What's wrong?

Booth's algorithm works on the doubled string s + s. The loop must iterate up to 2*n, and the failure array must have size 2*n. With only n iterations, you miss rotations that start in the second half:

cpp

vector<int> f(2 * n, -1);
int k = 0;
for (int j = 1; j < 2 * n; j++) {
    char sj = s[j % n];
    // ... rest of Booth's logic using s[... % n] ...
}

Drill 3 -- Outputting Lyndon factors in wrong order

cpp

// Inside Duval's algorithm, outputting factors:
while (i <= j) {
    factors.push_back(s.substr(i, k - j));
    i += k - j;
    j += k - j;  // BUG: j should not change
}

What's wrong?

After the inner while-loop breaks, the period length is k - j. We output complete factors of length k - j starting from i. The variable j should not be modified in this output loop. Fix:

cpp

int len = k - j;
while (i <= j) {
    factors.push_back(s.substr(i, len));
    i += len;
}

Self-Test

Before moving to problems, verify you can do these without looking at the notes:

[ ] State the definition of a Lyndon word precisely -- including why "abab" is NOT Lyndon (it equals a rotation of itself) and why "ab" IS Lyndon
[ ] State the Chen-Fox-Lyndon theorem: every string has a unique factorization into Lyndon words in non-increasing lexicographic order
[ ] Explain why Duval's algorithm is $O (n)$ : variable $i$ never decreases, and the total advancement of $j$ across all rounds is bounded by $n$
[ ] Hand-trace Duval on "aab" and verify the result is non-increasing Lyndon words
[ ] Describe how the Lyndon factorization of $s + s$ yields the smallest rotation of $s$ , and identify which boundary gives the starting index

Practice Problems

CF Rating	What You Should Be Able to Do
1200	Know what a rotation is and find the minimum rotation by brute force.
1500	Implement Booth's algorithm for minimum rotation; understand what a Lyndon word is.
1800	Implement Duval's algorithm; use Lyndon factorization to solve cyclic-comparison problems.
2100	Combine Lyndon factorization with suffix arrays or SAM for advanced cyclic substring problems.

Problem	Source	Difficulty	Key Technique
Minimal Rotation	CSES 1110	CF ~1400	Direct `min_rotation()` application
MINMOVE	SPOJ	CF ~1400	Direct minimum rotation
Glass Beads	POJ 1509	CF ~1500	Minimum rotation, classic problem
Cyclical Quest	CF 235C	CF 2400	Cyclic substrings via suffix automaton
String Compression	CF 825F	CF 2200	Period analysis with Z-function + DP
Password	CF 126B	CF 1700	Prefix-suffix matching (KMP/Z, related)
Minimum Cyclic Shift	Timus 1423	CF ~1500	Direct Duval or Booth application
Isomorphic Strings	CF 985F	CF 2300	Hashing + cyclic comparison
Necklace	UVa 11475	CF ~1600	Palindromic necklace via min rotation + KMP
Periodic Strings	CF 1295C	CF 1400	String period analysis (Lyndon structure insight)
Anthem of Berland	CF 808G	CF 2200	KMP with cyclic pattern variants

Suggested Progression

Start with CSES 1110, SPOJ MINMOVE, Timus 1423 -- direct application of min_rotation. Write the code from scratch, don't copy-paste. These build muscle memory for Duval's three-case logic.
Build with POJ 1509, UVa 11475, CF 1295C -- slightly more involved problems where you need to combine min rotation or periodicity analysis with another technique.
Strengthen with CF 126B, CF 985F -- problems where Lyndon-adjacent thinking (prefix functions, cyclic hashing) is the core technique. These test your understanding of the ideas behind Lyndon factorization, not just the implementation.
Master with CF 235C, CF 825F, CF 808G -- hard problems requiring suffix automata, DP, or multi-technique combinations where cyclic string structure is one ingredient among several.

The first three are direct applications. CF 235C requires combining cyclic rotation ideas with a suffix automaton. CF 825F uses period detection (related to Lyndon structure). CF 126B exercises the KMP-family thinking that Booth's algorithm also builds on.

Lyndon Factorization ​

Contents ​

From Rotations to Lyndon Words ​

Lyndon words ​

The factorization theorem ​

Duval's algorithm ​

Minimum rotation via Lyndon factorization ​

What the Code Won't Teach You ​

Booth's algorithm ​

Watching the Algorithm Work ​

Duval's algorithm on "abbaabba" ​

Factorization result ​

Minimum rotation of "cabba" ​

More Worked Examples ​

All identical characters: "aaaa" ​

Repeated Lyndon word: "ababab" ​

Decreasing string: "dcba" ​

A real-world string: "banana" ​

Minimum rotation of "dbacd" ​

Relevant String Operations ​

Implementation ​

Contest template ​

Explained version ​

Complexity at a Glance ​

Problem Patterns and Tricks ​

1. Minimum cyclic rotation ​

2. Necklace equivalence ​

3. Cyclic substring matching ​

4. Lexicographic comparison of cyclic strings ​

5. Suffix array connection ​

6. String periodicity analysis ​

7. Runs and Lyndon roots ​

8. Splitting a cyclic string into primitive components ​

Contest Cheat Sheet ​

Gotchas and Debugging ​

Mental Traps ​

Common Mistakes ​

Debug Drills ​

Self-Test ​

Practice Problems ​

Suggested Progression ​

Further Reading ​

Related Techniques ​

Lyndon Factorization

Contents

From Rotations to Lyndon Words

Lyndon words

The factorization theorem

Duval's algorithm

Minimum rotation via Lyndon factorization

What the Code Won't Teach You

Booth's algorithm

Watching the Algorithm Work

Duval's algorithm on "abbaabba"

Factorization result

Minimum rotation of "cabba"

More Worked Examples

All identical characters: "aaaa"

Repeated Lyndon word: "ababab"

Decreasing string: "dcba"

A real-world string: "banana"

Minimum rotation of "dbacd"

Relevant String Operations

Implementation

Contest template

Explained version

Complexity at a Glance

Problem Patterns and Tricks

1. Minimum cyclic rotation

2. Necklace equivalence

3. Cyclic substring matching

4. Lexicographic comparison of cyclic strings

5. Suffix array connection

6. String periodicity analysis

7. Runs and Lyndon roots

8. Splitting a cyclic string into primitive components

Contest Cheat Sheet

Gotchas and Debugging

Mental Traps

Common Mistakes

Debug Drills

Self-Test

Practice Problems

Suggested Progression

Further Reading

Related Techniques