How do I find accidentally repeated words like 'the the' in a document?
Answer
/\(\<\w\+\>\)\_s\+\1\>
Explanation
When writing or editing text, repeated words like "the the" or "is is" are a common typo that spell checkers often miss. Vim's regex engine supports backreferences, which let you find any word that is immediately followed by itself, even across line boundaries.
How it works
\(and\)— define a capture group\<\w\+\>— match a whole word (\<and\>are word boundaries,\w\+matches one or more word characters)\_s\+— match one or more whitespace characters including newlines (\_sis Vim's "whitespace including newline" atom)\1— backreference to the first capture group, meaning the same word must appear again\>— trailing word boundary ensures exact match
Example
Given this text:
The quick brown fox jumped over
over the lazy dog and the the cat.
The pattern matches:
over\nover— repeated across a line breakthe the— repeated on the same line
Tips
- Combine with substitute to auto-fix:
:%s/\(\<\w\+\>\)\_s\+\1\>/\1/gc(confirm each replacement) - The
cflag lets you review each match before replacing, since not all repeated words are errors - This is particularly useful for proofreading long documents, README files, or commit messages