vimtricks.wiki Concise Vim tricks, one at a time.

How do I collapse consecutive duplicate words across a file?

Answer

:%s/\v<(\w+)\s+\1>/\1/g\<CR>

Explanation

OCR cleanup, copy-paste artifacts, and rushed note-taking often produce repeated words like the the or is is. Instead of fixing them manually, you can remove adjacent duplicates across the whole buffer with one regex substitution. This keeps punctuation and spacing structure intact while reducing obvious noise quickly.

How it works

  • :%s/.../.../g runs substitution on all lines in the file
  • \v enables very-magic mode so the pattern stays readable
  • <(\w+) captures a whole word into group 1
  • \s+ matches one or more spaces between repeated words
  • \1> requires the exact same captured word again
  • Replacement \1 keeps only one copy

This pattern is especially useful before proofreading passes, search indexing, or generating diff-friendly output from noisy source text.

Example

Before:

This is is a test.
We we should should fix this.

Run:

:%s/\v<(\w+)\s+\1>/\1/g

After:

This is a test.
We should fix this.

Tips

  • Add c (.../gc) when you want confirmation per match.
  • For case-insensitive cleanup, prepend \c inside the pattern.

Next

How do I run a substitution across the arglist and write only modified files?