Sacha Chua introduced me to
regexp-opt, an Emacs function that I hadn’t seen before. The idea is that
regexp-opt will take a list of strings and return a more or less optimal regexp to recognize any of those strings.
Because it does analysis and restructuring of the list of strings,
regexp-opt will generally produce a more efficient regexp than the equivalent
(mapconcat 'regexp-quote strings "\\|")
In addition to the list of strings,
regexp-opt takes an optional second argument that if not
nil will surround the regexp with grouping parentheses. If the second argument is
regexp-opt also surrounds the regexp with
Here’s an example from the source code for
regexp-opt that shows how it optimizes a regexp.
(let ((strings '("cond" "if" "when" "unless" "while" "let" "let*" "progn" "prog1" "prog2" "save-restriction" "save-excursion" "save-window-excursion" "save-current-buffer" "save-match-data" "catch" "throw" "unwind-protect" "condition-case"))) (concat "(" (regexp-opt strings t) "\\>")) => "(\\(c\\(atch\\|ond\\(ition-case\\)?\\)\\|if\\|let\\*?\\|prog[12n]\\|save-\\(current-buffer\\|excursion\\|match-data\\|restriction\\|window-excursion\\)\\|throw\\|un\\(less\\|wind-protect\\)\\|wh\\(en\\|ile\\)\\)\\>"
This regexp takes approximately two-thirds the amount of time that the equivalent
mapconcat regexp does.
Simon Marshall, the code’s author, notes that the package was written to produce efficient regexps, not to produce regexps efficiently and therefore you should avoid using it in-line too much unless you use
eval-when-compile. See the commentary in
regexp-opt.el for more on this. You can, of course, easily get to the code from the function description by clicking on the file name.