Documentation
Commentary
The "opt" in "regexp-opt" stands for "optim\\(al\\|i[sz]e\\)".
This package generates a regexp from a given list of strings (which matches
one of those strings) so that the regexp generated by:
(regexp-opt strings)
is equivalent to, but more efficient than, the regexp generated by:
(mapconcat 'regexp-quote strings "\\|")
For example:
(let ((strings '("cond" "if" "when" "unless" "while"
"let" "let*" "progn" "prog1" "prog2"
"save-restriction" "save-excursion" "save-window-excursion"
"save-current-buffer" "save-match-data"
"catch" "throw" "unwind-protect" "condition-case")))
(concat "(" (regexp-opt strings t) "\\>"))
=> "(\\(c\\(atch\\|ond\\(ition-case\\)?\\)\\|if\\|let\\*?\\|prog[12n]\\|save-\\(current-buffer\\|excursion\\|match-data\\|restriction\\|window-excursion\\)\\|throw\\|un\\(less\\|wind-protect\\)\\|wh\\(en\\|ile\\)\\)\\>"
Searching using the above example `regexp-opt' regexp takes approximately
two-thirds of the time taken using the equivalent `mapconcat' regexp.
Since this package was written to produce efficient regexps, not regexps
efficiently, it is probably not a good idea to in-line too many calls in
your code, unless you use the following trick with `eval-when-compile':
(defvar definition-regexp
(eval-when-compile
(concat "^("
(regexp-opt '("defun" "defsubst" "defmacro" "defalias"
"defvar" "defconst") t)
"\\>")))
The `byte-compile' code will be as if you had defined the variable thus:
(defvar definition-regexp
"^(\\(def\\(alias\\|const\\|macro\\|subst\\|un\\|var\\)\\)\\>")
Note that if you use this trick for all instances of `regexp-opt' and
`regexp-opt-depth' in your code, regexp-opt.el would only have to be loaded
at compile time. But note also that using this trick means that should
regexp-opt.el be changed, perhaps to fix a bug or to add a feature to
improve the efficiency of `regexp-opt' regexps, you would have to recompile
your code for such changes to have effect in your code.
Originally written for font-lock.el, from an idea from Stig's hl319.el, with
thanks for ideas also to Michael Ernst, Bob Glickstein, Dan Nicolaescu and
Stefan Monnier.
No doubt `regexp-opt' doesn't always produce optimal regexps, so code, ideas
or any other information to improve things are welcome.
One possible improvement would be to compile '("aa" "ab" "ba" "bb")
into "[ab][ab]" rather than "a[ab]\\|b[ab]". I'm not sure it's worth
it but if someone knows how to do it without going through too many
contortions, I'm all ears.
Requires
Dependencies
No package dependencies recorded.
Consumers
Reverse Dependencies
- blorg
- lpc-mode
- proc-mode
- sawfish
- sendmail-mode
- yaham
- emacs
- table
- actionscript-mode
- auto-complete-nxml
- company-bibtex
- door-gnus
- emojify
- ess
- fuzzy
- genrnc
- gmpl-mode
- inform-mode
- jdee
- navi2ch
- pillar
- plsense
- smarty-mode
- tup-mode
- vbasense
- w3m-https-everywhere
- writegood-mode
- yard-mode
- ecmascript-mode
- pddl-mode
- perl-completion
- vm
- auto-read-only
- mupad
- apache-mode
- php-mode
- poly-erb
- comint-hyperlink
- selectrum
- direx
- zephir-mode
- leaf-tree
- rainbow-mode
- pikchr-mode
- liblouis
- verilog-mode
- latex-table-wizard
- use-package
- uiua-mode
- xkb-mode
- splunk-mode
- which-key
- arscript-mode
- dpkg-dev-el
- slug