34.4 Regular Expression Searching
In GNU Emacs, you can search for the next match for a regular expression (see Syntax of Regexps) either incrementally or not. For incremental search commands, see Regular Expression Search in The GNU Emacs Manual. Here we describe only the search functions useful in programs. The principal one is re-search-forward
.
These search functions convert the regular expression to multibyte if the buffer is multibyte; they convert the regular expression to unibyte if the buffer is unibyte. See Text Representations.
command
re-search-forward regexp \&optional limit noerror countβ
This function searches forward in the current buffer for a string of text that is matched by the regular expression regexp
. The function skips over any amount of text that is not matched by regexp
, and leaves point at the end of the first match found. It returns the new value of point.
If limit
is non-nil
, it must be a position in the current buffer. It specifies the upper bound to the search. No match extending after that position is accepted. If limit
is omitted or nil
, it defaults to the end of the accessible portion of the buffer.
What re-search-forward
does when the search fails depends on the value of noerror
:
nil
β
Signal a search-failed
error.
t
β
Do nothing and return nil
.
anything elseβ
Move point to limit
(or the end of the accessible portion of the buffer) and return nil
.
The argument noerror
only affects valid searches which fail to find a match. Invalid arguments cause errors regardless of noerror
.
If count
is a positive number n
, the search is done n
times; each successive search starts at the end of the previous match. If all these successive searches succeed, the function call succeeds, moving point and returning its new value. Otherwise the function call fails, with results depending on the value of noerror
, as described above. If count
is a negative number -n
, the search is done n
times in the opposite (backward) direction.
In the following example, point is initially before the βT
β. Evaluating the search call moves point to the end of that line (between the βt
β of βhat
β and the newline).
---------- Buffer: foo ----------
I read "βThe cat in the hat
comes back" twice.
---------- Buffer: foo ----------
(re-search-forward "[a-z]+" nil t 5)
β 27
---------- Buffer: foo ----------
I read "The cat in the hatβ
comes back" twice.
---------- Buffer: foo ----------
command
re-search-backward regexp \&optional limit noerror countβ
This function searches backward in the current buffer for a string of text that is matched by the regular expression regexp
, leaving point at the beginning of the first text found.
This function is analogous to re-search-forward
, but they are not simple mirror images. re-search-forward
finds the match whose beginning is as close as possible to the starting point. If re-search-backward
were a perfect mirror image, it would find the match whose end is as close as possible. However, in fact it finds the match whose beginning is as close as possible (and yet ends before the starting point). The reason for this is that matching a regular expression at a given spot always works from beginning to end, and starts at a specified beginning position.
A true mirror-image of re-search-forward
would require a special feature for matching regular expressions from end to beginning. Itβs not worth the trouble of implementing that.
function
string-match regexp string \&optional startβ
This function returns the index of the start of the first match for the regular expression regexp
in string
, or nil
if there is no match. If start
is non-nil
, the search starts at that index in string
.
For example,
(string-match
"quick" "The quick brown fox jumped quickly.")
β 4
(string-match
"quick" "The quick brown fox jumped quickly." 8)
β 27
The index of the first character of the string is 0, the index of the second character is 1, and so on.
If this function finds a match, the index of the first character beyond the match is available as (match-end 0)
. See Match Data.
(string-match
"quick" "The quick brown fox jumped quickly." 8)
β 27
(match-end 0)
β 32
function
string-match-p regexp string \&optional startβ
This predicate function does what string-match
does, but it avoids modifying the match data.
function
looking-at regexpβ
This function determines whether the text in the current buffer directly following point matches the regular expression regexp
. βDirectly following" means precisely that: the search is βanchored" and it can succeed only starting with the first character following point. The result is t
if so, nil
otherwise.
This function does not move point, but it does update the match data. See Match Data. If you need to test for a match without modifying the match data, use looking-at-p
, described below.
In this example, point is located directly before the βT
β. If it were anywhere else, the result would be nil
.
---------- Buffer: foo ----------
I read "βThe cat in the hat
comes back" twice.
---------- Buffer: foo ----------
(looking-at "The cat in the hat$")
β t
function
looking-back regexp limit \&optional greedyβ
This function returns t
if regexp
matches the text immediately before point (i.e., ending at point), and nil
otherwise.
Because regular expression matching works only going forward, this is implemented by searching backwards from point for a match that ends at point. That can be quite slow if it has to search a long distance. You can bound the time required by specifying a non-nil
value for limit
, which says not to search before limit
. In this case, the match that is found must begin at or after limit
. Hereβs an example:
---------- Buffer: foo ----------
I read "βThe cat in the hat
comes back" twice.
---------- Buffer: foo ----------
(looking-back "read \"" 3)
β t
(looking-back "read \"" 4)
β nil
If greedy
is non-nil
, this function extends the match backwards as far as possible, stopping when a single additional previous character cannot be part of a match for regexp
. When the match is extended, its starting position is allowed to occur before limit
.
As a general recommendation, try to avoid using looking-back
wherever possible, since it is slow. For this reason, there are no plans to add a looking-back-p
function.
function
looking-at-p regexpβ
This predicate function works like looking-at
, but without updating the match data.
variable
search-spaces-regexpβ
If this variable is non-nil
, it should be a regular expression that says how to search for whitespace. In that case, any group of spaces in a regular expression being searched for stands for use of this regular expression. However, spaces inside of constructs such as β[β¦]
β and β*
β, β+
β, β?
β are not affected by search-spaces-regexp
.
Since this variable affects all regular expression search and match constructs, you should bind it temporarily for as small as possible a part of the code.