Regular Expressions |
Regular expressions can be used for searching for patterns
rather than literals. For example, it is possible to
search for variables in SciTE property files,
which look like $(name.subname) with the regular expression:
\$([a-z.]+)
(or \$\([a-z.]+\)
in posix mode).
Replacement with regular expressions allows complex
transformations with the use of tagged expressions.
For example, pairs of numbers separated by a ',' could
be reordered by replacing the regular expression:
\([0-9]+\),\([0-9]+\)
(or ([0-9]+),([0-9]+)
in posix mode, or even (\d+),(\d+)
)
with:
\2,\1
Regular expression syntax depends on a parameter: find.replace.regexp.posix
If set to 0, syntax uses the old Unix style where \(
and \)
mark capturing sections while (
and )
are themselves.
If set to 1, syntax uses the more common style where (
and )
mark capturing sections while \(
and \)
are plain parentheses.
. \ [ ] * + ^ $
and ( )
in posix mode.
.
\
\a
, \b
, \f
,
\n
, \r
, \t
, \v
match the corresponding C escape char,
respectively BEL, BS, FF, LF, CR, TAB and VT;\r
and \n
are never matched because in Scintilla,
regular expression searches are made line per line (stripped of end-of-line chars).
[
set]
^
, it matches the characters NOT in the set,
i.e. complements the set. A shorthand S-E
(start dash end) is
used to specify a set of characters S up to E, inclusive. The special characters ]
and
-
have no special meaning if they appear as the first chars in the set. To include both,
put - first: [-]A-Z]
(or just backslash them).
example | match |
[-]|] | matches these 3 chars, |
[]-|] | matches from ] to | chars |
[a-z] | any lowercase alpha |
[^-]] | any char except - and ] |
[^A-Z] | any char except uppercase alpha |
[a-zA-Z] | any alpha |
*
*
) matches zero or more matches of that form.
+
\(form\)
(or (form)
with posix flag) matches
what form matches.
The enclosure creates a set of tags, used for [8] and for
pattern substitution. The tagged forms are numbered starting from 1.
\
followed by a digit 1 to 9 matches whatever a
previously tagged regular expression ([7]) matched.
\< \>
\<
construct
and/or ending with a \>
construct, restricts the
pattern matching to the beginning of a word, and/or
the end of a word. A word is defined to be a character
string beginning and/or ending with the characters
A-Z a-z 0-9 and _. Scintilla extends this definition
by user setting. The word must also be preceded and/or
followed by any character outside those mentioned.
\l
\xHH
^ $
Most of this documentation was originally written by Ozan S. Yigit.
Additions by Neil Hodgson and Philippe Lhoste.
All of this document is in the public domain.