1PERLREREF(1)           Perl Programmers Reference Guide           PERLREREF(1)
2
3
4

NAME

6       perlreref - Perl Regular Expressions Reference
7

DESCRIPTION

9       This is a quick reference to Perl's regular expressions.  For full
10       information see perlre and perlop, as well as the "SEE ALSO" section in
11       this document.
12
13   OPERATORS
14       "=~" determines to which variable the regex is applied.  In its
15       absence, $_ is used.
16
17           $var =~ /foo/;
18
19       "!~" determines to which variable the regex is applied, and negates the
20       result of the match; it returns false if the match succeeds, and true
21       if it fails.
22
23           $var !~ /foo/;
24
25       "m/pattern/msixpogcdualn" searches a string for a pattern match,
26       applying the given options.
27
28           m  Multiline mode - ^ and $ match internal lines
29           s  match as a Single line - . matches \n
30           i  case-Insensitive
31           x  eXtended legibility - free whitespace and comments
32           p  Preserve a copy of the matched string -
33              ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined.
34           o  compile pattern Once
35           g  Global - all occurrences
36           c  don't reset pos on failed matches when using /g
37           a  restrict \d, \s, \w and [:posix:] to match ASCII only
38           aa (two a's) also /i matches exclude ASCII/non-ASCII
39           l  match according to current locale
40           u  match according to Unicode rules
41           d  match according to native rules unless something indicates
42              Unicode
43           n  Non-capture mode. Don't let () fill in $1, $2, etc...
44
45       If 'pattern' is an empty string, the last successfully matched regex is
46       used. Delimiters other than '/' may be used for both this operator and
47       the following ones. The leading "m" can be omitted if the delimiter is
48       '/'.
49
50       "qr/pattern/msixpodualn" lets you store a regex in a variable, or pass
51       one around. Modifiers as for "m//", and are stored within the regex.
52
53       "s/pattern/replacement/msixpogcedual" substitutes matches of 'pattern'
54       with 'replacement'. Modifiers as for "m//", with two additions:
55
56           e  Evaluate 'replacement' as an expression
57           r  Return substitution and leave the original string untouched.
58
59       'e' may be specified multiple times. 'replacement' is interpreted as a
60       double quoted string unless a single-quote ("'") is the delimiter.
61
62       "m?pattern?" is like "m/pattern/" but matches only once. No alternate
63       delimiters can be used.  Must be reset with reset().
64
65   SYNTAX
66        \       Escapes the character immediately following it
67        .       Matches any single character except a newline (unless /s is
68                  used)
69        ^       Matches at the beginning of the string (or line, if /m is used)
70        $       Matches at the end of the string (or line, if /m is used)
71        *       Matches the preceding element 0 or more times
72        +       Matches the preceding element 1 or more times
73        ?       Matches the preceding element 0 or 1 times
74        {...}   Specifies a range of occurrences for the element preceding it
75        [...]   Matches any one of the characters contained within the brackets
76        (...)   Groups subexpressions for capturing to $1, $2...
77        (?:...) Groups subexpressions without capturing (cluster)
78        |       Matches either the subexpression preceding or following it
79        \g1 or \g{1}, \g2 ...    Matches the text from the Nth group
80        \1, \2, \3 ...           Matches the text from the Nth group
81        \g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group
82        \g{name}     Named backreference
83        \k<name>     Named backreference
84        \k'name'     Named backreference
85        (?P=name)    Named backreference (python syntax)
86
87   ESCAPE SEQUENCES
88       These work as in normal strings.
89
90          \a       Alarm (beep)
91          \e       Escape
92          \f       Formfeed
93          \n       Newline
94          \r       Carriage return
95          \t       Tab
96          \037     Char whose ordinal is the 3 octal digits, max \777
97          \o{2307} Char whose ordinal is the octal number, unrestricted
98          \x7f     Char whose ordinal is the 2 hex digits, max \xFF
99          \x{263a} Char whose ordinal is the hex number, unrestricted
100          \cx      Control-x
101          \N{name} A named Unicode character or character sequence
102          \N{U+263D} A Unicode character by hex ordinal
103
104          \l  Lowercase next character
105          \u  Titlecase next character
106          \L  Lowercase until \E
107          \U  Uppercase until \E
108          \F  Foldcase until \E
109          \Q  Disable pattern metacharacters until \E
110          \E  End modification
111
112       For Titlecase, see "Titlecase".
113
114       This one works differently from normal strings:
115
116          \b  An assertion, not backspace, except in a character class
117
118   CHARACTER CLASSES
119          [amy]    Match 'a', 'm' or 'y'
120          [f-j]    Dash specifies "range"
121          [f-j-]   Dash escaped or at start or end means 'dash'
122          [^f-j]   Caret indicates "match any character _except_ these"
123
124       The following sequences (except "\N") work within or without a
125       character class.  The first six are locale aware, all are Unicode
126       aware. See perllocale and perlunicode for details.
127
128          \d      A digit
129          \D      A nondigit
130          \w      A word character
131          \W      A non-word character
132          \s      A whitespace character
133          \S      A non-whitespace character
134          \h      A horizontal whitespace
135          \H      A non horizontal whitespace
136          \N      A non newline (when not followed by '{NAME}';;
137                  not valid in a character class; equivalent to [^\n]; it's
138                  like '.' without /s modifier)
139          \v      A vertical whitespace
140          \V      A non vertical whitespace
141          \R      A generic newline           (?>\v|\x0D\x0A)
142
143          \pP     Match P-named (Unicode) property
144          \p{...} Match Unicode property with name longer than 1 character
145          \PP     Match non-P
146          \P{...} Match lack of Unicode property with name longer than 1 char
147          \X      Match Unicode extended grapheme cluster
148
149       POSIX character classes and their Unicode and Perl equivalents:
150
151                   ASCII-         Full-
152          POSIX    range          range    backslash
153        [[:...:]]  \p{...}        \p{...}   sequence    Description
154
155        -----------------------------------------------------------------------
156        alnum   PosixAlnum       XPosixAlnum            'alpha' plus 'digit'
157        alpha   PosixAlpha       XPosixAlpha            Alphabetic characters
158        ascii   ASCII                                   Any ASCII character
159        blank   PosixBlank       XPosixBlank   \h       Horizontal whitespace;
160                                                          full-range also
161                                                          written as
162                                                          \p{HorizSpace} (GNU
163                                                          extension)
164        cntrl   PosixCntrl       XPosixCntrl            Control characters
165        digit   PosixDigit       XPosixDigit   \d       Decimal digits
166        graph   PosixGraph       XPosixGraph            'alnum' plus 'punct'
167        lower   PosixLower       XPosixLower            Lowercase characters
168        print   PosixPrint       XPosixPrint            'graph' plus 'space',
169                                                          but not any Controls
170        punct   PosixPunct       XPosixPunct            Punctuation and Symbols
171                                                          in ASCII-range; just
172                                                          punct outside it
173        space   PosixSpace       XPosixSpace   \s       Whitespace
174        upper   PosixUpper       XPosixUpper            Uppercase characters
175        word    PosixWord        XPosixWord    \w       'alnum' + Unicode marks
176                                                           + connectors, like
177                                                           '_' (Perl extension)
178        xdigit  ASCII_Hex_Digit  XPosixDigit            Hexadecimal digit,
179                                                           ASCII-range is
180                                                           [0-9A-Fa-f]
181
182       Also, various synonyms like "\p{Alpha}" for "\p{XPosixAlpha}"; all
183       listed in "Properties accessible through \p{} and \P{}" in perluniprops
184
185       Within a character class:
186
187           POSIX      traditional   Unicode
188         [:digit:]       \d        \p{Digit}
189         [:^digit:]      \D        \P{Digit}
190
191   ANCHORS
192       All are zero-width assertions.
193
194          ^  Match string start (or line, if /m is used)
195          $  Match string end (or line, if /m is used) or before newline
196          \b{} Match boundary of type specified within the braces
197          \B{} Match wherever \b{} doesn't match
198          \b Match word boundary (between \w and \W)
199          \B Match except at word boundary (between \w and \w or \W and \W)
200          \A Match string start (regardless of /m)
201          \Z Match string end (before optional newline)
202          \z Match absolute string end
203          \G Match where previous m//g left off
204          \K Keep the stuff left of the \K, don't include it in $&
205
206   QUANTIFIERS
207       Quantifiers are greedy by default and match the longest leftmost.
208
209          Maximal Minimal Possessive Allowed range
210          ------- ------- ---------- -------------
211          {n,m}   {n,m}?  {n,m}+     Must occur at least n times
212                                     but no more than m times
213          {n,}    {n,}?   {n,}+      Must occur at least n times
214          {n}     {n}?    {n}+       Must occur exactly n times
215          *       *?      *+         0 or more times (same as {0,})
216          +       +?      ++         1 or more times (same as {1,})
217          ?       ??      ?+         0 or 1 time (same as {0,1})
218
219       The possessive forms (new in Perl 5.10) prevent backtracking: what gets
220       matched by a pattern with a possessive quantifier will not be
221       backtracked into, even if that causes the whole match to fail.
222
223       There is no quantifier "{,n}". That's currently illegal.
224
225   EXTENDED CONSTRUCTS
226          (?#text)          A comment
227          (?:...)           Groups subexpressions without capturing (cluster)
228          (?pimsx-imsx:...) Enable/disable option (as per m// modifiers)
229          (?=...)           Zero-width positive lookahead assertion
230          (*pla:...)        Same, starting in 5.32; experimentally in 5.28
231          (*positive_lookahead:...) Same, same versions as *pla
232          (?!...)           Zero-width negative lookahead assertion
233          (*nla:...)        Same, starting in 5.32; experimentally in 5.28
234          (*negative_lookahead:...) Same, same versions as *nla
235          (?<=...)          Zero-width positive lookbehind assertion
236          (*plb:...)        Same, starting in 5.32; experimentally in 5.28
237          (*positive_lookbehind:...) Same, same versions as *plb
238          (?<!...)          Zero-width negative lookbehind assertion
239          (*nlb:...)        Same, starting in 5.32; experimentally in 5.28
240          (*negative_lookbehind:...) Same, same versions as *plb
241          (?>...)           Grab what we can, prohibit backtracking
242          (*atomic:...)     Same, starting in 5.32; experimentally in 5.28
243          (?|...)           Branch reset
244          (?<name>...)      Named capture
245          (?'name'...)      Named capture
246          (?P<name>...)     Named capture (python syntax)
247          (?[...])          Extended bracketed character class
248          (?{ code })       Embedded code, return value becomes $^R
249          (??{ code })      Dynamic regex, return value used as regex
250          (?N)              Recurse into subpattern number N
251          (?-N), (?+N)      Recurse into Nth previous/next subpattern
252          (?R), (?0)        Recurse at the beginning of the whole pattern
253          (?&name)          Recurse into a named subpattern
254          (?P>name)         Recurse into a named subpattern (python syntax)
255          (?(cond)yes|no)
256          (?(cond)yes)      Conditional expression, where "(cond)" can be:
257                            (?=pat)   lookahead; also (*pla:pat)
258                                      (*positive_lookahead:pat)
259                            (?!pat)   negative lookahead; also (*nla:pat)
260                                      (*negative_lookahead:pat)
261                            (?<=pat)  lookbehind; also (*plb:pat)
262                                      (*lookbehind:pat)
263                            (?<!pat)  negative lookbehind; also (*nlb:pat)
264                                      (*negative_lookbehind:pat)
265                            (N)       subpattern N has matched something
266                            (<name>)  named subpattern has matched something
267                            ('name')  named subpattern has matched something
268                            (?{code}) code condition
269                            (R)       true if recursing
270                            (RN)      true if recursing into Nth subpattern
271                            (R&name)  true if recursing into named subpattern
272                            (DEFINE)  always false, no no-pattern allowed
273
274   VARIABLES
275          $_    Default variable for operators to use
276
277          $`    Everything prior to matched string
278          $&    Entire matched string
279          $'    Everything after to matched string
280
281          ${^PREMATCH}   Everything prior to matched string
282          ${^MATCH}      Entire matched string
283          ${^POSTMATCH}  Everything after to matched string
284
285       Note to those still using Perl 5.18 or earlier: The use of "$`", $& or
286       "$'" will slow down all regex use within your program. Consult perlvar
287       for "@-" to see equivalent expressions that won't cause slow down.  See
288       also Devel::SawAmpersand. Starting with Perl 5.10, you can also use the
289       equivalent variables "${^PREMATCH}", "${^MATCH}" and "${^POSTMATCH}",
290       but for them to be defined, you have to specify the "/p" (preserve)
291       modifier on your regular expression.  In Perl 5.20, the use of "$`", $&
292       and "$'" makes no speed difference.
293
294          $1, $2 ...  hold the Xth captured expr
295          $+    Last parenthesized pattern match
296          $^N   Holds the most recently closed capture
297          $^R   Holds the result of the last (?{...}) expr
298          @-    Offsets of starts of groups. $-[0] holds start of whole match
299          @+    Offsets of ends of groups. $+[0] holds end of whole match
300          %+    Named capture groups
301          %-    Named capture groups, as array refs
302
303       Captured groups are numbered according to their opening paren.
304
305   FUNCTIONS
306          lc          Lowercase a string
307          lcfirst     Lowercase first char of a string
308          uc          Uppercase a string
309          ucfirst     Titlecase first char of a string
310          fc          Foldcase a string
311
312          pos         Return or set current match position
313          quotemeta   Quote metacharacters
314          reset       Reset m?pattern? status
315          study       Analyze string for optimizing matching
316
317          split       Use a regex to split a string into parts
318
319       The first five of these are like the escape sequences "\L", "\l", "\U",
320       "\u", and "\F".  For Titlecase, see "Titlecase"; For Foldcase, see
321       "Foldcase".
322
323   TERMINOLOGY
324       Titlecase
325
326       Unicode concept which most often is equal to uppercase, but for certain
327       characters like the German "sharp s" there is a difference.
328
329       Foldcase
330
331       Unicode form that is useful when comparing strings regardless of case,
332       as certain characters have complex one-to-many case mappings. Primarily
333       a variant of lowercase.
334

AUTHOR

336       Iain Truskett. Updated by the Perl 5 Porters.
337
338       This document may be distributed under the same terms as Perl itself.
339

SEE ALSO

341       ·   perlretut for a tutorial on regular expressions.
342
343       ·   perlrequick for a rapid tutorial.
344
345       ·   perlre for more details.
346
347       ·   perlvar for details on the variables.
348
349       ·   perlop for details on the operators.
350
351       ·   perlfunc for details on the functions.
352
353       ·   perlfaq6 for FAQs on regular expressions.
354
355       ·   perlrebackslash for a reference on backslash sequences.
356
357       ·   perlrecharclass for a reference on character classes.
358
359       ·   The re module to alter behaviour and aid debugging.
360
361       ·   "Debugging Regular Expressions" in perldebug
362
363       ·   perluniintro, perlunicode, charnames and perllocale for details on
364           regexes and internationalisation.
365
366       ·   Mastering Regular Expressions by Jeffrey Friedl
367           (<http://oreilly.com/catalog/9780596528126/>) for a thorough
368           grounding and reference on the topic.
369

THANKS

371       David P.C. Wollmann, Richard Soderberg, Sean M. Burke, Tom
372       Christiansen, Jim Cromie, and Jeffrey Goff for useful advice.
373
374
375
376perl v5.32.1                      2021-03-31                      PERLREREF(1)
Impressum