1REGEX(3) Linux Programmer's Manual REGEX(3)
2
3
4
6 regcomp, regexec, regerror, regfree - POSIX regex functions
7
9 #include <sys/types.h>
10 #include <regex.h>
11
12 int regcomp(regex_t *preg, const char *regex, int cflags);
13 int regexec(const regex_t *preg, const char *string, size_t nmatch,
14 regmatch_t pmatch[], int eflags);
15 size_t regerror(int errcode, const regex_t *preg, char *errbuf, size_t
16 errbuf_size);
17 void regfree(regex_t *preg);
18
20 regcomp() is used to compile a regular expression into a form that is
21 suitable for subsequent regexec() searches.
22
23 regcomp() is supplied with preg, a pointer to a pattern buffer storage
24 area; regex, a pointer to the null-terminated string and cflags, flags
25 used to determine the type of compilation.
26
27 All regular expression searching must be done via a compiled pattern
28 buffer, thus regexec() must always be supplied with the address of a
29 regcomp() initialized pattern buffer.
30
31 cflags may be the bitwise-or of one or more of the following:
32
33 REG_EXTENDED
34 Use POSIX Extended Regular Expression syntax when interpreting
35 regex. If not set, POSIX Basic Regular Expression syntax is
36 used.
37
38 REG_ICASE
39 Do not differentiate case. Subsequent regexec() searches using
40 this pattern buffer will be case insensitive.
41
42 REG_NOSUB
43 Support for substring addressing of matches is not required.
44 The nmatch and pmatch parameters to regexec() are ignored if the
45 pattern buffer supplied was compiled with this flag set.
46
47 REG_NEWLINE
48 Match-any-character operators don't match a newline.
49
50 A non-matching list ([^...]) not containing a newline does not
51 match a newline.
52
53 Match-beginning-of-line operator (^) matches the empty string
54 immediately after a newline, regardless of whether eflags, the
55 execution flags of regexec(), contains REG_NOTBOL.
56
57 Match-end-of-line operator ($) matches the empty string immedi‐
58 ately before a newline, regardless of whether eflags contains
59 REG_NOTEOL.
60
62 regexec() is used to match a null-terminated string against the precom‐
63 piled pattern buffer, preg. nmatch and pmatch are used to provide
64 information regarding the location of any matches. eflags may be the
65 bitwise-or of one or both of REG_NOTBOL and REG_NOTEOL which cause
66 changes in matching behaviour described below.
67
68 REG_NOTBOL
69 The match-beginning-of-line operator always fails to match (but
70 see the compilation flag REG_NEWLINE above) This flag may be
71 used when different portions of a string are passed to regexec()
72 and the beginning of the string should not be interpreted as the
73 beginning of the line.
74
75 REG_NOTEOL
76 The match-end-of-line operator always fails to match (but see
77 the compilation flag REG_NEWLINE above)
78
79 BYTE OFFSETS
80 Unless REG_NOSUB was set for the compilation of the pattern buffer, it
81 is possible to obtain substring match addressing information. pmatch
82 must be dimensioned to have at least nmatch elements. These are filled
83 in by regexec() with substring match addresses. Any unused structure
84 elements will contain the value -1.
85
86 The regmatch_t structure which is the type of pmatch is defined in
87 regex.h.
88
89 typedef struct
90 {
91 regoff_t rm_so;
92 regoff_t rm_eo;
93 } regmatch_t;
94
95 Each rm_so element that is not -1 indicates the start offset of the
96 next largest substring match within the string. The relative rm_eo
97 element indicates the end offset of the match.
98
100 regerror() is used to turn the error codes that can be returned by both
101 regcomp() and regexec() into error message strings.
102
103 regerror() is passed the error code, errcode, the pattern buffer, preg,
104 a pointer to a character string buffer, errbuf, and the size of the
105 string buffer, errbuf_size. It returns the size of the errbuf required
106 to contain the null-terminated error message string. If both errbuf
107 and errbuf_size are non-zero, errbuf is filled in with the first
108 errbuf_size - 1 characters of the error message and a terminating null.
109
111 Supplying regfree() with a precompiled pattern buffer, preg will free
112 the memory allocated to the pattern buffer by the compiling process,
113 regcomp().
114
116 regcomp() returns zero for a successful compilation or an error code
117 for failure.
118
119 regexec() returns zero for a successful match or REG_NOMATCH for fail‐
120 ure.
121
123 The following errors can be returned by regcomp():
124
125 REG_BADBR
126 Invalid use of back reference operator.
127
128 REG_BADPAT
129 Invalid use of pattern operators such as group or list.
130
131 REG_BADRPT
132 Invalid use of repetition operators such as using `*' as the
133 first character.
134
135 REG_EBRACE
136 Un-matched brace interval operators.
137
138 REG_EBRACK
139 Un-matched bracket list operators.
140
141 REG_ECOLLATE
142 Invalid collating element.
143
144 REG_ECTYPE
145 Unknown character class name.
146
147 REG_EEND
148 Non specific error. This is not defined by POSIX.2.
149
150 REG_EESCAPE
151 Trailing backslash.
152
153 REG_EPAREN
154 Un-matched parenthesis group operators.
155
156 REG_ERANGE
157 Invalid use of the range operator, eg. the ending point of the
158 range occurs prior to the starting point.
159
160 REG_ESIZE
161 Compiled regular expression requires a pattern buffer larger
162 than 64Kb. This is not defined by POSIX.2.
163
164 REG_ESPACE
165 The regex routines ran out of memory.
166
167 REG_ESUBREG
168 Invalid back reference to a subexpression.
169
171 POSIX.1-2001.
172
174 regex(7), GNU regex manual
175
176
177
178
179GNU 1998-05-08 REGEX(3)