1regexpr(3GEN) String Pattern-Matching Library Functions regexpr(3GEN)
2
3
4
6 regexpr, compile, step, advance - regular expression compile and match
7 routines
8
10 cc [flag]... [file]... -lgen [library]...
11
12
13 #include <regexpr.h>
14
15 char *compile(char *instring, char *expbuf, const char *endbuf);
16
17
18 int
19 step(const char *string, const char *expbuf);
20
21
22 int
23 advance(const char *string, const char *expbuf);
24
25
26 extern char *loc1, loc2, locs;
27
28
29 extern int nbra, regerrno, reglength;
30
31
32 extern char *braslist[], *braelist[];
33
34
36 These routines are used to compile regular expressions and match the
37 compiled expressions against lines. The regular expressions compiled
38 are in the form used by ed(1).
39
40
41 The parameter instring is a null-terminated string representing the
42 regular expression.
43
44
45 The parameter expbuf points to the place where the compiled regular
46 expression is to be placed. If expbuf is NULL, compile() uses mal‐
47 loc(3C) to allocate the space for the compiled regular expression. If
48 an error occurs, this space is freed. It is the user's responsibility
49 to free unneeded space after the compiled regular expression is no
50 longer needed.
51
52
53 The parameter endbuf is one more than the highest address where the
54 compiled regular expression may be placed. This argument is ignored if
55 expbuf is NULL. If the compiled expression cannot fit in (endbuf−exp‐
56 buf) bytes, compile() returns NULL and regerrno (see below) is set to
57 50.
58
59
60 The parameter string is a pointer to a string of characters to be
61 checked for a match. This string should be null-terminated.
62
63
64 The parameter expbuf is the compiled regular expression obtained by a
65 call of the function compile().
66
67
68 The function step() returns non-zero if the given string matches the
69 regular expression, and zero if the expressions do not match. If there
70 is a match, two external character pointers are set as a side effect to
71 the call to step(). The variables set in step() are loc1 and loc2.
72 loc1 is a pointer to the first character that matched the regular
73 expression. The variable loc2 points to the character after the last
74 character that matches the regular expression. Thus if the regular
75 expression matches the entire line, loc1 points to the first character
76 of string and loc2 points to the null at the end of string.
77
78
79 The purpose of step() is to step through the string argument until a
80 match is found or until the end of string is reached. If the regular
81 expression begins with ^, step() tries to match the regular expression
82 at the beginning of the string only.
83
84
85 The advance() function is similar to step(); but, it only sets the
86 variable loc2 and always restricts matches to the beginning of the
87 string.
88
89
90 If one is looking for successive matches in the same string of charac‐
91 ters, locs should be set equal to loc2, and step() should be called
92 with string equal to loc2. locs is used by commands like ed and sed so
93 that global substitutions like s/y*//g do not loop forever, and is NULL
94 by default.
95
96
97 The external variable nbra is used to determine the number of subex‐
98 pressions in the compiled regular expression. braslist and braelist
99 are arrays of character pointers that point to the start and end of the
100 nbra subexpressions in the matched string. For example, after calling
101 step() or advance() with string sabcdefg and regular expression
102 \(abcdef\), braslist[0] will point at a and braelist[0] will point at
103 g. These arrays are used by commands like ed and sed for substitute
104 replacement patterns that contain the \n notation for subexpressions.
105
106
107 Note that it is not necessary to use the external variables regerrno,
108 nbra, loc1, loc2 locs, braelist, and braslist if one is only checking
109 whether or not a string matches a regular expression.
110
112 Example 1 The following is similar to the regular expression code from
113 grep:
114
115 #include<regexpr.h>
116 . . .
117 if(compile(*argv, (char *)0, (char *)0) == (char *)0)
118 regerr(regerrno);
119 . . .
120 if (step(linebuf, expbuf))
121 succeed();
122
123
125 If compile() succeeds, it returns a non-NULL pointer whose value
126 depends on expbuf. If expbuf is non-NULL, compile() returns a pointer
127 to the byte after the last byte in the compiled regular expression.
128 The length of the compiled regular expression is stored in reglength.
129 Otherwise, compile() returns a pointer to the space allocated by mal‐
130 loc(3C).
131
132
133 The functions step() and advance() return non-zero if the given string
134 matches the regular expression, and zero if the expressions do not
135 match.
136
138 If an error is detected when compiling the regular expression, a NULL
139 pointer is returned from compile() and regerrno is set to one of the
140 non-zero error numbers indicated below:
141
142
143
144
145 ERROR MEANING
146 11 Range endpoint too large.
147 16 Bad Number.
148 25 "\digit" out or range.
149 36 Illegal or missing delimiter.
150 41 No remembered string search.
151 42 \(~\) imbalance.
152 43 Too many \(.
153 44 More than 2 numbers given in \[~\}.
154 45 } expected after \.
155 46 First number exceeds second in \{~\}.
156 49 [] imbalance.
157 50 Regular expression overflow.
158
159
161 See attributes(5) for descriptions of the following attributes:
162
163
164
165
166 ┌─────────────────────────────┬─────────────────────────────┐
167 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
168 ├─────────────────────────────┼─────────────────────────────┤
169 │MT-Level │MT-Safe │
170 └─────────────────────────────┴─────────────────────────────┘
171
173 ed(1), grep(1), sed(1), malloc(3C), attributes(5), regexp(5)
174
176 When compiling multi-threaded applications, the _REENTRANT flag must be
177 defined on the compile line. This flag should only be used in multi-
178 threaded applications.
179
180
181
182SunOS 5.11 29 Dec 1996 regexpr(3GEN)