1Str(3) OCaml library Str(3)
2
3
4
6 Str - Regular expressions and high-level string processing
7
9 Module Str
10
12 Module Str
13 : sig end
14
15
16 Regular expressions and high-level string processing
17
18
19
20
21
22
23
24
25 === Regular expressions ===
26
27 type regexp
28
29
30 The type of compiled regular expressions.
31
32
33
34
35 val regexp : string -> regexp
36
37 Compile a regular expression. The following constructs are recognized:
38
39 - . Matches any character except newline.
40
41 - * (postfix) Matches the preceding expression zero, one or several
42 times
43
44 - + (postfix) Matches the preceding expression one or several times
45
46 - ? (postfix) Matches the preceding expression once or not at all
47
48 - [..] Character set. Ranges are denoted with - , as in [a-z] . An
49 initial ^ , as in [^0-9] , complements the set. To include a ] charac‐
50 ter in a set, make it the first character of the set. To include a -
51 character in a set, make it the first or the last character of the set.
52
53 - ^ Matches at beginning of line (either at the beginning of the
54 matched string, or just after a newline character).
55
56 - $ Matches at end of line (either at the end of the matched string, or
57 just before a newline character).
58
59 - \| (infix) Alternative between two expressions.
60
61 - \(..\) Grouping and naming of the enclosed expression.
62
63 - \1 The text matched by the first \(...\) expression ( \2 for the sec‐
64 ond expression, and so on up to \9 ).
65
66 - \b Matches word boundaries.
67
68 - \ Quotes special characters. The special characters are $^.*+?[] .
69
70
71
72
73
74 val regexp_case_fold : string -> regexp
75
76 Same as regexp , but the compiled expression will match text in a case-
77 insensitive way: uppercase and lowercase letters will be considered
78 equivalent.
79
80
81
82
83 val quote : string -> string
84
85
86 Str.quote s returns a regexp string that matches exactly s and nothing
87 else.
88
89
90
91
92 val regexp_string : string -> regexp
93
94
95 Str.regexp_string s returns a regular expression that matches exactly s
96 and nothing else.
97
98
99
100
101 val regexp_string_case_fold : string -> regexp
102
103
104 Str.regexp_string_case_fold is similar to Str.regexp_string , but the
105 regexp matches in a case-insensitive way.
106
107
108
109
110
111 === String matching and searching ===
112
113
114 val string_match : regexp -> string -> int -> bool
115
116
117 string_match r s start tests whether a substring of s that starts at
118 position start matches the regular expression r . The first character
119 of a string has position 0 , as usual.
120
121
122
123
124 val search_forward : regexp -> string -> int -> int
125
126
127 search_forward r s start searches the string s for a substring matching
128 the regular expression r . The search starts at position start and pro‐
129 ceeds towards the end of the string. Return the position of the first
130 character of the matched substring, or raise Not_found if no substring
131 matches.
132
133
134
135
136 val search_backward : regexp -> string -> int -> int
137
138
139 search_backward r s last searches the string s for a substring matching
140 the regular expression r . The search first considers substrings that
141 start at position last and proceeds towards the beginning of string.
142 Return the position of the first character of the matched substring;
143 raise Not_found if no substring matches.
144
145
146
147
148 val string_partial_match : regexp -> string -> int -> bool
149
150 Similar to Str.string_match , but also returns true if the argument
151 string is a prefix of a string that matches. This includes the case of
152 a true complete match.
153
154
155
156
157 val matched_string : string -> string
158
159
160 matched_string s returns the substring of s that was matched by the
161 latest Str.string_match , Str.search_forward or Str.search_backward .
162 The user must make sure that the parameter s is the same string that
163 was passed to the matching or searching function.
164
165
166
167
168 val match_beginning : unit -> int
169
170
171 match_beginning() returns the position of the first character of the
172 substring that was matched by Str.string_match , Str.search_forward or
173 Str.search_backward .
174
175
176
177
178 val match_end : unit -> int
179
180
181 match_end() returns the position of the character following the last
182 character of the substring that was matched by string_match ,
183 search_forward or search_backward .
184
185
186
187
188 val matched_group : int -> string -> string
189
190
191 matched_group n s returns the substring of s that was matched by the n
192 th group \(...\) of the regular expression during the latest
193 Str.string_match , Str.search_forward or Str.search_backward . The
194 user must make sure that the parameter s is the same string that was
195 passed to the matching or searching function. matched_group n s raises
196 Not_found if the n th group of the regular expression was not matched.
197 This can happen with groups inside alternatives \| , options ? or rep‐
198 etitions * . For instance, the empty string will match \(a\)* , but
199 matched_group 1 will raise Not_found because the first group itself was
200 not matched.
201
202
203
204
205 val group_beginning : int -> int
206
207
208 group_beginning n returns the position of the first character of the
209 substring that was matched by the n th group of the regular expression.
210
211 Raises
212
213 Not_found if the n th group of the regular expression was not matched.
214
215 Invalid_argument if there are fewer than n groups in the regular
216 expression.
217
218
219
220
221
222 val group_end : int -> int
223
224
225 group_end n returns the position of the character following the last
226 character of substring that was matched by the n th group of the regu‐
227 lar expression.
228
229 Raises
230
231 Not_found if the n th group of the regular expression was not matched.
232
233 Invalid_argument if there are fewer than n groups in the regular
234 expression.
235
236
237
238
239
240
241 === Replacement ===
242
243
244 val global_replace : regexp -> string -> string -> string
245
246
247 global_replace regexp templ s returns a string identical to s , except
248 that all substrings of s that match regexp have been replaced by templ
249 . The replacement template templ can contain \1 , \2 , etc; these
250 sequences will be replaced by the text matched by the corresponding
251 group in the regular expression. \0 stands for the text matched by the
252 whole regular expression.
253
254
255
256
257 val replace_first : regexp -> string -> string -> string
258
259 Same as Str.global_replace , except that only the first substring
260 matching the regular expression is replaced.
261
262
263
264
265 val global_substitute : regexp -> (string -> string) -> string ->
266 string
267
268
269 global_substitute regexp subst s returns a string identical to s ,
270 except that all substrings of s that match regexp have been replaced by
271 the result of function subst . The function subst is called once for
272 each matching substring, and receives s (the whole text) as argument.
273
274
275
276
277 val substitute_first : regexp -> (string -> string) -> string -> string
278
279 Same as Str.global_substitute , except that only the first substring
280 matching the regular expression is replaced.
281
282
283
284
285 val replace_matched : string -> string -> string
286
287
288 replace_matched repl s returns the replacement text repl in which \1 ,
289 \2 , etc. have been replaced by the text matched by the corresponding
290 groups in the most recent matching operation. s must be the same
291 string that was matched during this matching operation.
292
293
294
295
296
297 === Splitting ===
298
299
300 val split : regexp -> string -> string list
301
302
303 split r s splits s into substrings, taking as delimiters the substrings
304 that match r , and returns the list of substrings. For instance, split
305 (regexp [ \t]+ ) s splits s into blank-separated words. An occurrence
306 of the delimiter at the beginning and at the end of the string is
307 ignored.
308
309
310
311
312 val bounded_split : regexp -> string -> int -> string list
313
314 Same as Str.split , but splits into at most n substrings, where n is
315 the extra integer parameter.
316
317
318
319
320 val split_delim : regexp -> string -> string list
321
322 Same as Str.split but occurrences of the delimiter at the beginning and
323 at the end of the string are recognized and returned as empty strings
324 in the result. For instance, split_delim (regexp ) abc returns ["";
325 abc ; ] , while split with the same arguments returns ["abc"] .
326
327
328
329
330 val bounded_split_delim : regexp -> string -> int -> string list
331
332 Same as Str.bounded_split , but occurrences of the delimiter at the
333 beginning and at the end of the string are recognized and returned as
334 empty strings in the result.
335
336
337
338 type split_result =
339 | Text of string
340 | Delim of string
341
342
343
344
345
346 val full_split : regexp -> string -> split_result list
347
348 Same as Str.split_delim , but returns the delimiters as well as the
349 substrings contained between delimiters. The former are tagged Delim
350 in the result list; the latter are tagged Text . For instance,
351 full_split (regexp [{}] ) {ab} returns [Delim { ; Text ab ; Delim } ] .
352
353
354
355
356 val bounded_full_split : regexp -> string -> int -> split_result list
357
358 Same as Str.bounded_split_delim , but returns the delimiters as well as
359 the substrings contained between delimiters. The former are tagged
360 Delim in the result list; the latter are tagged Text .
361
362
363
364
365
366 === Extracting substrings ===
367
368
369 val string_before : string -> int -> string
370
371
372 string_before s n returns the substring of all characters of s that
373 precede position n (excluding the character at position n ).
374
375
376
377
378 val string_after : string -> int -> string
379
380
381 string_after s n returns the substring of all characters of s that fol‐
382 low position n (including the character at position n ).
383
384
385
386
387 val first_chars : string -> int -> string
388
389
390 first_chars s n returns the first n characters of s . This is the same
391 function as Str.string_before .
392
393
394
395
396 val last_chars : string -> int -> string
397
398
399 last_chars s n returns the last n characters of s .
400
401
402
403
404
405
406OCamldoc 2007-05-24 Str(3)