1Str(3) OCaml library Str(3)
2
3
4
6 Str - Regular expressions and high-level string processing
7
9 Module Str
10
12 Module Str
13 : sig end
14
15
16 Regular expressions and high-level string processing
17
18
19
20
21
22
23
24 Regular expressions
25 type regexp
26
27
28 The type of compiled regular expressions.
29
30
31
32 val regexp : string -> regexp
33
34 Compile a regular expression. The following constructs are recognized:
35
36 - . Matches any character except newline.
37
38 - * (postfix) Matches the preceding expression zero, one or several
39 times
40
41 - + (postfix) Matches the preceding expression one or several times
42
43 - ? (postfix) Matches the preceding expression once or not at all
44
45 - [..] Character set. Ranges are denoted with - , as in [a-z] . An
46 initial ^ , as in [^0-9] , complements the set. To include a ] charac‐
47 ter in a set, make it the first character of the set. To include a -
48 character in a set, make it the first or the last character of the set.
49
50 - ^ Matches at beginning of line: either at the beginning of the
51 matched string, or just after a '\n' character.
52
53 - $ Matches at end of line: either at the end of the matched string, or
54 just before a '\n' character.
55
56 - \| (infix) Alternative between two expressions.
57
58 - \(..\) Grouping and naming of the enclosed expression.
59
60 - \1 The text matched by the first \(...\) expression ( \2 for the sec‐
61 ond expression, and so on up to \9 ).
62
63 - \b Matches word boundaries.
64
65 - \ Quotes special characters. The special characters are $^\.*+?[] .
66
67 Note: the argument to regexp is usually a string literal. In this case,
68 any backslash character in the regular expression must be doubled to
69 make it past the OCaml string parser. For example, the following
70 expression: let r = Str.regexp hello \\([A-Za-z]+\\) in
71 Str.replace_first r \\1 hello world returns the string world .
72
73 In particular, if you want a regular expression that matches a single
74 backslash character, you need to quote it in the argument to regexp
75 (according to the last item of the list above) by adding a second back‐
76 slash. Then you need to quote both backslashes (according to the syntax
77 of string constants in OCaml) by doubling them again, so you need to
78 write four backslash characters: Str.regexp \\\\ .
79
80
81
82 val regexp_case_fold : string -> regexp
83
84 Same as regexp , but the compiled expression will match text in a
85 case-insensitive way: uppercase and lowercase letters will be consid‐
86 ered equivalent.
87
88
89
90 val quote : string -> string
91
92
93 Str.quote s returns a regexp string that matches exactly s and nothing
94 else.
95
96
97
98 val regexp_string : string -> regexp
99
100
101 Str.regexp_string s returns a regular expression that matches exactly s
102 and nothing else.
103
104
105
106 val regexp_string_case_fold : string -> regexp
107
108
109 Str.regexp_string_case_fold is similar to Str.regexp_string , but the
110 regexp matches in a case-insensitive way.
111
112
113
114
115 String matching and searching
116 val string_match : regexp -> string -> int -> bool
117
118
119 string_match r s start tests whether a substring of s that starts at
120 position start matches the regular expression r . The first character
121 of a string has position 0 , as usual.
122
123
124
125 val search_forward : regexp -> string -> int -> int
126
127
128 search_forward r s start searches the string s for a substring matching
129 the regular expression r . The search starts at position start and pro‐
130 ceeds towards the end of the string. Return the position of the first
131 character of the matched substring.
132
133
134 Raises Not_found if no substring matches.
135
136
137
138 val search_backward : regexp -> string -> int -> int
139
140
141 search_backward r s last searches the string s for a substring matching
142 the regular expression r . The search first considers substrings that
143 start at position last and proceeds towards the beginning of string.
144 Return the position of the first character of the matched substring.
145
146
147 Raises Not_found if no substring matches.
148
149
150
151 val string_partial_match : regexp -> string -> int -> bool
152
153 Similar to Str.string_match , but also returns true if the argument
154 string is a prefix of a string that matches. This includes the case of
155 a true complete match.
156
157
158
159 val matched_string : string -> string
160
161
162 matched_string s returns the substring of s that was matched by the
163 last call to one of the following matching or searching functions:
164
165 - Str.string_match
166
167
168 - Str.search_forward
169
170
171 - Str.search_backward
172
173
174 - Str.string_partial_match
175
176
177 - Str.global_substitute
178
179
180 - Str.substitute_first
181
182 provided that none of the following functions was called in between:
183
184 - Str.global_replace
185
186
187 - Str.replace_first
188
189
190 - Str.split
191
192
193 - Str.bounded_split
194
195
196 - Str.split_delim
197
198
199 - Str.bounded_split_delim
200
201
202 - Str.full_split
203
204
205 - Str.bounded_full_split
206
207 Note: in the case of global_substitute and substitute_first , a call to
208 matched_string is only valid within the subst argument, not after
209 global_substitute or substitute_first returns.
210
211 The user must make sure that the parameter s is the same string that
212 was passed to the matching or searching function.
213
214
215
216 val match_beginning : unit -> int
217
218
219 match_beginning() returns the position of the first character of the
220 substring that was matched by the last call to a matching or searching
221 function (see Str.matched_string for details).
222
223
224
225 val match_end : unit -> int
226
227
228 match_end() returns the position of the character following the last
229 character of the substring that was matched by the last call to a
230 matching or searching function (see Str.matched_string for details).
231
232
233
234 val matched_group : int -> string -> string
235
236
237 matched_group n s returns the substring of s that was matched by the n
238 th group \(...\) of the regular expression that was matched by the last
239 call to a matching or searching function (see Str.matched_string for
240 details). The user must make sure that the parameter s is the same
241 string that was passed to the matching or searching function.
242
243
244 Raises Not_found if the n th group of the regular expression was not
245 matched. This can happen with groups inside alternatives \| , options
246 ? or repetitions * . For instance, the empty string will match \(a\)*
247 , but matched_group 1 will raise Not_found because the first group
248 itself was not matched.
249
250
251
252 val group_beginning : int -> int
253
254
255 group_beginning n returns the position of the first character of the
256 substring that was matched by the n th group of the regular expression
257 that was matched by the last call to a matching or searching function
258 (see Str.matched_string for details).
259
260
261 Raises Not_found if the n th group of the regular expression was not
262 matched.
263
264
265 Raises Invalid_argument if there are fewer than n groups in the regular
266 expression.
267
268
269
270 val group_end : int -> int
271
272
273 group_end n returns the position of the character following the last
274 character of substring that was matched by the n th group of the regu‐
275 lar expression that was matched by the last call to a matching or
276 searching function (see Str.matched_string for details).
277
278
279 Raises Not_found if the n th group of the regular expression was not
280 matched.
281
282
283 Raises Invalid_argument if there are fewer than n groups in the regular
284 expression.
285
286
287
288
289 Replacement
290 val global_replace : regexp -> string -> string -> string
291
292
293 global_replace regexp templ s returns a string identical to s , except
294 that all substrings of s that match regexp have been replaced by templ
295 . The replacement template templ can contain \1 , \2 , etc; these
296 sequences will be replaced by the text matched by the corresponding
297 group in the regular expression. \0 stands for the text matched by the
298 whole regular expression.
299
300
301
302 val replace_first : regexp -> string -> string -> string
303
304 Same as Str.global_replace , except that only the first substring
305 matching the regular expression is replaced.
306
307
308
309 val global_substitute : regexp -> (string -> string) -> string ->
310 string
311
312
313 global_substitute regexp subst s returns a string identical to s ,
314 except that all substrings of s that match regexp have been replaced by
315 the result of function subst . The function subst is called once for
316 each matching substring, and receives s (the whole text) as argument.
317
318
319
320 val substitute_first : regexp -> (string -> string) -> string -> string
321
322 Same as Str.global_substitute , except that only the first substring
323 matching the regular expression is replaced.
324
325
326
327 val replace_matched : string -> string -> string
328
329
330 replace_matched repl s returns the replacement text repl in which \1 ,
331 \2 , etc. have been replaced by the text matched by the corresponding
332 groups in the regular expression that was matched by the last call to a
333 matching or searching function (see Str.matched_string for details). s
334 must be the same string that was passed to the matching or searching
335 function.
336
337
338
339
340 Splitting
341 val split : regexp -> string -> string list
342
343
344 split r s splits s into substrings, taking as delimiters the substrings
345 that match r , and returns the list of substrings. For instance, split
346 (regexp [ \t]+ ) s splits s into blank-separated words. An occurrence
347 of the delimiter at the beginning or at the end of the string is
348 ignored.
349
350
351
352 val bounded_split : regexp -> string -> int -> string list
353
354 Same as Str.split , but splits into at most n substrings, where n is
355 the extra integer parameter.
356
357
358
359 val split_delim : regexp -> string -> string list
360
361 Same as Str.split but occurrences of the delimiter at the beginning and
362 at the end of the string are recognized and returned as empty strings
363 in the result. For instance, split_delim (regexp ) abc returns ["";
364 abc ; ] , while split with the same arguments returns ["abc"] .
365
366
367
368 val bounded_split_delim : regexp -> string -> int -> string list
369
370 Same as Str.bounded_split , but occurrences of the delimiter at the
371 beginning and at the end of the string are recognized and returned as
372 empty strings in the result.
373
374
375 type split_result =
376 | Text of string
377 | Delim of string
378
379
380
381
382
383 val full_split : regexp -> string -> split_result list
384
385 Same as Str.split_delim , but returns the delimiters as well as the
386 substrings contained between delimiters. The former are tagged Delim
387 in the result list; the latter are tagged Text . For instance,
388 full_split (regexp [{}] ) {ab} returns [Delim { ; Text ab ; Delim } ] .
389
390
391
392 val bounded_full_split : regexp -> string -> int -> split_result list
393
394 Same as Str.bounded_split_delim , but returns the delimiters as well as
395 the substrings contained between delimiters. The former are tagged
396 Delim in the result list; the latter are tagged Text .
397
398
399
400
401 Extracting substrings
402 val string_before : string -> int -> string
403
404
405 string_before s n returns the substring of all characters of s that
406 precede position n (excluding the character at position n ).
407
408
409
410 val string_after : string -> int -> string
411
412
413 string_after s n returns the substring of all characters of s that fol‐
414 low position n (including the character at position n ).
415
416
417
418 val first_chars : string -> int -> string
419
420
421 first_chars s n returns the first n characters of s . This is the same
422 function as Str.string_before .
423
424
425
426 val last_chars : string -> int -> string
427
428
429 last_chars s n returns the last n characters of s .
430
431
432
433
434
435OCamldoc 2019-07-30 Str(3)