1Str(3) OCaml library Str(3)
2
3
4
6 Str - Regular expressions and high-level string processing
7
9 Module Str
10
12 Module Str
13 : sig end
14
15
16 Regular expressions and high-level string processing
17
18
19
20
21
22
23
24
25 === Regular expressions ===
26
27
28 type regexp
29
30
31 The type of compiled regular expressions.
32
33
34
35
36 val regexp : string -> regexp
37
38 Compile a regular expression. The following constructs are recognized:
39
40 - . Matches any character except newline.
41
42 - * (postfix) Matches the preceding expression zero, one or several
43 times
44
45 - + (postfix) Matches the preceding expression one or several times
46
47 - ? (postfix) Matches the preceding expression once or not at all
48
49 - [..] Character set. Ranges are denoted with - , as in [a-z] . An
50 initial ^ , as in [^0-9] , complements the set. To include a ] charac‐
51 ter in a set, make it the first character of the set. To include a -
52 character in a set, make it the first or the last character of the set.
53
54 - ^ Matches at beginning of line (either at the beginning of the
55 matched string, or just after a newline character).
56
57 - $ Matches at end of line (either at the end of the matched string, or
58 just before a newline character).
59
60 - \| (infix) Alternative between two expressions.
61
62 - \(..\) Grouping and naming of the enclosed expression.
63
64 - \1 The text matched by the first \(...\) expression ( \2 for the sec‐
65 ond expression, and so on up to \9 ).
66
67 - \b Matches word boundaries.
68
69 - \ Quotes special characters. The special characters are $^.*+?[] .
70
71
72
73
74
75 val regexp_case_fold : string -> regexp
76
77 Same as regexp , but the compiled expression will match text in a
78 case-insensitive way: uppercase and lowercase letters will be consid‐
79 ered equivalent.
80
81
82
83
84 val quote : string -> string
85
86
87 Str.quote s returns a regexp string that matches exactly s and nothing
88 else.
89
90
91
92
93 val regexp_string : string -> regexp
94
95
96 Str.regexp_string s returns a regular expression that matches exactly s
97 and nothing else.
98
99
100
101
102 val regexp_string_case_fold : string -> regexp
103
104
105 Str.regexp_string_case_fold is similar to Str.regexp_string , but the
106 regexp matches in a case-insensitive way.
107
108
109
110
111
112 === String matching and searching ===
113
114
115 val string_match : regexp -> string -> int -> bool
116
117
118 string_match r s start tests whether a substring of s that starts at
119 position start matches the regular expression r . The first character
120 of a string has position 0 , as usual.
121
122
123
124
125 val search_forward : regexp -> string -> int -> int
126
127
128 search_forward r s start searches the string s for a substring matching
129 the regular expression r . The search starts at position start and pro‐
130 ceeds towards the end of the string. Return the position of the first
131 character of the matched substring, or raise Not_found if no substring
132 matches.
133
134
135
136
137 val search_backward : regexp -> string -> int -> int
138
139
140 search_backward r s last searches the string s for a substring matching
141 the regular expression r . The search first considers substrings that
142 start at position last and proceeds towards the beginning of string.
143 Return the position of the first character of the matched substring;
144 raise Not_found if no substring matches.
145
146
147
148
149 val string_partial_match : regexp -> string -> int -> bool
150
151 Similar to Str.string_match , but also returns true if the argument
152 string is a prefix of a string that matches. This includes the case of
153 a true complete match.
154
155
156
157
158 val matched_string : string -> string
159
160
161 matched_string s returns the substring of s that was matched by the
162 latest Str.string_match , Str.search_forward or Str.search_backward .
163 The user must make sure that the parameter s is the same string that
164 was passed to the matching or searching function.
165
166
167
168
169 val match_beginning : unit -> int
170
171
172 match_beginning() returns the position of the first character of the
173 substring that was matched by Str.string_match , Str.search_forward or
174 Str.search_backward .
175
176
177
178
179 val match_end : unit -> int
180
181
182 match_end() returns the position of the character following the last
183 character of the substring that was matched by string_match ,
184 search_forward or search_backward .
185
186
187
188
189 val matched_group : int -> string -> string
190
191
192 matched_group n s returns the substring of s that was matched by the n
193 th group \(...\) of the regular expression during the latest
194 Str.string_match , Str.search_forward or Str.search_backward . The
195 user must make sure that the parameter s is the same string that was
196 passed to the matching or searching function. matched_group n s raises
197 Not_found if the n th group of the regular expression was not matched.
198 This can happen with groups inside alternatives \| , options ? or rep‐
199 etitions * . For instance, the empty string will match \(a\)* , but
200 matched_group 1 will raise Not_found because the first group itself was
201 not matched.
202
203
204
205
206 val group_beginning : int -> int
207
208
209 group_beginning n returns the position of the first character of the
210 substring that was matched by the n th group of the regular expression.
211
212 Raises
213
214 Not_found if the n th group of the regular expression was not matched.
215
216 Invalid_argument if there are fewer than n groups in the regular
217 expression.
218
219
220
221
222
223 val group_end : int -> int
224
225
226 group_end n returns the position of the character following the last
227 character of substring that was matched by the n th group of the regu‐
228 lar expression.
229
230 Raises
231
232 Not_found if the n th group of the regular expression was not matched.
233
234 Invalid_argument if there are fewer than n groups in the regular
235 expression.
236
237
238
239
240
241
242 === Replacement ===
243
244
245 val global_replace : regexp -> string -> string -> string
246
247
248 global_replace regexp templ s returns a string identical to s , except
249 that all substrings of s that match regexp have been replaced by templ
250 . The replacement template templ can contain \1 , \2 , etc; these
251 sequences will be replaced by the text matched by the corresponding
252 group in the regular expression. \0 stands for the text matched by the
253 whole regular expression.
254
255
256
257
258 val replace_first : regexp -> string -> string -> string
259
260 Same as Str.global_replace , except that only the first substring
261 matching the regular expression is replaced.
262
263
264
265
266 val global_substitute : regexp -> (string -> string) -> string ->
267 string
268
269
270 global_substitute regexp subst s returns a string identical to s ,
271 except that all substrings of s that match regexp have been replaced by
272 the result of function subst . The function subst is called once for
273 each matching substring, and receives s (the whole text) as argument.
274
275
276
277
278 val substitute_first : regexp -> (string -> string) -> string -> string
279
280 Same as Str.global_substitute , except that only the first substring
281 matching the regular expression is replaced.
282
283
284
285
286 val replace_matched : string -> string -> string
287
288
289 replace_matched repl s returns the replacement text repl in which \1 ,
290 \2 , etc. have been replaced by the text matched by the corresponding
291 groups in the most recent matching operation. s must be the same
292 string that was matched during this matching operation.
293
294
295
296
297
298 === Splitting ===
299
300
301 val split : regexp -> string -> string list
302
303
304 split r s splits s into substrings, taking as delimiters the substrings
305 that match r , and returns the list of substrings. For instance, split
306 (regexp [ \t]+ ) s splits s into blank-separated words. An occurrence
307 of the delimiter at the beginning and at the end of the string is
308 ignored.
309
310
311
312
313 val bounded_split : regexp -> string -> int -> string list
314
315 Same as Str.split , but splits into at most n substrings, where n is
316 the extra integer parameter.
317
318
319
320
321 val split_delim : regexp -> string -> string list
322
323 Same as Str.split but occurrences of the delimiter at the beginning and
324 at the end of the string are recognized and returned as empty strings
325 in the result. For instance, split_delim (regexp ) abc returns ["";
326 abc ; ] , while split with the same arguments returns ["abc"] .
327
328
329
330
331 val bounded_split_delim : regexp -> string -> int -> string list
332
333 Same as Str.bounded_split , but occurrences of the delimiter at the
334 beginning and at the end of the string are recognized and returned as
335 empty strings in the result.
336
337
338
339 type split_result =
340 | Text of string
341 | Delim of string
342
343
344
345
346
347 val full_split : regexp -> string -> split_result list
348
349 Same as Str.split_delim , but returns the delimiters as well as the
350 substrings contained between delimiters. The former are tagged Delim
351 in the result list; the latter are tagged Text . For instance,
352 full_split (regexp [{}] ) {ab} returns [Delim { ; Text ab ; Delim } ] .
353
354
355
356
357 val bounded_full_split : regexp -> string -> int -> split_result list
358
359 Same as Str.bounded_split_delim , but returns the delimiters as well as
360 the substrings contained between delimiters. The former are tagged
361 Delim in the result list; the latter are tagged Text .
362
363
364
365
366
367 === Extracting substrings ===
368
369
370 val string_before : string -> int -> string
371
372
373 string_before s n returns the substring of all characters of s that
374 precede position n (excluding the character at position n ).
375
376
377
378
379 val string_after : string -> int -> string
380
381
382 string_after s n returns the substring of all characters of s that fol‐
383 low position n (including the character at position n ).
384
385
386
387
388 val first_chars : string -> int -> string
389
390
391 first_chars s n returns the first n characters of s . This is the same
392 function as Str.string_before .
393
394
395
396
397 val last_chars : string -> int -> string
398
399
400 last_chars s n returns the last n characters of s .
401
402
403
404
405
406
407OCamldoc 2017-03-22 Str(3)