1Str(3)                             OCamldoc                             Str(3)
2
3
4

NAME

6       Str - Regular expressions and high-level string processing
7

Module

9       Module   Str
10

Documentation

12       Module Str
13        : sig end
14
15
16       Regular expressions and high-level string processing
17
18
19
20
21
22
23
24       === Regular expressions ===
25
26
27       type regexp
28
29
30       The type of compiled regular expressions.
31
32
33
34       val regexp : string -> regexp
35
36       Compile a regular expression. The following constructs are recognized:
37
38       - .  Matches any character except newline.
39
40       -  *  (postfix)  Matches  the preceding expression zero, one or several
41       times
42
43       - + (postfix) Matches the preceding expression one or several times
44
45       - ?  (postfix) Matches the preceding expression once or not at all
46
47       - [..]  Character set. Ranges are denoted with - , as in  [a-z]  .   An
48       initial ^ , as in [^0-9] , complements the set.  To include a ] charac‐
49       ter in a set, make it the first character of the set. To  include  a  -
50       character in a set, make it the first or the last character of the set.
51
52       -  ^  Matches  at  beginning  of  line:  either at the beginning of the
53       matched string, or just after a '\n' character.
54
55       - $ Matches at end of line: either at the end of the matched string, or
56       just before a '\n' character.
57
58       - \| (infix) Alternative between two expressions.
59
60       - \(..\) Grouping and naming of the enclosed expression.
61
62       - \1 The text matched by the first \(...\) expression ( \2 for the sec‐
63       ond expression, and so on up to \9 ).
64
65       - \b Matches word boundaries.
66
67       - \ Quotes special characters.  The special characters are $^\.*+?[] .
68
69       Note: the argument to regexp is usually a string literal. In this case,
70       any  backslash  character  in the regular expression must be doubled to
71       make it past the  OCaml  string  parser.  For  example,  the  following
72       expression:    let    r   =   Str.regexp   hello   \\([A-Za-z]+\\)   in
73       Str.replace_first r \\1 hello world returns the string world .
74
75       In particular, if you want a regular expression that matches  a  single
76       backslash  character,  you  need  to quote it in the argument to regexp
77       (according to the last item of the list above) by adding a second back‐
78       slash. Then you need to quote both backslashes (according to the syntax
79       of string constants in OCaml) by doubling them again, so  you  need  to
80       write four backslash characters: Str.regexp \\\\ .
81
82
83
84       val regexp_case_fold : string -> regexp
85
86       Same  as  regexp  ,  but  the  compiled expression will match text in a
87       case-insensitive way: uppercase and lowercase letters will  be  consid‐
88       ered equivalent.
89
90
91
92       val quote : string -> string
93
94
95       Str.quote  s returns a regexp string that matches exactly s and nothing
96       else.
97
98
99
100       val regexp_string : string -> regexp
101
102
103       Str.regexp_string s returns a regular expression that matches exactly s
104       and nothing else.
105
106
107
108       val regexp_string_case_fold : string -> regexp
109
110
111       Str.regexp_string_case_fold  is  similar to Str.regexp_string , but the
112       regexp matches in a case-insensitive way.
113
114
115
116
117       === String matching and searching ===
118
119
120       val string_match : regexp -> string -> int -> bool
121
122
123       string_match r s start tests whether a substring of s  that  starts  at
124       position  start matches the regular expression r .  The first character
125       of a string has position 0 , as usual.
126
127
128
129       val search_forward : regexp -> string -> int -> int
130
131
132       search_forward r s start searches the string s for a substring matching
133       the regular expression r . The search starts at position start and pro‐
134       ceeds towards the end of the string.  Return the position of the  first
135       character of the matched substring.
136
137
138       Raises Not_found if no substring matches.
139
140
141
142       val search_backward : regexp -> string -> int -> int
143
144
145       search_backward r s last searches the string s for a substring matching
146       the regular expression r . The search first considers  substrings  that
147       start  at  position  last and proceeds towards the beginning of string.
148       Return the position of the first character of the matched substring.
149
150
151       Raises Not_found if no substring matches.
152
153
154
155       val string_partial_match : regexp -> string -> int -> bool
156
157       Similar to Str.string_match , but also returns  true  if  the  argument
158       string is a prefix of a string that matches.  This includes the case of
159       a true complete match.
160
161
162
163       val matched_string : string -> string
164
165
166       matched_string s returns the substring of s that  was  matched  by  the
167       last call to one of the following matching or searching functions:
168
169       - Str.string_match
170
171
172       - Str.search_forward
173
174
175       - Str.search_backward
176
177
178       - Str.string_partial_match
179
180
181       - Str.global_substitute
182
183
184       - Str.substitute_first
185
186       provided that none of the following functions was called inbetween:
187
188       - Str.global_replace
189
190
191       - Str.replace_first
192
193
194       - Str.split
195
196
197       - Str.bounded_split
198
199
200       - Str.split_delim
201
202
203       - Str.bounded_split_delim
204
205
206       - Str.full_split
207
208
209       - Str.bounded_full_split
210
211       Note: in the case of global_substitute and substitute_first , a call to
212       matched_string is only valid  within  the  subst  argument,  not  after
213       global_substitute or substitute_first returns.
214
215       The  user  must  make sure that the parameter s is the same string that
216       was passed to the matching or searching function.
217
218
219
220       val match_beginning : unit -> int
221
222
223       match_beginning() returns the position of the first  character  of  the
224       substring  that was matched by the last call to a matching or searching
225       function (see Str.matched_string for details).
226
227
228
229       val match_end : unit -> int
230
231
232       match_end() returns the position of the character  following  the  last
233       character  of  the  substring  that  was  matched by the last call to a
234       matching or searching function (see Str.matched_string for details).
235
236
237
238       val matched_group : int -> string -> string
239
240
241       matched_group n s returns the substring of s that was matched by the  n
242       th group \(...\) of the regular expression that was matched by the last
243       call to a matching or searching function  (see  Str.matched_string  for
244       details).   The  user  must  make sure that the parameter s is the same
245       string that was passed to the matching or searching function.
246
247
248       Raises Not_found if the n th group of the regular  expression  was  not
249       matched.   This can happen with groups inside alternatives \| , options
250       ?  or repetitions * .  For instance, the empty string will match \(a\)*
251       ,  but  matched_group  1  will  raise Not_found because the first group
252       itself was not matched.
253
254
255
256       val group_beginning : int -> int
257
258
259       group_beginning n returns the position of the first  character  of  the
260       substring  that was matched by the n th group of the regular expression
261       that was matched by the last call to a matching or  searching  function
262       (see Str.matched_string for details).
263
264
265       Raises  Not_found  if  the n th group of the regular expression was not
266       matched.
267
268
269       Raises Invalid_argument if there are fewer than n groups in the regular
270       expression.
271
272
273
274       val group_end : int -> int
275
276
277       group_end  n  returns  the position of the character following the last
278       character of substring that was matched by the n th group of the  regu‐
279       lar  expression  that  was  matched  by  the last call to a matching or
280       searching function (see Str.matched_string for details).
281
282
283       Raises Not_found if the n th group of the regular  expression  was  not
284       matched.
285
286
287       Raises Invalid_argument if there are fewer than n groups in the regular
288       expression.
289
290
291
292
293       === Replacement ===
294
295
296       val global_replace : regexp -> string -> string -> string
297
298
299       global_replace regexp templ s returns a string identical to s ,  except
300       that  all substrings of s that match regexp have been replaced by templ
301       . The replacement template templ can contain  \1  ,  \2  ,  etc;  these
302       sequences  will  be  replaced  by the text matched by the corresponding
303       group in the regular expression.  \0 stands for the text matched by the
304       whole regular expression.
305
306
307
308       val replace_first : regexp -> string -> string -> string
309
310       Same  as  Str.global_replace  ,  except  that  only the first substring
311       matching the regular expression is replaced.
312
313
314
315       val global_substitute : regexp ->  (string  ->  string)  ->  string  ->
316       string
317
318
319       global_substitute  regexp  subst  s  returns  a string identical to s ,
320       except that all substrings of s that match regexp have been replaced by
321       the  result  of  function subst . The function subst is called once for
322       each matching substring, and receives s (the whole text) as argument.
323
324
325
326       val substitute_first : regexp -> (string -> string) -> string -> string
327
328       Same as Str.global_substitute , except that only  the  first  substring
329       matching the regular expression is replaced.
330
331
332
333       val replace_matched : string -> string -> string
334
335
336       replace_matched  repl s returns the replacement text repl in which \1 ,
337       \2 , etc. have been replaced by the text matched by  the  corresponding
338       groups in the regular expression that was matched by the last call to a
339       matching or searching function (see Str.matched_string for details).  s
340       must  be  the  same string that was passed to the matching or searching
341       function.
342
343
344
345
346       === Splitting ===
347
348
349       val split : regexp -> string -> string list
350
351
352       split r s splits s into substrings, taking as delimiters the substrings
353       that match r , and returns the list of substrings.  For instance, split
354       (regexp [ \t]+ ) s splits s into blank-separated words.  An  occurrence
355       of  the  delimiter  at  the  beginning  or  at the end of the string is
356       ignored.
357
358
359
360       val bounded_split : regexp -> string -> int -> string list
361
362       Same as Str.split , but splits into at most n substrings,  where  n  is
363       the extra integer parameter.
364
365
366
367       val split_delim : regexp -> string -> string list
368
369       Same as Str.split but occurrences of the delimiter at the beginning and
370       at the end of the string are recognized and returned as  empty  strings
371       in the result.  For instance, split_delim (regexp   )  abc returns ["";
372       abc ;  ] , while split with the same arguments returns ["abc"] .
373
374
375
376       val bounded_split_delim : regexp -> string -> int -> string list
377
378       Same as Str.bounded_split , but occurrences of  the  delimiter  at  the
379       beginning  and  at the end of the string are recognized and returned as
380       empty strings in the result.
381
382
383       type split_result =
384        | Text of string
385        | Delim of string
386
387
388
389
390
391       val full_split : regexp -> string -> split_result list
392
393       Same as Str.split_delim , but returns the delimiters  as  well  as  the
394       substrings  contained  between delimiters.  The former are tagged Delim
395       in the result list;  the  latter  are  tagged  Text  .   For  instance,
396       full_split (regexp [{}] ) {ab} returns [Delim { ; Text ab ; Delim } ] .
397
398
399
400       val bounded_full_split : regexp -> string -> int -> split_result list
401
402       Same as Str.bounded_split_delim , but returns the delimiters as well as
403       the substrings contained between delimiters.   The  former  are  tagged
404       Delim in the result list; the latter are tagged Text .
405
406
407
408
409       === Extracting substrings ===
410
411
412       val string_before : string -> int -> string
413
414
415       string_before  s  n  returns  the substring of all characters of s that
416       precede position n (excluding the character at position n ).
417
418
419
420       val string_after : string -> int -> string
421
422
423       string_after s n returns the substring of all characters of s that fol‐
424       low position n (including the character at position n ).
425
426
427
428       val first_chars : string -> int -> string
429
430
431       first_chars s n returns the first n characters of s .  This is the same
432       function as Str.string_before .
433
434
435
436       val last_chars : string -> int -> string
437
438
439       last_chars s n returns the last n characters of s .
440
441
442
443
444
4452018-04-14                          source:                             Str(3)
Impressum