1Str(3)                           OCaml library                          Str(3)
2
3
4

NAME

6       Str - Regular expressions and high-level string processing
7

Module

9       Module   Str
10

Documentation

12       Module Str
13        : sig end
14
15
16       Regular expressions and high-level string processing
17
18
19
20
21
22
23
24   Regular expressions
25       type regexp
26
27
28       The type of compiled regular expressions.
29
30
31
32       val regexp : string -> regexp
33
34       Compile a regular expression. The following constructs are recognized:
35
36       - .  Matches any character except newline.
37
38       -  *  (postfix)  Matches  the preceding expression zero, one or several
39       times
40
41       - + (postfix) Matches the preceding expression one or several times
42
43       - ?  (postfix) Matches the preceding expression once or not at all
44
45       - [..]  Character set. Ranges are denoted with - , as in  [a-z]  .   An
46       initial ^ , as in [^0-9] , complements the set.  To include a ] charac‐
47       ter in a set, make it the first character of the set. To  include  a  -
48       character in a set, make it the first or the last character of the set.
49
50       -  ^  Matches  at  beginning  of  line:  either at the beginning of the
51       matched string, or just after a '\n' character.
52
53       - $ Matches at end of line: either at the end of the matched string, or
54       just before a '\n' character.
55
56       - \| (infix) Alternative between two expressions.
57
58       - \(..\) Grouping and naming of the enclosed expression.
59
60       - \1 The text matched by the first \(...\) expression ( \2 for the sec‐
61       ond expression, and so on up to \9 ).
62
63       - \b Matches word boundaries.
64
65       - \ Quotes special characters.  The special characters are $^\.*+?[] .
66
67       Note: the argument to regexp is usually a string literal. In this case,
68       any  backslash  character  in the regular expression must be doubled to
69       make it past the  OCaml  string  parser.  For  example,  the  following
70       expression:
71       let r = Str.regexp "hello \\([A-Za-z]+\\)" in
72             Str.replace_first r "\\1" "hello world"
73       returns the string "world" .
74
75       In  particular,  if you want a regular expression that matches a single
76       backslash character, you need to quote it in  the  argument  to  regexp
77       (according to the last item of the list above) by adding a second back‐
78       slash. Then you need to quote both backslashes (according to the syntax
79       of  string  constants  in OCaml) by doubling them again, so you need to
80       write four backslash characters: Str.regexp "\\\\" .
81
82
83
84       val regexp_case_fold : string -> regexp
85
86       Same as regexp , but the compiled  expression  will  match  text  in  a
87       case-insensitive  way:  uppercase and lowercase letters will be consid‐
88       ered equivalent.
89
90
91
92       val quote : string -> string
93
94
95       Str.quote s returns a regexp string that matches exactly s and  nothing
96       else.
97
98
99
100       val regexp_string : string -> regexp
101
102
103       Str.regexp_string s returns a regular expression that matches exactly s
104       and nothing else.
105
106
107
108       val regexp_string_case_fold : string -> regexp
109
110
111       Str.regexp_string_case_fold is similar to Str.regexp_string ,  but  the
112       regexp matches in a case-insensitive way.
113
114
115
116
117   String matching and searching
118       val string_match : regexp -> string -> int -> bool
119
120
121       string_match  r  s  start tests whether a substring of s that starts at
122       position start matches the regular expression r .  The first  character
123       of a string has position 0 , as usual.
124
125
126
127       val search_forward : regexp -> string -> int -> int
128
129
130       search_forward r s start searches the string s for a substring matching
131       the regular expression r . The search starts at position start and pro‐
132       ceeds  towards the end of the string.  Return the position of the first
133       character of the matched substring.
134
135
136       Raises Not_found if no substring matches.
137
138
139
140       val search_backward : regexp -> string -> int -> int
141
142
143       search_backward r s last searches the string s for a substring matching
144       the  regular  expression r . The search first considers substrings that
145       start at position last and proceeds towards the  beginning  of  string.
146       Return the position of the first character of the matched substring.
147
148
149       Raises Not_found if no substring matches.
150
151
152
153       val string_partial_match : regexp -> string -> int -> bool
154
155       Similar  to  Str.string_match  ,  but also returns true if the argument
156       string is a prefix of a string that matches.  This includes the case of
157       a true complete match.
158
159
160
161       val matched_string : string -> string
162
163
164       matched_string  s  returns  the  substring of s that was matched by the
165       last call to one of the following matching or searching functions:
166
167       - Str.string_match
168
169
170       - Str.search_forward
171
172
173       - Str.search_backward
174
175
176       - Str.string_partial_match
177
178
179       - Str.global_substitute
180
181
182       - Str.substitute_first
183
184       provided that none of the following functions was called in between:
185
186       - Str.global_replace
187
188
189       - Str.replace_first
190
191
192       - Str.split
193
194
195       - Str.bounded_split
196
197
198       - Str.split_delim
199
200
201       - Str.bounded_split_delim
202
203
204       - Str.full_split
205
206
207       - Str.bounded_full_split
208
209       Note: in the case of global_substitute and substitute_first , a call to
210       matched_string  is  only  valid  within  the  subst argument, not after
211       global_substitute or substitute_first returns.
212
213       The user must make sure that the parameter s is the  same  string  that
214       was passed to the matching or searching function.
215
216
217
218       val match_beginning : unit -> int
219
220
221       match_beginning()  returns  the  position of the first character of the
222       substring that was matched by the last call to a matching or  searching
223       function (see Str.matched_string for details).
224
225
226
227       val match_end : unit -> int
228
229
230       match_end()  returns  the  position of the character following the last
231       character of the substring that was matched  by  the  last  call  to  a
232       matching or searching function (see Str.matched_string for details).
233
234
235
236       val matched_group : int -> string -> string
237
238
239       matched_group  n s returns the substring of s that was matched by the n
240       th group \(...\) of the regular expression that was matched by the last
241       call  to  a  matching or searching function (see Str.matched_string for
242       details).  The user must make sure that the parameter  s  is  the  same
243       string that was passed to the matching or searching function.
244
245
246       Raises  Not_found  if  the n th group of the regular expression was not
247       matched.  This can happen with groups inside alternatives \| ,  options
248       ?  or repetitions * .  For instance, the empty string will match \(a\)*
249       , but matched_group 1 "" will raise Not_found because the  first  group
250       itself was not matched.
251
252
253
254       val group_beginning : int -> int
255
256
257       group_beginning  n  returns  the position of the first character of the
258       substring that was matched by the n th group of the regular  expression
259       that  was  matched by the last call to a matching or searching function
260       (see Str.matched_string for details).
261
262
263       Raises Not_found if the n th group of the regular  expression  was  not
264       matched.
265
266
267       Raises Invalid_argument if there are fewer than n groups in the regular
268       expression.
269
270
271
272       val group_end : int -> int
273
274
275       group_end n returns the position of the character  following  the  last
276       character  of substring that was matched by the n th group of the regu‐
277       lar expression that was matched by the  last  call  to  a  matching  or
278       searching function (see Str.matched_string for details).
279
280
281       Raises  Not_found  if  the n th group of the regular expression was not
282       matched.
283
284
285       Raises Invalid_argument if there are fewer than n groups in the regular
286       expression.
287
288
289
290
291   Replacement
292       val global_replace : regexp -> string -> string -> string
293
294
295       global_replace  regexp templ s returns a string identical to s , except
296       that all substrings of s that match regexp have been replaced by  templ
297       .  The  replacement  template  templ  can  contain \1 , \2 , etc; these
298       sequences will be replaced by the text  matched  by  the  corresponding
299       group in the regular expression.  \0 stands for the text matched by the
300       whole regular expression.
301
302
303
304       val replace_first : regexp -> string -> string -> string
305
306       Same as Str.global_replace ,  except  that  only  the  first  substring
307       matching the regular expression is replaced.
308
309
310
311       val  global_substitute  :  regexp  ->  (string  -> string) -> string ->
312       string
313
314
315       global_substitute regexp subst s returns a  string  identical  to  s  ,
316       except that all substrings of s that match regexp have been replaced by
317       the result of function subst . The function subst is  called  once  for
318       each matching substring, and receives s (the whole text) as argument.
319
320
321
322       val substitute_first : regexp -> (string -> string) -> string -> string
323
324       Same  as  Str.global_substitute  , except that only the first substring
325       matching the regular expression is replaced.
326
327
328
329       val replace_matched : string -> string -> string
330
331
332       replace_matched repl s returns the replacement text repl in which \1  ,
333       \2  ,  etc. have been replaced by the text matched by the corresponding
334       groups in the regular expression that was matched by the last call to a
335       matching or searching function (see Str.matched_string for details).  s
336       must be the same string that was passed to the  matching  or  searching
337       function.
338
339
340
341
342   Splitting
343       val split : regexp -> string -> string list
344
345
346       split r s splits s into substrings, taking as delimiters the substrings
347       that match r , and returns the list of substrings.  For instance, split
348       (regexp "[ \t]+") s splits s into blank-separated words.  An occurrence
349       of the delimiter at the beginning or  at  the  end  of  the  string  is
350       ignored.
351
352
353
354       val bounded_split : regexp -> string -> int -> string list
355
356       Same  as  Str.split  , but splits into at most n substrings, where n is
357       the extra integer parameter.
358
359
360
361       val split_delim : regexp -> string -> string list
362
363       Same as Str.split but occurrences of the delimiter at the beginning and
364       at  the  end of the string are recognized and returned as empty strings
365       in the result.  For instance, split_delim (regexp " ") " abc "  returns
366       [""; "abc"; ""] , while split with the same arguments returns ["abc"] .
367
368
369
370       val bounded_split_delim : regexp -> string -> int -> string list
371
372       Same  as  Str.bounded_split  ,  but occurrences of the delimiter at the
373       beginning and at the end of the string are recognized and  returned  as
374       empty strings in the result.
375
376
377       type split_result =
378        | Text of string
379        | Delim of string
380
381
382
383
384
385       val full_split : regexp -> string -> split_result list
386
387       Same  as  Str.split_delim  ,  but returns the delimiters as well as the
388       substrings contained between delimiters.  The former are  tagged  Delim
389       in  the  result  list;  the  latter  are  tagged  Text .  For instance,
390       full_split (regexp "[{}]") "{ab}" returns [Delim "{"; Text "ab";  Delim
391       "}"] .
392
393
394
395       val bounded_full_split : regexp -> string -> int -> split_result list
396
397       Same as Str.bounded_split_delim , but returns the delimiters as well as
398       the substrings contained between delimiters.   The  former  are  tagged
399       Delim in the result list; the latter are tagged Text .
400
401
402
403
404   Extracting substrings
405       val string_before : string -> int -> string
406
407
408       string_before  s  n  returns  the substring of all characters of s that
409       precede position n (excluding the character at position n ).
410
411
412
413       val string_after : string -> int -> string
414
415
416       string_after s n returns the substring of all characters of s that fol‐
417       low position n (including the character at position n ).
418
419
420
421       val first_chars : string -> int -> string
422
423
424       first_chars s n returns the first n characters of s .  This is the same
425       function as Str.string_before .
426
427
428
429       val last_chars : string -> int -> string
430
431
432       last_chars s n returns the last n characters of s .
433
434
435
436
437
438OCamldoc                          2020-02-27                            Str(3)
Impressum