1sscanf(3) Library Functions Manual sscanf(3)
2
3
4
6 sscanf, vsscanf - input string format conversion
7
9 Standard C library (libc, -lc)
10
12 #include <stdio.h>
13
14 int sscanf(const char *restrict str,
15 const char *restrict format, ...);
16
17 #include <stdarg.h>
18
19 int vsscanf(const char *restrict str,
20 const char *restrict format, va_list ap);
21
22 Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
23
24 vsscanf():
25 _ISOC99_SOURCE || _POSIX_C_SOURCE >= 200112L
26
28 The sscanf() family of functions scans input according to format as de‐
29 scribed below. This format may contain conversion specifications; the
30 results from such conversions, if any, are stored in the locations
31 pointed to by the pointer arguments that follow format. Each pointer
32 argument must be of a type that is appropriate for the value returned
33 by the corresponding conversion specification.
34
35 If the number of conversion specifications in format exceeds the number
36 of pointer arguments, the results are undefined. If the number of
37 pointer arguments exceeds the number of conversion specifications, then
38 the excess pointer arguments are evaluated, but are otherwise ignored.
39
40 sscanf() These functions read their input from the string pointed to by
41 str.
42
43 The vsscanf() function is analogous to vsprintf(3).
44
45 The format string consists of a sequence of directives which describe
46 how to process the sequence of input characters. If processing of a
47 directive fails, no further input is read, and sscanf() returns. A
48 "failure" can be either of the following: input failure, meaning that
49 input characters were unavailable, or matching failure, meaning that
50 the input was inappropriate (see below).
51
52 A directive is one of the following:
53
54 • A sequence of white-space characters (space, tab, newline, etc.;
55 see isspace(3)). This directive matches any amount of white
56 space, including none, in the input.
57
58 • An ordinary character (i.e., one other than white space or '%').
59 This character must exactly match the next character of input.
60
61 • A conversion specification, which commences with a '%' (percent)
62 character. A sequence of characters from the input is converted
63 according to this specification, and the result is placed in the
64 corresponding pointer argument. If the next item of input does
65 not match the conversion specification, the conversion fails—
66 this is a matching failure.
67
68 Each conversion specification in format begins with either the charac‐
69 ter '%' or the character sequence "%n$" (see below for the distinction)
70 followed by:
71
72 • An optional '*' assignment-suppression character: sscanf() reads
73 input as directed by the conversion specification, but discards
74 the input. No corresponding pointer argument is required, and
75 this specification is not included in the count of successful
76 assignments returned by scanf().
77
78 • For decimal conversions, an optional quote character ('). This
79 specifies that the input number may include thousands' separa‐
80 tors as defined by the LC_NUMERIC category of the current lo‐
81 cale. (See setlocale(3).) The quote character may precede or
82 follow the '*' assignment-suppression character.
83
84 • An optional 'm' character. This is used with string conversions
85 (%s, %c, %[), and relieves the caller of the need to allocate a
86 corresponding buffer to hold the input: instead, sscanf() allo‐
87 cates a buffer of sufficient size, and assigns the address of
88 this buffer to the corresponding pointer argument, which should
89 be a pointer to a char * variable (this variable does not need
90 to be initialized before the call). The caller should subse‐
91 quently free(3) this buffer when it is no longer required.
92
93 • An optional decimal integer which specifies the maximum field
94 width. Reading of characters stops either when this maximum is
95 reached or when a nonmatching character is found, whichever hap‐
96 pens first. Most conversions discard initial white space char‐
97 acters (the exceptions are noted below), and these discarded
98 characters don't count toward the maximum field width. String
99 input conversions store a terminating null byte ('\0') to mark
100 the end of the input; the maximum field width does not include
101 this terminator.
102
103 • An optional type modifier character. For example, the l type
104 modifier is used with integer conversions such as %d to specify
105 that the corresponding pointer argument refers to a long rather
106 than a pointer to an int.
107
108 • A conversion specifier that specifies the type of input conver‐
109 sion to be performed.
110
111 The conversion specifications in format are of two forms, either begin‐
112 ning with '%' or beginning with "%n$". The two forms should not be
113 mixed in the same format string, except that a string containing "%n$"
114 specifications can include %% and %*. If format contains '%' specifi‐
115 cations, then these correspond in order with successive pointer argu‐
116 ments. In the "%n$" form (which is specified in POSIX.1-2001, but not
117 C99), n is a decimal integer that specifies that the converted input
118 should be placed in the location referred to by the n-th pointer argu‐
119 ment following format.
120
121 Conversions
122 The following type modifier characters can appear in a conversion spec‐
123 ification:
124
125 h Indicates that the conversion will be one of d, i, o, u, x, X,
126 or n and the next pointer is a pointer to a short or unsigned
127 short (rather than int).
128
129 hh As for h, but the next pointer is a pointer to a signed char or
130 unsigned char.
131
132 j As for h, but the next pointer is a pointer to an intmax_t or a
133 uintmax_t. This modifier was introduced in C99.
134
135 l Indicates either that the conversion will be one of d, i, o, u,
136 x, X, or n and the next pointer is a pointer to a long or un‐
137 signed long (rather than int), or that the conversion will be
138 one of e, f, or g and the next pointer is a pointer to double
139 (rather than float). If used with %c or %s, the corresponding
140 parameter is considered as a pointer to a wide character or
141 wide-character string respectively.
142
143 ll (ell-ell) Indicates that the conversion will be one of b, d, i,
144 o, u, x, X, or n and the next pointer is a pointer to a long
145 long or unsigned long long (rather than int).
146
147 L Indicates that the conversion will be either e, f, or g and the
148 next pointer is a pointer to long double or (as a GNU extension)
149 the conversion will be d, i, o, u, or x and the next pointer is
150 a pointer to long long.
151
152 q equivalent to L. This specifier does not exist in ANSI C.
153
154 t As for h, but the next pointer is a pointer to a ptrdiff_t.
155 This modifier was introduced in C99.
156
157 z As for h, but the next pointer is a pointer to a size_t. This
158 modifier was introduced in C99.
159
160 The following conversion specifiers are available:
161
162 % Matches a literal '%'. That is, %% in the format string matches
163 a single input '%' character. No conversion is done (but ini‐
164 tial white space characters are discarded), and assignment does
165 not occur.
166
167 d Deprecated. Matches an optionally signed decimal integer; the
168 next pointer must be a pointer to int.
169
170 i Deprecated. Matches an optionally signed integer; the next
171 pointer must be a pointer to int. The integer is read in base
172 16 if it begins with 0x or 0X, in base 8 if it begins with 0,
173 and in base 10 otherwise. Only characters that correspond to
174 the base are used.
175
176 o Deprecated. Matches an unsigned octal integer; the next pointer
177 must be a pointer to unsigned int.
178
179 u Deprecated. Matches an unsigned decimal integer; the next
180 pointer must be a pointer to unsigned int.
181
182 x Deprecated. Matches an unsigned hexadecimal integer (that may
183 optionally begin with a prefix of 0x or 0X, which is discarded);
184 the next pointer must be a pointer to unsigned int.
185
186 X Deprecated. Equivalent to x.
187
188 f Deprecated. Matches an optionally signed floating-point number;
189 the next pointer must be a pointer to float.
190
191 e Deprecated. Equivalent to f.
192
193 g Deprecated. Equivalent to f.
194
195 E Deprecated. Equivalent to f.
196
197 a Deprecated. (C99) Equivalent to f.
198
199 s Matches a sequence of non-white-space characters; the next
200 pointer must be a pointer to the initial element of a character
201 array that is long enough to hold the input sequence and the
202 terminating null byte ('\0'), which is added automatically. The
203 input string stops at white space or at the maximum field width,
204 whichever occurs first.
205
206 c Matches a sequence of characters whose length is specified by
207 the maximum field width (default 1); the next pointer must be a
208 pointer to char, and there must be enough room for all the char‐
209 acters (no terminating null byte is added). The usual skip of
210 leading white space is suppressed. To skip white space first,
211 use an explicit space in the format.
212
213 [ Matches a nonempty sequence of characters from the specified set
214 of accepted characters; the next pointer must be a pointer to
215 char, and there must be enough room for all the characters in
216 the string, plus a terminating null byte. The usual skip of
217 leading white space is suppressed. The string is to be made up
218 of characters in (or not in) a particular set; the set is de‐
219 fined by the characters between the open bracket [ character and
220 a close bracket ] character. The set excludes those characters
221 if the first character after the open bracket is a circumflex
222 (^). To include a close bracket in the set, make it the first
223 character after the open bracket or the circumflex; any other
224 position will end the set. The hyphen character - is also spe‐
225 cial; when placed between two other characters, it adds all in‐
226 tervening characters to the set. To include a hyphen, make it
227 the last character before the final close bracket. For in‐
228 stance, [^]0-9-] means the set "everything except close bracket,
229 zero through nine, and hyphen". The string ends with the ap‐
230 pearance of a character not in the (or, with a circumflex, in)
231 set or when the field width runs out.
232
233 p Matches a pointer value (as printed by %p in printf(3)); the
234 next pointer must be a pointer to a pointer to void.
235
236 n Nothing is expected; instead, the number of characters consumed
237 thus far from the input is stored through the next pointer,
238 which must be a pointer to int, or variant whose size matches
239 the (optionally) supplied integer length modifier. This is not
240 a conversion and does not increase the count returned by the
241 function. The assignment can be suppressed with the * assign‐
242 ment-suppression character, but the effect on the return value
243 is undefined. Therefore %*n conversions should not be used.
244
246 On success, these functions return the number of input items success‐
247 fully matched and assigned; this can be fewer than provided for, or
248 even zero, in the event of an early matching failure.
249
250 The value EOF is returned if the end of input is reached before either
251 the first successful conversion or a matching failure occurs.
252
254 EILSEQ Input byte sequence does not form a valid character.
255
256 EINVAL Not enough arguments; or format is NULL.
257
258 ENOMEM Out of memory.
259
261 For an explanation of the terms used in this section, see at‐
262 tributes(7).
263
264 ┌─────────────────────────────────────┬───────────────┬────────────────┐
265 │Interface │ Attribute │ Value │
266 ├─────────────────────────────────────┼───────────────┼────────────────┤
267 │sscanf(), vsscanf() │ Thread safety │ MT-Safe locale │
268 └─────────────────────────────────────┴───────────────┴────────────────┘
269
271 C11, POSIX.1-2008.
272
274 C89, POSIX.1-2001.
275
276 The q specifier is the 4.4BSD notation for long long, while ll or the
277 usage of L in integer conversions is the GNU notation.
278
279 The Linux version of these functions is based on the GNU libio library.
280 Take a look at the info documentation of GNU libc (glibc-1.08) for a
281 more concise description.
282
284 The 'a' assignment-allocation modifier
285 Originally, the GNU C library supported dynamic allocation for string
286 inputs (as a nonstandard extension) via the a character. (This feature
287 is present at least as far back as glibc 2.0.) Thus, one could write
288 the following to have sscanf() allocate a buffer for a string, with a
289 pointer to that buffer being returned in *buf:
290
291 char *buf;
292 sscanf(str, "%as", &buf);
293
294 The use of the letter a for this purpose was problematic, since a is
295 also specified by the ISO C standard as a synonym for f (floating-point
296 input). POSIX.1-2008 instead specifies the m modifier for assignment
297 allocation (as documented in DESCRIPTION, above).
298
299 Note that the a modifier is not available if the program is compiled
300 with gcc -std=c99 or gcc -D_ISOC99_SOURCE (unless _GNU_SOURCE is also
301 specified), in which case the a is interpreted as a specifier for
302 floating-point numbers (see above).
303
304 Support for the m modifier was added to glibc 2.7, and new programs
305 should use that modifier instead of a.
306
307 As well as being standardized by POSIX, the m modifier has the follow‐
308 ing further advantages over the use of a:
309
310 • It may also be applied to %c conversion specifiers (e.g., %3mc).
311
312 • It avoids ambiguity with respect to the %a floating-point conversion
313 specifier (and is unaffected by gcc -std=c99 etc.).
314
316 Numeric conversion specifiers
317 Use of the numeric conversion specifiers produces Undefined Behavior
318 for invalid input. See C11 7.21.6.2/10 ⟨https://port70.net/%7Ensz/c/
319 c11/n1570.html#7.21.6.2p10⟩. This is a bug in the ISO C standard, and
320 not an inherent design issue with the API. However, current implemen‐
321 tations are not safe from that bug, so it is not recommended to use
322 them. Instead, programs should use functions such as strtol(3) to
323 parse numeric input. This manual page deprecates use of the numeric
324 conversion specifiers until they are fixed by ISO C.
325
326 Nonstandard modifiers
327 These functions are fully C99 conformant, but provide the additional
328 modifiers q and a as well as an additional behavior of the L and ll
329 modifiers. The latter may be considered to be a bug, as it changes the
330 behavior of modifiers defined in C99.
331
332 Some combinations of the type modifiers and conversion specifiers de‐
333 fined by C99 do not make sense (e.g., %Ld). While they may have a
334 well-defined behavior on Linux, this need not to be so on other archi‐
335 tectures. Therefore it usually is better to use modifiers that are not
336 defined by C99 at all, that is, use q instead of L in combination with
337 d, i, o, u, x, and X conversions or ll.
338
339 The usage of q is not the same as on 4.4BSD, as it may be used in float
340 conversions equivalently to L.
341
343 To use the dynamic allocation conversion specifier, specify m as a
344 length modifier (thus %ms or %m[range]). The caller must free(3) the
345 returned string, as in the following example:
346
347 char *p;
348 int n;
349
350 errno = 0;
351 n = sscanf(str, "%m[a-z]", &p);
352 if (n == 1) {
353 printf("read: %s\n", p);
354 free(p);
355 } else if (errno != 0) {
356 perror("sscanf");
357 } else {
358 fprintf(stderr, "No matching characters\n");
359 }
360
361 As shown in the above example, it is necessary to call free(3) only if
362 the sscanf() call successfully read a string.
363
365 getc(3), printf(3), setlocale(3), strtod(3), strtol(3), strtoul(3)
366
367
368
369Linux man-pages 6.04 2023-03-30 sscanf(3)