1FSCANF(3P) POSIX Programmer's Manual FSCANF(3P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 fscanf, scanf, sscanf — convert formatted input
13
15 #include <stdio.h>
16
17 int fscanf(FILE *restrict stream, const char *restrict format, ...);
18 int scanf(const char *restrict format, ...);
19 int sscanf(const char *restrict s, const char *restrict format, ...);
20
22 The functionality described on this reference page is aligned with the
23 ISO C standard. Any conflict between the requirements described here
24 and the ISO C standard is unintentional. This volume of POSIX.1‐2017
25 defers to the ISO C standard.
26
27 The fscanf() function shall read from the named input stream. The
28 scanf() function shall read from the standard input stream stdin. The
29 sscanf() function shall read from the string s. Each function reads
30 bytes, interprets them according to a format, and stores the results in
31 its arguments. Each expects, as arguments, a control string format
32 described below, and a set of pointer arguments indicating where the
33 converted input should be stored. The result is undefined if there are
34 insufficient arguments for the format. If the format is exhausted while
35 arguments remain, the excess arguments shall be evaluated but otherwise
36 ignored.
37
38 Conversions can be applied to the nth argument after the format in the
39 argument list, rather than to the next unused argument. In this case,
40 the conversion specifier character % (see below) is replaced by the
41 sequence "%n$", where n is a decimal integer in the range
42 [1,{NL_ARGMAX}]. This feature provides for the definition of format
43 strings that select arguments in an order appropriate to specific lan‐
44 guages. In format strings containing the "%n$" form of conversion spec‐
45 ifications, it is unspecified whether numbered arguments in the argu‐
46 ment list can be referenced from the format string more than once.
47
48 The format can contain either form of a conversion specification—that
49 is, % or "%n$"—but the two forms cannot be mixed within a single format
50 string. The only exception to this is that %% or %* can be mixed with
51 the "%n$" form. When numbered argument specifications are used, speci‐
52 fying the Nth argument requires that all the leading arguments, from
53 the first to the (N-1)th, are pointers.
54
55 The fscanf() function in all its forms shall allow detection of a lan‐
56 guage-dependent radix character in the input string. The radix charac‐
57 ter is defined in the current locale (category LC_NUMERIC). In the
58 POSIX locale, or in a locale where the radix character is not defined,
59 the radix character shall default to a <period> ('.').
60
61 The format is a character string, beginning and ending in its initial
62 shift state, if any, composed of zero or more directives. Each direc‐
63 tive is composed of one of the following: one or more white-space char‐
64 acters (<space>, <tab>, <newline>, <vertical-tab>, or <form-feed>); an
65 ordinary character (neither '%' nor a white-space character); or a con‐
66 version specification. Each conversion specification is introduced by
67 the character '%' or the character sequence "%n$", after which the fol‐
68 lowing appear in sequence:
69
70 * An optional assignment-suppressing character '*'.
71
72 * An optional non-zero decimal integer that specifies the maximum
73 field width.
74
75 * An optional assignment-allocation character 'm'.
76
77 * An option length modifier that specifies the size of the receiving
78 object.
79
80 * A conversion specifier character that specifies the type of conver‐
81 sion to be applied. The valid conversion specifiers are described
82 below.
83
84 The fscanf() functions shall execute each directive of the format in
85 turn. If a directive fails, as detailed below, the function shall
86 return. Failures are described as input failures (due to the unavail‐
87 ability of input bytes) or matching failures (due to inappropriate
88 input).
89
90 A directive composed of one or more white-space characters shall be
91 executed by reading input until no more valid input can be read, or up
92 to the first byte which is not a white-space character, which remains
93 unread.
94
95 A directive that is an ordinary character shall be executed as follows:
96 the next byte shall be read from the input and compared with the byte
97 that comprises the directive; if the comparison shows that they are not
98 equivalent, the directive shall fail, and the differing and subsequent
99 bytes shall remain unread. Similarly, if end-of-file, an encoding
100 error, or a read error prevents a character from being read, the direc‐
101 tive shall fail.
102
103 A directive that is a conversion specification defines a set of match‐
104 ing input sequences, as described below for each conversion character.
105 A conversion specification shall be executed in the following steps.
106
107 Input white-space characters (as specified by isspace()) shall be
108 skipped, unless the conversion specification includes a [, c, C, or n
109 conversion specifier.
110
111 An item shall be read from the input, unless the conversion specifica‐
112 tion includes an n conversion specifier. An input item shall be defined
113 as the longest sequence of input bytes (up to any specified maximum
114 field width, which may be measured in characters or bytes dependent on
115 the conversion specifier) which is an initial subsequence of a matching
116 sequence. The first byte, if any, after the input item shall remain
117 unread. If the length of the input item is 0, the execution of the con‐
118 version specification shall fail; this condition is a matching failure,
119 unless end-of-file, an encoding error, or a read error prevented input
120 from the stream, in which case it is an input failure.
121
122 Except in the case of a % conversion specifier, the input item (or, in
123 the case of a %n conversion specification, the count of input bytes)
124 shall be converted to a type appropriate to the conversion character.
125 If the input item is not a matching sequence, the execution of the con‐
126 version specification fails; this condition is a matching failure.
127 Unless assignment suppression was indicated by a '*', the result of the
128 conversion shall be placed in the object pointed to by the first argu‐
129 ment following the format argument that has not already received a con‐
130 version result if the conversion specification is introduced by %, or
131 in the nth argument if introduced by the character sequence "%n$". If
132 this object does not have an appropriate type, or if the result of the
133 conversion cannot be represented in the space provided, the behavior is
134 undefined.
135
136 The %c, %s, and %[ conversion specifiers shall accept an optional
137 assignment-allocation character 'm', which shall cause a memory buffer
138 to be allocated to hold the string converted including a terminating
139 null character. In such a case, the argument corresponding to the con‐
140 version specifier should be a reference to a pointer variable that will
141 receive a pointer to the allocated buffer. The system shall allocate a
142 buffer as if malloc() had been called. The application shall be respon‐
143 sible for freeing the memory after usage. If there is insufficient mem‐
144 ory to allocate a buffer, the function shall set errno to [ENOMEM] and
145 a conversion error shall result. If the function returns EOF, any mem‐
146 ory successfully allocated for parameters using assignment-allocation
147 character 'm' by this call shall be freed before the function returns.
148
149 The length modifiers and their meanings are:
150
151 hh Specifies that a following d, i, o, u, x, X, or n conversion
152 specifier applies to an argument with type pointer to signed
153 char or unsigned char.
154
155 h Specifies that a following d, i, o, u, x, X, or n conversion
156 specifier applies to an argument with type pointer to short or
157 unsigned short.
158
159 l (ell) Specifies that a following d, i, o, u, x, X, or n conversion
160 specifier applies to an argument with type pointer to long or
161 unsigned long; that a following a, A, e, E, f, F, g, or G con‐
162 version specifier applies to an argument with type pointer to
163 double; or that a following c, s, or [ conversion specifier
164 applies to an argument with type pointer to wchar_t. If the
165 'm' assignment-allocation character is specified, the conver‐
166 sion applies to an argument with the type pointer to a pointer
167 to wchar_t.
168
169 ll (ell-ell)
170 Specifies that a following d, i, o, u, x, X, or n conversion
171 specifier applies to an argument with type pointer to long long
172 or unsigned long long.
173
174 j Specifies that a following d, i, o, u, x, X, or n conversion
175 specifier applies to an argument with type pointer to intmax_t
176 or uintmax_t.
177
178 z Specifies that a following d, i, o, u, x, X, or n conversion
179 specifier applies to an argument with type pointer to size_t or
180 the corresponding signed integer type.
181
182 t Specifies that a following d, i, o, u, x, X, or n conversion
183 specifier applies to an argument with type pointer to ptrdiff_t
184 or the corresponding unsigned type.
185
186 L Specifies that a following a, A, e, E, f, F, g, or G conversion
187 specifier applies to an argument with type pointer to long dou‐
188 ble.
189
190 If a length modifier appears with any conversion specifier other than
191 as specified above, the behavior is undefined.
192
193 The following conversion specifiers are valid:
194
195 d Matches an optionally signed decimal integer, whose format is
196 the same as expected for the subject sequence of strtol() with
197 the value 10 for the base argument. In the absence of a size
198 modifier, the application shall ensure that the corresponding
199 argument is a pointer to int.
200
201 i Matches an optionally signed integer, whose format is the same
202 as expected for the subject sequence of strtol() with 0 for the
203 base argument. In the absence of a size modifier, the applica‐
204 tion shall ensure that the corresponding argument is a pointer
205 to int.
206
207 o Matches an optionally signed octal integer, whose format is the
208 same as expected for the subject sequence of strtoul() with the
209 value 8 for the base argument. In the absence of a size modi‐
210 fier, the application shall ensure that the corresponding argu‐
211 ment is a pointer to unsigned.
212
213 u Matches an optionally signed decimal integer, whose format is
214 the same as expected for the subject sequence of strtoul() with
215 the value 10 for the base argument. In the absence of a size
216 modifier, the application shall ensure that the corresponding
217 argument is a pointer to unsigned.
218
219 x Matches an optionally signed hexadecimal integer, whose format
220 is the same as expected for the subject sequence of strtoul()
221 with the value 16 for the base argument. In the absence of a
222 size modifier, the application shall ensure that the corre‐
223 sponding argument is a pointer to unsigned.
224
225 a, e, f, g
226 Matches an optionally signed floating-point number, infinity,
227 or NaN, whose format is the same as expected for the subject
228 sequence of strtod(). In the absence of a size modifier, the
229 application shall ensure that the corresponding argument is a
230 pointer to float.
231
232 If the fprintf() family of functions generates character string
233 representations for infinity and NaN (a symbolic entity encoded
234 in floating-point format) to support IEEE Std 754‐1985, the
235 fscanf() family of functions shall recognize them as input.
236
237 s Matches a sequence of bytes that are not white-space charac‐
238 ters. If the 'm' assignment-allocation character is not speci‐
239 fied, the application shall ensure that the corresponding argu‐
240 ment is a pointer to the initial byte of an array of char,
241 signed char, or unsigned char large enough to accept the
242 sequence and a terminating null character code, which shall be
243 added automatically. Otherwise, the application shall ensure
244 that the corresponding argument is a pointer to a pointer to a
245 char.
246
247 If an l (ell) qualifier is present, the input is a sequence of
248 characters that begins in the initial shift state. Each charac‐
249 ter shall be converted to a wide character as if by a call to
250 the mbrtowc() function, with the conversion state described by
251 an mbstate_t object initialized to zero before the first char‐
252 acter is converted. If the 'm' assignment-allocation character
253 is not specified, the application shall ensure that the corre‐
254 sponding argument is a pointer to an array of wchar_t large
255 enough to accept the sequence and the terminating null wide
256 character, which shall be added automatically. Otherwise, the
257 application shall ensure that the corresponding argument is a
258 pointer to a pointer to a wchar_t.
259
260 [ Matches a non-empty sequence of bytes from a set of expected
261 bytes (the scanset). The normal skip over white-space charac‐
262 ters shall be suppressed in this case. If the 'm' assignment-
263 allocation character is not specified, the application shall
264 ensure that the corresponding argument is a pointer to the ini‐
265 tial byte of an array of char, signed char, or unsigned char
266 large enough to accept the sequence and a terminating null
267 byte, which shall be added automatically. Otherwise, the
268 application shall ensure that the corresponding argument is a
269 pointer to a pointer to a char.
270
271 If an l (ell) qualifier is present, the input is a sequence of
272 characters that begins in the initial shift state. Each charac‐
273 ter in the sequence shall be converted to a wide character as
274 if by a call to the mbrtowc() function, with the conversion
275 state described by an mbstate_t object initialized to zero
276 before the first character is converted. If the 'm' assign‐
277 ment-allocation character is not specified, the application
278 shall ensure that the corresponding argument is a pointer to an
279 array of wchar_t large enough to accept the sequence and the
280 terminating null wide character, which shall be added automati‐
281 cally.
282 Otherwise, the application shall ensure that the corresponding
283 argument is a pointer to a pointer to a wchar_t.
284
285 The conversion specification includes all subsequent bytes in
286 the format string up to and including the matching <right-
287 square-bracket> (']'). The bytes between the square brackets
288 (the scanlist) comprise the scanset, unless the byte after the
289 <left-square-bracket> is a <circumflex> ('^'), in which case
290 the scanset contains all bytes that do not appear in the scan‐
291 list between the <circumflex> and the <right-square-bracket>.
292 If the conversion specification begins with "[]" or "[^]", the
293 <right-square-bracket> is included in the scanlist and the next
294 <right-square-bracket> is the matching <right-square-bracket>
295 that ends the conversion specification; otherwise, the first
296 <right-square-bracket> is the one that ends the conversion
297 specification. If a '-' is in the scanlist and is not the first
298 character, nor the second where the first character is a '^',
299 nor the last character, the behavior is implementation-defined.
300
301 c Matches a sequence of bytes of the number specified by the
302 field width (1 if no field width is present in the conversion
303 specification). No null byte is added. The normal skip over
304 white-space characters shall be suppressed in this case. If the
305 'm' assignment-allocation character is not specified, the
306 application shall ensure that the corresponding argument is a
307 pointer to the initial byte of an array of char, signed char,
308 or unsigned char large enough to accept the sequence. Other‐
309 wise, the application shall ensure that the corresponding argu‐
310 ment is a pointer to a pointer to a char.
311
312 If an l (ell) qualifier is present, the input shall be a
313 sequence of characters that begins in the initial shift state.
314 Each character in the sequence is converted to a wide character
315 as if by a call to the mbrtowc() function, with the conversion
316 state described by an mbstate_t object initialized to zero
317 before the first character is converted. No null wide charac‐
318 ter is added. If the 'm' assignment-allocation character is not
319 specified, the application shall ensure that the corresponding
320 argument is a pointer to an array of wchar_t large enough to
321 accept the resulting sequence of wide characters. Otherwise,
322 the application shall ensure that the corresponding argument is
323 a pointer to a pointer to a wchar_t.
324
325 p Matches an implementation-defined set of sequences, which shall
326 be the same as the set of sequences that is produced by the %p
327 conversion specification of the corresponding fprintf() func‐
328 tions. The application shall ensure that the corresponding
329 argument is a pointer to a pointer to void. The interpretation
330 of the input item is implementation-defined. If the input item
331 is a value converted earlier during the same program execution,
332 the pointer that results shall compare equal to that value;
333 otherwise, the behavior of the %p conversion specification is
334 undefined.
335
336 n No input is consumed. The application shall ensure that the
337 corresponding argument is a pointer to the integer into which
338 shall be written the number of bytes read from the input so far
339 by this call to the fscanf() functions. Execution of a %n con‐
340 version specification shall not increment the assignment count
341 returned at the completion of execution of the function. No
342 argument shall be converted, but one shall be consumed. If the
343 conversion specification includes an assignment-suppressing
344 character or a field width, the behavior is undefined.
345
346 C Equivalent to lc.
347
348 S Equivalent to ls.
349
350 % Matches a single '%' character; no conversion or assignment
351 occurs. The complete conversion specification shall be %%.
352
353 If a conversion specification is invalid, the behavior is undefined.
354
355 The conversion specifiers A, E, F, G, and X are also valid and shall be
356 equivalent to a, e, f, g, and x, respectively.
357
358 If end-of-file is encountered during input, conversion shall be termi‐
359 nated. If end-of-file occurs before any bytes matching the current con‐
360 version specification (except for %n) have been read (other than lead‐
361 ing white-space characters, where permitted), execution of the current
362 conversion specification shall terminate with an input failure. Other‐
363 wise, unless execution of the current conversion specification is ter‐
364 minated with a matching failure, execution of the following conversion
365 specification (if any) shall be terminated with an input failure.
366
367 Reaching the end of the string in sscanf() shall be equivalent to
368 encountering end-of-file for fscanf().
369
370 If conversion terminates on a conflicting input, the offending input is
371 left unread in the input. Any trailing white space (including <newline>
372 characters) shall be left unread unless matched by a conversion speci‐
373 fication. The success of literal matches and suppressed assignments is
374 only directly determinable via the %n conversion specification.
375
376 The fscanf() and scanf() functions may mark the last data access time‐
377 stamp of the file associated with stream for update. The last data
378 access timestamp shall be marked for update by the first successful
379 execution of fgetc(), fgets(), fread(), getc(), getchar(), getdelim(),
380 getline(), gets(), fscanf(), or scanf() using stream that returns data
381 not supplied by a prior call to ungetc().
382
384 Upon successful completion, these functions shall return the number of
385 successfully matched and assigned input items; this number can be zero
386 in the event of an early matching failure. If the input ends before the
387 first conversion (if any) has completed, and without a matching failure
388 having occurred, EOF shall be returned. If an error occurs before the
389 first conversion (if any) has completed, and without a matching failure
390 having occurred, EOF shall be returned and errno shall be set to indi‐
391 cate the error. If a read error occurs, the error indicator for the
392 stream shall be set.
393
395 For the conditions under which the fscanf() functions fail and may
396 fail, refer to fgetc() or fgetwc().
397
398 In addition, the fscanf() function shall fail if:
399
400 EILSEQ Input byte sequence does not form a valid character.
401
402 ENOMEM Insufficient storage space is available.
403
404 In addition, the fscanf() function may fail if:
405
406 EINVAL There are insufficient arguments.
407
408 The following sections are informative.
409
411 The call:
412
413
414 int i, n; float x; char name[50];
415 n = scanf("%d%f%s", &i, &x, name);
416
417 with the input line:
418
419
420 25 54.32E-1 Hamster
421
422 assigns to n the value 3, to i the value 25, to x the value 5.432, and
423 name contains the string "Hamster".
424
425 The call:
426
427
428 int i; float x; char name[50];
429 (void) scanf("%2d%f%*d %[0123456789]", &i, &x, name);
430
431 with input:
432
433
434 56789 0123 56a72
435
436 assigns 56 to i, 789.0 to x, skips 0123, and places the string "56\0"
437 in name. The next call to getchar() shall return the character 'a'.
438
439 Reading Data into an Array
440 The following call uses fscanf() to read three floating-point numbers
441 from standard input into the input array.
442
443
444 float input[3]; fscanf (stdin, "%f %f %f", input, input+1, input+2);
445
447 If the application calling fscanf() has any objects of type wint_t or
448 wchar_t, it must also include the <wchar.h> header to have these
449 objects defined.
450
451 For functions that allocate memory as if by malloc(), the application
452 should release such memory when it is no longer required by a call to
453 free(). For fscanf(), this is memory allocated via use of the 'm'
454 assignment-allocation character.
455
457 This function is aligned with the ISO/IEC 9899:1999 standard, and in
458 doing so a few ``obvious'' things were not included. Specifically, the
459 set of characters allowed in a scanset is limited to single-byte char‐
460 acters. In other similar places, multi-byte characters have been per‐
461 mitted, but for alignment with the ISO/IEC 9899:1999 standard, it has
462 not been done here. Applications needing this could use the correspond‐
463 ing wide-character functions to achieve the desired results.
464
466 None.
467
469 Section 2.5, Standard I/O Streams, fprintf(), getc(), setlocale(), str‐
470 tod(), strtol(), strtoul(), wcrtomb()
471
472 The Base Definitions volume of POSIX.1‐2017, Chapter 7, Locale, <int‐
473 types.h>, <langinfo.h>, <stdio.h>, <wchar.h>
474
476 Portions of this text are reprinted and reproduced in electronic form
477 from IEEE Std 1003.1-2017, Standard for Information Technology -- Por‐
478 table Operating System Interface (POSIX), The Open Group Base Specifi‐
479 cations Issue 7, 2018 Edition, Copyright (C) 2018 by the Institute of
480 Electrical and Electronics Engineers, Inc and The Open Group. In the
481 event of any discrepancy between this version and the original IEEE and
482 The Open Group Standard, the original IEEE and The Open Group Standard
483 is the referee document. The original Standard can be obtained online
484 at http://www.opengroup.org/unix/online.html .
485
486 Any typographical or formatting errors that appear in this page are
487 most likely to have been introduced during the conversion of the source
488 files to man page format. To report such errors, see https://www.ker‐
489 nel.org/doc/man-pages/reporting_bugs.html .
490
491
492
493IEEE/The Open Group 2017 FSCANF(3P)