1FWSCANF(3P) POSIX Programmer's Manual FWSCANF(3P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 fwscanf, swscanf, wscanf — convert formatted wide-character input
13
15 #include <stdio.h>
16 #include <wchar.h>
17
18 int fwscanf(FILE *restrict stream, const wchar_t *restrict format, ...);
19 int swscanf(const wchar_t *restrict ws,
20 const wchar_t *restrict format, ...);
21 int wscanf(const wchar_t *restrict format, ...);
22
24 The functionality described on this reference page is aligned with the
25 ISO C standard. Any conflict between the requirements described here
26 and the ISO C standard is unintentional. This volume of POSIX.1‐2017
27 defers to the ISO C standard.
28
29 The fwscanf() function shall read from the named input stream. The
30 wscanf() function shall read from the standard input stream stdin. The
31 swscanf() function shall read from the wide-character string ws. Each
32 function reads wide characters, interprets them according to a format,
33 and stores the results in its arguments. Each expects, as arguments, a
34 control wide-character string format described below, and a set of
35 pointer arguments indicating where the converted input should be
36 stored. The result is undefined if there are insufficient arguments for
37 the format. If the format is exhausted while arguments remain, the
38 excess arguments are evaluated but are otherwise ignored.
39
40 Conversions can be applied to the nth argument after the format in the
41 argument list, rather than to the next unused argument. In this case,
42 the conversion specifier wide character % (see below) is replaced by
43 the sequence "%n$", where n is a decimal integer in the range
44 [1,{NL_ARGMAX}]. This feature provides for the definition of format
45 wide-character strings that select arguments in an order appropriate to
46 specific languages. In format wide-character strings containing the
47 "%n$" form of conversion specifications, it is unspecified whether num‐
48 bered arguments in the argument list can be referenced from the format
49 wide-character string more than once.
50
51 The format can contain either form of a conversion specification—that
52 is, % or "%n$"— but the two forms cannot normally be mixed within a
53 single format wide-character string. The only exception to this is that
54 %% or %* can be mixed with the "%n$" form. When numbered argument spec‐
55 ifications are used, specifying the Nth argument requires that all the
56 leading arguments, from the first to the (N-1)th, are pointers.
57
58 The fwscanf() function in all its forms allows for detection of a lan‐
59 guage-dependent radix character in the input string, encoded as a wide-
60 character value. The radix character is defined in the current locale
61 (category LC_NUMERIC). In the POSIX locale, or in a locale where the
62 radix character is not defined, the radix character shall default to a
63 <period> ('.').
64
65 The format is a wide-character string composed of zero or more direc‐
66 tives. Each directive is composed of one of the following: one or more
67 white-space wide characters (<space>, <tab>, <newline>, <vertical-tab>,
68 or <form-feed>); an ordinary wide character (neither '%' nor a white-
69 space character); or a conversion specification. It is unspecified
70 whether an encoding error occurs if the format string contains wchar_t
71 values that do not correspond to members of the character set of the
72 current locale and the specified semantics do not require that value to
73 be processed by wcrtomb().
74
75 Each conversion specification is introduced by the '%' or by the char‐
76 acter sequence "%n$", after which the following appear in sequence:
77
78 * An optional assignment-suppressing character '*'.
79
80 * An optional non-zero decimal integer that specifies the maximum
81 field width.
82
83 * An optional assignment-allocation character 'm'.
84
85 * An optional length modifier that specifies the size of the receiv‐
86 ing object.
87
88 * A conversion specifier wide character that specifies the type of
89 conversion to be applied. The valid conversion specifiers are
90 described below.
91
92 The fwscanf() functions shall execute each directive of the format in
93 turn. If a directive fails, as detailed below, the function shall
94 return. Failures are described as input failures (due to the unavail‐
95 ability of input bytes) or matching failures (due to inappropriate
96 input).
97
98 A directive composed of one or more white-space wide characters is exe‐
99 cuted by reading input until no more valid input can be read, or up to
100 the first wide character which is not a white-space wide character,
101 which remains unread.
102
103 A directive that is an ordinary wide character shall be executed as
104 follows. The next wide character is read from the input and compared
105 with the wide character that comprises the directive; if the comparison
106 shows that they are not equivalent, the directive shall fail, and the
107 differing and subsequent wide characters remain unread. Similarly, if
108 end-of-file, an encoding error, or a read error prevents a wide charac‐
109 ter from being read, the directive shall fail.
110
111 A directive that is a conversion specification defines a set of match‐
112 ing input sequences, as described below for each conversion wide char‐
113 acter. A conversion specification is executed in the following steps.
114
115 Input white-space wide characters (as specified by iswspace()) shall be
116 skipped, unless the conversion specification includes a [, c, or n con‐
117 version specifier.
118
119 An item shall be read from the input, unless the conversion specifica‐
120 tion includes an n conversion specifier wide character. An input item
121 is defined as the longest sequence of input wide characters, not
122 exceeding any specified field width, which is an initial subsequence of
123 a matching sequence. The first wide character, if any, after the input
124 item shall remain unread. If the length of the input item is zero, the
125 execution of the conversion specification shall fail; this condition is
126 a matching failure, unless end-of-file, an encoding error, or a read
127 error prevented input from the stream, in which case it is an input
128 failure.
129
130 Except in the case of a % conversion specifier, the input item (or, in
131 the case of a %n conversion specification, the count of input wide
132 characters) shall be converted to a type appropriate to the conversion
133 wide character. If the input item is not a matching sequence, the exe‐
134 cution of the conversion specification shall fail; this condition is a
135 matching failure. Unless assignment suppression was indicated by a '*',
136 the result of the conversion shall be placed in the object pointed to
137 by the first argument following the format argument that has not
138 already received a conversion result if the conversion specification is
139 introduced by %, or in the nth argument if introduced by the wide-char‐
140 acter sequence "%n$". If this object does not have an appropriate
141 type, or if the result of the conversion cannot be represented in the
142 space provided, the behavior is undefined.
143
144 The %c, %s, and %[ conversion specifiers shall accept an optional
145 assignment-allocation character 'm', which shall cause a memory buffer
146 to be allocated to hold the wide-character string converted including a
147 terminating null wide character. In such a case, the argument corre‐
148 sponding to the conversion specifier should be a reference to a pointer
149 value that will receive a pointer to the allocated buffer. The system
150 shall allocate a buffer as if malloc() had been called. The application
151 shall be responsible for freeing the memory after usage. If there is
152 insufficient memory to allocate a buffer, the function shall set errno
153 to [ENOMEM] and a conversion error shall result. If the function
154 returns EOF, any memory successfully allocated for parameters using
155 assignment-allocation character 'm' by this call shall be freed before
156 the function returns.
157
158 The length modifiers and their meanings are:
159
160 hh Specifies that a following d, i, o, u, x, X, or n conversion
161 specifier applies to an argument with type pointer to signed
162 char or unsigned char.
163
164 h Specifies that a following d, i, o, u, x, X, or n conversion
165 specifier applies to an argument with type pointer to short or
166 unsigned short.
167
168 l (ell) Specifies that a following d, i, o, u, x, X, or n conversion
169 specifier applies to an argument with type pointer to long or
170 unsigned long; that a following a, A, e, E, f, F, g, or G con‐
171 version specifier applies to an argument with type pointer to
172 double; or that a following c, s, or [ conversion specifier
173 applies to an argument with type pointer to wchar_t. If the
174 'm' assignment-allocation character is specified, the conver‐
175 sion applies to an argument with the type pointer to a pointer
176 to wchar_t.
177
178 ll (ell-ell)
179 Specifies that a following d, i, o, u, x, X, or n conversion
180 specifier applies to an argument with type pointer to long long
181 or unsigned long long.
182
183 j Specifies that a following d, i, o, u, x, X, or n conversion
184 specifier applies to an argument with type pointer to intmax_t
185 or uintmax_t.
186
187 z Specifies that a following d, i, o, u, x, X, or n conversion
188 specifier applies to an argument with type pointer to size_t or
189 the corresponding signed integer type.
190
191 t Specifies that a following d, i, o, u, x, X, or n conversion
192 specifier applies to an argument with type pointer to ptrdiff_t
193 or the corresponding unsigned type.
194
195 L Specifies that a following a, A, e, E, f, F, g, or G conversion
196 specifier applies to an argument with type pointer to long dou‐
197 ble.
198
199 If a length modifier appears with any conversion specifier other than
200 as specified above, the behavior is undefined.
201
202 The following conversion specifier wide characters are valid:
203
204 d Matches an optionally signed decimal integer, whose format is
205 the same as expected for the subject sequence of wcstol() with
206 the value 10 for the base argument. In the absence of a size
207 modifier, the application shall ensure that the corresponding
208 argument is a pointer to int.
209
210 i Matches an optionally signed integer, whose format is the same
211 as expected for the subject sequence of wcstol() with 0 for the
212 base argument. In the absence of a size modifier, the applica‐
213 tion shall ensure that the corresponding argument is a pointer
214 to int.
215
216 o Matches an optionally signed octal integer, whose format is the
217 same as expected for the subject sequence of wcstoul() with the
218 value 8 for the base argument. In the absence of a size modi‐
219 fier, the application shall ensure that the corresponding argu‐
220 ment is a pointer to unsigned.
221
222 u Matches an optionally signed decimal integer, whose format is
223 the same as expected for the subject sequence of wcstoul() with
224 the value 10 for the base argument. In the absence of a size
225 modifier, the application shall ensure that the corresponding
226 argument is a pointer to unsigned.
227
228 x Matches an optionally signed hexadecimal integer, whose format
229 is the same as expected for the subject sequence of wcstoul()
230 with the value 16 for the base argument. In the absence of a
231 size modifier, the application shall ensure that the corre‐
232 sponding argument is a pointer to unsigned.
233
234 a, e, f, g
235 Matches an optionally signed floating-point number, infinity,
236 or NaN whose format is the same as expected for the subject
237 sequence of wcstod(). In the absence of a size modifier, the
238 application shall ensure that the corresponding argument is a
239 pointer to float.
240
241 If the fwprintf() family of functions generates character
242 string representations for infinity and NaN (a symbolic entity
243 encoded in floating-point format) to support IEEE Std 754‐1985,
244 the fwscanf() family of functions shall recognize them as
245 input.
246
247 s Matches a sequence of non-white-space wide characters. If no l
248 (ell) qualifier is present, characters from the input field
249 shall be converted as if by repeated calls to the wcrtomb()
250 function, with the conversion state described by an mbstate_t
251 object initialized to zero before the first wide character is
252 converted. If the 'm' assignment-allocation character is not
253 specified, the application shall ensure that the corresponding
254 argument is a pointer to a character array large enough to
255 accept the sequence and the terminating null character, which
256 shall be added automatically. Otherwise, the application shall
257 ensure that the corresponding argument is a pointer to a
258 pointer to a wchar_t.
259
260 If the l (ell) qualifier is present and the 'm' assignment-
261 allocation character is not specified, the application shall
262 ensure that the corresponding argument is a pointer to an array
263 of wchar_t large enough to accept the sequence and the termi‐
264 nating null wide character, which shall be added automatically.
265 If the l (ell) qualifier is present and the 'm' assignment-
266 allocation character is present, the application shall ensure
267 that the corresponding argument is a pointer to a pointer to a
268 wchar_t.
269
270 [ Matches a non-empty sequence of wide characters from a set of
271 expected wide characters (the scanset). If no l (ell) quali‐
272 fier is present, wide characters from the input field shall be
273 converted as if by repeated calls to the wcrtomb() function,
274 with the conversion state described by an mbstate_t object ini‐
275 tialized to zero before the first wide character is converted.
276 If the 'm' assignment-allocation character is not specified,
277 the application shall ensure that the corresponding argument is
278 a pointer to a character array large enough to accept the
279 sequence and the terminating null character, which shall be
280 added automatically. Otherwise, the application shall ensure
281 that the corresponding argument is a pointer to a pointer to a
282 wchar_t.
283
284 If an l (ell) qualifier is present and the 'm' assignment-allo‐
285 cation character is not specified, the application shall ensure
286 that the corresponding argument is a pointer to an array of
287 wchar_t large enough to accept the sequence and the terminating
288 null wide character. If an l (ell) qualifier is present and
289 the 'm' assignment-allocation character is specified, the
290 application shall ensure that the corresponding argument is a
291 pointer to a pointer to a wchar_t.
292
293 The conversion specification includes all subsequent wide char‐
294 acters in the format string up to and including the matching
295 <right-square-bracket> (']'). The wide characters between the
296 square brackets (the scanlist) comprise the scanset, unless the
297 wide character after the <left-square-bracket> is a <circum‐
298 flex> ('^'), in which case the scanset contains all wide char‐
299 acters that do not appear in the scanlist between the <circum‐
300 flex> and the <right-square-bracket>. If the conversion speci‐
301 fication begins with "[]" or "[^]", the <right-square-bracket>
302 is included in the scanlist and the next <right-square-bracket>
303 is the matching <right-square-bracket> that ends the conversion
304 specification; otherwise, the first <right-square-bracket> is
305 the one that ends the conversion specification. If a '-' is in
306 the scanlist and is not the first wide character, nor the sec‐
307 ond where the first wide character is a '^', nor the last wide
308 character, the behavior is implementation-defined.
309
310 c Matches a sequence of wide characters of exactly the number
311 specified by the field width (1 if no field width is present in
312 the conversion specification).
313
314 If no l (ell) length modifier is present, characters from the
315 input field shall be converted as if by repeated calls to the
316 wcrtomb() function, with the conversion state described by an
317 mbstate_t object initialized to zero before the first wide
318 character is converted. No null character is added. If the 'm'
319 assignment-allocation character is not specified, the applica‐
320 tion shall ensure that the corresponding argument is a pointer
321 to the initial element of a character array large enough to
322 accept the sequence. Otherwise, the application shall ensure
323 that the corresponding argument is a pointer to a pointer to a
324 char.
325
326 No null wide character is added. If an l (ell) length modifier
327 is present and the 'm' assignment-allocation character is not
328 specified, the application shall ensure that the corresponding
329 argument shall be a pointer to the initial element of an array
330 of wchar_t large enough to accept the sequence. If an l (ell)
331 qualifier is present and the 'm' assignment-allocation charac‐
332 ter is specified, the application shall ensure that the corre‐
333 sponding argument is a pointer to a pointer to a wchar_t.
334
335 p Matches an implementation-defined set of sequences, which shall
336 be the same as the set of sequences that is produced by the %p
337 conversion specification of the corresponding fwprintf() func‐
338 tions. The application shall ensure that the corresponding
339 argument is a pointer to a pointer to void. The interpretation
340 of the input item is implementation-defined. If the input item
341 is a value converted earlier during the same program execution,
342 the pointer that results shall compare equal to that value;
343 otherwise, the behavior of the %p conversion is undefined.
344
345 n No input is consumed. The application shall ensure that the
346 corresponding argument is a pointer to the integer into which
347 is to be written the number of wide characters read from the
348 input so far by this call to the fwscanf() functions. Execution
349 of a %n conversion specification shall not increment the
350 assignment count returned at the completion of execution of the
351 function. No argument shall be converted, but one shall be con‐
352 sumed. If the conversion specification includes an assignment-
353 suppressing wide character or a field width, the behavior is
354 undefined.
355
356 C Equivalent to lc.
357
358 S Equivalent to ls.
359
360 % Matches a single '%' wide character; no conversion or assign‐
361 ment shall occur. The complete conversion specification shall
362 be %%.
363
364 If a conversion specification is invalid, the behavior is undefined.
365
366 The conversion specifiers A, E, F, G, and X are also valid and shall be
367 equivalent to, respectively, a, e, f, g, and x.
368
369 If end-of-file is encountered during input, conversion is terminated.
370 If end-of-file occurs before any wide characters matching the current
371 conversion specification (except for %n) have been read (other than
372 leading white-space, where permitted), execution of the current conver‐
373 sion specification shall terminate with an input failure. Otherwise,
374 unless execution of the current conversion specification is terminated
375 with a matching failure, execution of the following conversion specifi‐
376 cation (if any) shall be terminated with an input failure.
377
378 Reaching the end of the string in swscanf() shall be equivalent to
379 encountering end-of-file for fwscanf().
380
381 If conversion terminates on a conflicting input, the offending input
382 shall be left unread in the input. Any trailing white space (including
383 <newline>) shall be left unread unless matched by a conversion specifi‐
384 cation. The success of literal matches and suppressed assignments is
385 only directly determinable via the %n conversion specification.
386
387 The fwscanf() and wscanf() functions may mark the last data access
388 timestamp of the file associated with stream for update. The last data
389 access timestamp shall be marked for update by the first successful
390 execution of fgetwc(), fgetws(), fwscanf(), getwc(), getwchar(), vfws‐
391 canf(), vwscanf(), or wscanf() using stream that returns data not sup‐
392 plied by a prior call to ungetwc().
393
395 Upon successful completion, these functions shall return the number of
396 successfully matched and assigned input items; this number can be zero
397 in the event of an early matching failure. If the input ends before the
398 first conversion (if any) has completed, and without a matching failure
399 having occurred, EOF shall be returned. If an error occurs before the
400 first conversion (if any) has completed, and without a matching failure
401 having occurred, EOF shall be returned and errno shall be set to indi‐
402 cate the error. If a read error occurs, the error indicator for the
403 stream shall be set.
404
406 For the conditions under which the fwscanf() functions shall fail and
407 may fail, refer to fgetwc().
408
409 In addition, the fwscanf() function shall fail if:
410
411 EILSEQ Input byte sequence does not form a valid character.
412
413 ENOMEM Insufficient storage space is available.
414
415 In addition, the fwscanf() function may fail if:
416
417 EINVAL There are insufficient arguments.
418
419 The following sections are informative.
420
422 The call:
423
424
425 int i, n; float x; char name[50];
426 n = wscanf(L"%d%f%s", &i, &x, name);
427
428 with the input line:
429
430
431 25 54.32E-1 Hamster
432
433 assigns to n the value 3, to i the value 25, to x the value 5.432, and
434 name contains the string "Hamster".
435
436 The call:
437
438
439 int i; float x; char name[50];
440 (void) wscanf(L"%2d%f%*d %[0123456789]", &i, &x, name);
441
442 with input:
443
444
445 56789 0123 56a72
446
447 assigns 56 to i, 789.0 to x, skips 0123, and places the string "56\0"
448 in name. The next call to getchar() shall return the character 'a'.
449
451 In format strings containing the '%' form of conversion specifications,
452 each argument in the argument list is used exactly once.
453
454 For functions that allocate memory as if by malloc(), the application
455 should release such memory when it is no longer required by a call to
456 free(). For fwscanf(), this is memory allocated via use of the 'm'
457 assignment-allocation character.
458
460 None.
461
463 None.
464
466 Section 2.5, Standard I/O Streams, getwc(), fwprintf(), setlocale(),
467 wcstod(), wcstol(), wcstoul(), wcrtomb()
468
469 The Base Definitions volume of POSIX.1‐2017, Chapter 7, Locale, <int‐
470 types.h>, <stdio.h>, <wchar.h>
471
473 Portions of this text are reprinted and reproduced in electronic form
474 from IEEE Std 1003.1-2017, Standard for Information Technology -- Por‐
475 table Operating System Interface (POSIX), The Open Group Base Specifi‐
476 cations Issue 7, 2018 Edition, Copyright (C) 2018 by the Institute of
477 Electrical and Electronics Engineers, Inc and The Open Group. In the
478 event of any discrepancy between this version and the original IEEE and
479 The Open Group Standard, the original IEEE and The Open Group Standard
480 is the referee document. The original Standard can be obtained online
481 at http://www.opengroup.org/unix/online.html .
482
483 Any typographical or formatting errors that appear in this page are
484 most likely to have been introduced during the conversion of the source
485 files to man page format. To report such errors, see https://www.ker‐
486 nel.org/doc/man-pages/reporting_bugs.html .
487
488
489
490IEEE/The Open Group 2017 FWSCANF(3P)