1FWSCANF(3P) POSIX Programmer's Manual FWSCANF(3P)
2
3
4
6 This manual page is part of the POSIX Programmer's Manual. The Linux
7 implementation of this interface may differ (consult the corresponding
8 Linux manual page for details of Linux behavior), or the interface may
9 not be implemented on Linux.
10
12 fwscanf, swscanf, wscanf - convert formatted wide-character input
13
15 #include <stdio.h>
16 #include <wchar.h>
17
18 int fwscanf(FILE *restrict stream, const wchar_t *restrict format, ...
19 );
20 int swscanf(const wchar_t *restrict ws,
21 const wchar_t *restrict format, ... );
22 int wscanf(const wchar_t *restrict format, ... );
23
24
26 The fwscanf() function shall read from the named input stream. The
27 wscanf() function shall read from the standard input stream stdin. The
28 swscanf() function shall read from the wide-character string ws. Each
29 function reads wide characters, interprets them according to a format,
30 and stores the results in its arguments. Each expects, as arguments, a
31 control wide-character string format described below, and a set of
32 pointer arguments indicating where the converted input should be
33 stored. The result is undefined if there are insufficient arguments for
34 the format. If the format is exhausted while arguments remain, the
35 excess arguments are evaluated but are otherwise ignored.
36
37 Conversions can be applied to the nth argument after the format in the
38 argument list, rather than to the next unused argument. In this case,
39 the conversion specifier wide character % (see below) is replaced by
40 the sequence "%n$", where n is a decimal integer in the range
41 [1,{NL_ARGMAX}]. This feature provides for the definition of format
42 wide-character strings that select arguments in an order appropriate to
43 specific languages. In format wide-character strings containing the
44 "%n$" form of conversion specifications, it is unspecified whether num‐
45 bered arguments in the argument list can be referenced from the format
46 wide-character string more than once.
47
48 The format can contain either form of a conversion specification-that
49 is, % or "%n$"- but the two forms cannot normally be mixed within a
50 single format wide-character string. The only exception to this is that
51 %% or %* can be mixed with the "%n$" form. When numbered argument spec‐
52 ifications are used, specifying the Nth argument requires that all the
53 leading arguments, from the first to the ( N-1)th, are pointers.
54
55 The fwscanf() function in all its forms allows for detection of a lan‐
56 guage-dependent radix character in the input string, encoded as a wide-
57 character value. The radix character is defined in the program's locale
58 (category LC_NUMERIC ). In the POSIX locale, or in a locale where the
59 radix character is not defined, the radix character shall default to a
60 period ( '.' ).
61
62 The format is a wide-character string composed of zero or more direc‐
63 tives. Each directive is composed of one of the following: one or more
64 white-space wide characters ( <space>s, <tab>s, <newline>s, <vertical-
65 tab>s, or <form-feed>s); an ordinary wide character (neither '%' nor a
66 white-space character); or a conversion specification. Each conversion
67 specification is introduced by a '%' or the sequence "%n$" after
68 which the following appear in sequence:
69
70 * An optional assignment-suppressing character '*' .
71
72 * An optional non-zero decimal integer that specifies the maximum
73 field width.
74
75 * An optional length modifier that specifies the size of the receiving
76 object.
77
78 * A conversion specifier wide character that specifies the type of
79 conversion to be applied. The valid conversion specifiers are
80 described below.
81
82 The fwscanf() functions shall execute each directive of the format in
83 turn. If a directive fails, as detailed below, the function shall
84 return. Failures are described as input failures (due to the unavail‐
85 ability of input bytes) or matching failures (due to inappropriate
86 input).
87
88 A directive composed of one or more white-space wide characters is exe‐
89 cuted by reading input until no more valid input can be read, or up to
90 the first wide character which is not a white-space wide character,
91 which remains unread.
92
93 A directive that is an ordinary wide character shall be executed as
94 follows. The next wide character is read from the input and compared
95 with the wide character that comprises the directive; if the comparison
96 shows that they are not equivalent, the directive shall fail, and the
97 differing and subsequent wide characters remain unread. Similarly, if
98 end-of-file, an encoding error, or a read error prevents a wide charac‐
99 ter from being read, the directive shall fail.
100
101 A directive that is a conversion specification defines a set of match‐
102 ing input sequences, as described below for each conversion wide char‐
103 acter. A conversion specification is executed in the following steps.
104
105 Input white-space wide characters (as specified by iswspace() ) shall
106 be skipped, unless the conversion specification includes a [, c, or n
107 conversion specifier.
108
109 An item shall be read from the input, unless the conversion specifica‐
110 tion includes an n conversion specifier wide character. An input item
111 is defined as the longest sequence of input wide characters, not
112 exceeding any specified field width, which is an initial subsequence of
113 a matching sequence. The first wide character, if any, after the input
114 item shall remain unread. If the length of the input item is zero, the
115 execution of the conversion specification shall fail; this condition is
116 a matching failure, unless end-of-file, an encoding error, or a read
117 error prevented input from the stream, in which case it is an input
118 failure.
119
120 Except in the case of a % conversion specifier, the input item (or, in
121 the case of a %n conversion specification, the count of input wide
122 characters) shall be converted to a type appropriate to the conversion
123 wide character. If the input item is not a matching sequence, the exe‐
124 cution of the conversion specification shall fail; this condition is a
125 matching failure. Unless assignment suppression was indicated by a '*',
126 the result of the conversion shall be placed in the object pointed to
127 by the first argument following the format argument that has not
128 already received a conversion result if the conversion specification is
129 introduced by %, or in the nth argument if introduced by the wide-
130 character sequence "%n$". If this object does not have an appropriate
131 type, or if the result of the conversion cannot be represented in the
132 space provided, the behavior is undefined.
133
134 The length modifiers and their meanings are:
135
136 hh Specifies that a following d, i, o, u, x, X, or n conversion
137 specifier applies to an argument with type pointer to signed
138 char or unsigned char.
139
140 h Specifies that a following d, i, o, u, x, X, or n conversion
141 specifier applies to an argument with type pointer to short or
142 unsigned short.
143
144 l (ell)
145 Specifies that a following d, i, o, u, x, X, or n conversion
146 specifier applies to an argument with type pointer to long or
147 unsigned long; that a following a, A, e, E, f, F, g, or G con‐
148 version specifier applies to an argument with type pointer to
149 double; or that a following c, s, or [ conversion specifier
150 applies to an argument with type pointer to wchar_t.
151
152 ll (ell-ell)
153
154 Specifies that a following d, i, o, u, x, X, or n conversion
155 specifier applies to an argument with type pointer to long long
156 or unsigned long long.
157
158 j Specifies that a following d, i, o, u, x, X, or n conversion
159 specifier applies to an argument with type pointer to intmax_t
160 or uintmax_t.
161
162 z Specifies that a following d, i, o, u, x, X, or n conversion
163 specifier applies to an argument with type pointer to size_t or
164 the corresponding signed integer type.
165
166 t Specifies that a following d, i, o, u, x, X, or n conversion
167 specifier applies to an argument with type pointer to ptrdiff_t
168 or the corresponding unsigned type.
169
170 L Specifies that a following a, A, e, E, f, F, g, or G conversion
171 specifier applies to an argument with type pointer to long dou‐
172 ble.
173
174
175 If a length modifier appears with any conversion specifier other than
176 as specified above, the behavior is undefined.
177
178 The following conversion specifier wide characters are valid:
179
180 d Matches an optionally signed decimal integer, whose format is
181 the same as expected for the subject sequence of wcstol() with
182 the value 10 for the base argument. In the absence of a size
183 modifier, the application shall ensure that the corresponding
184 argument is a pointer to int.
185
186 i Matches an optionally signed integer, whose format is the same
187 as expected for the subject sequence of wcstol() with 0 for the
188 base argument. In the absence of a size modifier, the applica‐
189 tion shall ensure that the corresponding argument is a pointer
190 to int.
191
192 o Matches an optionally signed octal integer, whose format is the
193 same as expected for the subject sequence of wcstoul() with the
194 value 8 for the base argument. In the absence of a size modi‐
195 fier, the application shall ensure that the corresponding argu‐
196 ment is a pointer to unsigned.
197
198 u Matches an optionally signed decimal integer, whose format is
199 the same as expected for the subject sequence of wcstoul() with
200 the value 10 for the base argument. In the absence of a size
201 modifier, the application shall ensure that the corresponding
202 argument is a pointer to unsigned.
203
204 x Matches an optionally signed hexadecimal integer, whose format
205 is the same as expected for the subject sequence of wcstoul()
206 with the value 16 for the base argument. In the absence of a
207 size modifier, the application shall ensure that the correspond‐
208 ing argument is a pointer to unsigned.
209
210 a, e, f, g
211
212 Matches an optionally signed floating-point number, infinity, or
213 NaN whose format is the same as expected for the subject
214 sequence of wcstod(). In the absence of a size modifier, the
215 application shall ensure that the corresponding argument is a
216 pointer to float.
217
218 If the fwprintf() family of functions generates character string repre‐
219 sentations for infinity and NaN (a symbolic entity encoded in floating-
220 point format) to support IEEE Std 754-1985, the fwscanf() family of
221 functions shall recognize them as input.
222
223 s Matches a sequence of non white-space wide characters. If no l
224 (ell) qualifier is present, characters from the input field
225 shall be converted as if by repeated calls to the wcrtomb()
226 function, with the conversion state described by an mbstate_t
227 object initialized to zero before the first wide character is
228 converted. The application shall ensure that the corresponding
229 argument is a pointer to a character array large enough to
230 accept the sequence and the terminating null character, which
231 shall be added automatically.
232
233 Otherwise, the application shall ensure that the corresponding argument
234 is a pointer to an array of wchar_t large enough to accept the sequence
235 and the terminating null wide character, which shall be added automati‐
236 cally.
237
238 [ Matches a non-empty sequence of wide characters from a set of
239 expected wide characters (the scanset). If no l (ell) qualifier
240 is present, wide characters from the input field shall be con‐
241 verted as if by repeated calls to the wcrtomb() function, with
242 the conversion state described by an mbstate_t object initial‐
243 ized to zero before the first wide character is converted. The
244 application shall ensure that the corresponding argument is a
245 pointer to a character array large enough to accept the sequence
246 and the terminating null character, which shall be added auto‐
247 matically.
248
249 If an l (ell) qualifier is present, the application shall ensure that
250 the corresponding argument is a pointer to an array of wchar_t large
251 enough to accept the sequence and the terminating null wide character,
252 which shall be added automatically.
253
254 The conversion specification includes all subsequent wide characters in
255 the format string up to and including the matching right square bracket
256 ( ']' ). The wide characters between the square brackets (the scanlist)
257 comprise the scanset, unless the wide character after the left square
258 bracket is a circumflex ( '^' ), in which case the scanset contains all
259 wide characters that do not appear in the scanlist between the circum‐
260 flex and the right square bracket. If the conversion specification
261 begins with "[]" or "[^]", the right square bracket is included in the
262 scanlist and the next right square bracket is the matching right square
263 bracket that ends the conversion specification; otherwise, the first
264 right square bracket is the one that ends the conversion specification.
265 If a '-' is in the scanlist and is not the first wide character, nor
266 the second where the first wide character is a '^', nor the last wide
267 character, the behavior is implementation-defined.
268
269 c Matches a sequence of wide characters of exactly the number
270 specified by the field width (1 if no field width is present in
271 the conversion specification).
272
273 If no l (ell) length modifier is present, characters from the input
274 field shall be converted as if by repeated calls to the wcrtomb() func‐
275 tion, with the conversion state described by an mbstate_t object ini‐
276 tialized to zero before the first wide character is converted. The
277 corresponding argument shall be a pointer to the initial element of a
278 character array large enough to accept the sequence. No null character
279 is added.
280
281 If an l (ell) length modifier is present, the corresponding argument
282 shall be a pointer to the initial element of an array of wchar_t large
283 enough to accept the sequence. No null wide character is added.
284
285 Otherwise, the application shall ensure that the corresponding argument
286 is a pointer to an array of wchar_t large enough to accept the
287 sequence. No null wide character is added.
288
289 p Matches an implementation-defined set of sequences, which shall
290 be the same as the set of sequences that is produced by the %p
291 conversion specification of the corresponding fwprintf() func‐
292 tions. The application shall ensure that the corresponding argu‐
293 ment is a pointer to a pointer to void. The interpretation of
294 the input item is implementation-defined. If the input item is a
295 value converted earlier during the same program execution, the
296 pointer that results shall compare equal to that value; other‐
297 wise, the behavior of the %p conversion is undefined.
298
299 n No input is consumed. The application shall ensure that the cor‐
300 responding argument is a pointer to the integer into which is to
301 be written the number of wide characters read from the input so
302 far by this call to the fwscanf() functions. Execution of a %n
303 conversion specification shall not increment the assignment
304 count returned at the completion of execution of the function.
305 No argument shall be converted, but one shall be consumed. If
306 the conversion specification includes an assignment-suppressing
307 wide character or a field width, the behavior is undefined.
308
309 C Equivalent to lc .
310
311 S Equivalent to ls .
312
313 % Matches a single '%' wide character; no conversion or assignment
314 shall occur. The complete conversion specification shall be %% .
315
316
317 If a conversion specification is invalid, the behavior is undefined.
318
319 The conversion specifiers A, E, F, G, and X are also valid and shall be
320 equivalent to, respectively, a, e, f, g, and x .
321
322 If end-of-file is encountered during input, conversion is terminated.
323 If end-of-file occurs before any wide characters matching the current
324 conversion specification (except for %n ) have been read (other than
325 leading white-space, where permitted), execution of the current conver‐
326 sion specification shall terminate with an input failure. Otherwise,
327 unless execution of the current conversion specification is terminated
328 with a matching failure, execution of the following conversion specifi‐
329 cation (if any) shall be terminated with an input failure.
330
331 Reaching the end of the string in swscanf() shall be equivalent to
332 encountering end-of-file for fwscanf().
333
334 If conversion terminates on a conflicting input, the offending input
335 shall be left unread in the input. Any trailing white space (including
336 <newline>) shall be left unread unless matched by a conversion specifi‐
337 cation. The success of literal matches and suppressed assignments is
338 only directly determinable via the %n conversion specification.
339
340 The fwscanf() and wscanf() functions may mark the st_atime field of the
341 file associated with stream for update. The st_atime field shall be
342 marked for update by the first successful execution of fgetc(),
343 fgetwc(), fgets(), fgetws(), fread(), getc(), getwc(), getchar(),
344 getwchar(), gets(), fscanf(), or fwscanf() using stream that returns
345 data not supplied by a prior call to ungetc().
346
348 Upon successful completion, these functions shall return the number of
349 successfully matched and assigned input items; this number can be zero
350 in the event of an early matching failure. If the input ends before the
351 first matching failure or conversion, EOF shall be returned. If a read
352 error occurs, the error indicator for the stream is set, EOF shall be
353 returned, and errno shall be set to indicate the error.
354
356 For the conditions under which the fwscanf() functions shall fail and
357 may fail, refer to fgetwc().
358
359 In addition, fwscanf() may fail if:
360
361 EILSEQ Input byte sequence does not form a valid character.
362
363 EINVAL There are insufficient arguments.
364
365
366 The following sections are informative.
367
369 The call:
370
371
372 int i, n; float x; char name[50];
373 n = wscanf(L"%d%f%s", &i, &x, name);
374
375 with the input line:
376
377
378 25 54.32E-1 Hamster
379
380 assigns to n the value 3, to i the value 25, to x the value 5.432, and
381 name contains the string "Hamster" .
382
383 The call:
384
385
386 int i; float x; char name[50];
387 (void) wscanf(L"%2d%f%*d %[0123456789]", &i, &x, name);
388
389 with input:
390
391
392 56789 0123 56a72
393
394 assigns 56 to i, 789.0 to x, skips 0123, and places the string "56\0"
395 in name. The next call to getchar() shall return the character 'a' .
396
398 In format strings containing the '%' form of conversion specifications,
399 each argument in the argument list is used exactly once.
400
402 None.
403
405 None.
406
408 getwc(), fwprintf(), setlocale(), wcstod(), wcstol(), wcstoul(), wcr‐
409 tomb(), the Base Definitions volume of IEEE Std 1003.1-2001, Chapter 7,
410 Locale, <langinfo.h>, <stdio.h>, <wchar.h>
411
413 Portions of this text are reprinted and reproduced in electronic form
414 from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
415 -- Portable Operating System Interface (POSIX), The Open Group Base
416 Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
417 Electrical and Electronics Engineers, Inc and The Open Group. In the
418 event of any discrepancy between this version and the original IEEE and
419 The Open Group Standard, the original IEEE and The Open Group Standard
420 is the referee document. The original Standard can be obtained online
421 at http://www.opengroup.org/unix/online.html .
422
423
424
425IEEE/The Open Group 2003 FWSCANF(3P)