1fwscanf(3C) Standard C Library Functions fwscanf(3C)
2
3
4
6 fwscanf, wscanf, swscanf, vfwscanf, vwscanf, vswscanf - convert format‐
7 ted wide-character input
8
10 #include <stdio.h>
11 #include <wchar.h>
12
13 int fwscanf(FILE *restrict stream, const wchar_t *restrict format, ...);
14
15
16 int wscanf(const wchar_t *restrict format, ...);
17
18
19 int swscanf(const wchar_t *restrict s, const wchar_t *restrict format,
20 ...);
21
22
23 #include <stdarg.h>
24 #include <stdio.h>
25 #include <wchar.h>
26
27 int vfwscanf(FILE *restrict stream, const wchar_t *restrict format,
28 va_list arg);
29
30
31 int vswcanf(const wchar_t *restrict ws, const wchar_t *restrict format,
32 va_list arg);
33
34
35 int vwscanf(const wchar_t *restrict format, va_list arg);
36
37
39 The fwscanf() function reads from the named input stream.
40
41
42 The wscanf() function reads from the standard input stream stdin.
43
44
45 The swscanf() function reads from the wide-character string s.
46
47
48 The vfwscanf(), vswscanf(), and vwscanf() functions are equivalent to
49 the fwscanf(), swscanf(), and wscanf() functions, respectively, except
50 that instead of being called with a variable number of arguments, they
51 are called with an argument list as defined by the <stdarg.h> header .
52 These functions do not invoke the va_end() macro. Applications using
53 these functions should call va_end(ap) afterwards to clean up.
54
55
56 Each function reads wide-characters, interprets them according to a
57 format, and stores the results in its arguments. Each expects, as argu‐
58 ments, a control wide-character string format described below, and a
59 set of pointer arguments indicating where the converted input should be
60 stored. The result is undefined if there are insufficient arguments for
61 the format. If the format is exhausted while arguments remain, the
62 excess arguments are evaluated but are otherwise ignored.
63
64
65 Conversions can be applied to the nth argument after the format in the
66 argument list, rather than to the next unused argument. In this case,
67 the conversion wide-character % (see below) is replaced by the sequence
68 %n$, where n is a decimal integer in the range [1, NL_ARGMAX]. This
69 feature provides for the definition of format wide-character strings
70 that select arguments in an order appropriate to specific languages. In
71 format wide-character strings containing the %n$ form of conversion
72 specifications, it is unspecified whether numbered arguments in the
73 argument list can be referenced from the format wide-character string
74 more than once.
75
76
77 The format can contain either form of a conversion specification, that
78 is, % or %n$, but the two forms cannot normally be mixed within a sin‐
79 gle format wide-character string. The only exception to this is that %%
80 or %* can be mixed with the %n$ form.
81
82
83 The fwscanf() function in all its forms allows for detection of a lan‐
84 guage-dependent radix character in the input string, encoded as a wide-
85 character value. The radix character is defined in the program's
86 locale (category LC_NUMERIC). In the POSIX locale, or in a locale where
87 the radix character is not defined, the radix character defaults to a
88 period (.).
89
90
91 The format is a wide-character string composed of zero or more direc‐
92 tives. Each directive is composed of one of the following: one or more
93 white-space wide-characters (space, tab, newline, vertical-tab or
94 form-feed characters); an ordinary wide-character (neither % nor a
95 white-space character); or a conversion specification. Each conversion
96 specification is introduced by a % or the sequence %n$ after which the
97 following appear in sequence:
98
99 o An optional assignment-suppressing character *.
100
101 o An optional non-zero decimal integer that specifies the max‐
102 imum field width.
103
104 o An option length modifier that specifies the size of the
105 receiving object.
106
107 o A conversion specifier wide-character that specifies the
108 type of conversion to be applied. The valid conversion wide-
109 characters are described below.
110
111
112 The fwscanf() functions execute each directive of the format in turn.
113 If a directive fails, as detailed below, the function returns. Fail‐
114 ures are described as input failures (due to the unavailability of
115 input bytes) or matching failures (due to inappropriate input).
116
117
118 A directive composed of one or more white-space wide-characters is exe‐
119 cuted by reading input until no more valid input can be read, or up to
120 the first wide-character which is not a white-space wide-character,
121 which remains unread.
122
123
124 A directive that is an ordinary wide-character is executed as follows.
125 The next wide-character is read from the input and compared with the
126 wide-character that comprises the directive; if the comparison shows
127 that they are not equivalent, the directive fails, and the differing
128 and subsequent wide-characters remain unread.
129
130
131 A directive that is a conversion specification defines a set of match‐
132 ing input sequences, as described below for each conversion wide-char‐
133 acter. A conversion specification is executed in the following steps:
134
135
136 Input white-space wide-characters (as specified by iswspace(3C)) are
137 skipped, unless the conversion specification includes a [, c, or n con‐
138 version character.
139
140
141 An item is read from the input unless the conversion specification
142 includes an n conversion wide-character. The length of the item read is
143 limited to any specified maximum field width. In Solaris default mode,
144 the input item is defined as the longest sequence of input wide-charac‐
145 ters that forms a matching sequence. In some cases, fwscanf() might
146 need to read several extra wide-characters beyond the end of the input
147 item to find the end of a matching sequence. In C99/SUSv3 mode, the
148 input item is defined as the longest sequence of input wide-characters
149 that is, or is a prefix of, a matching sequence. With this definition,
150 fwscanf() need only read at most one wide-character beyond the end of
151 the input item. Therefore, in C99/SUSv3 mode, some sequences that are
152 acceptable to wcstod(3C), wcstol(3C), and similar functions are unac‐
153 ceptable to fwscanf(). In either mode, fwscanf() attempts to push back
154 any excess bytes read using ungetc(3C). Assuming all such attempts suc‐
155 ceed, the first wide-character, if any, after the input item remains
156 unread. If the length of the input item is 0, the conversion fails.
157 This condition is a matching failure unless end-of-file, an encoding
158 error, or a read error prevented input from the stream, in which case
159 it is an input failure.
160
161
162 Except in the case of a % conversion wide-character, the input item
163 (or, in the case of a %n conversion specification, the count of input
164 wide-characters) is converted to a type appropriate to the conversion
165 wide-character. If the input item is not a matching sequence, the exe‐
166 cution of the conversion specification fails; this condition is a
167 matching failure. Unless assignment suppression was indicated by a *,
168 the result of the conversion is placed in the object pointed to by the
169 first argument following the format argument that has not already
170 received a conversion result if the conversion specification is intro‐
171 duced by %, or in the nth argument if introduced by the wide-character
172 sequence %n$. If this object does not have an appropriate type, or if
173 the result of the conversion cannot be represented in the space pro‐
174 vided, the behavior is undefined.
175
176
177 The length modifiers and their meanings are:
178
179 hh Specifies that a following d, i, o, u, x, X, or n con‐
180 version specifier applies to an argument with type
181 pointer to signed char or unsigned char.
182
183
184 h Specifies that a following d, i, o, u, x, X, or n con‐
185 version specifier applies to an argument with type
186 pointer to short or unsigned short.
187
188
189 l (ell) Specifies that a following d, i, o, u, x, X, or n con‐
190 version specifier applies to an argument with type
191 pointer to long or unsigned long; that a following a,
192 A, e, E, f, F, g, or G conversion specifier applies to
193 an argument with type pointer to double; or that a fol‐
194 lowing c, s, or [ conversion specifier applies to an
195 argument with type pointer to wchar_t.
196
197
198 ll (ell-ell) Specifies that a following d, i, o, u, x, X, or n con‐
199 version specifier applies to an argument with type
200 pointer to long long or unsigned long long.
201
202
203 j Specifies that a following d, i, o, u, x, X, or n con‐
204 version specifier applies to an argument with type
205 pointer to intmax_t or uintmax_t.
206
207
208 z Specifies that a following d, i, o, u, x, X, or n con‐
209 version specifier applies to an argument with type
210 pointer to size_t or the corresponding signed integer
211 type.
212
213
214 t Specifies that a following d, i, o, u, x, X, or n con‐
215 version specifier applies to an argument with type
216 pointer to ptrdiff_t or the corresponding unsigned
217 type.
218
219
220 L Specifies that a following a, A, e, E, f, F, g, or G
221 conversion specifier applies to an argument with type
222 pointer to long double.
223
224
225
226 If a length modifier appears with any conversion specifier other than
227 as specified above, the behavior is undefined.
228
229
230 The following conversion wide-characters are valid:
231
232 d Matches an optionally signed decimal integer, whose format
233 is the same as expected for the subject sequence of
234 wcstol(3C) with the value 10 for the base argument. In the
235 absence of a size modifier, the corresponding argument must
236 be a pointer to int.
237
238
239 i Matches an optionally signed integer, whose format is the
240 same as expected for the subject sequence of wcstol(3C) with
241 0 for the base argument. In the absence of a size modifier,
242 the corresponding argument must be a pointer to int.
243
244
245 o Matches an optionally signed octal integer, whose format is
246 the same as expected for the subject sequence of wcstoul(3C)
247 with the value 8 for the base argument. In the absence of a
248 size modifier, the corresponding argument must be a pointer
249 to unsigned int.
250
251
252 u Matches an optionally signed decimal integer, whose format
253 is the same as expected for the subject sequence of
254 wcstoul(3C) with the value 10 for the base argument. In the
255 absence of a size modifier, the corresponding argument must
256 be a pointer to unsigned int.
257
258
259 x Matches an optionally signed hexadecimal integer, whose for‐
260 mat is the same as expected for the subject sequence of
261 wcstoul(3C) with the value 16 for the base argument. In the
262 absence of a size modifier, the corresponding argument must
263 be a pointer to unsigned int.
264
265
266 a,e,f,g Matches an optionally signed floating-point number, whose
267 format is the same as expected for the subject sequence of
268 wcstod(3C). In the absence of a size modifier, the corre‐
269 sponding argument must be a pointer to float. The e, f, and
270 g specifiers match hexadecimal floating point values only in
271 C99/SUSv3 (see standards(5)) mode, but the a specifier
272 always matches hexadecimal floating point values.
273
274 These conversion specifiers match any subject sequence
275 accepted by strtod(3C), including the INF, INFINITY, NAN,
276 and NAN(n-char-sequence) forms. The result of the conver‐
277 sion is the same as that of calling strtod() (or strtof() or
278 strtold()) with the matching sequence, including the raising
279 of floating point exceptions and the setting of errno to
280 ERANGE, if applicable.
281
282
283 s Matches a sequence of non white-space wide-characters. If
284 no l (ell) qualifier is present, characters from the input
285 field are converted as if by repeated calls to the wcr‐
286 tomb(3C) function, with the conversion state described by an
287 mbstate_t object initialized to zero before the first wide-
288 character is converted. The corresponding argument must be
289 a pointer to a character array large enough to accept the
290 sequence and the terminating null character, which will be
291 added automatically.
292
293 Otherwise, the corresponding argument must be a pointer to
294 an array of wchar_t large enough to accept the sequence and
295 the terminating null wide-character, which will be added
296 automatically.
297
298
299 [ Matches a non-empty sequence of wide-characters from a set
300 of expected wide-characters (the scanset). If no l (ell)
301 qualifier is present, wide-characters from the input field
302 are converted as if by repeated calls to the wcrtomb() func‐
303 tion, with the conversion state described by an mbstate_t
304 object initialized to zero before the first wide-character
305 is converted. The corresponding argument must be a pointer
306 to a character array large enough to accept the sequence and
307 the terminating null character, which will be added auto‐
308 matically.
309
310 If an l (ell) qualifier is present, the corresponding argu‐
311 ment must be a pointer to an array of wchar_t large enough
312 to accept the sequence and the terminating null wide-char‐
313 acter, which will be added automatically.
314
315 The conversion specification includes all subsequent widw
316 characters in the format string up to and including the
317 matching right square bracket (]). The wide-characters
318 between the square brackets (the scanlist) comprise the
319 scanset, unless the wide-character after the left square
320 bracket is a circumflex (^), in which case the scanset con‐
321 tains all wide-characters that do not appear in the scanlist
322 between the circumflex and the right square bracket. If the
323 conversion specification begins with [] or [^], the right
324 square bracket is included in the scanlist and the next
325 right square bracket is the matching right square bracket
326 that ends the conversion specification; otherwise the first
327 right square bracket is the one that ends the conversion
328 specification. If a minus-sign (−) is in the scanlist and is
329 not the first wide-character, nor the second where the first
330 wide-character is a ^, nor the last wide-character, it indi‐
331 cates a range of characters to be matched.
332
333
334 c Matches a sequence of wide-characters of the number speci‐
335 fied by the field width (1 if no field width is present in
336 the conversion specification). If no l (ell) qualifier is
337 present, wide-characters from the input field are converted
338 as if by repeated calls to the wcrtomb() function, with the
339 conversion state described by an mbstate_t object initial‐
340 ized to zero before the first wide-character is converted.
341 The corresponding argument must be a pointer to a character
342 array large enough to accept the sequence. No null charac‐
343 ter is added.
344
345 Otherwise, the corresponding argument must be a pointer to
346 an array of wchar_t large enough to accept the sequence. No
347 null wide-character is added.
348
349
350 p Matches the set of sequences that is the same as the set of
351 sequences that is produced by the %p conversion of the cor‐
352 responding fwprintf(3C) functions. The corresponding argu‐
353 ment must be a pointer to a pointer to void. If the input
354 item is a value converted earlier during the same program
355 execution, the pointer that results will compare equal to
356 that value; otherwise the behavior of the %p conversion is
357 undefined.
358
359
360 n No input is consumed. The corresponding argument must be a
361 pointer to the integer into which is to be written the num‐
362 ber of wide-characters read from the input so far by this
363 call to the fwscanf() functions. Execution of a %n conver‐
364 sion specification does not increment the assignment count
365 returned at the completion of execution of the function.
366
367
368 C Same as lc.
369
370
371 S Same as ls.
372
373
374 % Matches a single %; no conversion or assignment occurs. The
375 complete conversion specification must be %%.
376
377
378
379 If a conversion specification is invalid, the behavior is undefined.
380
381
382 The conversion characters A, E, F, G, and X are also valid and behave
383 the same as, respectively, a, e, f, g, and x.
384
385
386 If end-of-file is encountered during input, conversion is terminated.
387 If end-of-file occurs before any wide-characters matching the current
388 conversion specification (except for %n) have been read (other than
389 leading white-space, where permitted), execution of the current conver‐
390 sion specification terminates with an input failure. Otherwise, unless
391 execution of the current conversion specification is terminated with a
392 matching failure, execution of the following conversion specification
393 (if any) is terminated with an input failure.
394
395
396 Reaching the end of the string in swscanf() is equivalent to encounter‐
397 ing end-of-file for fwscanf().
398
399
400 If conversion terminates on a conflicting input, the offending input is
401 left unread in the input. Any trailing white space (including newline)
402 is left unread unless matched by a conversion specification. The suc‐
403 cess of literal matches and suppressed assignments is only directly
404 determinable via the %n conversion specification.
405
406
407 The fwscanf() and wscanf() functions may mark the st_atime field of the
408 file associated with stream for update. The st_atime field will be
409 marked for update by the first successful execution of fgetc(3C),
410 fgetwc(3C), fgets(3C), fgetws(3C), fread(3C), getc(3C), getwc(3C),
411 getchar(3C), getwchar(3C), gets(3C), fscanf(3C) or fwscanf() using
412 stream that returns data not supplied by a prior call to ungetc(3C).
413
415 Upon successful completion, these functions return the number of suc‐
416 cessfully matched and assigned input items; this number can be 0 in the
417 event of an early matching failure. If the input ends before the first
418 matching failure or conversion, EOF is returned. If a read error
419 occurs the error indicator for the stream is set, EOF is returned, and
420 errno is set to indicate the error.
421
423 For the conditions under which the fwscanf() functions will fail and
424 may fail, refer to fgetwc(3C).
425
426
427 In addition, fwscanf() may fail if:
428
429 EILSEQ Input byte sequence does not form a valid character.
430
431
432 EINVAL There are insufficient arguments.
433
434
436 In format strings containing the % form of conversion specifications,
437 each argument in the argument list is used exactly once.
438
440 Example 1 wscanf() example
441
442
443 The call:
444
445
446 int i, n; float x; char name[50];
447 n = wscanf(L"%d%f%s", &i, &x, name);
448
449
450
451 with the input line:
452
453
454 25 54.32E−1 Hamster
455
456
457
458 will assign to n the value 3, to i the value 25, to x the value 5.432,
459 and name will contain the string Hamster.
460
461
462
463 The call:
464
465
466 int i; float x; char name[50];
467 (void) wscanf(L"%2d%f%*d %[0123456789], &i, &x, name);
468
469
470
471 with input:
472
473
474 56789 0123 56a72
475
476
477
478 will assign 56 to i, 789.0 to x, skip 0123, and place the string 56\0
479 in name. The next call to getchar(3C) will return the character a.
480
481
483 See attributes(5) for descriptions of the following attributes:
484
485
486
487
488 ┌─────────────────────────────┬─────────────────────────────┐
489 │ ATTRIBUTE TYPE │ ATTRIBUTE VALUE │
490 ├─────────────────────────────┼─────────────────────────────┤
491 │Interface Stability │Committed │
492 ├─────────────────────────────┼─────────────────────────────┤
493 │MT-Level │MT-Safe │
494 ├─────────────────────────────┼─────────────────────────────┤
495 │Standard │See standards(5). │
496 └─────────────────────────────┴─────────────────────────────┘
497
499 fgetc(3C), fgets(3C), fgetwc(3C), fgetws(3C), fread(3C), fscanf(3C),
500 fwprintf(3C), getc(3C), getchar(3C), gets(3C), getwc(3C), getwchar(3C),
501 setlocale(3C), strtod(3C), wcrtomb(3C), wcstod(3C), wcstol(3C),
502 wcstoul(3C), attributes(5), standards(5)
503
505 The behavior of the conversion specifier "%%" has changed for all of
506 the functions described on this manual page. Previously the "%%" speci‐
507 fier accepted a "%" character from input only if there were no preced‐
508 ing whitespace characters. The new behavior accepts "%" even if there
509 are preceding whitespace characters. This new behavior now aligns with
510 the description on this manual page and in various standards. If the
511 old behavior is desired, the conversion specification "%*[%]" can be
512 used.
513
514
515
516SunOS 5.11 10 Jul 2008 fwscanf(3C)