1scan_utf8(3) Library Functions Manual scan_utf8(3)
2
3
4
6 scan_utf8 - decode an unsigned integer from UTF-8 encoding
7
9 #include <scan.h>
10
11 size_t scan_utf8(const char *src,size_t len,uint32_t *dest);
12
14 scan_utf8 decodes an unsigned integer in UTF-8 encoding from a memory
15 area holding binary data. It writes the decode value in dest and
16 returns the number of bytes it read from src.
17
18 scan_utf8 never reads more than len bytes from src. If the sequence is
19 longer than that, or the memory area contains an invalid sequence,
20 scan_utf8 returns 0 and does not touch dest.
21
22 The length of the longest UTF-8 sequence is 5. If the buffer is longer
23 than that, and scan_utf8 fails, then the data was not a valid UTF-8
24 encoded sequence.
25
27 fmt_utf8 and scan_utf8 implement the encoding from UTF-8, but are meant
28 to be able to store integers, not just Unicode code points. Values
29 above 0x10ffff are not valid UTF-8. If you are using this function to
30 parse UTF-8, you need to reject them (see RFC 3629).
31
33 fmt_utf8(3)
34
35
36
37 scan_utf8(3)