1UNVIS(3bsd)                          LOCAL                         UNVIS(3bsd)
2

NAME

4     unvis, strunvis, strnunvis, strunvisx, strnunvisx — decode a visual rep‐
5     resentation of characters
6

LIBRARY

8     Utility functions from BSD systems (libbsd, -lbsd)
9

SYNOPSIS

11     #include <vis.h>
12     (See libbsd(7) for include usage.)
13
14     int
15     unvis(char *cp, int c, int *astate, int flag);
16
17     int
18     strunvis(char *dst, const char *src);
19
20     int
21     strnunvis(char *dst, size_t dlen, const char *src);
22
23     int
24     strunvisx(char *dst, const char *src, int flag);
25
26     int
27     strnunvisx(char *dst, size_t dlen, const char *src, int flag);
28

DESCRIPTION

30     The unvis(), strunvis() and strunvisx() functions are used to decode a
31     visual representation of characters, as produced by the vis(3bsd) func‐
32     tion, back into the original form.
33
34     The unvis() function is called with successive characters in c until a
35     valid sequence is recognized, at which time the decoded character is
36     available at the character pointed to by cp.
37
38     The strunvis() function decodes the characters pointed to by src into the
39     buffer pointed to by dst.  The strunvis() function simply copies src to
40     dst, decoding any escape sequences along the way, and returns the number
41     of characters placed into dst, or -1 if an invalid escape sequence was
42     detected.  The size of dst should be equal to the size of src (that is,
43     no expansion takes place during decoding).
44
45     The strunvisx() function does the same as the strunvis() function, but it
46     allows you to add a flag that specifies the style the string src is
47     encoded with.  Currently, the supported flags are: VIS_HTTPSTYLE and
48     VIS_MIMESTYLE.
49
50     The unvis() function implements a state machine that can be used to
51     decode an arbitrary stream of bytes.  All state associated with the bytes
52     being decoded is stored outside the unvis() function (that is, a pointer
53     to the state is passed in), so calls decoding different streams can be
54     freely intermixed.  To start decoding a stream of bytes, first initialize
55     an integer to zero.  Call unvis() with each successive byte, along with a
56     pointer to this integer, and a pointer to a destination character.  The
57     unvis() function has several return codes that must be handled properly.
58     They are:
59
60     0 (zero)         Another character is necessary; nothing has been recog‐
61                      nized yet.
62
63     UNVIS_VALID      A valid character has been recognized and is available
64                      at the location pointed to by cp.
65
66     UNVIS_VALIDPUSH  A valid character has been recognized and is available
67                      at the location pointed to by cp; however, the character
68                      currently passed in should be passed in again.
69
70     UNVIS_NOCHAR     A valid sequence was detected, but no character was pro‐
71                      duced.  This return code is necessary to indicate a log‐
72                      ical break between characters.
73
74     UNVIS_SYNBAD     An invalid escape sequence was detected, or the decoder
75                      is in an unknown state.  The decoder is placed into the
76                      starting state.
77
78     When all bytes in the stream have been processed, call unvis() one more
79     time with flag set to UNVIS_END to extract any remaining character (the
80     character passed in is ignored).
81
82     The flag argument is also used to specify the encoding style of the
83     source.  If set to VIS_HTTPSTYLE or VIS_HTTP1808, unvis() will decode URI
84     strings as specified in RFC 1808.  If set to VIS_HTTP1866, unvis() will
85     decode entity references and numeric character references as specified in
86     RFC 1866.  If set to VIS_MIMESTYLE, unvis() will decode MIME Quoted-
87     Printable strings as specified in RFC 2045.  If set to VIS_NOESCAPE,
88     unvis() will not decode ‘\’ quoted characters.
89
90     The following code fragment illustrates a proper use of unvis().
91
92           int state = 0;
93           char out;
94
95           while ((ch = getchar()) != EOF) {
96           again:
97                   switch(unvis(&out, ch, &state, 0)) {
98                   case 0:
99                   case UNVIS_NOCHAR:
100                           break;
101                   case UNVIS_VALID:
102                           (void)putchar(out);
103                           break;
104                   case UNVIS_VALIDPUSH:
105                           (void)putchar(out);
106                           goto again;
107                   case UNVIS_SYNBAD:
108                           errx(EXIT_FAILURE, "Bad character sequence!");
109                   }
110           }
111           if (unvis(&out, '\0', &state, UNVIS_END) == UNVIS_VALID)
112                   (void)putchar(out);
113

ERRORS

115     The functions strunvis(), strnunvis(), strunvisx(), and strnunvisx() will
116     return -1 on error and set errno to:
117
118     [EINVAL]           An invalid escape sequence was detected, or the
119                        decoder is in an unknown state.
120
121     In addition the functions strnunvis() and strnunvisx() will can also set
122     errno on error to:
123
124     [ENOSPC]           Not enough space to perform the conversion.
125

SEE ALSO

127     unvis(1), vis(1), vis(3bsd)
128
129     R. Fielding, Relative Uniform Resource Locators, RFC1808.
130

HISTORY

132     The unvis() function first appeared in 4.4BSD.  The strnunvis() and
133     strnunvisx() functions appeared in NetBSD 6.0.
134

BUGS

136     The names VIS_HTTP1808 and VIS_HTTP1866 are wrong.  Percent-encoding was
137     defined in RFC 1738, the original RFC for URL.  RFC 1866 defines HTML
138     2.0, an application of SGML, from which it inherits concepts of numeric
139     character references and entity references.
140
141BSD                             March 12, 2011                             BSD
Impressum