1vis(3bsd)                            LOCAL                           vis(3bsd)
2

NAME

4     vis, nvis, strvis, stravis, strnvis, strvisx, strnvisx, strenvisx, svis,
5     snvis, strsvis, strsnvis, strsvisx, strsnvisx, strsenvisx — visually en‐
6     code characters
7

LIBRARY

9     Utility functions from BSD systems (libbsd, -lbsd)
10

SYNOPSIS

12     #include <vis.h>
13     (See libbsd(7) for include usage.)
14
15     char *
16     vis(char *dst, int c, int flag, int nextc);
17
18     char *
19     nvis(char *dst, size_t dlen, int c, int flag, int nextc);
20
21     int
22     strvis(char *dst, const char *src, int flag);
23
24     int
25     stravis(char **dst, const char *src, int flag);
26
27     int
28     strnvis(char *dst, size_t dlen, const char *src, int flag);
29
30     int
31     strvisx(char *dst, const char *src, size_t len, int flag);
32
33     int
34     strnvisx(char *dst, size_t dlen, const char *src, size_t len, int flag);
35
36     int
37     strenvisx(char *dst, size_t dlen, const char *src, size_t len, int flag,
38         int *cerr_ptr);
39
40     char *
41     svis(char *dst, int c, int flag, int nextc, const char *extra);
42
43     char *
44     snvis(char *dst, size_t dlen, int c, int flag, int nextc,
45         const char *extra);
46
47     int
48     strsvis(char *dst, const char *src, int flag, const char *extra);
49
50     int
51     strsnvis(char *dst, size_t dlen, const char *src, int flag,
52         const char *extra);
53
54     int
55     strsvisx(char *dst, const char *src, size_t len, int flag,
56         const char *extra);
57
58     int
59     strsnvisx(char *dst, size_t dlen, const char *src, size_t len, int flag,
60         const char *extra);
61
62     int
63     strsenvisx(char *dst, size_t dlen, const char *src, size_t len, int flag,
64         const char *extra, int *cerr_ptr);
65

DESCRIPTION

67     The vis() function copies into dst a string which represents the charac‐
68     ter c.  If c needs no encoding, it is copied in unaltered.  The string is
69     null terminated, and a pointer to the end of the string is returned.  The
70     maximum length of any encoding is four bytes (not including the trailing
71     NUL); thus, when encoding a set of characters into a buffer, the size of
72     the buffer should be four times the number of bytes encoded, plus one for
73     the trailing NUL.  The flag parameter is used for altering the default
74     range of characters considered for encoding and for altering the visual
75     representation.  The additional character, nextc, is only used when se‐
76     lecting the VIS_CSTYLE encoding format (explained below).
77
78     The strvis(), stravis(), strnvis(), strvisx(), and strnvisx() functions
79     copy into dst a visual representation of the string src.  The strvis()
80     and strnvis() functions encode characters from src up to the first NUL.
81     The strvisx() and strnvisx() functions encode exactly len characters from
82     src (this is useful for encoding a block of data that may contain NUL's).
83     Both forms NUL terminate dst.  The size of dst must be four times the
84     number of bytes encoded from src (plus one for the NUL).  Both forms re‐
85     turn the number of characters in dst (not including the trailing NUL).
86     The stravis() function allocates space dynamically to hold the string.
87     The “n” versions of the functions also take an additional argument dlen
88     that indicates the length of the dst buffer.  If dlen is not large enough
89     to fit the converted string then the strnvis() and strnvisx() functions
90     return -1 and set errno to ENOSPC.  The strenvisx() function takes an ad‐
91     ditional argument, cerr_ptr, that is used to pass in and out a multibyte
92     conversion error flag.  This is useful when processing single characters
93     at a time when it is possible that the locale may be set to something
94     other than the locale of the characters in the input data.
95
96     The functions svis(), snvis(), strsvis(), strsnvis(), strsvisx(),
97     strsnvisx(), and strsenvisx() correspond to vis(), nvis(), strvis(),
98     strnvis(), strvisx(), strnvisx(), and strenvisx() but have an additional
99     argument extra, pointing to a NUL terminated list of characters.  These
100     characters will be copied encoded or backslash-escaped into dst.  These
101     functions are useful e.g. to remove the special meaning of certain char‐
102     acters to shells.
103
104     The encoding is a unique, invertible representation composed entirely of
105     graphic characters; it can be decoded back into the original form using
106     the unvis(3bsd), strunvis(3bsd) or strnunvis(3bsd) functions.
107
108     There are two parameters that can be controlled: the range of characters
109     that are encoded (applies only to vis(), nvis(), strvis(), strnvis(),
110     strvisx(), and strnvisx()), and the type of representation used.  By de‐
111     fault, all non-graphic characters, except space, tab, and newline are en‐
112     coded (see isgraph(3)).  The following flags alter this:
113
114     VIS_DQ      Also encode double quotes
115
116     VIS_GLOB    Also encode the magic characters (‘*’, ‘?’, ‘[’, and ‘#’)
117                 recognized by glob(3).
118
119     VIS_SHELL   Also encode the meta characters used by shells (in addition
120                 to the glob characters): (‘'’, ‘`’, ‘"’, ‘;’, ‘&’, ‘<’, ‘>’,
121                 ‘(’, ‘)’, ‘|’, ‘]’, ‘\’, ‘$’, ‘!’, ‘^’, and ‘~’).
122
123     VIS_SP      Also encode space.
124
125     VIS_TAB     Also encode tab.
126
127     VIS_NL      Also encode newline.
128
129     VIS_WHITE   Synonym for VIS_SP | VIS_TAB | VIS_NL.
130
131     VIS_META    Synonym for VIS_WHITE | VIS_GLOB | VIS_SHELL.
132
133     VIS_SAFE    Only encode “unsafe” characters.  Unsafe means control char‐
134                 acters which may cause common terminals to perform unexpected
135                 functions.  Currently this form allows space, tab, newline,
136                 backspace, bell, and return — in addition to all graphic
137                 characters — unencoded.
138
139     (The above flags have no effect for svis(), snvis(), strsvis(),
140     strsnvis(), strsvisx(), and strsnvisx().  When using these functions,
141     place all graphic characters to be encoded in an array pointed to by
142     extra.  In general, the backslash character should be included in this
143     array, see the warning on the use of the VIS_NOSLASH flag below).
144
145     There are six forms of encoding.  All forms use the backslash character
146     ‘\’ to introduce a special sequence; two backslashes are used to repre‐
147     sent a real backslash, except VIS_HTTPSTYLE that uses ‘%’, or
148     VIS_MIMESTYLE that uses ‘=’.  These are the visual formats:
149
150     (default)   Use an ‘M’ to represent meta characters (characters with the
151                 8th bit set), and use caret ‘^’ to represent control charac‐
152                 ters (see iscntrl(3)).  The following formats are used:
153
154                 \^C    Represents the control character ‘C’.  Spans charac‐
155                        ters ‘\000’ through ‘\037’, and ‘\177’ (as ‘\^?’).
156
157                 \M-C   Represents character ‘C’ with the 8th bit set.  Spans
158                        characters ‘\241’ through ‘\376’.
159
160                 \M^C   Represents control character ‘C’ with the 8th bit set.
161                        Spans characters ‘\200’ through ‘\237’, and ‘\377’ (as
162                        ‘\M^?’).
163
164                 \040   Represents ASCII space.
165
166                 \240   Represents Meta-space.
167
168     VIS_CSTYLE  Use C-style backslash sequences to represent standard non-
169                 printable characters.  The following sequences are used to
170                 represent the indicated characters:
171
172                       \a — BEL (007)
173                       \b — BS (010)
174                       \f — NP (014)
175                       \n — NL (012)
176                       \r — CR (015)
177                       \s — SP (040)
178                       \t — HT (011)
179                       \v — VT (013)
180                       \0 — NUL (000)
181
182                 When using this format, the nextc parameter is looked at to
183                 determine if a NUL character can be encoded as ‘\0’ instead
184                 of ‘\000’.  If nextc is an octal digit, the latter represen‐
185                 tation is used to avoid ambiguity.
186
187                 Non-printable characters without C-style backslash sequences
188                 use the default representation.
189
190     VIS_OCTAL   Use a three digit octal sequence.  The form is ‘\ddd’ where d
191                 represents an octal digit.
192
193     VIS_CSTYLE | VIS_OCTAL
194                 Same as VIS_CSTYLE except that non-printable characters with‐
195                 out C-style backslash sequences use a three digit octal se‐
196                 quence.
197
198     VIS_HTTPSTYLE
199                 Use URI encoding as described in RFC 1738.  The form is ‘%xx’
200                 where x represents a lower case hexadecimal digit.
201
202     VIS_MIMESTYLE
203                 Use MIME Quoted-Printable encoding as described in RFC 2045,
204                 only don't break lines and don't handle CRLF.  The form is
205                 ‘=XX’ where X represents an upper case hexadecimal digit.
206
207     There is one additional flag, VIS_NOSLASH, which inhibits the doubling of
208     backslashes and the backslash before the default format (that is, control
209     characters are represented by ‘^C’ and meta characters as ‘M-C’).  With
210     this flag set, the encoding is ambiguous and non-invertible.
211

MULTIBYTE CHARACTER SUPPORT

213     These functions support multibyte character input.  The encoding conver‐
214     sion is influenced by the setting of the LC_CTYPE environment variable
215     which defines the set of characters that can be copied without encoding.
216
217     If VIS_NOLOCALE is set, processing is done assuming the C locale and
218     overriding any other environment settings.
219
220     When 8-bit data is present in the input, LC_CTYPE must be set to the cor‐
221     rect locale or to the C locale.  If the locales of the data and the con‐
222     version are mismatched, multibyte character recognition may fail and en‐
223     coding will be performed byte-by-byte instead.
224
225     As noted above, dst must be four times the number of bytes processed from
226     src.  But note that each multibyte character can be up to MB_LEN_MAX
227     bytes so in terms of multibyte characters, dst must be four times
228     MB_LEN_MAX times the number of characters processed from src.
229

ENVIRONMENT

231     LC_CTYPE  Specify the locale of the input data.  Set to C if the input
232               data locale is unknown.
233

ERRORS

235     The functions nvis() and snvis() will return NULL and the functions
236     strnvis(), strnvisx(), strsnvis(), and strsnvisx(), will return -1 when
237     the dlen destination buffer size is not enough to perform the conversion
238     while setting errno to:
239
240     [ENOSPC]  The destination buffer size is not large enough to perform the
241               conversion.
242

SEE ALSO

244     unvis(1), vis(1), glob(3), unvis(3bsd)
245
246     T. Berners-Lee, Uniform Resource Locators (URL), RFC 1738.
247
248     Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet
249     Message Bodies, RFC 2045.
250

HISTORY

252     The vis(), strvis(), and strvisx() functions first appeared in 4.4BSD.
253     The svis(), strsvis(), and strsvisx() functions appeared in NetBSD 1.5.
254     The buffer size limited versions of the functions (nvis(), strnvis(),
255     strnvisx(), snvis(), strsnvis(), and strsnvisx()) appeared in NetBSD 6.0
256     and FreeBSD 9.2.  Multibyte character support was added in NetBSD 7.0 and
257     FreeBSD 9.2.
258
259BSD                             April 22, 2017                             BSD
Impressum