1SNOBOL4IO(1) CSNOBOL4 Manual SNOBOL4IO(1)
2
3
4
6 snobol4io - SNOBOL4 file I/O
7
9 Macro SNOBOL4 originally depended on FORTRAN libraries, unit numbers
10 and FORMATs for input and output. CSNOBOL4 uses the C stdio(3) library
11 instead, but unit numbers (INTEGERs between 1 and 256) and record
12 lengths remain embedded in the Macro SNOBOL4 code.
13
14 I/O Associations
15 Output on a closed unit generates a fatal “Output error”, see
16 snobol4error(1).
17
18 The following variable/unit/file associations exist by default;
19
20 Variable Unit Association
21 INPUT 5 standard input (input)
22 OUTPUT 6 standard output (output)
23 TERMINAL 7 standard error (output)
24 TERMINAL 8 /dev/tty (input)
25
26 Named files
27 Input and output filenames can be supplied to the INPUT() and OUTPUT()
28 functions via an optional fourth argument.
29
30 filename - (hyphen)
31 is interpreted as stdin on INPUT() and stdout on OUTPUT().
32
33 sub-process I/O using PIPE and Pseudo-terminals
34 If the filename begins with single a vertical bar (|), the
35 remainder is used as a shell command whose stdin (in the case of
36 OUTPUT()) or stdout (in the case of INPUT()) will be connected to
37 the file variable via a pipe. If a pipe is opened by INPUT() input
38 in “update” mode, the connection will be bi-directional (on systems
39 with socketpair and Unix-domain sockets). See below for how to
40 associate a variable for I/O in both directions.
41
42 If the filename begins with two vertical bars (||) the remainder is
43 used as a shell command executed with stdin, stdout and stderr
44 attached to the slave side of a pseudo-terminal (pty), if the
45 system C library contains the forkpty(3) routine. Use of ptys are
46 necessary when the program to be invoked cannot be run without a
47 “terminal” for I/O. See below on how to properly associate the I/O
48 variable.
49
50 magic paths /dev/stdin, /dev/stdout, and /dev/stderr
51 /dev/stdin, /dev/stdout, and /dev/stderr refer to the current
52 process standard input, standard output and standard error I/O
53 streams respectively regardless of whether those special filenames
54 exist on your system.
55
56 magic path /dev/fd/n
57 /dev/fd/n uses fdopen(3) to open a new I/O stream associated with
58 file descriptor number n, regardless of whether the special device
59 entries exist.
60
61 magic paths /tcp/hostname/service, /udp/hostname/service
62 and /tls/hostname/service. /tcp/hostname/service can be used to
63 open connection to a TCP server. /udp/hostname/service behaves
64 similarly for UDP. /tls/hostname/service opens a TLS over TCP
65 connection (NOTE! does not attempt to verify certificate unless
66 "verify" option used, and even then does not handle SNI or SAN).
67 Path can followed by a number of different slash separated options:
68
69 broadcast Allow broadcast address (UDP only).
70 dontroute Enables routing bypass for outgoing messages.
71 keepalive Enables TCP connection keep alive messages.
72 nodelay Send TCP data without waiting.
73 oobinline Enables reception of out-of-band data in band.
74 priv Bind local port number under 1024 (if allowed).
75 reuseaddr Allow quick reuse of local addresses.
76 verify Attempt to verify server TLS certificate.
77
78 magic pathname /dev/tmpfile
79 /dev/tmpfile opens an anonymous temporary file for reading and
80 writing, see tmpfile(3).
81
82 /dev/null and /dev/tty
83 On non-POSIX systems /dev/null and /dev/tty are magical, and refer
84 to the null device, and the user's terminal/console, respectively.
85
86 I/O Options
87 Originally the third argument specified record length for INPUT(), or a
88 FORTRAN FORMAT for OUTPUT().
89
90 CSNOBOL4 interprets it as string of single letter options, commas are
91 ignored. Some options effect only the I/O variable named in the first
92 argument, others effect any variable associated with the unit number in
93 the second argument.
94
95 digits
96 A span of digits will set the input record length for the named I/O
97 variable. This controls the maximum string that will be returned
98 for regular text I/O, and the number of bytes returned for binary
99 I/O. Record length is per-variable association; multiple variables
100 may be associated with the same unit, but with different record
101 lengths. The default record length for input is 1024. Lines longer
102 than the record length will be silently truncated. Since CSNOBOL4
103 2.2, record length is only honored for binary I/O, and all
104 characters upto a newline (ASCII Line Feed) are interpreted as a
105 single line.
106
107 A For OUTPUT() the unit will be opened for append access (and ignored
108 by INPUT()). All writes will occur at the end of the file at the
109 time of the write, regardless of the file position before the
110 write.
111
112 B The unit will be opened for binary access. On input, newline
113 characters have no special meaning; the number of bytes transferred
114 depends on record length (see above). On output, no newline is
115 appended.
116
117 B For terminal devices, all input from this unit will be done without
118 special processing for line editing or EOF; the number of
119 characters returned depends on the record length. Characters which
120 deliver signals (including interrupt, kill, and suspend) are still
121 processed. Units (with different fds) opened on the same terminal
122 device operate independently; some can use binary mode, while
123 others operate in text mode.
124
125 C Character at a time I/O. A synonym for B,1.
126
127 E Set the "close on exec" flag for the underlying file descriptor.
128 Depends on support by the C library fopen(3) call for 'e' in the
129 mode string for regular files. Honored for sockets regardless,
130 (but not on Windows).
131
132 J Read and write compressed data in .xz format, using liblzma, as
133 written by xz(1). If a digit 0 through 9 immediately follows the
134 option, it will be interpreted as the compression level to use when
135 writing. It's claimed that level zero is "sometimes faster than
136 gzip -9 while compressing much better". The default compression
137 level is 6, larger numbers will require more than 16MiB of memory
138 to decompress, and are only useful only when compressing files
139 bigger than 8 MiB (level 7), 16 MiB (level 8), and 32 MiB (level
140 9). Matches the tar(1) command line option. Added in CSNOBOL4
141 2.2.
142
143 j Read and write compressed data in .bz2 format, using libbz2, as
144 created by bzip2(1). If a digit 1 through 9 immediately follows
145 the option, it will be interpreted as the compression level to use
146 when writing. Matches the tar(1) command line option. Added in
147 CSNOBOL4 2.2.
148
149 K If an input line is longer than the input record length, return the
150 line in multiple reads (breaK up the line) instead of discarding
151 the extra characters. Added in CSNOBOL4 2.0. Obsolete in CSNOBOL4
152 2.2.
153
154 T Terminal mode. Writes are performed “unbuffered” (see below), and
155 no newline characters are added. On input newline characters are
156 returned. Terminal mode effects only the referenced unit, and does
157 not require opening a new file descriptor (ie; by using a magic
158 pathname): OUTPUT(.TT, 8, "T", "-"). Terminal mode is useful for
159 outputting prompts in interactive programs.
160
161 Q Quiet mode. Turns off input echo on terminals. Effects only input
162 on this file descriptor.
163
164 U Update mode. The unit is opened for both input and output.
165 Example of associating a variable for I/O in both directions:
166 unit = IO_FINDUNIT()
167 INPUT(.name, unit, 'U', 'filepath')
168 OUTPUT(.name, unit)
169
170 Useful situations for this when filepath is /dev/fd/n where n is a
171 file descriptor number returned by SERV_LISTEN(), or filepath
172 specifies a pipe (|command) or pseudo-terminal (||command) paths.
173
174 The above sequence is also useful with when combined with fixed
175 record length, binary mode and the SET() function for I/O to
176 preexisting files. Performing OUTPUT() first will create a regular
177 file if it does not exist, but will also truncate a preexisting
178 file!
179
180 W Unbuffered mode. Each output variable assignment causes an
181 immediate I/O transfer to occur by direct read(1) or write(1)
182 system calls, rather than collecting the data in a buffer for
183 efficiency.
184
185 X Open fails if file exists (meaningless for /dev/fd/n). Depends on
186 support by the C library fopen(3) call for 'x' in the mode string.
187 Added in CSNOBOL4 2.1 where it was ignored for sockets. In
188 CSNOBOL4 2.2 applies to sockets, and means don't allow local socket
189 address reuse.
190
191 Z Reserved for .Z (compress(1)) style compression?!
192
193 z Read and write compressed data in .gz format using zlib(3), as
194 created by gzip(1). If a digit 0 through 9 immediately follows the
195 option, it will be interpreted as the compression level to use when
196 writing. Matches the tar(1) command line option. Added in
197 CSNOBOL4 2.2.
198
199 Other I/O extensions
200 SERV_LISTEN(), SET(), SSET()
201 see snobol4func(1).
202
203 I/O Layers
204 The Macro SNOBOL4 and POSIX I/O architectures have subtleties which
205 interact, and are explained here:
206
207 Variable association
208 Input and output is done by reading or writing variables associated
209 with a unit number for I/O.
210
211 Input (maximum) record lengths are associated each variable
212 association!
213
214 Unit number
215 Multiple variables can be associated with the same unit number
216 using the INPUT() and OUTPUT() functions.
217
218 Each unit number refers to a stdio(3) stream (except on broken
219 systems like Windows, where socket handles are incompatible with
220 file handles, and all network I/O is performed “unbuffered”).
221
222 Sequential named files can be associated with an I/O unit when the
223 -r option is given on the command line! REWIND() should return to
224 to after the program END label!
225
226 “Standard I/O” Stream
227 snobol4(1) performs MOST I/O through “Standard Input/Output”
228 streams. Multiple units can be associated with the same stdio
229 stream (FILE struct) using magic pathnames (“-” and
230 /dev/std{in,out,err}). Buffering is performed by the stdio layer.
231
232 Operating System file descriptor
233 More than one stdio stream can be associated with the same O/S “fd”
234 (by opening magic pathname “/dev/fd/n”).
235
236 Each POSIX “fd” has a file position pointer, changed by reading,
237 writing and the REWIND(), SET() and SSET() functions.
238
239 Normally terminal device “special files” have one set of mode
240 settings, but CSNOBOL4 associates (saves and restores) different
241 terminal settings (echo and the number of characters returned on
242 read) based on fd numbers.
243
244 Operating System open file object
245 More than one “fd” slot can be associated with the same “open file”
246 object, either in multiple forks, or by dup(2) of the same fd.
247 This is often the case for stdin, stdout and stderr.
248
249 Open file objects have flags which effect all associated fds,
250 including input, output and append modes.
251
252 Operating System named file
253 Independent opens of the same named “regular” file will have
254 different open file objects, and thus have independent access modes
255 and file positions.
256
257 Terminal devices normally have one set of “line discipline” mode
258 settings, but CSNOBOL4 maintains different settings for each file
259 descriptor (see above).
260
262 This page was cut and pasted from various parts of the original
263 snobol4(1) man page, and still needs review and cleanup.
264
265 All extensions should be annotated with the version they appeared in
266 (and what other implementations they're compatible or inspired by).
267
268 Record lengths.
269
270 Unit numbers.
271
273 snobol4(1), snobol4ezio(3)
274
275
276
277CSNOBOL4B 2.3.1 March 31, 2022 SNOBOL4IO(1)