1PERLIOL(1) Perl Programmers Reference Guide PERLIOL(1)
2
3
4
6 perliol - C API for Perl's implementation of IO in Layers.
7
9 /* Defining a layer ... */
10 #include <perliol.h>
11
13 This document describes the behavior and implementation of the PerlIO
14 abstraction described in perlapio when "USE_PERLIO" is defined.
15
16 History and Background
17 The PerlIO abstraction was introduced in perl5.003_02 but languished as
18 just an abstraction until perl5.7.0. However during that time a number
19 of perl extensions switched to using it, so the API is mostly fixed to
20 maintain (source) compatibility.
21
22 The aim of the implementation is to provide the PerlIO API in a
23 flexible and platform neutral manner. It is also a trial of an "Object
24 Oriented C, with vtables" approach which may be applied to Raku.
25
26 Basic Structure
27 PerlIO is a stack of layers.
28
29 The low levels of the stack work with the low-level operating system
30 calls (file descriptors in C) getting bytes in and out, the higher
31 layers of the stack buffer, filter, and otherwise manipulate the I/O,
32 and return characters (or bytes) to Perl. Terms above and below are
33 used to refer to the relative positioning of the stack layers.
34
35 A layer contains a "vtable", the table of I/O operations (at C level a
36 table of function pointers), and status flags. The functions in the
37 vtable implement operations like "open", "read", and "write".
38
39 When I/O, for example "read", is requested, the request goes from Perl
40 first down the stack using "read" functions of each layer, then at the
41 bottom the input is requested from the operating system services, then
42 the result is returned up the stack, finally being interpreted as Perl
43 data.
44
45 The requests do not necessarily go always all the way down to the
46 operating system: that's where PerlIO buffering comes into play.
47
48 When you do an open() and specify extra PerlIO layers to be deployed,
49 the layers you specify are "pushed" on top of the already existing
50 default stack. One way to see it is that "operating system is on the
51 left" and "Perl is on the right".
52
53 What exact layers are in this default stack depends on a lot of things:
54 your operating system, Perl version, Perl compile time configuration,
55 and Perl runtime configuration. See PerlIO, "PERLIO" in perlrun, and
56 open for more information.
57
58 binmode() operates similarly to open(): by default the specified layers
59 are pushed on top of the existing stack.
60
61 However, note that even as the specified layers are "pushed on top" for
62 open() and binmode(), this doesn't mean that the effects are limited to
63 the "top": PerlIO layers can be very 'active' and inspect and affect
64 layers also deeper in the stack. As an example there is a layer called
65 "raw" which repeatedly "pops" layers until it reaches the first layer
66 that has declared itself capable of handling binary data. The "pushed"
67 layers are processed in left-to-right order.
68
69 sysopen() operates (unsurprisingly) at a lower level in the stack than
70 open(). For example in Unix or Unix-like systems sysopen() operates
71 directly at the level of file descriptors: in the terms of PerlIO
72 layers, it uses only the "unix" layer, which is a rather thin wrapper
73 on top of the Unix file descriptors.
74
75 Layers vs Disciplines
76 Initial discussion of the ability to modify IO streams behaviour used
77 the term "discipline" for the entities which were added. This came (I
78 believe) from the use of the term in "sfio", which in turn borrowed it
79 from "line disciplines" on Unix terminals. However, this document (and
80 the C code) uses the term "layer".
81
82 This is, I hope, a natural term given the implementation, and should
83 avoid connotations that are inherent in earlier uses of "discipline"
84 for things which are rather different.
85
86 Data Structures
87 The basic data structure is a PerlIOl:
88
89 typedef struct _PerlIO PerlIOl;
90 typedef struct _PerlIO_funcs PerlIO_funcs;
91 typedef PerlIOl *PerlIO;
92
93 struct _PerlIO
94 {
95 PerlIOl * next; /* Lower layer */
96 PerlIO_funcs * tab; /* Functions for this layer */
97 U32 flags; /* Various flags for state */
98 };
99
100 A "PerlIOl *" is a pointer to the struct, and the application level
101 "PerlIO *" is a pointer to a "PerlIOl *" - i.e. a pointer to a pointer
102 to the struct. This allows the application level "PerlIO *" to remain
103 constant while the actual "PerlIOl *" underneath changes. (Compare
104 perl's "SV *" which remains constant while its "sv_any" field changes
105 as the scalar's type changes.) An IO stream is then in general
106 represented as a pointer to this linked-list of "layers".
107
108 It should be noted that because of the double indirection in a "PerlIO
109 *", a "&(perlio->next)" "is" a "PerlIO *", and so to some degree at
110 least one layer can use the "standard" API on the next layer down.
111
112 A "layer" is composed of two parts:
113
114 1. The functions and attributes of the "layer class".
115
116 2. The per-instance data for a particular handle.
117
118 Functions and Attributes
119 The functions and attributes are accessed via the "tab" (for table)
120 member of "PerlIOl". The functions (methods of the layer "class") are
121 fixed, and are defined by the "PerlIO_funcs" type. They are broadly the
122 same as the public "PerlIO_xxxxx" functions:
123
124 struct _PerlIO_funcs
125 {
126 Size_t fsize;
127 char * name;
128 Size_t size;
129 IV kind;
130 IV (*Pushed)(pTHX_ PerlIO *f,
131 const char *mode,
132 SV *arg,
133 PerlIO_funcs *tab);
134 IV (*Popped)(pTHX_ PerlIO *f);
135 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
136 PerlIO_list_t *layers, IV n,
137 const char *mode,
138 int fd, int imode, int perm,
139 PerlIO *old,
140 int narg, SV **args);
141 IV (*Binmode)(pTHX_ PerlIO *f);
142 SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
143 IV (*Fileno)(pTHX_ PerlIO *f);
144 PerlIO * (*Dup)(pTHX_ PerlIO *f,
145 PerlIO *o,
146 CLONE_PARAMS *param,
147 int flags)
148 /* Unix-like functions - cf sfio line disciplines */
149 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
150 SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
151 SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
152 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
153 Off_t (*Tell)(pTHX_ PerlIO *f);
154 IV (*Close)(pTHX_ PerlIO *f);
155 /* Stdio-like buffered IO functions */
156 IV (*Flush)(pTHX_ PerlIO *f);
157 IV (*Fill)(pTHX_ PerlIO *f);
158 IV (*Eof)(pTHX_ PerlIO *f);
159 IV (*Error)(pTHX_ PerlIO *f);
160 void (*Clearerr)(pTHX_ PerlIO *f);
161 void (*Setlinebuf)(pTHX_ PerlIO *f);
162 /* Perl's snooping functions */
163 STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
164 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
165 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
166 SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
167 void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
168 };
169
170 The first few members of the struct give a function table size for
171 compatibility check "name" for the layer, the size to "malloc" for the
172 per-instance data, and some flags which are attributes of the class as
173 whole (such as whether it is a buffering layer), then follow the
174 functions which fall into four basic groups:
175
176 1. Opening and setup functions
177
178 2. Basic IO operations
179
180 3. Stdio class buffering options.
181
182 4. Functions to support Perl's traditional "fast" access to the
183 buffer.
184
185 A layer does not have to implement all the functions, but the whole
186 table has to be present. Unimplemented slots can be NULL (which will
187 result in an error when called) or can be filled in with stubs to
188 "inherit" behaviour from a "base class". This "inheritance" is fixed
189 for all instances of the layer, but as the layer chooses which stubs to
190 populate the table, limited "multiple inheritance" is possible.
191
192 Per-instance Data
193 The per-instance data are held in memory beyond the basic PerlIOl
194 struct, by making a PerlIOl the first member of the layer's struct
195 thus:
196
197 typedef struct
198 {
199 struct _PerlIO base; /* Base "class" info */
200 STDCHAR * buf; /* Start of buffer */
201 STDCHAR * end; /* End of valid part of buffer */
202 STDCHAR * ptr; /* Current position in buffer */
203 Off_t posn; /* Offset of buf into the file */
204 Size_t bufsiz; /* Real size of buffer */
205 IV oneword; /* Emergency buffer */
206 } PerlIOBuf;
207
208 In this way (as for perl's scalars) a pointer to a PerlIOBuf can be
209 treated as a pointer to a PerlIOl.
210
211 Layers in action.
212 table perlio unix
213 | |
214 +-----------+ +----------+ +--------+
215 PerlIO ->| |--->| next |--->| NULL |
216 +-----------+ +----------+ +--------+
217 | | | buffer | | fd |
218 +-----------+ | | +--------+
219 | | +----------+
220
221 The above attempts to show how the layer scheme works in a simple case.
222 The application's "PerlIO *" points to an entry in the table(s)
223 representing open (allocated) handles. For example the first three
224 slots in the table correspond to "stdin","stdout" and "stderr". The
225 table in turn points to the current "top" layer for the handle - in
226 this case an instance of the generic buffering layer "perlio". That
227 layer in turn points to the next layer down - in this case the low-
228 level "unix" layer.
229
230 The above is roughly equivalent to a "stdio" buffered stream, but with
231 much more flexibility:
232
233 • If Unix level "read"/"write"/"lseek" is not appropriate for (say)
234 sockets then the "unix" layer can be replaced (at open time or even
235 dynamically) with a "socket" layer.
236
237 • Different handles can have different buffering schemes. The "top"
238 layer could be the "mmap" layer if reading disk files was quicker
239 using "mmap" than "read". An "unbuffered" stream can be implemented
240 simply by not having a buffer layer.
241
242 • Extra layers can be inserted to process the data as it flows
243 through. This was the driving need for including the scheme in
244 perl 5.7.0+ - we needed a mechanism to allow data to be translated
245 between perl's internal encoding (conceptually at least Unicode as
246 UTF-8), and the "native" format used by the system. This is
247 provided by the ":encoding(xxxx)" layer which typically sits above
248 the buffering layer.
249
250 • A layer can be added that does "\n" to CRLF translation. This layer
251 can be used on any platform, not just those that normally do such
252 things.
253
254 Per-instance flag bits
255 The generic flag bits are a hybrid of "O_XXXXX" style flags deduced
256 from the mode string passed to "PerlIO_open()", and state bits for
257 typical buffer layers.
258
259 PERLIO_F_EOF
260 End of file.
261
262 PERLIO_F_CANWRITE
263 Writes are permitted, i.e. opened as "w" or "r+" or "a", etc.
264
265 PERLIO_F_CANREAD
266 Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick).
267
268 PERLIO_F_ERROR
269 An error has occurred (for "PerlIO_error()").
270
271 PERLIO_F_TRUNCATE
272 Truncate file suggested by open mode.
273
274 PERLIO_F_APPEND
275 All writes should be appends.
276
277 PERLIO_F_CRLF
278 Layer is performing Win32-like "\n" mapped to CR,LF for output and
279 CR,LF mapped to "\n" for input. Normally the provided "crlf" layer
280 is the only layer that need bother about this. "PerlIO_binmode()"
281 will mess with this flag rather than add/remove layers if the
282 "PERLIO_K_CANCRLF" bit is set for the layers class.
283
284 PERLIO_F_UTF8
285 Data written to this layer should be UTF-8 encoded; data provided
286 by this layer should be considered UTF-8 encoded. Can be set on any
287 layer by ":utf8" dummy layer. Also set on ":encoding" layer.
288
289 PERLIO_F_UNBUF
290 Layer is unbuffered - i.e. write to next layer down should occur
291 for each write to this layer.
292
293 PERLIO_F_WRBUF
294 The buffer for this layer currently holds data written to it but
295 not sent to next layer.
296
297 PERLIO_F_RDBUF
298 The buffer for this layer currently holds unconsumed data read from
299 layer below.
300
301 PERLIO_F_LINEBUF
302 Layer is line buffered. Write data should be passed to next layer
303 down whenever a "\n" is seen. Any data beyond the "\n" should then
304 be processed.
305
306 PERLIO_F_TEMP
307 File has been "unlink()"ed, or should be deleted on "close()".
308
309 PERLIO_F_OPEN
310 Handle is open.
311
312 PERLIO_F_FASTGETS
313 This instance of this layer supports the "fast "gets"" interface.
314 Normally set based on "PERLIO_K_FASTGETS" for the class and by the
315 existence of the function(s) in the table. However a class that
316 normally provides that interface may need to avoid it on a
317 particular instance. The "pending" layer needs to do this when it
318 is pushed above a layer which does not support the interface.
319 (Perl's "sv_gets()" does not expect the streams fast "gets"
320 behaviour to change during one "get".)
321
322 Methods in Detail
323 fsize
324 Size_t fsize;
325
326 Size of the function table. This is compared against the value
327 PerlIO code "knows" as a compatibility check. Future versions may
328 be able to tolerate layers compiled against an old version of the
329 headers.
330
331 name
332 char * name;
333
334 The name of the layer whose open() method Perl should invoke on
335 open(). For example if the layer is called APR, you will call:
336
337 open $fh, ">:APR", ...
338
339 and Perl knows that it has to invoke the PerlIOAPR_open() method
340 implemented by the APR layer.
341
342 size
343 Size_t size;
344
345 The size of the per-instance data structure, e.g.:
346
347 sizeof(PerlIOAPR)
348
349 If this field is zero then "PerlIO_pushed" does not malloc anything
350 and assumes layer's Pushed function will do any required layer
351 stack manipulation - used to avoid malloc/free overhead for dummy
352 layers. If the field is non-zero it must be at least the size of
353 "PerlIOl", "PerlIO_pushed" will allocate memory for the layer's
354 data structures and link new layer onto the stream's stack. (If the
355 layer's Pushed method returns an error indication the layer is
356 popped again.)
357
358 kind
359 IV kind;
360
361 • PERLIO_K_BUFFERED
362
363 The layer is buffered.
364
365 • PERLIO_K_RAW
366
367 The layer is acceptable to have in a binmode(FH) stack - i.e.
368 it does not (or will configure itself not to) transform bytes
369 passing through it.
370
371 • PERLIO_K_CANCRLF
372
373 Layer can translate between "\n" and CRLF line ends.
374
375 • PERLIO_K_FASTGETS
376
377 Layer allows buffer snooping.
378
379 • PERLIO_K_MULTIARG
380
381 Used when the layer's open() accepts more arguments than usual.
382 The extra arguments should come not before the "MODE" argument.
383 When this flag is used it's up to the layer to validate the
384 args.
385
386 Pushed
387 IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg);
388
389 The only absolutely mandatory method. Called when the layer is
390 pushed onto the stack. The "mode" argument may be NULL if this
391 occurs post-open. The "arg" will be non-"NULL" if an argument
392 string was passed. In most cases this should call
393 "PerlIOBase_pushed()" to convert "mode" into the appropriate
394 "PERLIO_F_XXXXX" flags in addition to any actions the layer itself
395 takes. If a layer is not expecting an argument it need neither
396 save the one passed to it, nor provide "Getarg()" (it could perhaps
397 "Perl_warn" that the argument was un-expected).
398
399 Returns 0 on success. On failure returns -1 and should set errno.
400
401 Popped
402 IV (*Popped)(pTHX_ PerlIO *f);
403
404 Called when the layer is popped from the stack. A layer will
405 normally be popped after "Close()" is called. But a layer can be
406 popped without being closed if the program is dynamically managing
407 layers on the stream. In such cases "Popped()" should free any
408 resources (buffers, translation tables, ...) not held directly in
409 the layer's struct. It should also "Unread()" any unconsumed data
410 that has been read and buffered from the layer below back to that
411 layer, so that it can be re-provided to what ever is now above.
412
413 Returns 0 on success and failure. If "Popped()" returns true then
414 perlio.c assumes that either the layer has popped itself, or the
415 layer is super special and needs to be retained for other reasons.
416 In most cases it should return false.
417
418 Open
419 PerlIO * (*Open)(...);
420
421 The "Open()" method has lots of arguments because it combines the
422 functions of perl's "open", "PerlIO_open", perl's "sysopen",
423 "PerlIO_fdopen" and "PerlIO_reopen". The full prototype is as
424 follows:
425
426 PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
427 PerlIO_list_t *layers, IV n,
428 const char *mode,
429 int fd, int imode, int perm,
430 PerlIO *old,
431 int narg, SV **args);
432
433 Open should (perhaps indirectly) call "PerlIO_allocate()" to
434 allocate a slot in the table and associate it with the layers
435 information for the opened file, by calling "PerlIO_push". The
436 layers is an array of all the layers destined for the "PerlIO *",
437 and any arguments passed to them, n is the index into that array of
438 the layer being called. The macro "PerlIOArg" will return a
439 (possibly "NULL") SV * for the argument passed to the layer.
440
441 Where a layer opens or takes ownership of a file descriptor, that
442 layer is responsible for getting the file descriptor's close-on-
443 exec flag into the correct state. The flag should be clear for a
444 file descriptor numbered less than or equal to "PL_maxsysfd", and
445 set for any file descriptor numbered higher. For thread safety,
446 when a layer opens a new file descriptor it should if possible open
447 it with the close-on-exec flag initially set.
448
449 The mode string is an ""fopen()"-like" string which would match the
450 regular expression "/^[I#]?[rwa]\+?[bt]?$/".
451
452 The 'I' prefix is used during creation of "stdin".."stderr" via
453 special "PerlIO_fdopen" calls; the '#' prefix means that this is
454 "sysopen" and that imode and perm should be passed to
455 "PerlLIO_open3"; 'r' means read, 'w' means write and 'a' means
456 append. The '+' suffix means that both reading and
457 writing/appending are permitted. The 'b' suffix means file should
458 be binary, and 't' means it is text. (Almost all layers should do
459 the IO in binary mode, and ignore the b/t bits. The ":crlf" layer
460 should be pushed to handle the distinction.)
461
462 If old is not "NULL" then this is a "PerlIO_reopen". Perl itself
463 does not use this (yet?) and semantics are a little vague.
464
465 If fd not negative then it is the numeric file descriptor fd, which
466 will be open in a manner compatible with the supplied mode string,
467 the call is thus equivalent to "PerlIO_fdopen". In this case nargs
468 will be zero. The file descriptor may have the close-on-exec flag
469 either set or clear; it is the responsibility of the layer that
470 takes ownership of it to get the flag into the correct state.
471
472 If nargs is greater than zero then it gives the number of arguments
473 passed to "open", otherwise it will be 1 if for example
474 "PerlIO_open" was called. In simple cases SvPV_nolen(*args) is the
475 pathname to open.
476
477 If a layer provides "Open()" it should normally call the "Open()"
478 method of next layer down (if any) and then push itself on top if
479 that succeeds. "PerlIOBase_open" is provided to do exactly that,
480 so in most cases you don't have to write your own "Open()" method.
481 If this method is not defined, other layers may have difficulty
482 pushing themselves on top of it during open.
483
484 If "PerlIO_push" was performed and open has failed, it must
485 "PerlIO_pop" itself, since if it's not, the layer won't be removed
486 and may cause bad problems.
487
488 Returns "NULL" on failure.
489
490 Binmode
491 IV (*Binmode)(pTHX_ PerlIO *f);
492
493 Optional. Used when ":raw" layer is pushed (explicitly or as a
494 result of binmode(FH)). If not present layer will be popped. If
495 present should configure layer as binary (or pop itself) and return
496 0. If it returns -1 for error "binmode" will fail with layer still
497 on the stack.
498
499 Getarg
500 SV * (*Getarg)(pTHX_ PerlIO *f,
501 CLONE_PARAMS *param, int flags);
502
503 Optional. If present should return an SV * representing the string
504 argument passed to the layer when it was pushed. e.g.
505 ":encoding(ascii)" would return an SvPV with value "ascii". (param
506 and flags arguments can be ignored in most cases)
507
508 "Dup" uses "Getarg" to retrieve the argument originally passed to
509 "Pushed", so you must implement this function if your layer has an
510 extra argument to "Pushed" and will ever be "Dup"ed.
511
512 Fileno
513 IV (*Fileno)(pTHX_ PerlIO *f);
514
515 Returns the Unix/Posix numeric file descriptor for the handle.
516 Normally "PerlIOBase_fileno()" (which just asks next layer down)
517 will suffice for this.
518
519 Returns -1 on error, which is considered to include the case where
520 the layer cannot provide such a file descriptor.
521
522 Dup
523 PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o,
524 CLONE_PARAMS *param, int flags);
525
526 XXX: Needs more docs.
527
528 Used as part of the "clone" process when a thread is spawned (in
529 which case param will be non-NULL) and when a stream is being
530 duplicated via '&' in the "open".
531
532 Similar to "Open", returns PerlIO* on success, "NULL" on failure.
533
534 Read
535 SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
536
537 Basic read operation.
538
539 Typically will call "Fill" and manipulate pointers (possibly via
540 the API). "PerlIOBuf_read()" may be suitable for derived classes
541 which provide "fast gets" methods.
542
543 Returns actual bytes read, or -1 on an error.
544
545 Unread
546 SSize_t (*Unread)(pTHX_ PerlIO *f,
547 const void *vbuf, Size_t count);
548
549 A superset of stdio's "ungetc()". Should arrange for future reads
550 to see the bytes in "vbuf". If there is no obviously better
551 implementation then "PerlIOBase_unread()" provides the function by
552 pushing a "fake" "pending" layer above the calling layer.
553
554 Returns the number of unread chars.
555
556 Write
557 SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count);
558
559 Basic write operation.
560
561 Returns bytes written or -1 on an error.
562
563 Seek
564 IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
565
566 Position the file pointer. Should normally call its own "Flush"
567 method and then the "Seek" method of next layer down.
568
569 Returns 0 on success, -1 on failure.
570
571 Tell
572 Off_t (*Tell)(pTHX_ PerlIO *f);
573
574 Return the file pointer. May be based on layers cached concept of
575 position to avoid overhead.
576
577 Returns -1 on failure to get the file pointer.
578
579 Close
580 IV (*Close)(pTHX_ PerlIO *f);
581
582 Close the stream. Should normally call "PerlIOBase_close()" to
583 flush itself and close layers below, and then deallocate any data
584 structures (buffers, translation tables, ...) not held directly in
585 the data structure.
586
587 Returns 0 on success, -1 on failure.
588
589 Flush
590 IV (*Flush)(pTHX_ PerlIO *f);
591
592 Should make stream's state consistent with layers below. That is,
593 any buffered write data should be written, and file position of
594 lower layers adjusted for data read from below but not actually
595 consumed. (Should perhaps "Unread()" such data to the lower
596 layer.)
597
598 Returns 0 on success, -1 on failure.
599
600 Fill
601 IV (*Fill)(pTHX_ PerlIO *f);
602
603 The buffer for this layer should be filled (for read) from layer
604 below. When you "subclass" PerlIOBuf layer, you want to use its
605 _read method and to supply your own fill method, which fills the
606 PerlIOBuf's buffer.
607
608 Returns 0 on success, -1 on failure.
609
610 Eof
611 IV (*Eof)(pTHX_ PerlIO *f);
612
613 Return end-of-file indicator. "PerlIOBase_eof()" is normally
614 sufficient.
615
616 Returns 0 on end-of-file, 1 if not end-of-file, -1 on error.
617
618 Error
619 IV (*Error)(pTHX_ PerlIO *f);
620
621 Return error indicator. "PerlIOBase_error()" is normally
622 sufficient.
623
624 Returns 1 if there is an error (usually when "PERLIO_F_ERROR" is
625 set), 0 otherwise.
626
627 Clearerr
628 void (*Clearerr)(pTHX_ PerlIO *f);
629
630 Clear end-of-file and error indicators. Should call
631 "PerlIOBase_clearerr()" to set the "PERLIO_F_XXXXX" flags, which
632 may suffice.
633
634 Setlinebuf
635 void (*Setlinebuf)(pTHX_ PerlIO *f);
636
637 Mark the stream as line buffered. "PerlIOBase_setlinebuf()" sets
638 the PERLIO_F_LINEBUF flag and is normally sufficient.
639
640 Get_base
641 STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
642
643 Allocate (if not already done so) the read buffer for this layer
644 and return pointer to it. Return NULL on failure.
645
646 Get_bufsiz
647 Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
648
649 Return the number of bytes that last "Fill()" put in the buffer.
650
651 Get_ptr
652 STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
653
654 Return the current read pointer relative to this layer's buffer.
655
656 Get_cnt
657 SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
658
659 Return the number of bytes left to be read in the current buffer.
660
661 Set_ptrcnt
662 void (*Set_ptrcnt)(pTHX_ PerlIO *f,
663 STDCHAR *ptr, SSize_t cnt);
664
665 Adjust the read pointer and count of bytes to match "ptr" and/or
666 "cnt". The application (or layer above) must ensure they are
667 consistent. (Checking is allowed by the paranoid.)
668
669 Utilities
670 To ask for the next layer down use PerlIONext(PerlIO *f).
671
672 To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All this
673 does is really just to check that the pointer is non-NULL and that the
674 pointer behind that is non-NULL.)
675
676 PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words,
677 the "PerlIOl*" pointer.
678
679 PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type.
680
681 Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either
682 calls the callback from the functions of the layer f (just by the name
683 of the IO function, like "Read") with the args, or if there is no such
684 callback, calls the base version of the callback with the same args, or
685 if the f is invalid, set errno to EBADF and return failure.
686
687 Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls
688 the callback of the functions of the layer f with the args, or if there
689 is no such callback, set errno to EINVAL. Or if the f is invalid, set
690 errno to EBADF and return failure.
691
692 Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls
693 the callback of the functions of the layer f with the args, or if there
694 is no such callback, calls the base version of the callback with the
695 same args, or if the f is invalid, set errno to EBADF.
696
697 Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the
698 callback of the functions of the layer f with the args, or if there is
699 no such callback, set errno to EINVAL. Or if the f is invalid, set
700 errno to EBADF.
701
702 Implementing PerlIO Layers
703 If you find the implementation document unclear or not sufficient, look
704 at the existing PerlIO layer implementations, which include:
705
706 • C implementations
707
708 The perlio.c and perliol.h in the Perl core implement the "unix",
709 "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" layers,
710 and also the "mmap" and "win32" layers if applicable. (The "win32"
711 is currently unfinished and unused, to see what is used instead in
712 Win32, see "Querying the layers of filehandles" in PerlIO .)
713
714 PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core.
715
716 PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN.
717
718 • Perl implementations
719
720 PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on
721 CPAN.
722
723 If you are creating a PerlIO layer, you may want to be lazy, in other
724 words, implement only the methods that interest you. The other methods
725 you can either replace with the "blank" methods
726
727 PerlIOBase_noop_ok
728 PerlIOBase_noop_fail
729
730 (which do nothing, and return zero and -1, respectively) or for certain
731 methods you may assume a default behaviour by using a NULL method. The
732 Open method looks for help in the 'parent' layer. The following table
733 summarizes the behaviour:
734
735 method behaviour with NULL
736
737 Clearerr PerlIOBase_clearerr
738 Close PerlIOBase_close
739 Dup PerlIOBase_dup
740 Eof PerlIOBase_eof
741 Error PerlIOBase_error
742 Fileno PerlIOBase_fileno
743 Fill FAILURE
744 Flush SUCCESS
745 Getarg SUCCESS
746 Get_base FAILURE
747 Get_bufsiz FAILURE
748 Get_cnt FAILURE
749 Get_ptr FAILURE
750 Open INHERITED
751 Popped SUCCESS
752 Pushed SUCCESS
753 Read PerlIOBase_read
754 Seek FAILURE
755 Set_cnt FAILURE
756 Set_ptrcnt FAILURE
757 Setlinebuf PerlIOBase_setlinebuf
758 Tell FAILURE
759 Unread PerlIOBase_unread
760 Write FAILURE
761
762 FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS)
763 and return -1 (for numeric return values) or NULL (for
764 pointers)
765 INHERITED Inherited from the layer below
766 SUCCESS Return 0 (for numeric return values) or a pointer
767
768 Core Layers
769 The file "perlio.c" provides the following layers:
770
771 "unix"
772 A basic non-buffered layer which calls Unix/POSIX "read()",
773 "write()", "lseek()", "close()". No buffering. Even on platforms
774 that distinguish between O_TEXT and O_BINARY this layer is always
775 O_BINARY.
776
777 "perlio"
778 A very complete generic buffering layer which provides the whole of
779 PerlIO API. It is also intended to be used as a "base class" for
780 other layers. (For example its "Read()" method is implemented in
781 terms of the "Get_cnt()"/"Get_ptr()"/"Set_ptrcnt()" methods).
782
783 "perlio" over "unix" provides a complete replacement for stdio as
784 seen via PerlIO API. This is the default for USE_PERLIO when
785 system's stdio does not permit perl's "fast gets" access, and which
786 do not distinguish between "O_TEXT" and "O_BINARY".
787
788 "stdio"
789 A layer which provides the PerlIO API via the layer scheme, but
790 implements it by calling system's stdio. This is (currently) the
791 default if system's stdio provides sufficient access to allow
792 perl's "fast gets" access and which do not distinguish between
793 "O_TEXT" and "O_BINARY".
794
795 "crlf"
796 A layer derived using "perlio" as a base class. It provides
797 Win32-like "\n" to CR,LF translation. Can either be applied above
798 "perlio" or serve as the buffer layer itself. "crlf" over "unix" is
799 the default if system distinguishes between "O_TEXT" and "O_BINARY"
800 opens. (At some point "unix" will be replaced by a "native" Win32
801 IO layer on that platform, as Win32's read/write layer has various
802 drawbacks.) The "crlf" layer is a reasonable model for a layer
803 which transforms data in some way.
804
805 "mmap"
806 If Configure detects "mmap()" functions this layer is provided
807 (with "perlio" as a "base") which does "read" operations by
808 mmap()ing the file. Performance improvement is marginal on modern
809 systems, so it is mainly there as a proof of concept. It is likely
810 to be unbundled from the core at some point. The "mmap" layer is a
811 reasonable model for a minimalist "derived" layer.
812
813 "pending"
814 An "internal" derivative of "perlio" which can be used to provide
815 Unread() function for layers which have no buffer or cannot be
816 bothered. (Basically this layer's "Fill()" pops itself off the
817 stack and so resumes reading from layer below.)
818
819 "raw"
820 A dummy layer which never exists on the layer stack. Instead when
821 "pushed" it actually pops the stack removing itself, it then calls
822 Binmode function table entry on all the layers in the stack -
823 normally this (via PerlIOBase_binmode) removes any layers which do
824 not have "PERLIO_K_RAW" bit set. Layers can modify that behaviour
825 by defining their own Binmode entry.
826
827 "utf8"
828 Another dummy layer. When pushed it pops itself and sets the
829 "PERLIO_F_UTF8" flag on the layer which was (and now is once more)
830 the top of the stack.
831
832 In addition perlio.c also provides a number of "PerlIOBase_xxxx()"
833 functions which are intended to be used in the table slots of classes
834 which do not need to do anything special for a particular method.
835
836 Extension Layers
837 Layers can be made available by extension modules. When an unknown
838 layer is encountered the PerlIO code will perform the equivalent of :
839
840 use PerlIO 'layer';
841
842 Where layer is the unknown layer. PerlIO.pm will then attempt to:
843
844 require PerlIO::layer;
845
846 If after that process the layer is still not defined then the "open"
847 will fail.
848
849 The following extension layers are bundled with perl:
850
851 ":encoding"
852 use Encoding;
853
854 makes this layer available, although PerlIO.pm "knows" where to
855 find it. It is an example of a layer which takes an argument as it
856 is called thus:
857
858 open( $fh, "<:encoding(iso-8859-7)", $pathname );
859
860 ":scalar"
861 Provides support for reading data from and writing data to a
862 scalar.
863
864 open( $fh, "+<:scalar", \$scalar );
865
866 When a handle is so opened, then reads get bytes from the string
867 value of $scalar, and writes change the value. In both cases the
868 position in $scalar starts as zero but can be altered via "seek",
869 and determined via "tell".
870
871 Please note that this layer is implied when calling open() thus:
872
873 open( $fh, "+<", \$scalar );
874
875 ":via"
876 Provided to allow layers to be implemented as Perl code. For
877 instance:
878
879 use PerlIO::via::StripHTML;
880 open( my $fh, "<:via(StripHTML)", "index.html" );
881
882 See PerlIO::via for details.
883
885 Things that need to be done to improve this document.
886
887 • Explain how to make a valid fh without going through open()(i.e.
888 apply a layer). For example if the file is not opened through perl,
889 but we want to get back a fh, like it was opened by Perl.
890
891 How PerlIO_apply_layera fits in, where its docs, was it made
892 public?
893
894 Currently the example could be something like this:
895
896 PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...)
897 {
898 char *mode; /* "w", "r", etc */
899 const char *layers = ":APR"; /* the layer name */
900 PerlIO *f = PerlIO_allocate(aTHX);
901 if (!f) {
902 return NULL;
903 }
904
905 PerlIO_apply_layers(aTHX_ f, mode, layers);
906
907 if (f) {
908 PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR);
909 /* fill in the st struct, as in _open() */
910 st->file = file;
911 PerlIOBase(f)->flags |= PERLIO_F_OPEN;
912
913 return f;
914 }
915 return NULL;
916 }
917
918 • fix/add the documentation in places marked as XXX.
919
920 • The handling of errors by the layer is not specified. e.g. when $!
921 should be set explicitly, when the error handling should be just
922 delegated to the top layer.
923
924 Probably give some hints on using SETERRNO() or pointers to where
925 they can be found.
926
927 • I think it would help to give some concrete examples to make it
928 easier to understand the API. Of course I agree that the API has to
929 be concise, but since there is no second document that is more of a
930 guide, I think that it'd make it easier to start with the doc which
931 is an API, but has examples in it in places where things are
932 unclear, to a person who is not a PerlIO guru (yet).
933
934
935
936perl v5.36.3 2023-11-30 PERLIOL(1)