perlxstypemap(1)

1PERLXSTYPEMAP(1)       Perl Programmers Reference Guide       PERLXSTYPEMAP(1)
2
3
4

NAME

6       perlxstypemap - Perl XS C/Perl type mapping
7

DESCRIPTION

9       The more you think about interfacing between two languages, the more
10       you'll realize that the majority of programmer effort has to go into
11       converting between the data structures that are native to either of the
12       languages involved.  This trumps other matter such as differing calling
13       conventions because the problem space is so much greater.  There are
14       simply more ways to shove data into memory than there are ways to
15       implement a function call.
16
17       Perl XS' attempt at a solution to this is the concept of typemaps.  At
18       an abstract level, a Perl XS typemap is nothing but a recipe for
19       converting from a certain Perl data structure to a certain C data
20       structure and vice versa.  Since there can be C types that are
21       sufficiently similar to one another to warrant converting with the same
22       logic, XS typemaps are represented by a unique identifier, henceforth
23       called an XS type in this document.  You can then tell the XS compiler
24       that multiple C types are to be mapped with the same XS typemap.
25
26       In your XS code, when you define an argument with a C type or when you
27       are using a "CODE:" and an "OUTPUT:" section together with a C return
28       type of your XSUB, it'll be the typemapping mechanism that makes this
29       easy.
30
31   Anatomy of a typemap
32       In more practical terms, the typemap is a collection of code fragments
33       which are used by the xsubpp compiler to map C function parameters and
34       values to Perl values.  The typemap file may consist of three sections
35       labelled "TYPEMAP", "INPUT", and "OUTPUT".  An unlabelled initial
36       section is assumed to be a "TYPEMAP" section.  The INPUT section tells
37       the compiler how to translate Perl values into variables of certain C
38       types.  The OUTPUT section tells the compiler how to translate the
39       values from certain C types into values Perl can understand.  The
40       TYPEMAP section tells the compiler which of the INPUT and OUTPUT code
41       fragments should be used to map a given C type to a Perl value.  The
42       section labels "TYPEMAP", "INPUT", or "OUTPUT" must begin in the first
43       column on a line by themselves, and must be in uppercase.
44
45       Each type of section can appear an arbitrary number of times and does
46       not have to appear at all.  For example, a typemap may commonly lack
47       "INPUT" and "OUTPUT" sections if all it needs to do is associate
48       additional C types with core XS types like T_PTROBJ.  Lines that start
49       with a hash "#" are considered comments and ignored in the "TYPEMAP"
50       section, but are considered significant in "INPUT" and "OUTPUT". Blank
51       lines are generally ignored.
52
53       Traditionally, typemaps needed to be written to a separate file,
54       conventionally called "typemap" in a CPAN distribution.  With
55       ExtUtils::ParseXS (the XS compiler) version 3.12 or better which comes
56       with perl 5.16, typemaps can also be embedded directly into XS code
57       using a HERE-doc like syntax:
58
59         TYPEMAP: <<HERE
60         ...
61         HERE
62
63       where "HERE" can be replaced by other identifiers like with normal Perl
64       HERE-docs.  All details below about the typemap textual format remain
65       valid.
66
67       The "TYPEMAP" section should contain one pair of C type and XS type per
68       line as follows.  An example from the core typemap file:
69
70         TYPEMAP
71         # all variants of char* is handled by the T_PV typemap
72         char *          T_PV
73         const char *    T_PV
74         unsigned char * T_PV
75         ...
76
77       The "INPUT" and "OUTPUT" sections have identical formats, that is, each
78       unindented line starts a new in- or output map respectively.  A new in-
79       or output map must start with the name of the XS type to map on a line
80       by itself, followed by the code that implements it indented on the
81       following lines. Example:
82
83         INPUT
84         T_PV
85           $var = ($type)SvPV_nolen($arg)
86         T_PTR
87           $var = INT2PTR($type,SvIV($arg))
88
89       We'll get to the meaning of those Perlish-looking variables in a little
90       bit.
91
92       Finally, here's an example of the full typemap file for mapping C
93       strings of the "char *" type to Perl scalars/strings:
94
95         TYPEMAP
96         char *  T_PV
97
98         INPUT
99         T_PV
100           $var = ($type)SvPV_nolen($arg)
101
102         OUTPUT
103         T_PV
104           sv_setpv((SV*)$arg, $var);
105
106       Here's a more complicated example: suppose that you wanted "struct
107       netconfig" to be blessed into the class "Net::Config".  One way to do
108       this is to use underscores (_) to separate package names, as follows:
109
110         typedef struct netconfig * Net_Config;
111
112       And then provide a typemap entry "T_PTROBJ_SPECIAL" that maps
113       underscores to double-colons (::), and declare "Net_Config" to be of
114       that type:
115
116         TYPEMAP
117         Net_Config      T_PTROBJ_SPECIAL
118
119         INPUT
120         T_PTROBJ_SPECIAL
121           if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
122             IV tmp = SvIV((SV*)SvRV($arg));
123             $var = INT2PTR($type, tmp);
124           }
125           else
126             croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")
127
128         OUTPUT
129         T_PTROBJ_SPECIAL
130           sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
131                        (void*)$var);
132
133       The INPUT and OUTPUT sections substitute underscores for double-colons
134       on the fly, giving the desired effect.  This example demonstrates some
135       of the power and versatility of the typemap facility.
136
137       The "INT2PTR" macro (defined in perl.h) casts an integer to a pointer
138       of a given type, taking care of the possible different size of integers
139       and pointers.  There are also "PTR2IV", "PTR2UV", "PTR2NV" macros, to
140       map the other way, which may be useful in OUTPUT sections.
141
142   The Role of the typemap File in Your Distribution
143       The default typemap in the lib/ExtUtils directory of the Perl source
144       contains many useful types which can be used by Perl extensions.  Some
145       extensions define additional typemaps which they keep in their own
146       directory.  These additional typemaps may reference INPUT and OUTPUT
147       maps in the main typemap.  The xsubpp compiler will allow the
148       extension's own typemap to override any mappings which are in the
149       default typemap.  Instead of using an additional typemap file, typemaps
150       may be embedded verbatim in XS with a heredoc-like syntax.  See the
151       documentation on the "TYPEMAP:" XS keyword.
152
153       For CPAN distributions, you can assume that the XS types defined by the
154       perl core are already available. Additionally, the core typemap has
155       default XS types for a large number of C types.  For example, if you
156       simply return a "char *" from your XSUB, the core typemap will have
157       this C type associated with the T_PV XS type.  That means your C string
158       will be copied into the PV (pointer value) slot of a new scalar that
159       will be returned from your XSUB to Perl.
160
161       If you're developing a CPAN distribution using XS, you may add your own
162       file called typemap to the distribution.  That file may contain
163       typemaps that either map types that are specific to your code or that
164       override the core typemap file's mappings for common C types.
165
166   Sharing typemaps Between CPAN Distributions
167       Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16
168       and better), it is rather easy to share typemap code between multiple
169       CPAN distributions. The general idea is to share it as a module that
170       offers a certain API and have the dependent modules declare that as a
171       built-time requirement and import the typemap into the XS. An example
172       of such a typemap-sharing module on CPAN is
173       "ExtUtils::Typemaps::Basic". Two steps to getting that module's
174       typemaps available in your code:
175
176       ·   Declare "ExtUtils::Typemaps::Basic" as a build-time dependency in
177           "Makefile.PL" (use "BUILD_REQUIRES"), or in your "Build.PL" (use
178           "build_requires").
179
180       ·   Include the following line in the XS section of your XS file:
181           (don't break the line)
182
183             INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd
184                              -e "print embeddable_typemap(q{Basic})"
185
186   Writing typemap Entries
187       Each INPUT or OUTPUT typemap entry is a double-quoted Perl string that
188       will be evaluated in the presence of certain variables to get the final
189       C code for mapping a certain C type.
190
191       This means that you can embed Perl code in your typemap (C) code using
192       constructs such as "${ perl code that evaluates to scalar reference
193       here }". A common use case is to generate error messages that refer to
194       the true function name even when using the ALIAS XS feature:
195
196         ${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }
197
198       For many typemap examples, refer to the core typemap file that can be
199       found in the perl source tree at lib/ExtUtils/typemap.
200
201       The Perl variables that are available for interpolation into typemaps
202       are the following:
203
204       ·   $var - the name of the input or output variable, eg. RETVAL for
205           return values.
206
207       ·   $type - the raw C type of the parameter, any ":" replaced with "_".
208           e.g. for a type of "Foo::Bar", $type is "Foo__Bar"
209
210       ·   $ntype - the supplied type with "*" replaced with "Ptr".  e.g. for
211           a type of "Foo*", $ntype is "FooPtr"
212
213       ·   $arg - the stack entry, that the parameter is input from or output
214           to, e.g. ST(0)
215
216       ·   $argoff - the argument stack offset of the argument.  ie. 0 for the
217           first argument, etc.
218
219       ·   $pname - the full name of the XSUB, with including the "PACKAGE"
220           name, with any "PREFIX" stripped.  This is the non-ALIAS name.
221
222       ·   $Package - the package specified by the most recent "PACKAGE"
223           keyword.
224
225       ·   $ALIAS - non-zero if the current XSUB has any aliases declared with
226           "ALIAS".
227
228   Full Listing of Core Typemaps
229       Each C type is represented by an entry in the typemap file that is
230       responsible for converting perl variables (SV, AV, HV, CV, etc.)  to
231       and from that type. The following sections list all XS types that come
232       with perl by default.
233
234       T_SV
235           This simply passes the C representation of the Perl variable (an
236           SV*) in and out of the XS layer. This can be used if the C code
237           wants to deal directly with the Perl variable.
238
239       T_SVREF
240           Used to pass in and return a reference to an SV.
241
242           Note that this typemap does not decrement the reference count when
243           returning the reference to an SV*.  See also:
244           T_SVREF_REFCOUNT_FIXED
245
246       T_SVREF_FIXED
247           Used to pass in and return a reference to an SV.  This is a fixed
248           variant of T_SVREF that decrements the refcount appropriately when
249           returning a reference to an SV*. Introduced in perl 5.15.4.
250
251       T_AVREF
252           From the perl level this is a reference to a perl array.  From the
253           C level this is a pointer to an AV.
254
255           Note that this typemap does not decrement the reference count when
256           returning an AV*. See also: T_AVREF_REFCOUNT_FIXED
257
258       T_AVREF_REFCOUNT_FIXED
259           From the perl level this is a reference to a perl array.  From the
260           C level this is a pointer to an AV. This is a fixed variant of
261           T_AVREF that decrements the refcount appropriately when returning
262           an AV*. Introduced in perl 5.15.4.
263
264       T_HVREF
265           From the perl level this is a reference to a perl hash.  From the C
266           level this is a pointer to an HV.
267
268           Note that this typemap does not decrement the reference count when
269           returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
270
271       T_HVREF_REFCOUNT_FIXED
272           From the perl level this is a reference to a perl hash.  From the C
273           level this is a pointer to an HV. This is a fixed variant of
274           T_HVREF that decrements the refcount appropriately when returning
275           an HV*. Introduced in perl 5.15.4.
276
277       T_CVREF
278           From the perl level this is a reference to a perl subroutine (e.g.
279           $sub = sub { 1 };). From the C level this is a pointer to a CV.
280
281           Note that this typemap does not decrement the reference count when
282           returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
283
284       T_CVREF_REFCOUNT_FIXED
285           From the perl level this is a reference to a perl subroutine (e.g.
286           $sub = sub { 1 };). From the C level this is a pointer to a CV.
287
288           This is a fixed variant of T_HVREF that decrements the refcount
289           appropriately when returning an HV*. Introduced in perl 5.15.4.
290
291       T_SYSRET
292           The T_SYSRET typemap is used to process return values from system
293           calls.  It is only meaningful when passing values from C to perl
294           (there is no concept of passing a system return value from Perl to
295           C).
296
297           System calls return -1 on error (setting ERRNO with the reason) and
298           (usually) 0 on success. If the return value is -1 this typemap
299           returns "undef". If the return value is not -1, this typemap
300           translates a 0 (perl false) to "0 but true" (which is perl true) or
301           returns the value itself, to indicate that the command succeeded.
302
303           The POSIX module makes extensive use of this type.
304
305       T_UV
306           An unsigned integer.
307
308       T_IV
309           A signed integer. This is cast to the required integer type when
310           passed to C and converted to an IV when passed back to Perl.
311
312       T_INT
313           A signed integer. This typemap converts the Perl value to a native
314           integer type (the "int" type on the current platform). When
315           returning the value to perl it is processed in the same way as for
316           T_IV.
317
318           Its behaviour is identical to using an "int" type in XS with T_IV.
319
320       T_ENUM
321           An enum value. Used to transfer an enum component from C. There is
322           no reason to pass an enum value to C since it is stored as an IV
323           inside perl.
324
325       T_BOOL
326           A boolean type. This can be used to pass true and false values to
327           and from C.
328
329       T_U_INT
330           This is for unsigned integers. It is equivalent to using T_UV but
331           explicitly casts the variable to type "unsigned int".  The default
332           type for "unsigned int" is T_UV.
333
334       T_SHORT
335           Short integers. This is equivalent to T_IV but explicitly casts the
336           return to type "short". The default typemap for "short" is T_IV.
337
338       T_U_SHORT
339           Unsigned short integers. This is equivalent to T_UV but explicitly
340           casts the return to type "unsigned short". The default typemap for
341           "unsigned short" is T_UV.
342
343           T_U_SHORT is used for type "U16" in the standard typemap.
344
345       T_LONG
346           Long integers. This is equivalent to T_IV but explicitly casts the
347           return to type "long". The default typemap for "long" is T_IV.
348
349       T_U_LONG
350           Unsigned long integers. This is equivalent to T_UV but explicitly
351           casts the return to type "unsigned long". The default typemap for
352           "unsigned long" is T_UV.
353
354           T_U_LONG is used for type "U32" in the standard typemap.
355
356       T_CHAR
357           Single 8-bit characters.
358
359       T_U_CHAR
360           An unsigned byte.
361
362       T_FLOAT
363           A floating point number. This typemap guarantees to return a
364           variable cast to a "float".
365
366       T_NV
367           A Perl floating point number. Similar to T_IV and T_UV in that the
368           return type is cast to the requested numeric type rather than to a
369           specific type.
370
371       T_DOUBLE
372           A double precision floating point number. This typemap guarantees
373           to return a variable cast to a "double".
374
375       T_PV
376           A string (char *).
377
378       T_PTR
379           A memory address (pointer). Typically associated with a "void *"
380           type.
381
382       T_PTRREF
383           Similar to T_PTR except that the pointer is stored in a scalar and
384           the reference to that scalar is returned to the caller. This can be
385           used to hide the actual pointer value from the programmer since it
386           is usually not required directly from within perl.
387
388           The typemap checks that a scalar reference is passed from perl to
389           XS.
390
391       T_PTROBJ
392           Similar to T_PTRREF except that the reference is blessed into a
393           class.  This allows the pointer to be used as an object. Most
394           commonly used to deal with C structs. The typemap checks that the
395           perl object passed into the XS routine is of the correct class (or
396           part of a subclass).
397
398           The pointer is blessed into a class that is derived from the name
399           of type of the pointer but with all '*' in the name replaced with
400           'Ptr'.
401
402           For "DESTROY" XSUBs only, a T_PTROBJ is optimized to a T_PTRREF.
403           This means the class check is skipped.
404
405       T_REF_IV_REF
406           NOT YET
407
408       T_REF_IV_PTR
409           Similar to T_PTROBJ in that the pointer is blessed into a scalar
410           object.  The difference is that when the object is passed back into
411           XS it must be of the correct type (inheritance is not supported)
412           while T_PTROBJ supports inheritance.
413
414           The pointer is blessed into a class that is derived from the name
415           of type of the pointer but with all '*' in the name replaced with
416           'Ptr'.
417
418           For "DESTROY" XSUBs only, a T_REF_IV_PTR is optimized to a
419           T_PTRREF. This means the class check is skipped.
420
421       T_PTRDESC
422           NOT YET
423
424       T_REFREF
425           Similar to T_PTRREF, except the pointer stored in the referenced
426           scalar is dereferenced and copied to the output variable. This
427           means that T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR.
428           All clear?
429
430           Only the INPUT part of this is implemented (Perl to XSUB) and there
431           are no known users in core or on CPAN.
432
433       T_REFOBJ
434           Like T_REFREF, except it does strict type checking (inheritance is
435           not supported).
436
437           For "DESTROY" XSUBs only, a T_REFOBJ is optimized to a T_REFREF.
438           This means the class check is skipped.
439
440       T_OPAQUEPTR
441           This can be used to store bytes in the string component of the SV.
442           Here the representation of the data is irrelevant to perl and the
443           bytes themselves are just stored in the SV. It is assumed that the
444           C variable is a pointer (the bytes are copied from that memory
445           location).  If the pointer is pointing to something that is
446           represented by 8 bytes then those 8 bytes are stored in the SV (and
447           length() will report a value of 8). This entry is similar to
448           T_OPAQUE.
449
450           In principle the unpack() command can be used to convert the bytes
451           back to a number (if the underlying type is known to be a number).
452
453           This entry can be used to store a C structure (the number of bytes
454           to be copied is calculated using the C "sizeof" function) and can
455           be used as an alternative to T_PTRREF without having to worry about
456           a memory leak (since Perl will clean up the SV).
457
458       T_OPAQUE
459           This can be used to store data from non-pointer types in the string
460           part of an SV. It is similar to T_OPAQUEPTR except that the typemap
461           retrieves the pointer directly rather than assuming it is being
462           supplied. For example, if an integer is imported into Perl using
463           T_OPAQUE rather than T_IV the underlying bytes representing the
464           integer will be stored in the SV but the actual integer value will
465           not be available. i.e. The data is opaque to perl.
466
467           The data may be retrieved using the "unpack" function if the
468           underlying type of the byte stream is known.
469
470           T_OPAQUE supports input and output of simple types.  T_OPAQUEPTR
471           can be used to pass these bytes back into C if a pointer is
472           acceptable.
473
474       Implicit array
475           xsubpp supports a special syntax for returning packed C arrays to
476           perl. If the XS return type is given as
477
478             array(type, nelem)
479
480           xsubpp will copy the contents of "nelem * sizeof(type)" bytes from
481           RETVAL to an SV and push it onto the stack. This is only really
482           useful if the number of items to be returned is known at compile
483           time and you don't mind having a string of bytes in your SV.  Use
484           T_ARRAY to push a variable number of arguments onto the return
485           stack (they won't be packed as a single string though).
486
487           This is similar to using T_OPAQUEPTR but can be used to process
488           more than one element.
489
490       T_PACKED
491           Calls user-supplied functions for conversion. For "OUTPUT" (XSUB to
492           Perl), a function named "XS_pack_$ntype" is called with the output
493           Perl scalar and the C variable to convert from.  $ntype is the
494           normalized C type that is to be mapped to Perl. Normalized means
495           that all "*" are replaced by the string "Ptr". The return value of
496           the function is ignored.
497
498           Conversely for "INPUT" (Perl to XSUB) mapping, the function named
499           "XS_unpack_$ntype" is called with the input Perl scalar as argument
500           and the return value is cast to the mapped C type and assigned to
501           the output C variable.
502
503           An example conversion function for a typemapped struct "foo_t *"
504           might be:
505
506             static void
507             XS_pack_foo_tPtr(SV *out, foo_t *in)
508             {
509               dTHX; /* alas, signature does not include pTHX_ */
510               HV* hash = newHV();
511               hv_stores(hash, "int_member", newSViv(in->int_member));
512               hv_stores(hash, "float_member", newSVnv(in->float_member));
513               /* ... */
514
515               /* mortalize as thy stack is not refcounted */
516               sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash)));
517             }
518
519           The conversion from Perl to C is left as an exercise to the reader,
520           but the prototype would be:
521
522             static foo_t *
523             XS_unpack_foo_tPtr(SV *in);
524
525           Instead of an actual C function that has to fetch the thread
526           context using "dTHX", you can define macros of the same name and
527           avoid the overhead. Also, keep in mind to possibly free the memory
528           allocated by "XS_unpack_foo_tPtr".
529
530       T_PACKEDARRAY
531           T_PACKEDARRAY is similar to T_PACKED. In fact, the "INPUT" (Perl to
532           XSUB) typemap is identical, but the "OUTPUT" typemap passes an
533           additional argument to the "XS_pack_$ntype" function. This third
534           parameter indicates the number of elements in the output so that
535           the function can handle C arrays sanely. The variable needs to be
536           declared by the user and must have the name "count_$ntype" where
537           $ntype is the normalized C type name as explained above. The
538           signature of the function would be for the example above and "foo_t
539           **":
540
541             static void
542             XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);
543
544           The type of the third parameter is arbitrary as far as the typemap
545           is concerned. It just has to be in line with the declared variable.
546
547           Of course, unless you know the number of elements in the "sometype
548           **" C array, within your XSUB, the return value from "foo_t **
549           XS_unpack_foo_tPtrPtr(...)" will be hard to decipher.  Since the
550           details are all up to the XS author (the typemap user), there are
551           several solutions, none of which particularly elegant.  The most
552           commonly seen solution has been to allocate memory for N+1 pointers
553           and assign "NULL" to the (N+1)th to facilitate iteration.
554
555           Alternatively, using a customized typemap for your purposes in the
556           first place is probably preferable.
557
558       T_DATAUNIT
559           NOT YET
560
561       T_CALLBACK
562           NOT YET
563
564       T_ARRAY
565           This is used to convert the perl argument list to a C array and for
566           pushing the contents of a C array onto the perl argument stack.
567
568           The usual calling signature is
569
570             @out = array_func( @in );
571
572           Any number of arguments can occur in the list before the array but
573           the input and output arrays must be the last elements in the list.
574
575           When used to pass a perl list to C the XS writer must provide a
576           function (named after the array type but with 'Ptr' substituted for
577           '*') to allocate the memory required to hold the list. A pointer
578           should be returned. It is up to the XS writer to free the memory on
579           exit from the function. The variable "ix_$var" is set to the number
580           of elements in the new array.
581
582           When returning a C array to Perl the XS writer must provide an
583           integer variable called "size_$var" containing the number of
584           elements in the array. This is used to determine how many elements
585           should be pushed onto the return argument stack. This is not
586           required on input since Perl knows how many arguments are on the
587           stack when the routine is called. Ordinarily this variable would be
588           called "size_RETVAL".
589
590           Additionally, the type of each element is determined from the type
591           of the array. If the array uses type "intArray *" xsubpp will
592           automatically work out that it contains variables of type "int" and
593           use that typemap entry to perform the copy of each element. All
594           pointer '*' and 'Array' tags are removed from the name to determine
595           the subtype.
596
597       T_STDIO
598           This is used for passing perl filehandles to and from C using "FILE
599           *" structures.
600
601       T_INOUT
602           This is used for passing perl filehandles to and from C using
603           "PerlIO *" structures. The file handle can used for reading and
604           writing. This corresponds to the "+<" mode, see also T_IN and
605           T_OUT.
606
607           See perliol for more information on the Perl IO abstraction layer.
608           Perl must have been built with "-Duseperlio".
609
610           There is no check to assert that the filehandle passed from Perl to
611           C was created with the right "open()" mode.
612
613           Hint: The perlxstut tutorial covers the T_INOUT, T_IN, and T_OUT XS
614           types nicely.
615
616       T_IN
617           Same as T_INOUT, but the filehandle that is returned from C to Perl
618           can only be used for reading (mode "<").
619
620       T_OUT
621           Same as T_INOUT, but the filehandle that is returned from C to Perl
622           is set to use the open mode "+>".
623
624
625
626perl v5.26.3                      2018-03-01                  PERLXSTYPEMAP(1)