1PERLXSTYPEMAP(1) Perl Programmers Reference Guide PERLXSTYPEMAP(1)
2
3
4
6 perlxstypemap - Perl XS C/Perl type mapping
7
9 The more you think about interfacing between two languages, the more
10 you'll realize that the majority of programmer effort has to go into
11 converting between the data structures that are native to either of the
12 languages involved. This trumps other matter such as differing calling
13 conventions because the problem space is so much greater. There are
14 simply more ways to shove data into memory than there are ways to
15 implement a function call.
16
17 Perl XS' attempt at a solution to this is the concept of typemaps. At
18 an abstract level, a Perl XS typemap is nothing but a recipe for
19 converting from a certain Perl data structure to a certain C data
20 structure and vice versa. Since there can be C types that are
21 sufficiently similar to warrant converting with the same logic, XS
22 typemaps are represented by a unique identifier, henceforth called an
23 <XS type> in this document. You can then tell the XS compiler that
24 multiple C types are to be mapped with the same XS typemap.
25
26 In your XS code, when you define an argument with a C type or when you
27 are using a "CODE:" and an "OUTPUT:" section together with a C return
28 type of your XSUB, it'll be the typemapping mechanism that makes this
29 easy.
30
31 Anatomy of a typemap
32 In more practical terms, the typemap is a collection of code fragments
33 which are used by the xsubpp compiler to map C function parameters and
34 values to Perl values. The typemap file may consist of three sections
35 labelled "TYPEMAP", "INPUT", and "OUTPUT". An unlabelled initial
36 section is assumed to be a "TYPEMAP" section. The INPUT section tells
37 the compiler how to translate Perl values into variables of certain C
38 types. The OUTPUT section tells the compiler how to translate the
39 values from certain C types into values Perl can understand. The
40 TYPEMAP section tells the compiler which of the INPUT and OUTPUT code
41 fragments should be used to map a given C type to a Perl value. The
42 section labels "TYPEMAP", "INPUT", or "OUTPUT" must begin in the first
43 column on a line by themselves, and must be in uppercase.
44
45 Each type of section can appear an arbitrary number of times and does
46 not have to appear at all. For example, a typemap may commonly lack
47 "INPUT" and "OUTPUT" sections if all it needs to do is associate
48 additional C types with core XS types like T_PTROBJ. Lines that start
49 with a hash "#" are considered comments and ignored in the "TYPEMAP"
50 section, but are considered significant in "INPUT" and "OUTPUT". Blank
51 lines are generally ignored.
52
53 Traditionally, typemaps needed to be written to a separate file,
54 conventionally called "typemap" in a CPAN distribution. With
55 ExtUtils::ParseXS (the XS compiler) version 3.12 or better which comes
56 with perl 5.16, typemaps can also be embedded directly into XS code
57 using a HERE-doc like syntax:
58
59 TYPEMAP: <<HERE
60 ...
61 HERE
62
63 where "HERE" can be replaced by other identifiers like with normal Perl
64 HERE-docs. All details below about the typemap textual format remain
65 valid.
66
67 The "TYPEMAP" section should contain one pair of C type and XS type per
68 line as follows. An example from the core typemap file:
69
70 TYPEMAP
71 # all variants of char* is handled by the T_PV typemap
72 char * T_PV
73 const char * T_PV
74 unsigned char * T_PV
75 ...
76
77 The "INPUT" and "OUTPUT" sections have identical formats, that is, each
78 unindented line starts a new in- or output map respectively. A new in-
79 or output map must start with the name of the XS type to map on a line
80 by itself, followed by the code that implements it indented on the
81 following lines. Example:
82
83 INPUT
84 T_PV
85 $var = ($type)SvPV_nolen($arg)
86 T_PTR
87 $var = INT2PTR($type,SvIV($arg))
88
89 We'll get to the meaning of those Perlish-looking variables in a little
90 bit.
91
92 Finally, here's an example of the full typemap file for mapping C
93 strings of the "char *" type to Perl scalars/strings:
94
95 TYPEMAP
96 char * T_PV
97
98 INPUT
99 T_PV
100 $var = ($type)SvPV_nolen($arg)
101
102 OUTPUT
103 T_PV
104 sv_setpv((SV*)$arg, $var);
105
106 Here's a more complicated example: suppose that you wanted "struct
107 netconfig" to be blessed into the class "Net::Config". One way to do
108 this is to use underscores (_) to separate package names, as follows:
109
110 typedef struct netconfig * Net_Config;
111
112 And then provide a typemap entry "T_PTROBJ_SPECIAL" that maps
113 underscores to double-colons (::), and declare "Net_Config" to be of
114 that type:
115
116 TYPEMAP
117 Net_Config T_PTROBJ_SPECIAL
118
119 INPUT
120 T_PTROBJ_SPECIAL
121 if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
122 IV tmp = SvIV((SV*)SvRV($arg));
123 $var = INT2PTR($type, tmp);
124 }
125 else
126 croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")
127
128 OUTPUT
129 T_PTROBJ_SPECIAL
130 sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
131 (void*)$var);
132
133 The INPUT and OUTPUT sections substitute underscores for double-colons
134 on the fly, giving the desired effect. This example demonstrates some
135 of the power and versatility of the typemap facility.
136
137 The "INT2PTR" macro (defined in perl.h) casts an integer to a pointer
138 of a given type, taking care of the possible different size of integers
139 and pointers. There are also "PTR2IV", "PTR2UV", "PTR2NV" macros, to
140 map the other way, which may be useful in OUTPUT sections.
141
142 The Role of the typemap File in Your Distribution
143 The default typemap in the lib/ExtUtils directory of the Perl source
144 contains many useful types which can be used by Perl extensions. Some
145 extensions define additional typemaps which they keep in their own
146 directory. These additional typemaps may reference INPUT and OUTPUT
147 maps in the main typemap. The xsubpp compiler will allow the
148 extension's own typemap to override any mappings which are in the
149 default typemap. Instead of using an additional typemap file, typemaps
150 may be embedded verbatim in XS with a heredoc-like syntax. See the
151 documentation on the "TYPEMAP:" XS keyword.
152
153 For CPAN distributions, you can assume that the XS types defined by the
154 perl core are already available. Additionally, the core typemap has
155 default XS types for a large number of C types. For example, if you
156 simply return a "char *" from your XSUB, the core typemap will have
157 this C type associated with the T_PV XS type. That means your C string
158 will be copied into the PV (pointer value) slot of a new scalar that
159 will be returned from your XSUB to to Perl.
160
161 If you're developing a CPAN distribution using XS, you may add your own
162 file called typemap to the distribution. That file may contain
163 typemaps that either map types that are specific to your code or that
164 override the core typemap file's mappings for common C types.
165
166 Sharing typemaps Between CPAN Distributions
167 Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16
168 and better), it is rather easy to share typemap code between multiple
169 CPAN distributions. The general idea is to share it as a module that
170 offers a certain API and have the dependent modules declare that as a
171 built-time requirement and import the typemap into the XS. An example
172 of such a typemap-sharing module on CPAN is
173 "ExtUtils::Typemaps::Basic". Two steps to getting that module's
174 typemaps available in your code:
175
176 · Declare "ExtUtils::Typemaps::Basic" as a build-time dependency in
177 "Makefile.PL" (use "BUILD_REQUIRES"), or in your "Build.PL" (use
178 "build_requires").
179
180 · Include the following line in the XS section of your XS file:
181 (don't break the line)
182
183 INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd
184 -e "print embeddable_typemap(q{Basic})"
185
186 Writing typemap Entries
187 Each INPUT or OUTPUT typemap entry is a double-quoted Perl string that
188 will be evaluated in the presence of certain variables to get the final
189 C code for mapping a certain C type.
190
191 This means that you can embed Perl code in your typemap (C) code using
192 constructs such as "${ perl code that evaluates to scalar reference
193 here }". A common use case is to generate error messages that refer to
194 the true function name even when using the ALIAS XS feature:
195
196 ${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }
197
198 For many typemap examples, refer to the core typemap file that can be
199 found in the perl source tree at lib/ExtUtils/typemap.
200
201 The Perl variables that are available for interpolation into typemaps
202 are the following:
203
204 · $var - the name of the input or output variable, eg. RETVAL for
205 return values.
206
207 · $type - the raw C type of the parameter, any ":" replaced with "_".
208
209 · $ntype - the supplied type with "*" replaced with "Ptr". e.g. for
210 a type of "Foo::Bar", $ntype is "Foo::Bar"
211
212 · $arg - the stack entry, that the parameter is input from or output
213 to, e.g. ST(0)
214
215 · $argoff - the argument stack offset of the argument. ie. 0 for the
216 first argument, etc.
217
218 · $pname - the full name of the XSUB, with including the "PACKAGE"
219 name, with any "PREFIX" stripped. This is the non-ALIAS name.
220
221 · $Package - the package specified by the most recent "PACKAGE"
222 keyword.
223
224 · $ALIAS - non-zero if the current XSUB has any aliases declared with
225 "ALIAS".
226
227 Full Listing of Core Typemaps
228 Each C type is represented by an entry in the typemap file that is
229 responsible for converting perl variables (SV, AV, HV, CV, etc.) to
230 and from that type. The following sections list all XS types that come
231 with perl by default.
232
233 T_SV
234 This simply passes the C representation of the Perl variable (an
235 SV*) in and out of the XS layer. This can be used if the C code
236 wants to deal directly with the Perl variable.
237
238 T_SVREF
239 Used to pass in and return a reference to an SV.
240
241 Note that this typemap does not decrement the reference count when
242 returning the reference to an SV*. See also:
243 T_SVREF_REFCOUNT_FIXED
244
245 T_SVREF_FIXED
246 Used to pass in and return a reference to an SV. This is a fixed
247 variant of T_SVREF that decrements the refcount appropriately when
248 returning a reference to an SV*. Introduced in perl 5.15.4.
249
250 T_AVREF
251 From the perl level this is a reference to a perl array. From the
252 C level this is a pointer to an AV.
253
254 Note that this typemap does not decrement the reference count when
255 returning an AV*. See also: T_AVREF_REFCOUNT_FIXED
256
257 T_AVREF_REFCOUNT_FIXED
258 From the perl level this is a reference to a perl array. From the
259 C level this is a pointer to an AV. This is a fixed variant of
260 T_AVREF that decrements the refcount appropriately when returning
261 an AV*. Introduced in perl 5.15.4.
262
263 T_HVREF
264 From the perl level this is a reference to a perl hash. From the C
265 level this is a pointer to an HV.
266
267 Note that this typemap does not decrement the reference count when
268 returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
269
270 T_HVREF_REFCOUNT_FIXED
271 From the perl level this is a reference to a perl hash. From the C
272 level this is a pointer to an HV. This is a fixed variant of
273 T_HVREF that decrements the refcount appropriately when returning
274 an HV*. Introduced in perl 5.15.4.
275
276 T_CVREF
277 From the perl level this is a reference to a perl subroutine (e.g.
278 $sub = sub { 1 };). From the C level this is a pointer to a CV.
279
280 Note that this typemap does not decrement the reference count when
281 returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
282
283 T_CVREF_REFCOUNT_FIXED
284 From the perl level this is a reference to a perl subroutine (e.g.
285 $sub = sub { 1 };). From the C level this is a pointer to a CV.
286
287 This is a fixed variant of T_HVREF that decrements the refcount
288 appropriately when returning an HV*. Introduced in perl 5.15.4.
289
290 T_SYSRET
291 The T_SYSRET typemap is used to process return values from system
292 calls. It is only meaningful when passing values from C to perl
293 (there is no concept of passing a system return value from Perl to
294 C).
295
296 System calls return -1 on error (setting ERRNO with the reason) and
297 (usually) 0 on success. If the return value is -1 this typemap
298 returns "undef". If the return value is not -1, this typemap
299 translates a 0 (perl false) to "0 but true" (which is perl true) or
300 returns the value itself, to indicate that the command succeeded.
301
302 The POSIX module makes extensive use of this type.
303
304 T_UV
305 An unsigned integer.
306
307 T_IV
308 A signed integer. This is cast to the required integer type when
309 passed to C and converted to an IV when passed back to Perl.
310
311 T_INT
312 A signed integer. This typemap converts the Perl value to a native
313 integer type (the "int" type on the current platform). When
314 returning the value to perl it is processed in the same way as for
315 T_IV.
316
317 Its behaviour is identical to using an "int" type in XS with T_IV.
318
319 T_ENUM
320 An enum value. Used to transfer an enum component from C. There is
321 no reason to pass an enum value to C since it is stored as an IV
322 inside perl.
323
324 T_BOOL
325 A boolean type. This can be used to pass true and false values to
326 and from C.
327
328 T_U_INT
329 This is for unsigned integers. It is equivalent to using T_UV but
330 explicitly casts the variable to type "unsigned int". The default
331 type for "unsigned int" is T_UV.
332
333 T_SHORT
334 Short integers. This is equivalent to T_IV but explicitly casts the
335 return to type "short". The default typemap for "short" is T_IV.
336
337 T_U_SHORT
338 Unsigned short integers. This is equivalent to T_UV but explicitly
339 casts the return to type "unsigned short". The default typemap for
340 "unsigned short" is T_UV.
341
342 T_U_SHORT is used for type "U16" in the standard typemap.
343
344 T_LONG
345 Long integers. This is equivalent to T_IV but explicitly casts the
346 return to type "long". The default typemap for "long" is T_IV.
347
348 T_U_LONG
349 Unsigned long integers. This is equivalent to T_UV but explicitly
350 casts the return to type "unsigned long". The default typemap for
351 "unsigned long" is T_UV.
352
353 T_U_LONG is used for type "U32" in the standard typemap.
354
355 T_CHAR
356 Single 8-bit characters.
357
358 T_U_CHAR
359 An unsigned byte.
360
361 T_FLOAT
362 A floating point number. This typemap guarantees to return a
363 variable cast to a "float".
364
365 T_NV
366 A Perl floating point number. Similar to T_IV and T_UV in that the
367 return type is cast to the requested numeric type rather than to a
368 specific type.
369
370 T_DOUBLE
371 A double precision floating point number. This typemap guarantees
372 to return a variable cast to a "double".
373
374 T_PV
375 A string (char *).
376
377 T_PTR
378 A memory address (pointer). Typically associated with a "void *"
379 type.
380
381 T_PTRREF
382 Similar to T_PTR except that the pointer is stored in a scalar and
383 the reference to that scalar is returned to the caller. This can be
384 used to hide the actual pointer value from the programmer since it
385 is usually not required directly from within perl.
386
387 The typemap checks that a scalar reference is passed from perl to
388 XS.
389
390 T_PTROBJ
391 Similar to T_PTRREF except that the reference is blessed into a
392 class. This allows the pointer to be used as an object. Most
393 commonly used to deal with C structs. The typemap checks that the
394 perl object passed into the XS routine is of the correct class (or
395 part of a subclass).
396
397 The pointer is blessed into a class that is derived from the name
398 of type of the pointer but with all '*' in the name replaced with
399 'Ptr'.
400
401 T_REF_IV_REF
402 NOT YET
403
404 T_REF_IV_PTR
405 Similar to T_PTROBJ in that the pointer is blessed into a scalar
406 object. The difference is that when the object is passed back into
407 XS it must be of the correct type (inheritance is not supported).
408
409 The pointer is blessed into a class that is derived from the name
410 of type of the pointer but with all '*' in the name replaced with
411 'Ptr'.
412
413 T_PTRDESC
414 NOT YET
415
416 T_REFREF
417 Similar to T_PTRREF, except the pointer stored in the referenced
418 scalar is dereferenced and copied to the output variable. This
419 means that T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR.
420 All clear?
421
422 Only the INPUT part of this is implemented (Perl to XSUB) and there
423 are no known users in core or on CPAN.
424
425 T_REFOBJ
426 NOT YET
427
428 T_OPAQUEPTR
429 This can be used to store bytes in the string component of the SV.
430 Here the representation of the data is irrelevant to perl and the
431 bytes themselves are just stored in the SV. It is assumed that the
432 C variable is a pointer (the bytes are copied from that memory
433 location). If the pointer is pointing to something that is
434 represented by 8 bytes then those 8 bytes are stored in the SV (and
435 length() will report a value of 8). This entry is similar to
436 T_OPAQUE.
437
438 In principle the unpack() command can be used to convert the bytes
439 back to a number (if the underlying type is known to be a number).
440
441 This entry can be used to store a C structure (the number of bytes
442 to be copied is calculated using the C "sizeof" function) and can
443 be used as an alternative to T_PTRREF without having to worry about
444 a memory leak (since Perl will clean up the SV).
445
446 T_OPAQUE
447 This can be used to store data from non-pointer types in the string
448 part of an SV. It is similar to T_OPAQUEPTR except that the typemap
449 retrieves the pointer directly rather than assuming it is being
450 supplied. For example, if an integer is imported into Perl using
451 T_OPAQUE rather than T_IV the underlying bytes representing the
452 integer will be stored in the SV but the actual integer value will
453 not be available. i.e. The data is opaque to perl.
454
455 The data may be retrieved using the "unpack" function if the
456 underlying type of the byte stream is known.
457
458 T_OPAQUE supports input and output of simple types. T_OPAQUEPTR
459 can be used to pass these bytes back into C if a pointer is
460 acceptable.
461
462 Implicit array
463 xsubpp supports a special syntax for returning packed C arrays to
464 perl. If the XS return type is given as
465
466 array(type, nelem)
467
468 xsubpp will copy the contents of "nelem * sizeof(type)" bytes from
469 RETVAL to an SV and push it onto the stack. This is only really
470 useful if the number of items to be returned is known at compile
471 time and you don't mind having a string of bytes in your SV. Use
472 T_ARRAY to push a variable number of arguments onto the return
473 stack (they won't be packed as a single string though).
474
475 This is similar to using T_OPAQUEPTR but can be used to process
476 more than one element.
477
478 T_PACKED
479 Calls user-supplied functions for conversion. For "OUTPUT" (XSUB to
480 Perl), a function named "XS_pack_$ntype" is called with the output
481 Perl scalar and the C variable to convert from. $ntype is the
482 normalized C type that is to be mapped to Perl. Normalized means
483 that all "*" are replaced by the string "Ptr". The return value of
484 the function is ignored.
485
486 Conversely for "INPUT" (Perl to XSUB) mapping, the function named
487 "XS_unpack_$ntype" is called with the input Perl scalar as argument
488 and the return value is cast to the mapped C type and assigned to
489 the output C variable.
490
491 An example conversion function for a typemapped struct "foo_t *"
492 might be:
493
494 static void
495 XS_pack_foo_tPtr(SV *out, foo_t *in)
496 {
497 dTHX; /* alas, signature does not include pTHX_ */
498 HV* hash = newHV();
499 hv_stores(hash, "int_member", newSViv(in->int_member));
500 hv_stores(hash, "float_member", newSVnv(in->float_member));
501 /* ... */
502
503 /* mortalize as thy stack is not refcounted */
504 sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash)));
505 }
506
507 The conversion from Perl to C is left as an exercise to the reader,
508 but the prototype would be:
509
510 static foo_t *
511 XS_unpack_foo_tPtr(SV *in);
512
513 Instead of an actual C function that has to fetch the thread
514 context using "dTHX", you can define macros of the same name and
515 avoid the overhead. Also, keep in mind to possibly free the memory
516 allocated by "XS_unpack_foo_tPtr".
517
518 T_PACKEDARRAY
519 T_PACKEDARRAY is similar to T_PACKED. In fact, the "INPUT" (Perl to
520 XSUB) typemap is indentical, but the "OUTPUT" typemap passes an
521 additional argument to the "XS_pack_$ntype" function. This third
522 parameter indicates the number of elements in the output so that
523 the function can handle C arrays sanely. The variable needs to be
524 declared by the user and must have the name "count_$ntype" where
525 $ntype is the normalized C type name as explained above. The
526 signature of the function would be for the example above and "foo_t
527 **":
528
529 static void
530 XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);
531
532 The type of the third parameter is arbitrary as far as the typemap
533 is concerned. It just has to be in line with the declared variable.
534
535 Of course, unless you know the number of elements in the "sometype
536 **" C array, within your XSUB, the return value from "foo_t **
537 XS_unpack_foo_tPtrPtr(...)" will be hard to decypher. Since the
538 details are all up to the XS author (the typemap user), there are
539 several solutions, none of which particularly elegant. The most
540 commonly seen solution has been to allocate memory for N+1 pointers
541 and assign "NULL" to the (N+1)th to facilitate iteration.
542
543 Alternatively, using a customized typemap for your purposes in the
544 first place is probably preferrable.
545
546 T_DATAUNIT
547 NOT YET
548
549 T_CALLBACK
550 NOT YET
551
552 T_ARRAY
553 This is used to convert the perl argument list to a C array and for
554 pushing the contents of a C array onto the perl argument stack.
555
556 The usual calling signature is
557
558 @out = array_func( @in );
559
560 Any number of arguments can occur in the list before the array but
561 the input and output arrays must be the last elements in the list.
562
563 When used to pass a perl list to C the XS writer must provide a
564 function (named after the array type but with 'Ptr' substituted for
565 '*') to allocate the memory required to hold the list. A pointer
566 should be returned. It is up to the XS writer to free the memory on
567 exit from the function. The variable "ix_$var" is set to the number
568 of elements in the new array.
569
570 When returning a C array to Perl the XS writer must provide an
571 integer variable called "size_$var" containing the number of
572 elements in the array. This is used to determine how many elements
573 should be pushed onto the return argument stack. This is not
574 required on input since Perl knows how many arguments are on the
575 stack when the routine is called. Ordinarily this variable would be
576 called "size_RETVAL".
577
578 Additionally, the type of each element is determined from the type
579 of the array. If the array uses type "intArray *" xsubpp will
580 automatically work out that it contains variables of type "int" and
581 use that typemap entry to perform the copy of each element. All
582 pointer '*' and 'Array' tags are removed from the name to determine
583 the subtype.
584
585 T_STDIO
586 This is used for passing perl filehandles to and from C using "FILE
587 *" structures.
588
589 T_INOUT
590 This is used for passing perl filehandles to and from C using
591 "PerlIO *" structures. The file handle can used for reading and
592 writing. This corresponds to the "+<" mode, see also T_IN and
593 T_OUT.
594
595 See perliol for more information on the Perl IO abstraction layer.
596 Perl must have been built with "-Duseperlio".
597
598 There is no check to assert that the filehandle passed from Perl to
599 C was created with the right "open()" mode.
600
601 Hint: The perlxstut tutorial covers the T_INOUT, T_IN, and T_OUT XS
602 types nicely.
603
604 T_IN
605 Same as T_INOUT, but the filehandle that is returned from C to Perl
606 can only be used for reading (mode "<").
607
608 T_OUT
609 Same as T_INOUT, but the filehandle that is returned from C to Perl
610 is set to use the open mode "+>".
611
612
613
614perl v5.16.3 2013-03-04 PERLXSTYPEMAP(1)