1PP(1) User Contributed Perl Documentation PP(1)
2
3
4
6 PDL::PP - Generate PDL routines from concise descriptions
7
9 e.g.
10
11 pp_def(
12 'sumover',
13 Pars => 'a(n); [o]b();',
14 Code => q{
15 double tmp=0;
16 loop(n) %{
17 tmp += $a();
18 %}
19 $b() = tmp;
20 },
21 );
22
23 pp_done();
24
26 Here is a quick reference list of the functions provided by PDL::PP.
27
28 pp_add_boot
29 Add code to the BOOT section of generated XS file
30
31 pp_add_exported
32 Add functions to the list of exported functions
33
34 pp_add_isa
35 Add entries to the @ISA list
36
37 pp_addbegin
38 Sets code to be added at the top of the generate .pm file
39
40 pp_addhdr
41 Add code and includes to C section of the generated XS file
42
43 pp_addpm
44 Add code to the generated .pm file
45
46 pp_addxs
47 Add extra XS code to the generated XS file
48
49 pp_beginwrap
50 Add BEGIN-block wrapping to code for the generated .pm file
51
52 pp_bless
53 Sets the package to which the XS code is added (default is PDL)
54
55 pp_boundscheck
56 Control state of PDL bounds checking activity
57
58 pp_core_importList
59 Specify what is imported from PDL::Core
60
61 pp_def
62 Define a new PDL function
63
64 pp_deprecate_module
65 Add runtime and POD warnings about a module being deprecated
66
67 pp_done
68 Mark the end of PDL::PP definitions in the file
69
70 pp_export_nothing
71 Clear out the export list for your generated module
72
73 pp_line_numbers
74 Add line number information to simplify debugging of PDL::PP code
75
76 pp_setversion
77 Set the version for .pm and .xs files
78
80 For an alternate introduction to PDL::PP, see Practical Magick with C,
81 PDL, and PDL::PP -- a guide to compiled add-ons for PDL
82 <https://arxiv.org/abs/1702.07753>.
83
84 Why do we need PP? Several reasons: firstly, we want to be able to
85 generate subroutine code for each of the PDL datatypes (PDL_Byte,
86 PDL_Short, etc). AUTOMATICALLY. Secondly, when referring to slices of
87 PDL arrays in Perl (e.g. "$x->slice('0:10:2,:')" or other things such
88 as transposes) it is nice to be able to do this transparently and to be
89 able to do this 'in-place' - i.e, not to have to make a memory copy of
90 the section. PP handles all the necessary element and offset arithmetic
91 for you. There are also the notions of threading (repeated calling of
92 the same routine for multiple slices, see PDL::Indexing) and dataflow
93 (see PDL::Dataflow) which use of PP allows.
94
95 In much of what follows we will assume familiarity of the reader with
96 the concepts of implicit and explicit threading and index manipulations
97 within PDL. If you have not yet heard of these concepts or are not very
98 comfortable with them it is time to check PDL::Indexing.
99
100 As you may appreciate from its name PDL::PP is a Pre-Processor, i.e.
101 it expands code via substitutions to make real C-code. Technically, the
102 output is XS code (see perlxs) but that is very close to C.
103
104 So how do you use PP? Well for the most part you just write ordinary C
105 code except for special PP constructs which take the form:
106
107 $something(something else)
108
109 or:
110
111 PPfunction %{
112 <stuff>
113 %}
114
115 The most important PP construct is the form "$array()". Consider the
116 very simple PP function to sum the elements of a 1D vector (in fact
117 this is very similar to the actual code used by 'sumover'):
118
119 pp_def('sumit',
120 Pars => 'a(n); [o]b();',
121 Code => q{
122 double tmp;
123 tmp = 0;
124 loop(n) %{
125 tmp += $a();
126 %}
127 $b() = tmp;
128 }
129 );
130
131 What's going on? The "Pars =>" line is very important for PP - it
132 specifies all the arguments and their dimensionality. We call this the
133 signature of the PP function (compare also the explanations in
134 PDL::Indexing). In this case the routine takes a 1-D function as input
135 and returns a 0-D scalar as output. The "$a()" PP construct is used to
136 access elements of the array a(n) for you - PP fills in all the
137 required C code.
138
139 You will notice that we are using the "q{}" single-quote operator. This
140 is not an accident. You generally want to use single quotes to denote
141 your PP Code sections. PDL::PP uses "$var()" for its parsing and if you
142 don't use single quotes, Perl will try to interpolate "$var()". Also,
143 using the single quote "q" operator with curly braces makes it look
144 like you are creating a code block, which is What You Mean. (Perl is
145 smart enough to look for nested curly braces and not close the quote
146 until it finds the matching curly brace, so it's safe to have nested
147 blocks.) Under other circumstances, such as when you're stitching
148 together a Code block using string concatenations, it's often easiest
149 to use real single quotes as
150
151 Code => 'something'.$interpolatable.'somethingelse;'
152
153 In the simple case here where all elements are accessed the PP
154 construct "loop(n) %{ ... %}" is used to loop over all elements in
155 dimension "n". Note this feature of PP: ALL DIMENSIONS ARE SPECIFIED
156 BY NAME.
157
158 This is made clearer if we avoid the PP loop() construct and write the
159 loop explicitly using conventional C:
160
161 pp_def('sumit',
162 Pars => 'a(n); [o]b();',
163 Code => q{
164 PDL_Indx i,n_size;
165 double tmp;
166 n_size = $SIZE(n);
167 tmp = 0;
168 for(i=0; i<n_size; i++) {
169 tmp += $a(n=>i);
170 }
171 $b() = tmp;
172 },
173 );
174
175 which does the same as before, but is more long-winded. You can see to
176 get element "i" of a() we say "$a(n=>i)" - we are specifying the
177 dimension by name "n". In 2D we might say:
178
179 Pars=>'a(m,n);',
180 ...
181 tmp += $a(m=>i,n=>j);
182 ...
183
184 The syntax "m=>i" borrows from Perl hashes, which are in fact used in
185 the implementation of PP. One could also say "$a(n=>j,m=>i)" as order
186 is not important.
187
188 You can also see in the above example the use of another PP construct -
189 $SIZE(n) to get the length of the dimension "n".
190
191 It should, however, be noted that you shouldn't write an explicit
192 C-loop when you could have used the PP "loop" construct since PDL::PP
193 checks automatically the loop limits for you, usage of "loop" makes the
194 code more concise, etc. But there are certainly situations where you
195 need explicit control of the loop and now you know how to do it ;).
196
197 To revisit 'Why PP?' - the above code for sumit() will be generated for
198 each data-type. It will operate on slices of arrays 'in-place'. It will
199 thread automatically - e.g. if a 2D array is given it will be called
200 repeatedly for each 1D row (again check PDL::Indexing for the details
201 of threading). And then b() will be a 1D array of sums of each row.
202 We could call it with $x->xchg(0,1) to sum the columns instead. And
203 Dataflow tracing etc. will be available.
204
205 You can see PP saves the programmer from writing a lot of needlessly
206 repetitive C-code -- in our opinion this is one of the best features of
207 PDL making writing new C subroutines for PDL an amazingly concise
208 exercise. A second reason is the ability to make PP expand your concise
209 code definitions into different C code based on the needs of the
210 computer architecture in question. Imagine for example you are lucky to
211 have a supercomputer at your hands; in that case you want PDL::PP
212 certainly to generate code that takes advantage of the
213 vectorising/parallel computing features of your machine (this a project
214 for the future). In any case, the bottom line is that your unchanged
215 code should still expand to working XS code even if the internals of
216 PDL changed.
217
218 Also, because you are generating the code in an actual Perl script,
219 there are many fun things that you can do. Let's say that you need to
220 write both sumit (as above) and multit. With a little bit of
221 creativity, we can do
222
223 for({Name => 'sumit', Init => '0', Op => '+='},
224 {Name => 'multit', Init => '1', Op => '*='}) {
225 pp_def($_->{Name},
226 Pars => 'a(n); [o]b();',
227 Code => '
228 double tmp;
229 tmp = '.$_->{Init}.';
230 loop(n) %{
231 tmp '.$_->{Op}.' $a();
232 %}
233 $b() = tmp;
234 ');
235 }
236
237 which defines both the functions easily. Now, if you later need to
238 change the signature or dimensionality or whatever, you only need to
239 change one place in your code. Yeah, sure, your editor does have 'cut
240 and paste' and 'search and replace' but it's still less bothersome and
241 definitely more difficult to forget just one place and have strange
242 bugs creep in. Also, adding 'orit' (bitwise or) later is a one-liner.
243
244 And remember, you really have Perl's full abilities with you - you can
245 very easily read any input file and make routines from the information
246 in that file. For simple cases like the above, the author (Tjl)
247 currently favors the hash syntax like the above - it's not too much
248 more characters than the corresponding array syntax but much easier to
249 understand and change.
250
251 We should mention here also the ability to get the pointer to the
252 beginning of the data in memory - a prerequisite for interfacing PDL to
253 some libraries. This is handled with the "$P(var)" directive, see
254 below.
255
256 When starting work on a new pp_def'ined function, if you make a
257 mistake, you will usually find a pile of compiler errors indicating
258 line numbers in the generated XS file. If you know how to read XS files
259 (or if you want to learn the hard way), you could open the generated XS
260 file and search for the line number with the error. However, a recent
261 addition to PDL::PP helps report the correct line number of your
262 errors: "pp_line_numbers". Working with the original summit example, if
263 you had a mis-spelling of tmp in your code, you could change the
264 (erroneous) code to something like this and the compiler would give you
265 much more useful information:
266
267 pp_def('sumit',
268 Pars => 'a(n); [o]b();',
269 Code => pp_line_numbers(__LINE__, q{
270 double tmp;
271 tmp = 0;
272 loop(n) %{
273 tmp += $a();
274 %}
275 $b() = rmp;
276 })
277 );
278
279 For the above situation, my compiler tells me:
280
281 ...
282 test.pd:15: error: 'rmp' undeclared (first use in this function)
283 ...
284
285 In my example script (called test.pd), line 15 is exactly the line at
286 which I made my typo: "rmp" instead of "tmp".
287
288 So, after this quick overview of the general flavour of programming PDL
289 routines using PDL::PP let's summarise in which circumstances you
290 should actually use this preprocessor/precompiler. You should use
291 PDL::PP if you want to
292
293 • interface PDL to some external library
294
295 • write some algorithm that would be slow if coded in Perl (this is
296 not as often as you think; take a look at threading and dataflow
297 first).
298
299 • be a PDL developer (and even then it's not obligatory)
300
302 Because of its architecture, PDL::PP can be both flexible and easy to
303 use on the one hand, yet exuberantly complicated at the same time.
304 Currently, part of the problem is that error messages are not very
305 informative and if something goes wrong, you'd better know what you are
306 doing and be able to hack your way through the internals (or be able to
307 figure out by trial and error what is wrong with your args to
308 "pp_def"). Although work is being done to produce better warnings, do
309 not be afraid to send your questions to the mailing list if you run
310 into trouble.
311
313 Now that you have some idea how to use "pp_def" to define new PDL
314 functions it is time to explain the general syntax of "pp_def".
315 "pp_def" takes as arguments first the name of the function you are
316 defining and then a hash list that can contain various keys.
317
318 Based on these keys PP generates XS code and a .pm file. The function
319 "pp_done" (see example in the SYNOPSIS) is used to tell PDL::PP that
320 there are no more definitions in this file and it is time to generate
321 the .xs and
322 .pm file.
323
324 As a consequence, there may be several pp_def() calls inside a file (by
325 convention files with PP code have the extension .pd or .pp) but
326 generally only one pp_done().
327
328 There are two main different types of usage of pp_def(), the 'data
329 operation' and 'slice operation' prototypes.
330
331 The 'data operation' is used to take some data, mangle it and output
332 some other data; this includes for example the '+' operation, matrix
333 inverse, sumover etc and all the examples we have talked about in this
334 document so far. Implicit and explicit threading and the creation of
335 the result are taken care of automatically in those operations. You can
336 even do dataflow with "sumit", "sumover", etc (don't be dismayed if you
337 don't understand the concept of dataflow in PDL very well yet; it is
338 still very much experimental).
339
340 The 'slice operation' is a different kind of operation: in a slice
341 operation, you are not changing any data, you are defining
342 correspondences between different elements of two piddles (examples
343 include the index manipulation/slicing function definitions in the file
344 slices.pd that is part of the PDL distribution; but beware, this is not
345 introductory level stuff).
346
347 If PDL was compiled with support for bad values (i.e. "WITH_BADVAL =>
348 1"), then additional keys are required for "pp_def", as explained
349 below.
350
351 If you are just interested in communicating with some external library
352 (for example some linear algebra/matrix library), you'll usually want
353 the 'data operation' so we are going to discuss that first.
354
356 A simple example
357 In the data operation, you must know what dimensions of data you need.
358 First, an example with scalars:
359
360 pp_def('add',
361 Pars => 'a(); b(); [o]c();',
362 Code => '$c() = $a() + $b();'
363 );
364
365 That looks a little strange but let's dissect it. The first line is
366 easy: we're defining a routine with the name 'add'. The second line
367 simply declares our parameters and the parentheses mean that they are
368 scalars. We call the string that defines our parameters and their
369 dimensionality the signature of that function. For its relevance with
370 regard to threading and index manipulations check the PDL::Indexing man
371 page.
372
373 The third line is the actual operation. You need to use the dollar
374 signs and parentheses to refer to your parameters (this will probably
375 change at some point in the future, once a good syntax is found).
376
377 These lines are all that is necessary to actually define the function
378 for PDL (well, actually it isn't; you additionally need to write a
379 Makefile.PL (see below) and build the module (something like 'perl
380 Makefile.PL; make'); but let's ignore that for the moment). So now you
381 can do
382
383 use MyModule;
384 $x = pdl 2,3,4;
385 $y = pdl 5;
386
387 $c = add($x,$y);
388 # or
389 add($x,$y,($c=null)); # Alternative form, useful if $c has been
390 # preset to something big, not useful here.
391
392 and have threading work correctly (the result is $c == [7 8 9]).
393
394 The Pars section: the signature of a PP function
395 Seeing the above example code you will most probably ask: what is this
396 strange "$c=null" syntax in the second call to our new "add" function?
397 If you take another look at the definition of "add" you will notice
398 that the third argument "c" is flagged with the qualifier "[o]" which
399 tells PDL::PP that this is an output argument. So the above call to add
400 means 'create a new $c from scratch with correct dimensions' - "null"
401 is a special token for 'empty piddle' (you might ask why we haven't
402 used the value "undef" to flag this instead of the PDL specific "null";
403 we are currently thinking about it ;).
404
405 [This should be explained in some other section of the manual as
406 well!!] The reason for having this syntax as an alternative is that if
407 you have really huge piddles, you can do
408
409 $c = PDL->null;
410 for(some long loop) {
411 # munge a,b
412 add($x,$y,$c);
413 # munge c, put something back to x,y
414 }
415
416 and avoid allocating and deallocating $c each time. It is allocated
417 once at the first add() and thereafter the memory stays until $c is
418 destroyed.
419
420 If you just say
421
422 $c = add($x,$y);
423
424 the code generated by PP will automatically fill in "$c=null" and
425 return the result. If you want to learn more about the reasons why
426 PDL::PP supports this style where output arguments are given as last
427 arguments check the PDL::Indexing man page.
428
429 "[o]" is not the only qualifier a pdl argument can have in the
430 signature. Another important qualifier is the "[t]" option which flags
431 a pdl as temporary. What does that mean? You tell PDL::PP that this
432 pdl is only used for temporary results in the course of the calculation
433 and you are not interested in its value after the computation has been
434 completed. But why should PDL::PP want to know about this in the first
435 place? The reason is closely related to the concepts of pdl auto
436 creation (you heard about that above) and implicit threading. If you
437 use implicit threading the dimensionality of automatically created pdls
438 is actually larger than that specified in the signature. With "[o]"
439 flagged pdls will be created so that they have the additional
440 dimensions as required by the number of implicit thread dimensions.
441 When creating a temporary pdl, however, it will always only be made big
442 enough so that it can hold the result for one iteration in a thread
443 loop, i.e. as large as required by the signature. So less memory is
444 wasted when you flag a pdl as temporary. Secondly, you can use output
445 auto creation with temporary pdls even when you are using explicit
446 threading which is forbidden for normal output pdls flagged with "[o]"
447 (see PDL::Indexing).
448
449 Here is an example where we use the [t] qualifier. We define the
450 function "callf" that calls a C routine "f" which needs a temporary
451 array of the same size and type as the array "a" (sorry about the
452 forward reference for $P; it's a pointer access, see below) :
453
454 pp_def('callf',
455 Pars => 'a(n); [t] tmp(n); [o] b()',
456 Code => 'PDL_Indx ns = $SIZE(n);
457 f($P(a),$P(b),$P(tmp),ns);
458 '
459 );
460
461 Argument dimensions and the signature
462 Now we have just talked about dimensions of pdls and the signature. How
463 are they related? Let's say that we want to add a scalar + the index
464 number to a vector:
465
466 pp_def('add2',
467 Pars => 'a(n); b(); [o]c(n);',
468 Code => 'loop(n) %{
469 $c() = $a() + $b() + n;
470 %}'
471 );
472
473 There are several points to notice here: first, the "Pars" argument now
474 contains the n arguments to show that we have a single dimensions in a
475 and c. It is important to note that dimensions are actual entities that
476 are accessed by name so this declares a and c to have the same first
477 dimensions. In most PP definitions the size of named dimensions will be
478 set from the respective dimensions of non-output pdls (those with no
479 "[o]" flag) but sometimes you might want to set the size of a named
480 dimension explicitly through an integer parameter. See below in the
481 description of the "OtherPars" section how that works.
482
483 Constant argument dimensions in the signature
484 Suppose you want an output piddle to be created automatically and you
485 know that on every call its dimension will have the same size (say 9)
486 regardless of the dimensions of the input piddles. In this case you use
487 the following syntax in the Pars section to specify the size of the
488 dimension:
489
490 ' [o] y(n=9); '
491
492 As expected, extra dimensions required by threading will be created if
493 necessary. If you need to assign a named dimension according to a more
494 complicated formula (than a constant) you must use the "RedoDimsCode"
495 key described below.
496
497 Type conversions and the signature
498 The signature also determines the type conversions that will be
499 performed when a PP function is invoked. So what happens when we invoke
500 one of our previously defined functions with pdls of different type,
501 e.g.
502
503 add2($x,$y,($ret=null));
504
505 where $x is of type "PDL_Float" and $y of type "PDL_Short"? With the
506 signature as shown in the definition of "add2" above the datatype of
507 the operation (as determined at runtime) is that of the pdl with the
508 'highest' type (sequence is byte < short < ushort < long < float <
509 double). In the add2 example the datatype of the operation is float ($x
510 has that datatype). All pdl arguments are then type converted to that
511 datatype (they are not converted inplace but a copy with the right type
512 is created if a pdl argument doesn't have the type of the operation).
513 Null pdls don't contribute a type in the determination of the type of
514 the operation. However, they will be created with the datatype of the
515 operation; here, for example, $ret will be of type float. You should be
516 aware of these rules when calling PP functions with pdls of different
517 types to take the additional storage and runtime requirements into
518 account.
519
520 These type conversions are correct for most functions you normally
521 define with "pp_def". However, there are certain cases where slightly
522 modified type conversion behaviour is desired. For these cases
523 additional qualifiers in the signature can be used to specify the
524 desired properties with regard to type conversion. These qualifiers can
525 be combined with those we have encountered already (the creation
526 qualifiers "[o]" and "[t]"). Let's go through the list of qualifiers
527 that change type conversion behaviour.
528
529 The most important is the "indx" qualifier which comes in handy when a
530 pdl argument represents indices into another pdl. Let's take a look at
531 an example from "PDL::Ufunc":
532
533 pp_def('maximum_ind',
534 Pars => 'a(n); indx [o] b()',
535 Code => '$GENERIC() cur;
536 PDL_Indx curind;
537 loop(n) %{
538 if (!n || $a() > cur) {cur = $a(); curind = n;}
539 %}
540 $b() = curind;',
541 );
542
543 The function "maximum_ind" finds the index of the largest element of a
544 vector. If you look at the signature you notice that the output
545 argument "b" has been declared with the additional "indx" qualifier.
546 This has the following consequences for type conversions: regardless of
547 the type of the input pdl "a" the output pdl "b" will be of type
548 "PDL_Indx" which makes sense since "b" will represent an index into
549 "a".
550
551 Note that 'curind' is declared as type "PDL_Indx" and not "indx".
552 While most datatype declarations in the 'Pars' section use the same
553 name as the underlying C type, "indx" is a type which is sufficient to
554 handle PDL indexing operations. For 32-bit installs, it can be a
555 32-bit integer type. For 64-bit installs, it will be a 64-bit integer
556 type.
557
558 Furthermore, if you call the function with an existing output pdl "b"
559 its type will not influence the datatype of the operation (see above).
560 Hence, even if "a" is of a smaller type than "b" it will not be
561 converted to match the type of "b" but stays untouched, which saves
562 memory and CPU cycles and is the right thing to do when "b" represents
563 indices. Also note that you can use the 'indx' qualifier together with
564 other qualifiers (the "[o]" and "[t]" qualifiers). Order is significant
565 -- type qualifiers precede creation qualifiers ("[o]" and "[t]").
566
567 The above example also demonstrates typical usage of the "$GENERIC()"
568 macro. It expands to the current type in a so called generic loop.
569 What is a generic loop? As you already heard a PP function has a
570 runtime datatype as determined by the type of the pdl arguments it has
571 been invoked with. The PP generated XS code for this function
572 therefore contains a switch like "switch (type) {case PDL_Byte: ...
573 case PDL_Double: ...}" that selects a case based on the runtime
574 datatype of the function (it's called a type ``loop'' because there is
575 a loop in PP code that generates the cases). In any case your code is
576 inserted once for each PDL type into this switch statement. The
577 "$GENERIC()" macro just expands to the respective type in each copy of
578 your parsed code in this "switch" statement, e.g., in the "case
579 PDL_Byte" section "cur" will expand to "PDL_Byte" and so on for the
580 other case statements. I guess you realise that this is a useful macro
581 to hold values of pdls in some code.
582
583 There are a couple of other qualifiers with similar effects as "indx".
584 For your convenience there are the "float" and "double" qualifiers with
585 analogous consequences on type conversions as "indx". Let's assume you
586 have a very large array for which you want to compute row and column
587 sums with an equivalent of the "sumover" function. However, with the
588 normal definition of "sumover" you might run into problems when your
589 data is, e.g. of type short. A call like
590
591 sumover($large_pdl,($sums = null));
592
593 will result in $sums be of type short and is therefore prone to
594 overflow errors if $large_pdl is a very large array. On the other hand
595 calling
596
597 @dims = $large_pdl->dims; shift @dims;
598 sumover($large_pdl,($sums = zeroes(double,@dims)));
599
600 is not a good alternative either. Now we don't have overflow problems
601 with $sums but at the expense of a type conversion of $large_pdl to
602 double, something bad if this is really a large pdl. That's where
603 "double" comes in handy:
604
605 pp_def('sumoverd',
606 Pars => 'a(n); double [o] b()',
607 Code => 'double tmp=0;
608 loop(n) %{ tmp += a(); %}
609 $b() = tmp;',
610 );
611
612 This gets us around the type conversion and overflow problems. Again,
613 analogous to the "indx" qualifier "double" results in "b" always being
614 of type double regardless of the type of "a" without leading to a type
615 conversion of "a" as a side effect.
616
617 Finally, there are the "type+" qualifiers where type is one of "int" or
618 "float". What shall that mean. Let's illustrate the "int+" qualifier
619 with the actual definition of sumover:
620
621 pp_def('sumover',
622 Pars => 'a(n); int+ [o] b()',
623 Code => '$GENERIC(b) tmp=0;
624 loop(n) %{ tmp += a(); %}
625 $b() = tmp;',
626 );
627
628 As we had already seen for the "int", "float" and "double" qualifiers,
629 a pdl marked with a "type+" qualifier does not influence the datatype
630 of the pdl operation. Its meaning is "make this pdl at least of type
631 "type" or higher, as required by the type of the operation". In the
632 sumover example this means that when you call the function with an "a"
633 of type PDL_Short the output pdl will be of type PDL_Long (just as
634 would have been the case with the "int" qualifier). This again tries to
635 avoid overflow problems when using small datatypes (e.g. byte images).
636 However, when the datatype of the operation is higher than the type
637 specified in the "type+" qualifier "b" will be created with the
638 datatype of the operation, e.g. when "a" is of type double then "b"
639 will be double as well. We hope you agree that this is sensible
640 behaviour for "sumover". It should be obvious how the "float+"
641 qualifier works by analogy. It may become necessary to be able to
642 specify a set of alternative types for the parameters. However, this
643 will probably not be implemented until someone comes up with a
644 reasonable use for it.
645
646 Note that we now had to specify the $GENERIC macro with the name of the
647 pdl to derive the type from that argument. Why is that? If you
648 carefully followed our explanations you will have realised that in some
649 cases "b" will have a different type than the type of the operation.
650 Calling the '$GENERIC' macro with "b" as argument makes sure that the
651 type will always the same as that of "b" in that part of the generic
652 loop.
653
654 This is about all there is to say about the "Pars" section in a
655 "pp_def" call. You should remember that this section defines the
656 signature of a PP defined function, you can use several options to
657 qualify certain arguments as output and temporary args and all
658 dimensions that you can later refer to in the "Code" section are
659 defined by name.
660
661 It is important that you understand the meaning of the signature since
662 in the latest PDL versions you can use it to define threaded functions
663 from within Perl, i.e. what we call Perl level threading. Please check
664 PDL::Indexing for details.
665
666 The Code section
667 The "Code" section contains the actual XS code that will be in the
668 innermost part of a thread loop (if you don't know what a thread loop
669 is then you still haven't read PDL::Indexing; do it now ;) after any PP
670 macros (like $GENERIC) and PP functions have been expanded (like the
671 "loop" function we are going to explain next).
672
673 Let's quickly reiterate the "sumover" example:
674
675 pp_def('sumover',
676 Pars => 'a(n); int+ [o] b()',
677 Code => '$GENERIC(b) tmp=0;
678 loop(n) %{ tmp += a(); %}
679 $b() = tmp;',
680 );
681
682 The "loop" construct in the "Code" section also refers to the dimension
683 name so you don't need to specify any limits: the loop is correctly
684 sized and everything is done for you, again.
685
686 Next, there is the surprising fact that "$a()" and "$b()" do not
687 contain the index. This is not necessary because we're looping over n
688 and both variables know which dimensions they have so they
689 automatically know they're being looped over.
690
691 This feature comes in very handy in many places and makes for much
692 shorter code. Of course, there are times when you want to circumvent
693 this; here is a function which make a matrix symmetric and serves as an
694 example of how to code explicit looping:
695
696 pp_def('symm',
697 Pars => 'a(n,n); [o]c(n,n);',
698 Code => 'loop(n) %{
699 int n2;
700 for(n2=n; n2<$SIZE(n); n2++) {
701 $c(n0 => n, n1 => n2) =
702 $c(n0 => n2, n1 => n) =
703 $a(n0 => n, n1 => n2);
704 }
705 %}
706 '
707 );
708
709 Let's dissect what is happening. Firstly, what is this function
710 supposed to do? From its signature you see that it takes a 2D matrix
711 with equal numbers of columns and rows and outputs a matrix of the same
712 size. From a given input matrix $a it computes a symmetric output
713 matrix $c (symmetric in the matrix sense that A^T = A where ^T means
714 matrix transpose, or in PDL parlance $c == $c->xchg(0,1)). It does this
715 by using only the values on and below the diagonal of $a. In the output
716 matrix $c all values on and below the diagonal are the same as those in
717 $a while those above the diagonal are a mirror image of those below the
718 diagonal (above and below are here interpreted in the way that PDL
719 prints 2D pdls). If this explanation still sounds a bit strange just go
720 ahead, make a little file into which you write this definition, build
721 the new PDL extension (see section on Makefiles for PP code) and try it
722 out with a couple of examples.
723
724 Having explained what the function is supposed to do there are a couple
725 of points worth noting from the syntactical point of view. First, we
726 get the size of the dimension named "n" again by using the $SIZE macro.
727 Second, there are suddenly these funny "n0" and "n1" index names in the
728 code though the signature defines only the dimension "n". Why this? The
729 reason becomes clear when you note that both the first and second
730 dimension of $a and $b are named "n" in the signature of "symm". This
731 tells PDL::PP that the first and second dimension of these arguments
732 should have the same size. Otherwise the generated function will raise
733 a runtime error. However, now in an access to $a and $c PDL::PP cannot
734 figure out which index "n" refers to any more just from the name of the
735 index. Therefore, the indices with equal dimension names get numbered
736 from left to right starting at 0, e.g. in the above example "n0" refers
737 to the first dimension of $a and $c, "n1" to the second and so on.
738
739 In all examples so far, we have only used the "Pars" and "Code" members
740 of the hash that was passed to "pp_def". There are certainly other keys
741 that are recognised by PDL::PP and we will hear about some of them in
742 the course of this document. Find a (non-exhaustive) list of keys in
743 Appendix A. A list of macros and PPfunctions (we have only encountered
744 some of those in the examples above yet) that are expanded in values of
745 the hash argument to "pp_def" is summarised in Appendix B.
746
747 At this point, it might be appropriate to mention that PDL::PP is not a
748 completely static, well designed set of routines (as Tuomas puts it:
749 "stop thinking of PP as a set of routines carved in stone") but rather
750 a collection of things that the PDL::PP author (Tuomas J. Lukka)
751 considered he would have to write often into his PDL extension
752 routines. PP tries to be expandable so that in the future, as new needs
753 arise, new common code can be abstracted back into it. If you want to
754 learn more on why you might want to change PDL::PP and how to do it
755 check the section on PDL::PP internals.
756
757 Handling bad values
758 If you do not have bad-value support compiled into PDL you can ignore
759 this section and the related keys: "BadCode", "HandleBad", ... (try
760 printing out the value of $PDL::Bad::Status - if it equals 0 then move
761 straight on).
762
763 There are several keys and macros used when writing code to handle bad
764 values. The first one is the "HandleBad" key:
765
766 HandleBad => 0
767 This flags a pp-routine as NOT handling bad values. If this routine
768 is sent piddles with their "badflag" set, then a warning message is
769 printed to STDOUT and the piddles are processed as if the value
770 used to represent bad values is a valid number. The "badflag" value
771 is not propagated to the output piddles.
772
773 An example of when this is used is for FFT routines, which
774 generally do not have a way of ignoring part of the data.
775
776 HandleBad => 1
777 This causes PDL::PP to write extra code that ensures the BadCode
778 section is used, and that the "$ISBAD()" macro (and its brethren)
779 work.
780
781 HandleBad is not given
782 If any of the input piddles have their "badflag" set, then the
783 output piddles will have their "badflag" set, but any supplied
784 BadCode is ignored.
785
786 The value of "HandleBad" is used to define the contents of the "BadDoc"
787 key, if it is not given.
788
789 To handle bad values, code must be written somewhat differently; for
790 instance,
791
792 $c() = $a() + $b();
793
794 becomes something like
795
796 if ( $a() != BADVAL && $b() != BADVAL ) {
797 $c() = $a() + $b();
798 } else {
799 $c() = BADVAL;
800 }
801
802 However, we only want the second version if bad values are present in
803 the input piddles (and that bad-value support is wanted!) - otherwise
804 we actually want the original code. This is where the "BadCode" key
805 comes in; you use it to specify the code to execute if bad values may
806 be present, and PP uses both it and the "Code" section to create
807 something like:
808
809 if ( bad_values_are_present ) {
810 fancy_threadloop_stuff {
811 BadCode
812 }
813 } else {
814 fancy_threadloop_stuff {
815 Code
816 }
817 }
818
819 This approach means that there is virtually no overhead when bad values
820 are not present (i.e. the badflag routine returns 0).
821
822 The C preprocessor symbol "PDL_BAD_CODE" is defined when the bad code
823 is compiled, so that you can reduce the amount of code you write. The
824 BadCode section can use the same macros and looping constructs as the
825 Code section. However, it wouldn't be much use without the following
826 additional macros:
827
828 $ISBAD(var)
829 To check whether a piddle's value is bad, use the $ISBAD macro:
830
831 if ( $ISBAD(a()) ) { printf("a() is bad\n"); }
832
833 You can also access given elements of a piddle:
834
835 if ( $ISBAD(a(n=>l)) ) { printf("element %d of a() is bad\n", l); }
836
837 $ISGOOD(var)
838 This is the opposite of the $ISBAD macro.
839
840 $SETBAD(var)
841 For when you want to set an element of a piddle bad.
842
843 $ISBADVAR(c_var,pdl)
844 If you have cached the value of a piddle "$a()" into a c-variable
845 ("foo" say), then to check whether it is bad, use
846 "$ISBADVAR(foo,a)".
847
848 $ISGOODVAR(c_var,pdl)
849 As above, but this time checking that the cached value isn't bad.
850
851 $SETBADVAR(c_var,pdl)
852 To copy the bad value for a piddle into a c variable, use
853 "$SETBADVAR(foo,a)".
854
855 TODO: mention "$PPISBAD()" etc macros.
856
857 Using these macros, the above code could be specified as:
858
859 Code => '$c() = $a() + $b();',
860 BadCode => '
861 if ( $ISBAD(a()) || $ISBAD(b()) ) {
862 $SETBAD(c());
863 } else {
864 $c() = $a() + $b();
865 }',
866
867 Since this is Perl, TMTOWTDI, so you could also write:
868
869 BadCode => '
870 if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
871 $c() = $a() + $b();
872 } else {
873 $SETBAD(c());
874 }',
875
876 You can reduce code repition using the C "PDL_BAD_CODE" macro, using
877 the same code for both of the "Code" and "BadCode" sections:
878
879 #ifdef PDL_BAD_CODE
880 if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
881 #endif PDL_BAD_CODE
882
883 $c() = $a() + $b();
884
885 #ifdef PDL_BAD_CODE
886 } else {
887 $SETBAD(c());
888 }
889 #endif PDL_BAD_CODE
890
891 If you want access to the value of the badflag for a given piddle, you
892 can use the PDL STATE macros:
893
894 $ISPDLSTATEBAD(pdl)
895 $ISPDLSTATEGOOD(pdl)
896 $SETPDLSTATEBAD(pdl)
897 $SETPDLSTATEGOOD(pdl)
898
899 TODO: mention the "FindBadStatusCode" and "CopyBadStatusCode" options
900 to "pp_def", as well as the "BadDoc" key.
901
902 Interfacing your own/library functions using PP
903 Now, consider the following: you have your own C function (that may in
904 fact be part of some library you want to interface to PDL) which takes
905 as arguments two pointers to vectors of double:
906
907 void myfunc(int n,double *v1,double *v2);
908
909 The correct way of defining the PDL function is
910
911 pp_def('myfunc',
912 Pars => 'a(n); [o]b(n);',
913 GenericTypes => ['D'],
914 Code => 'myfunc($SIZE(n),$P(a),$P(b));'
915 );
916
917 The "$P("par")" syntax returns a pointer to the first element and the
918 other elements are guaranteed to lie after that.
919
920 Notice that here it is possible to make many mistakes. First, $SIZE(n)
921 must be used instead of "n". Second, you shouldn't put any loops in
922 this code. Third, here we encounter a new hash key recognised by
923 PDL::PP : the "GenericTypes" declaration tells PDL::PP to ONLY GENERATE
924 THE TYPELOOP FOP THE LIST OF TYPES SPECIFIED. In this case "double".
925 This has two advantages. Firstly the size of the compiled code is
926 reduced vastly, secondly if non-double arguments are passed to
927 "myfunc()" PDL will automatically convert them to double before passing
928 to the external C routine and convert them back afterwards.
929
930 One can also use "Pars" to qualify the types of individual arguments.
931 Thus one could also write this as:
932
933 pp_def('myfunc',
934 Pars => 'double a(n); double [o]b(n);',
935 Code => 'myfunc($SIZE(n),$P(a),$P(b));'
936 );
937
938 The type specification in "Pars" exempts the argument from variation in
939 the typeloop - rather it is automatically converted to and from the
940 type specified. This is obviously useful in a more general example,
941 e.g.:
942
943 void myfunc(int n,float *v1,long *v2);
944
945 pp_def('myfunc',
946 Pars => 'float a(n); long [o]b(n);',
947 GenericTypes => ['F'],
948 Code => 'myfunc($SIZE(n),$P(a),$P(b));'
949 );
950
951 Note we still use "GenericTypes" to reduce the size of the type loop,
952 obviously PP could in principle spot this and do it automatically
953 though the code has yet to attain that level of sophistication!
954
955 Finally note when types are converted automatically one MUST use the
956 "[o]" qualifier for output variables or you hard-won changes will get
957 optimised away by PP!
958
959 If you interface a large library you can automate the interfacing even
960 further. Perl can help you again(!) in doing this. In many libraries
961 you have certain calling conventions. This can be exploited. In short,
962 you can write a little parser (which is really not difficult in Perl)
963 that then generates the calls to "pp_def" from parsed descriptions of
964 the functions in that library. For an example, please check the Slatec
965 interface in the "Lib" tree of the PDL distribution. If you want to
966 check (during debugging) which calls to PP functions your Perl code
967 generated a little helper package comes in handy which replaces the PP
968 functions by identically named ones that dump their arguments to
969 stdout.
970
971 Just say
972
973 perl -MPDL::PP::Dump myfile.pd
974
975 to see the calls to "pp_def" and friends. Try it with ops.pd and
976 slatec.pd. If you're interested (or want to enhance it), the source is
977 in Basic/Gen/PP/Dump.pm
978
979 Other macros and functions in the Code section
980 Macros: So far we have encountered the $SIZE, $GENERIC and $P macros.
981 Now we are going to quickly explain the other macros that are expanded
982 in the "Code" section of PDL::PP along with examples of their usage.
983
984 $T The $T macro is used for type switches. This is very useful when you
985 have to use different external (e.g. library) functions depending on
986 the input type of arguments. The general syntax is
987
988 $Ttypeletters(type_alternatives)
989
990 where "typeletters" is a permutation of a subset of the letters
991 "BSULFD" which stand for Byte, Short, Ushort, etc. and
992 "type_alternatives" are the expansions when the type of the PP
993 operation is equal to that indicated by the respective letter. Let's
994 illustrate this incomprehensible description by an example. Assuming
995 you have two C functions with prototypes
996
997 void float_func(float *in, float *out);
998 void double_func(double *in, double *out);
999
1000 which do basically the same thing but one accepts float and the
1001 other double pointers. You could interface them to PDL by defining a
1002 generic function "foofunc" (which will call the correct function
1003 depending on the type of the transformation):
1004
1005 pp_def('foofunc',
1006 Pars => ' a(n); [o] b();',
1007 Code => ' $TFD(float_func,double_func) ($P(a),$P(b));'
1008 GenericTypes => [qw(F D)],
1009 );
1010
1011 Please note that you can't say
1012
1013 Code => ' $TFD(float,double)_func ($P(a),$P(b));'
1014
1015 since the $T macro expands with trailing spaces, analogously to C
1016 preprocessor macros. The slightly longer form illustrated above is
1017 correct. If you really want brevity, you can of course do
1018
1019 '$TBSULFD('.(join ',',map {"long_identifier_name_$_"}
1020 qw/byt short unseigned lounge flotte dubble/).');'
1021
1022 $PP
1023 The $PP macro is used for a so called physical pointer access. The
1024 physical refers to some internal optimisations of PDL (for those who
1025 are familiar with the PDL core we are talking about the vaffine
1026 optimisations). This macro is mainly for internal use and you
1027 shouldn't need to use it in any of your normal code.
1028
1029 $COMP (and the "OtherPars" section)
1030 The $COMP macro is used to access non-pdl values in the code
1031 section. Its name is derived from the implementation of
1032 transformations in PDL. The variables you can refer to using $COMP
1033 are members of the ``compiled'' structure that represents the PDL
1034 transformation in question but does not yet contain any information
1035 about dimensions (for further details check PDL::Internals).
1036 However, you can treat $COMP just as a black box without knowing
1037 anything about the implementation of transformations in PDL. So when
1038 would you use this macro? Its main usage is to access values of
1039 arguments that are declared in the "OtherPars" section of a "pp_def"
1040 definition. But then you haven't heard about the "OtherPars" key
1041 yet?! Let's have another example that illustrates typical usage of
1042 both new features:
1043
1044 pp_def('pnmout',
1045 Pars => 'a(m)',
1046 OtherPars => "char* fd",
1047 GenericTypes => [qw(B U S L)],
1048 Code => 'PerlIO *fp;
1049 IO *io;
1050
1051 io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
1052 if (!io || !(fp = IoIFP(io)))
1053 croak("Can\'t figure out FP");
1054
1055 if (PerlIO_write(fp,$P(a),len) != len)
1056 croak("Error writing pnm file");
1057 ');
1058
1059 This function is used to write data from a pdl to a file. The file
1060 descriptor is passed as a string into this function. This parameter
1061 does not go into the "Pars" section since it cannot be usefully
1062 treated like a pdl but rather into the aptly named "OtherPars"
1063 section. Parameters in the "OtherPars" section follow those in the
1064 "Pars" section when invoking the function, i.e.
1065
1066 open FILE,">out.dat" or die "couldn't open out.dat";
1067 pnmout($pdl,'FILE');
1068
1069 When you want to access this parameter inside the code section you
1070 have to tell PP by using the $COMP macro, i.e. you write "$COMP(fd)"
1071 as in the example. Otherwise PP wouldn't know that the "fd" you are
1072 referring to is the same as that specified in the "OtherPars"
1073 section.
1074
1075 Another use for the "OtherPars" section is to set a named dimension
1076 in the signature. Let's have an example how that is done:
1077
1078 pp_def('setdim',
1079 Pars => '[o] a(n)',
1080 OtherPars => 'int ns => n',
1081 Code => 'loop(n) %{ $a() = n; %}',
1082 );
1083
1084 This says that the named dimension "n" will be initialised from the
1085 value of the other parameter "ns" which is of integer type (I guess
1086 you have realised that we use the "CType From => named_dim" syntax).
1087 Now you can call this function in the usual way:
1088
1089 setdim(($x=null),5);
1090 print $x;
1091 [ 0 1 2 3 4 ]
1092
1093 Admittedly this function is not very useful but it demonstrates how
1094 it works. If you call the function with an existing pdl and you
1095 don't need to explicitly specify the size of "n" since PDL::PP can
1096 figure it out from the dimensions of the non-null pdl. In that case
1097 you just give the dimension parameter as "-1":
1098
1099 $x = hist($y);
1100 setdim($x,-1);
1101
1102 That should do it.
1103
1104 The only PP function that we have used in the examples so far is
1105 "loop". Additionally, there are currently two other functions which
1106 are recognised in the "Code" section:
1107
1108 threadloop
1109 As we heard above the signature of a PP defined function defines the
1110 dimensions of all the pdl arguments involved in a primitive
1111 operation. However, you often call the functions that you defined
1112 with PP with pdls that have more dimensions than those specified in
1113 the signature. In this case the primitive operation is performed on
1114 all subslices of appropriate dimensionality in what is called a
1115 thread loop (see also overview above and PDL::Indexing). Assuming you
1116 have some notion of this concept you will probably appreciate that
1117 the operation specified in the code section should be optimised since
1118 this is the tightest loop inside a thread loop. However, if you
1119 revisit the example where we define the "pnmout" function, you will
1120 quickly realise that looking up the "IO" file descriptor in the inner
1121 thread loop is not very efficient when writing a pdl with many rows.
1122 A better approach would be to look up the "IO" descriptor once
1123 outside the thread loop and use its value then inside the tightest
1124 thread loop. This is exactly where the "threadloop" function comes in
1125 handy. Here is an improved definition of "pnmout" which uses this
1126 function:
1127
1128 pp_def('pnmout',
1129 Pars => 'a(m)',
1130 OtherPars => "char* fd",
1131 GenericTypes => [qw(B U S L)],
1132 Code => 'PerlIO *fp;
1133 IO *io;
1134 int len;
1135
1136 io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
1137 if (!io || !(fp = IoIFP(io)))
1138 croak("Can\'t figure out FP");
1139
1140 len = $SIZE(m) * sizeof($GENERIC());
1141
1142 threadloop %{
1143 if (PerlIO_write(fp,$P(a),len) != len)
1144 croak("Error writing pnm file");
1145 %}
1146 ');
1147
1148 This works as follows. Normally the C code you write inside the
1149 "Code" section is placed inside a thread loop (i.e. PP generates the
1150 appropriate wrapping XS code around it). However, when you explicitly
1151 use the "threadloop" function, PDL::PP recognises this and doesn't
1152 wrap your code with an additional thread loop. This has the effect
1153 that code you write outside the thread loop is only executed once per
1154 transformation and just the code with in the surrounding "%{ ... %}"
1155 pair is placed within the tightest thread loop. This also comes in
1156 handy when you want to perform a decision (or any other code,
1157 especially CPU intensive code) only once per thread, i.e.
1158
1159 pp_addhdr('
1160 #define RAW 0
1161 #define ASCII 1
1162 ');
1163 pp_def('do_raworascii',
1164 Pars => 'a(); b(); [o]c()',
1165 OtherPars => 'int mode',
1166 Code => ' switch ($COMP(mode)) {
1167 case RAW:
1168 threadloop %{
1169 /* do raw stuff */
1170 %}
1171 break;
1172 case ASCII:
1173 threadloop %{
1174 /* do ASCII stuff */
1175 %}
1176 break;
1177 default:
1178 croak("unknown mode");
1179 }'
1180 );
1181
1182 types
1183 The types function works similar to the $T macro. However, with the
1184 "types" function the code in the following block (delimited by "%{"
1185 and "%}" as usual) is executed for all those cases in which the
1186 datatype of the operation is any of the types represented by the
1187 letters in the argument to "type", e.g.
1188
1189 Code => '...
1190
1191 types(BSUL) %{
1192 /* do integer type operation */
1193 %}
1194 types(FD) %{
1195 /* do floating point operation */
1196 %}
1197 ...'
1198
1199 The RedoDimsCode Section
1200 The "RedoDimsCode" key is an optional key that is used to compute
1201 dimensions of piddles at runtime in case the standard rules for
1202 computing dimensions from the signature are not sufficient. The
1203 contents of the "RedoDimsCode" entry is interpreted in the same way
1204 that the Code section is interpreted-- i.e., PP macros are expanded and
1205 the result is interpreted as C code. The purpose of the code is to set
1206 the size of some dimensions that appear in the signature. Storage
1207 allocation and threadloops and so forth will be set up as if the
1208 computed dimension had appeared in the signature. In your code, you
1209 first compute the desired size of a named dimension in the signature
1210 according to your needs and then assign that value to it via the
1211 $SIZE() macro.
1212
1213 As an example, consider the following situation. You are interfacing an
1214 external library routine that requires an temporary array for workspace
1215 to be passed as an argument. Two input data arrays that are passed are
1216 p(m) and x(n). The output data array is y(n). The routine requires a
1217 workspace array with a length of n+m*m, and you'd like the storage
1218 created automatically just like it would be for any piddle flagged with
1219 [t] or [o]. What you'd like is to say something like
1220
1221 pp_def( "myexternalfunc",
1222 Pars => " p(m); x(n); [o] y; [t] work(n+m*m); ", ...
1223
1224 but that won't work, because PP can't interpret expressions with
1225 arithmetic in the signature. Instead you write
1226
1227 pp_def(
1228 "myexternalfunc",
1229 Pars => ' p(m); x(n); [o] y(); [t] work(wn); ',
1230 RedoDimsCode => '
1231 PDL_Indx im = $PDL(p)->dims[0];
1232 PDL_Indx in = $PDL(x)->dims[0];
1233 PDL_Indx min = in + im * im;
1234 PDL_Indx inw = $PDL(work)->dims[0];
1235 $SIZE(wn) = inw >= min ? inw : min;
1236 ',
1237 Code => '
1238 externalfunc( $P(p), $P(x), $SIZE(m), $SIZE(n), $P(work) );
1239 '
1240 );
1241
1242 This code works as follows: The macro $PDL(p) expands to a pointer to
1243 the pdl struct for the piddle p. You don't want a pointer to the data
1244 ( ie $P ) in this case, because you want to access the methods for the
1245 piddle on the C level. You get the first dimension of each of the
1246 piddles and store them in integers. Then you compute the minimum length
1247 the work array can be. If the user sent a piddle "work" with sufficient
1248 storage, then leave it alone. If the user sent, say a null pdl, or no
1249 pdl at all, then the size of wn will be zero and you reset it to the
1250 minimum value. Before the code in the Code section is executed PP will
1251 create the proper storage for "work" if it does not exist. Note that
1252 you only took the first dimension of "p" and "x" because the user may
1253 have sent piddles with extra threading dimensions. Of course, the
1254 temporary piddle "work" (note the [t] flag) should not be given any
1255 thread dimensions anyway.
1256
1257 You can also use "RedoDimsCode" to set the dimension of a piddle
1258 flagged with [o]. In this case you set the dimensions for the named
1259 dimension in the signature using $SIZE() as in the preceding example.
1260 However, because the piddle is flagged with [o] instead of [t],
1261 threading dimensions will be added if required just as if the size of
1262 the dimension were computed from the signature according to the usual
1263 rules. Here is an example from PDL::Math
1264
1265 pp_def("polyroots",
1266 Pars => 'cr(n); ci(n); [o]rr(m); [o]ri(m);',
1267 RedoDimsCode => 'PDL_Indx sn = $PDL(cr)->dims[0]; $SIZE(m) = sn-1;',
1268
1269 The input piddles are the real and imaginary parts of complex
1270 coefficients of a polynomial. The output piddles are real and imaginary
1271 parts of the roots. There are "n" roots to an "n"th order polynomial
1272 and such a polynomial has "n+1" coefficients (the zeoreth through the
1273 "n"th). In this example, threading will work correctly. That is, the
1274 first dimension of the output piddle with have its dimension adjusted,
1275 but other threading dimensions will be assigned just as if there were
1276 no "RedoDimsCode".
1277
1278 Typemap handling in the "OtherPars" section
1279 The "OtherPars" section discussed above is very often absolutely
1280 crucial when you interface external libraries with PDL. However in many
1281 cases the external libraries either use derived types or pointers of
1282 various types.
1283
1284 The standard way to handle this in Perl is to use a "typemap" file.
1285 This is discussed in some detail in perlxs in the standard Perl
1286 documentation. In PP the functionality is very similar, so you can
1287 create a "typemap" file in the directory where your PP file resides and
1288 when it is built it is automatically read in to figure out the
1289 appropriate translation between the C type and Perl's built-in type.
1290
1291 That said, there are a couple of important differences from the general
1292 handling of types in XS. The first, and probably most important, is
1293 that at the moment pointers to types are not allowed in the "OtherPars"
1294 section. To get around this limitation you must use the "IV" type
1295 (thanks to Judd Taylor for pointing out that this is necessary for
1296 portability).
1297
1298 It is probably best to illustrate this with a couple of code-snippets:
1299
1300 For instance the "gsl_spline_init" function has the following C
1301 declaration:
1302
1303 int gsl_spline_init(gsl_spline * spline,
1304 const double xa[], const double ya[], size_t size);
1305
1306 Clearly the "xa" and "ya" arrays are candidates for being passed in as
1307 piddles and the "size" argument is just the length of these piddles so
1308 that can be handled by the "$SIZE()" macro in PP. The problem is the
1309 pointer to the "gsl_spline" type. The natural solution would be to
1310 write an "OtherPars" declaration of the form
1311
1312 OtherPars => 'gsl_spline *spl'
1313
1314 and write a short "typemap" file which handled this type. This does not
1315 work at present however! So what you have to do is to go around the
1316 problem slightly (and in some ways this is easier too!):
1317
1318 The solution is to declare "spline" in the "OtherPars" section using an
1319 "Integer Value", "IV". This hides the nature of the variable from PP
1320 and you then need to (well to avoid compiler warnings at least!)
1321 perform a type cast when you use the variable in your code. Thus
1322 "OtherPars" should take the form:
1323
1324 OtherPars => 'IV spl'
1325
1326 and when you use it in the code you will write
1327
1328 INT2PTR(gsl_spline *, $COMP(spl))
1329
1330 where the Perl API macro "INT2PTR" has been used to handle the pointer
1331 cast to avoid compiler warnings and problems for machines with mixed
1332 32bit and 64bit Perl configurations. Putting this together as Andres
1333 Jordan has done (with the modification using "IV" by Judd Taylor) in
1334 the "gsl_interp.pd" in the distribution source you get:
1335
1336 pp_def('init_meat',
1337 Pars => 'double x(n); double y(n);',
1338 OtherPars => 'IV spl',
1339 Code =>'
1340 gsl_spline_init,( INT2PTR(gsl_spline *, $COMP(spl)), $P(x),$P(y),$SIZE(n)));'
1341 );
1342
1343 where I have removed a macro wrapper call, but that would obscure the
1344 discussion.
1345
1346 The other minor difference as compared to the standard typemap handling
1347 in Perl, is that the user cannot specify non-standard typemap locations
1348 or typemap filenames using the "TYPEMAPS" option in MakeMaker... Thus
1349 you can only use a file called "typemap" and/or the "IV" trick above.
1350
1351 Other useful PP keys in data operation definitions
1352 You have already heard about the "OtherPars" key. Currently, there are
1353 not many other keys for a data operation that will be useful in normal
1354 (whatever that is) PP programming. In fact, it would be interesting to
1355 hear about a case where you think you need more than what is provided
1356 at the moment. Please speak up on one of the PDL mailing lists. Most
1357 other keys recognised by "pp_def" are only really useful for what we
1358 call slice operations (see also above).
1359
1360 One thing that is strongly being planned is variable number of
1361 arguments, which will be a little tricky.
1362
1363 An incomplete list of the available keys:
1364
1365 Inplace
1366 Setting this key marks the routine as working inplace - ie the
1367 input and output piddles are the same. An example is
1368 "$x->inplace->sqrt()" (or "sqrt(inplace($x))").
1369
1370 Inplace => 1
1371 Use when the routine is a unary function, such as "sqrt".
1372
1373 Inplace => ['a']
1374 If there are more than one input piddles, specify the name of
1375 the one that can be changed inplace using an array reference.
1376
1377 Inplace => ['a','b']
1378 If there are more than one output piddle, specify the name of
1379 the input piddle and output piddle in a 2-element array
1380 reference. This probably isn't needed, but left in for
1381 completeness.
1382
1383 If bad values are being used, care must be taken to ensure the
1384 propagation of the badflag when inplace is being used; consider
1385 this excerpt from Basic/Bad/bad.pd:
1386
1387 pp_def('replacebad',HandleBad => 1,
1388 Pars => 'a(); [o]b();',
1389 OtherPars => 'double newval',
1390 Inplace => 1,
1391 CopyBadStatusCode =>
1392 '/* propagate badflag if inplace AND it has changed */
1393 if ( a == b && $ISPDLSTATEBAD(a) )
1394 PDL->propagate_badflag( b, 0 );
1395
1396 /* always make sure the output is "good" */
1397 $SETPDLSTATEGOOD(b);
1398 ',
1399 ...
1400
1401 Since this routine removes all bad values, then the output piddle
1402 had its bad flag cleared. If run inplace (so "a == b"), then we
1403 have to tell all the children of "a" that the bad flag has been
1404 cleared (to save time we make sure that we call
1405 "PDL->propagate_badgflag" only if the input piddle had its bad flag
1406 set).
1407
1408 NOTE: one idea is that the documentation for the routine could be
1409 automatically flagged to indicate that it can be executed inplace,
1410 ie something similar to how "HandleBad" sets "BadDoc" if it's not
1411 supplied (it's not an ideal solution).
1412
1413 Other PDL::PP functions to support concise package definition
1414 So far, we have described the "pp_def" and "pp_done" functions. PDL::PP
1415 exports a few other functions to aid you in writing concise PDL
1416 extension package definitions.
1417
1418 pp_addhdr
1419
1420 Often when you interface library functions as in the above example you
1421 have to include additional C include files. Since the XS file is
1422 generated by PP we need some means to make PP insert the appropriate
1423 include directives in the right place into the generated XS file. To
1424 this end there is the "pp_addhdr" function. This is also the function
1425 to use when you want to define some C functions for internal use by
1426 some of the XS functions (which are mostly functions defined by
1427 "pp_def"). By including these functions here you make sure that
1428 PDL::PP inserts your code before the point where the actual XS module
1429 section begins and will therefore be left untouched by xsubpp (cf.
1430 perlxs and perlxstut man pages).
1431
1432 A typical call would be
1433
1434 pp_addhdr('
1435 #include <unistd.h> /* we need defs of XXXX */
1436 #include "libprotos.h" /* prototypes of library functions */
1437 #include "mylocaldecs.h" /* Local decs */
1438
1439 static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
1440 {
1441 /* do some calculations with the data */
1442 }
1443 ');
1444
1445 This ensures that all the constants and prototypes you need will be
1446 properly included and that you can use the internal functions defined
1447 here in the "pp_def"s, e.g.:
1448
1449 pp_def('barfoo',
1450 Pars => ' a(n); [o] b(n)',
1451 GenericTypes => ['B'],
1452 Code => ' PDL_Indx ns = $SIZE(n);
1453 do_the_real_work($P(a),$P(b),ns);
1454 ',
1455 );
1456
1457 pp_addpm
1458
1459 In many cases the actual PP code (meaning the arguments to "pp_def"
1460 calls) is only part of the package you are currently implementing.
1461 Often there is additional Perl code and XS code you would normally have
1462 written into the pm and XS files which are now automatically generated
1463 by PP. So how to get this stuff into those dynamically generated files?
1464 Fortunately, there are a couple of functions, generally called
1465 "pp_addXXX" that assist you in doing this.
1466
1467 Let's assume you have additional Perl code that should go into the
1468 generated pm-file. This is easily achieved with the "pp_addpm" command:
1469
1470 pp_addpm(<<'EOD');
1471
1472 =head1 NAME
1473
1474 PDL::Lib::Mylib -- a PDL interface to the Mylib library
1475
1476 =head1 DESCRIPTION
1477
1478 This package implements an interface to the Mylib package with full
1479 threading and indexing support (see L<PDL::Indexing>).
1480
1481 =cut
1482
1483 use PGPLOT;
1484
1485 =head2 use_myfunc
1486 this function applies the myfunc operation to all the
1487 elements of the input pdl regardless of dimensions
1488 and returns the sum of the result
1489 =cut
1490
1491 sub use_myfunc {
1492 my $pdl = shift;
1493
1494 myfunc($pdl->clump(-1),($res=null));
1495
1496 return $res->sum;
1497 }
1498
1499 EOD
1500
1501 pp_add_exported
1502
1503 You have probably got the idea. In some cases you also want to export
1504 your additional functions. To avoid getting into trouble with PP which
1505 also messes around with the @EXPORT array you just tell PP to add your
1506 functions to the list of exported functions:
1507
1508 pp_add_exported('use_myfunc gethynx');
1509
1510 pp_add_isa
1511
1512 The "pp_add_isa" command works like the the "pp_add_exported" function.
1513 The arguments to "pp_add_isa" are added the @ISA list, e.g.
1514
1515 pp_add_isa(' Some::Other::Class ');
1516
1517 pp_bless
1518
1519 If your pp_def routines are to be used as object methods use "pp_bless"
1520 to specify the package (i.e. class) to which your pp_defed methods will
1521 be added. For example, "pp_bless('PDL::MyClass')". The default is "PDL"
1522 if this is omitted.
1523
1524 pp_addxs
1525
1526 Sometimes you want to add extra XS code of your own (that is generally
1527 not involved with any threading/indexing issues but supplies some other
1528 functionality you want to access from the Perl side) to the generated
1529 XS file, for example
1530
1531 pp_addxs('','
1532
1533 # Determine endianness of machine
1534
1535 int
1536 isbigendian()
1537 CODE:
1538 unsigned short i;
1539 PDL_Byte *b;
1540
1541 i = 42; b = (PDL_Byte*) (void*) &i;
1542
1543 if (*b == 42)
1544 RETVAL = 0;
1545 else if (*(b+1) == 42)
1546 RETVAL = 1;
1547 else
1548 croak("Impossible - machine is neither big nor little endian!!\n");
1549 OUTPUT:
1550 RETVAL
1551 ');
1552
1553 Especially "pp_add_exported" and "pp_addxs" should be used with care.
1554 PP uses PDL::Exporter, hence letting PP export your function means that
1555 they get added to the standard list of function exported by default
1556 (the list defined by the export tag ``:Func''). If you use "pp_addxs"
1557 you shouldn't try to do anything that involves threading or indexing
1558 directly. PP is much better at generating the appropriate code from
1559 your definitions.
1560
1561 pp_add_boot
1562
1563 Finally, you may want to add some code to the BOOT section of the XS
1564 file (if you don't know what that is check perlxs). This is easily done
1565 with the "pp_add_boot" command:
1566
1567 pp_add_boot(<<EOB);
1568 descrip = mylib_initialize(KEEP_OPEN);
1569
1570 if (descrip == NULL)
1571 croak("Can't initialize library");
1572
1573 GlobalStruc->descrip = descrip;
1574 GlobalStruc->maxfiles = 200;
1575 EOB
1576
1577 pp_export_nothing
1578
1579 By default, PP.pm puts all subs defined using the pp_def function into
1580 the output .pm file's EXPORT list. This can create problems if you are
1581 creating a subclassed object where you don't want any methods exported.
1582 (i.e. the methods will only be called using the $object->method
1583 syntax).
1584
1585 For these cases you can call pp_export_nothing() to clear out the
1586 export list. Example (At the end of the .pd file):
1587
1588 pp_export_nothing();
1589 pp_done();
1590
1591 pp_core_importList
1592
1593 By default, PP.pm puts the 'use Core;' line into the output .pm file.
1594 This imports Core's exported names into the current namespace, which
1595 can create problems if you are over-riding one of Core's methods in the
1596 current file. You end up getting messages like "Warning: sub sumover
1597 redefined in file subclass.pm" when running the program.
1598
1599 For these cases the pp_core_importList can be used to change what is
1600 imported from Core.pm. For example:
1601
1602 pp_core_importList('()')
1603
1604 This would result in
1605
1606 use Core();
1607
1608 being generated in the output .pm file. This would result in no names
1609 being imported from Core.pm. Similarly, calling
1610
1611 pp_core_importList(' qw/ barf /')
1612
1613 would result in
1614
1615 use Core qw/ barf/;
1616
1617 being generated in the output .pm file. This would result in just
1618 'barf' being imported from Core.pm.
1619
1620 pp_setversion
1621
1622 I am pretty sure that this allows you to simultaneously set the .pm and
1623 .xs files' versions, thus avoiding unnecessary version-skew between the
1624 two. To use this, simply have the following line at some point in your
1625 .pd file:
1626
1627 pp_setversion('0.0.3');
1628
1629 However, don't use this if you use Module::Build::PDL. See that
1630 module's documentation for details.
1631
1632 pp_deprecate_module
1633
1634 If a particular module is deemed obsolete, this function can be used to
1635 mark it as deprecated. This has the effect of emitting a warning when a
1636 user tries to "use" the module. The generated POD for this module also
1637 carries a deprecation notice. The replacement module can be passed as
1638 an argument like this:
1639
1640 pp_deprecate_module( infavor => "PDL::NewNonDeprecatedModule" );
1641
1642 Note that function affects only the runtime warning and the POD.
1643
1645 Let's say that you have a function in your module called PDL::foo that
1646 uses the PP function "bar_pp" to do the heavy lifting. But you don't
1647 want to advertise that "bar_pp" exists. To do this, you must move your
1648 PP function to the top of your module file, then call
1649
1650 pp_export_nothing()
1651
1652 to clear the "EXPORT" list. To ensure that no documentation (even the
1653 default PP docs) is generated, set
1654
1655 Doc => undef
1656
1657 and to prevent the function from being added to the symbol table, set
1658
1659 PMFunc => ''
1660
1661 in your pp_def declaration (see Image2D.pd for an example). This will
1662 effectively make your PP function "private." However, it is always
1663 accessible via PDL::bar_pp due to Perl's module design. But making it
1664 private will cause the user to go very far out of his or her way to use
1665 it, so he or she shoulders the consequences!
1666
1668 The slice operation section of this manual is provided using dataflow
1669 and lazy evaluation: when you need it, ask Tjl to write it. a delivery
1670 in a week from when I receive the email is 95% probable and two week
1671 delivery is 99% probable.
1672
1673 And anyway, the slice operations require a much more intimate knowledge
1674 of PDL internals than the data operations. Furthermore, the complexity
1675 of the issues involved is considerably higher than that in the average
1676 data operation. If you would like to convince yourself of this fact
1677 take a look at the Basic/Slices/slices.pd file in the PDL distribution
1678 :-). Nevertheless, functions generated using the slice operations are
1679 at the heart of the index manipulation and dataflow capabilities of
1680 PDL.
1681
1682 Also, there are a lot of dirty issues with virtual piddles and vaffines
1683 which we shall entirely skip here.
1684
1685 Slices and bad values
1686 Slice operations need to be able to handle bad values (if support is
1687 compiled into PDL). The easiest thing to do is look at
1688 Basic/Slices/slices.pd to see how this works.
1689
1690 Along with "BadCode", there are also the "BadBackCode" and
1691 "BadRedoDimsCode" keys for "pp_def". However, any "EquivCPOffsCode"
1692 should not need changing, since any changes are absorbed into the
1693 definition of the "$EQUIVCPOFFS()" macro (i.e. it is handled
1694 automatically by PDL::PP).
1695
1696 A few notes on writing a slicing routine...
1697 The following few paragraphs describe writing of a new slicing routine
1698 ('range'); any errors are CED's. (--CED 26-Aug-2002)
1699
1701 For printing warning messages or aborting/dieing, you can call "warn"
1702 or "barf" from PP code. However, you should be aware that these calls
1703 have been redefined using C preprocessor macros to "PDL->barf" and
1704 "PDL->warn". These redefinitions are in place to keep you from
1705 inadvertently calling perl's "warn" or "barf" directly, which can cause
1706 segfaults during pthreading (i.e. processor multi-threading).
1707
1708 PDL's own versions of "barf" and "warn" will queue-up warning or barf
1709 messages until after pthreading is completed, and then call the perl
1710 versions of these routines.
1711
1712 See PDL::ParallelCPU for more information on pthreading.
1713
1715 The PDL "Core" structure, defined in Basic/Core/pdlcore.h.PL, contains
1716 pointers to a number of routines that may be useful to you. The
1717 majority of these routines deal with manipulating piddles, but some are
1718 more general:
1719
1720 PDL->qsort_B( PDL_Byte *xx, PDL_Indx a, PDL_Indx b )
1721 Sort the array "xx" between the indices "a" and "b". There are
1722 also versions for the other PDL datatypes, with postfix "_S", "_U",
1723 "_L", "_N", "_Q", "_F", and "_D". Any module using this must
1724 ensure that "PDL::Ufunc" is loaded.
1725
1726 PDL->qsort_ind_B( PDL_Byte *xx, PDL_Indx *ix, PDL_Indx a, PDL_Indx b )
1727 As for "PDL->qsort_B", but this time sorting the indices rather
1728 than the data.
1729
1730 The routine "med2d" in Lib/Image2D/image2d.pd shows how such routines
1731 are used.
1732
1734 If you are going to generate a package from your PP file (typical file
1735 extensions are ".pd" or ".pp" for the files containing PP code) it is
1736 easiest and safest to leave generation of the appropriate commands to
1737 the Makefile. In the following we will outline the typical format of a
1738 Perl Makefile to automatically build and install your package from a
1739 description in a PP file. Most of the rules to build the xs, pm and
1740 other required files from the PP file are already predefined in the
1741 PDL::Core::Dev package. We just have to tell MakeMaker to use it.
1742
1743 In most cases you can define your Makefile like
1744
1745 # Makefile.PL for a package defined by PP code.
1746
1747 use PDL::Core::Dev; # Pick up development utilities
1748 use ExtUtils::MakeMaker;
1749
1750 $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
1751 %hash = pdlpp_stdargs($package);
1752 $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
1753 $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
1754 $hash{'VERSION_FROM'} = 'mylib.pd';
1755 WriteMakefile(%hash);
1756
1757 sub MY::postamble { pdlpp_postamble($package); }
1758
1759 Here, the list in $package is: first: PP source file name, then the
1760 prefix for the produced files and finally the whole package name. You
1761 can modify the hash in whatever way you like but it would be reasonable
1762 to stay within some limits so that your package will continue to work
1763 with later versions of PDL.
1764
1765 If you don't want to use prepackaged arguments, here is a generic
1766 Makefile.PL that you can adapt for your own needs:
1767
1768 # Makefile.PL for a package defined by PP code.
1769
1770 use PDL::Core::Dev; # Pick up development utilities
1771 use ExtUtils::MakeMaker;
1772
1773 WriteMakefile(
1774 'NAME' => 'PDL::Lib::Mylib',
1775 'VERSION_FROM' => 'mylib.pd',
1776 'TYPEMAPS' => [&PDL_TYPEMAP()],
1777 'OBJECT' => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
1778 'PM' => { 'Mylib.pm' => '$(INST_LIBDIR)/Mylib.pm'},
1779 'INC' => &PDL_INCLUDE(), # add include dirs as required by your lib
1780 'LIBS' => [''], # add link directives as necessary
1781 'clean' => {'FILES' =>
1782 'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
1783 additional_Ccode$(OBJ_EXT)'},
1784 );
1785
1786 # Add genpp rule; this will invoke PDL::PP on our PP file
1787 # the argument is an array reference where the array has three string elements:
1788 # arg1: name of the source file that contains the PP code
1789 # arg2: basename of the xs and pm files to be generated
1790 # arg3: name of the package that is to be generated
1791 sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }
1792
1793 To make life even easier PDL::Core::Dev defines the function
1794 "pdlpp_stdargs" that returns a hash with default values that can be
1795 passed (either directly or after appropriate modification) to a call to
1796 WriteMakefile. Currently, "pdlpp_stdargs" returns a hash where the
1797 keys are filled in as follows:
1798
1799 (
1800 'NAME' => $mod,
1801 'TYPEMAPS' => [&PDL_TYPEMAP()],
1802 'OBJECT' => "$pref\$(OBJ_EXT)",
1803 PM => {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
1804 MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
1805 'INC' => &PDL_INCLUDE(),
1806 'LIBS' => [''],
1807 'clean' => {'FILES' => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
1808 )
1809
1810 Here, $src is the name of the source file with PP code, $pref the
1811 prefix for the generated .pm and .xs files and $mod the name of the
1812 extension module to generate.
1813
1815 The internals of the current version consist of a large table which
1816 gives the rules according to which things are translated and the subs
1817 which implement these rules.
1818
1819 Later on, it would be good to make the table modifiable by the user so
1820 that different things may be tried.
1821
1822 [Meta comment: here will hopefully be more in the future; currently,
1823 your best bet will be to read the source code :-( or ask on the list
1824 (try the latter first) ]
1825
1827 Unless otherwise specified, the arguments are strings. Keys marked with
1828 (bad) are only used if bad-value support is compiled into PDL.
1829
1830 Pars
1831 define the signature of your function
1832
1833 OtherPars
1834 arguments which are not pdls. Default: nothing. This is a semi-
1835 colon separated list of arguments, e.g., "OtherPars=>'int k; double
1836 value; char* fd'". See $COMP(x) and also the same entry in Appendix
1837 B.
1838
1839 Code
1840 the actual code that implements the functionality; several PP
1841 macros and PP functions are recognised in the string value
1842
1843 HandleBad (bad)
1844 If set to 1, the routine is assumed to support bad values and the
1845 code in the BadCode key is used if bad values are present; it also
1846 sets things up so that the "$ISBAD()" etc macros can be used. If
1847 set to 0, cause the routine to print a warning if any of the input
1848 piddles have their bad flag set.
1849
1850 BadCode (bad)
1851 Give the code to be used if bad values may be present in the input
1852 piddles. Only used if "HandleBad => 1".
1853
1854 GenericTypes
1855 An array reference. The array may contain any subset of the one-
1856 character strings `B', `S', `U', `L', `Q', `F' and `D', which
1857 specify which types your operation will accept. The meaning of each
1858 type is:
1859
1860 B - signed byte (i.e. signed char)
1861 S - signed short (two-byte integer)
1862 U - unsigned short
1863 L - signed long (four-byte integer, int on 32 bit systems)
1864 N - signed integer for indexing piddle elements (platform & Perl-dependent size)
1865 Q - signed long long (eight byte integer)
1866 F - float
1867 D - double
1868
1869 This is very useful (and important!) when interfacing an external
1870 library. Default: [qw/B S U L N Q F D/]
1871
1872 Inplace
1873 Mark a function as being able to work inplace.
1874
1875 Inplace => 1 if Pars => 'a(); [o]b();'
1876 Inplace => ['a'] if Pars => 'a(); b(); [o]c();'
1877 Inplace => ['a','b'] if Pars => 'a(); b(); [o]c(); [o]d();'
1878
1879 If bad values are being used, care must be taken to ensure the
1880 propagation of the badflag when inplace is being used; for instance
1881 see the code for "replacebad" in Basic/Bad/bad.pd.
1882
1883 Doc Used to specify a documentation string in Pod format. See PDL::Doc
1884 for information on PDL documentation conventions. Note: in the
1885 special case where the PP 'Doc' string is one line this is
1886 implicitly used for the quick reference AND the documentation!
1887
1888 If the Doc field is omitted PP will generate default documentation
1889 (after all it knows about the Signature).
1890
1891 If you really want the function NOT to be documented in any way at
1892 this point (e.g. for an internal routine, or because you are doing
1893 it elsewhere in the code) explicitly specify "Doc=>undef".
1894
1895 BadDoc (bad)
1896 Contains the text returned by the "badinfo" command (in "perldl")
1897 or the "-b" switch to the "pdldoc" shell script. In many cases, you
1898 will not need to specify this, since the information can be
1899 automatically created by PDL::PP. However, as befits computer-
1900 generated text, it's rather stilted; it may be much better to do it
1901 yourself!
1902
1903 NoPthread
1904 Optional flag to indicate the PDL function should not use processor
1905 threads (i.e. pthreads or POSIX threads) to split up work across
1906 multiple CPU cores. This option is typically set to 1 if the
1907 underlying PDL function is not threadsafe. If this option isn't
1908 present, then the function is assumed to be threadsafe. This option
1909 only applies if PDL has been compiled with POSIX threads enabled.
1910
1911 PMCode
1912 PDL functions allow you to pass in a piddle into which you want the
1913 output saved. This is handy because you can allocate an output
1914 piddle once and reuse it many times; the alternative would be for
1915 PDL to create a new piddle each time, which may waste compute
1916 cycles or, more likely, RAM. This added flexibility comes at the
1917 cost of more complexity: PDL::PP has to write functions that are
1918 smart enough to count the arguments passed to it and create new
1919 piddles on the fly, but only if you want them.
1920
1921 PDL::PP is smart enough to do that, but there are restrictions on
1922 argument order and the like. If you want a more flexible function,
1923 you can write your own Perl-side wrapper and specify it in the
1924 PMCode key. The string that you supply must (should) define a Perl
1925 function with a name that matches what you gave to pp_def in the
1926 first place. When you wish to eventually invoke the PP-generated
1927 function, you will need to supply all piddles in the exact order
1928 specified in the signature: output piddles are not optional, and
1929 the PP-generated function will not return anything. The obfuscated
1930 name that you will call is _<funcname>_int.
1931
1932 I believe this documentation needs further clarification, but this
1933 will have to do. :-(
1934
1935 PMFunc
1936 When pp_def generates functions, it typically defines them in the
1937 PDL package. Then, in the .pm file that it generates for your
1938 module, it typically adds a line that essentially copies that
1939 function into your current package's symbol table with code that
1940 looks like this:
1941
1942 *func_name = \&PDL::func_name;
1943
1944 It's a little bit smarter than that (it knows when to wrap that
1945 sort of thing in a BEGIN block, for example, and if you specified
1946 something different for pp_bless), but that's the gist of it. If
1947 you don't care to import the function into your current package's
1948 symbol table, you can specify
1949
1950 PMFunc => '',
1951
1952 PMFunc has no other side-effects, so you could use it to insert
1953 arbitrary Perl code into your module if you like. However, you
1954 should use pp_addpm if you want to add Perl code to your module.
1955
1957 Macros
1958 Macros labeled by (bad) are only used if bad-value support is compiled
1959 into PDL.
1960
1961 $variablename_from_sig()
1962 access a pdl (by its name) that was specified in the signature
1963
1964 $COMP(x)
1965 access a value in the private data structure of this
1966 transformation (mainly used to use an argument that is specified
1967 in the "OtherPars" section)
1968
1969 $SIZE(n)
1970 replaced at runtime by the actual size of a named dimension (as
1971 specified in the signature)
1972
1973 $GENERIC()
1974 replaced by the C type that is equal to the runtime type of the
1975 operation
1976
1977 $P(a) a pointer access to the PDL named "a" in the signature. Useful
1978 for interfacing to C functions
1979
1980 $PP(a) a physical pointer access to pdl "a"; mainly for internal use
1981
1982 $TXXX(Alternative,Alternative)
1983 expansion alternatives according to runtime type of operation,
1984 where XXX is some string that is matched by "/[BSULNQFD+]/".
1985
1986 $PDL(a)
1987 return a pointer to the pdl data structure (pdl *) of piddle "a"
1988
1989 $ISBAD(a()) (bad)
1990 returns true if the value stored in "a()" equals the bad value
1991 for this piddle. Requires "HandleBad" being set to 1.
1992
1993 $ISGOOD(a()) (bad)
1994 returns true if the value stored in "a()" does not equal the bad
1995 value for this piddle. Requires "HandleBad" being set to 1.
1996
1997 $SETBAD(a()) (bad)
1998 Sets "a()" to equal the bad value for this piddle. Requires
1999 "HandleBad" being set to 1.
2000
2001 functions
2002 "loop(DIMS) %{ ... %}"
2003 loop over named dimensions; limits are generated automatically by PP
2004
2005 "threadloop %{ ... %}"
2006 enclose following code in a thread loop
2007
2008 "types(TYPES) %{ ... %}"
2009 execute following code if type of operation is any of "TYPES"
2010
2012 A number of functions are imported when you "use PDL::PP". These
2013 include functions that control the generated C or XS code, functions
2014 that control the generated Perl code, and functions that manipulate the
2015 packages and symbol tables into which the code is created.
2016
2017 Generating C and XS Code
2018 PDL::PP's main purpose is to make it easy for you to wrap the threading
2019 engine around your own C code, but you can do some other things, too.
2020
2021 pp_def
2022 Used to wrap the threading engine around your C code. Virtually all
2023 of this document discusses the use of pp_def.
2024
2025 pp_done
2026 Indicates you are done with PDL::PP and that it should generate its
2027 .xs and .pm files based upon the other pp_* functions that you have
2028 called. This function takes no arguments.
2029
2030 pp_addxs
2031 This lets you add XS code to your .xs file. This is useful if you
2032 want to create Perl-accessible functions that invoke C code but
2033 cannot or should not invoke the threading engine. XS is the
2034 standard means by which you wrap Perl-accessible C code. You can
2035 learn more at perlxs.
2036
2037 pp_add_boot
2038 This function adds whatever string you pass to the XS BOOT section.
2039 The BOOT section is C code that gets called by Perl when your
2040 module is loaded and is useful for automatic initialization. You
2041 can learn more about XS and the BOOT section at perlxs.
2042
2043 pp_addhdr
2044 Adds pure-C code to your XS file. XS files are structured such that
2045 pure C code must come before XS specifications. This allows you to
2046 specify such C code.
2047
2048 pp_boundscheck
2049 PDL normally checks the bounds of your accesses before making them.
2050 You can turn that on or off at runtime by setting
2051 MyPackage::set_boundscheck. This function allows you to remove that
2052 runtime flexibility and never do bounds checking. It also returns
2053 the current boundschecking status if called without any argumens.
2054
2055 NOTE: I have not found anything about bounds checking in other
2056 documentation. That needs to be addressed.
2057
2058 Generating Perl Code
2059 Many functions imported when you use PDL::PP allow you to modify the
2060 contents of the generated .pm file. In addition to pp_def and pp_done,
2061 the role of these functions is primarily to add code to various parts
2062 of your generated .pm file.
2063
2064 pp_addpm
2065 Adds Perl code to the generated .pm file. PDL::PP actually keeps
2066 track of three different sections of generated code: the Top, the
2067 Middle, and the Bottom. You can add Perl code to the Middle section
2068 using the one-argument form, where the argument is the Perl code
2069 you want to supply. In the two-argument form, the first argument is
2070 an anonymous hash with only one key that specifies where to put the
2071 second argument, which is the string that you want to add to the
2072 .pm file. The hash is one of these three:
2073
2074 {At => 'Top'}
2075 {At => 'Middle'}
2076 {At => 'Bot'}
2077
2078 For example:
2079
2080 pp_addpm({At => 'Bot'}, <<POD);
2081
2082 =head1 Some documentation
2083
2084 I know I'm typing this in the middle of my file, but it'll go at
2085 the bottom.
2086
2087 =cut
2088
2089 POD
2090
2091 Warning: If, in the middle of your .pd file, you put documentation
2092 meant for the bottom of your pod, you will thoroughly confuse CPAN.
2093 On the other hand, if in the middle of your .pd file, you add some
2094 Perl code destined for the bottom or top of your .pm file, you only
2095 have yourself to confuse. :-)
2096
2097 pp_beginwrap
2098 Adds BEGIN-block wrapping. Certain declarations can be wrapped in
2099 BEGIN blocks, though the default behavior is to have no such
2100 wrapping.
2101
2102 pp_addbegin
2103 Sets code to be added to the top of your .pm file, even above code
2104 that you specify with "pp_addpm({At => 'Top'}, ...)". Unlike
2105 pp_addpm, calling this overwrites whatever was there before.
2106 Generally, you probably shouldn't use it.
2107
2108 Tracking Line Numbers
2109 When you get compile errors, either from your C-like code or your Perl
2110 code, it can help to make those errors back to the line numbers in the
2111 source file at which the error occurred.
2112
2113 pp_line_numbers
2114 Takes a line number and a (usually long) string of code. The line
2115 number should indicate the line at which the quote begins. This is
2116 usually Perl's "__LINE__" literal, unless you are using heredocs,
2117 in which case it is "__LINE__ + 1". The returned string has #line
2118 directives interspersed to help the compiler report errors on the
2119 proper line.
2120
2121 Modifying the Symbol Table and Export Behavior
2122 PDL::PP usually exports all functions generated using pp_def, and
2123 usually installs them into the PDL symbol table. However, you can
2124 modify this behavior with these functions.
2125
2126 pp_bless
2127 Sets the package (symbol table) to which the XS code is added. The
2128 default is PDL, which is generally what you want. If you use the
2129 default blessing and you create a function myfunc, then you can do
2130 the following:
2131
2132 $piddle->myfunc(<args>);
2133 PDL::myfunc($piddle, <args>);
2134
2135 On the other hand, if you bless your functions into another
2136 package, you cannot invoke them as PDL methods, and must invoke
2137 them as:
2138
2139 MyPackage::myfunc($piddle, <args>);
2140
2141 Of course, you could always use the PMFunc key to add your function
2142 to the PDL symbol table, but why do that?
2143
2144 pp_add_isa
2145 Adds to the list of modules from which your module inherits. The
2146 default list is
2147
2148 qw(PDL::Exporter DynaLoader)
2149
2150 pp_core_importlist
2151 At the top of your generated .pm file is a line that looks like
2152 this:
2153
2154 use PDL::Core;
2155
2156 You can modify that by specifying a string to pp_core_importlist.
2157 For example,
2158
2159 pp_core_importlist('::Blarg');
2160
2161 will result in
2162
2163 use PDL::Core::Blarg;
2164
2165 You can use this, for example, to add a list of symbols to import
2166 from PDL::Core. For example:
2167
2168 pp_core_importlist(" ':Internal'");
2169
2170 will lead to the following use statement:
2171
2172 use PDL::Core ':Internal';
2173
2174 pp_setversion
2175 Sets your module's version. The version must be consistent between
2176 the .xs and the .pm file, and is used to ensure that your Perl's
2177 libraries do not suffer from version skew.
2178
2179 pp_add_exported
2180 Adds to the export list whatever names you give it. Functions
2181 created using pp_def are automatically added to the list. This
2182 function is useful if you define any Perl functions using pp_addpm
2183 or pp_addxs that you want exported as well.
2184
2185 pp_export_nothing
2186 This resets the list of exported symbols to nothing. This is
2187 probably better called "pp_export_clear", since you can add
2188 exported symbols after calling "pp_export_nothing". When called
2189 just before calling pp_done, this ensures that your module does not
2190 export anything, for example, if you only want programmers to use
2191 your functions as methods.
2192
2194 PDL
2195
2196 For the concepts of threading and slicing check PDL::Indexing.
2197
2198 PDL::Internals
2199
2200 PDL::BadValues for information on bad values
2201
2202 perlxs, perlxstut
2203
2204 Practical Magick with C, PDL, and PDL::PP -- a guide to compiled add-
2205 ons for PDL <https://arxiv.org/abs/1702.07753>
2206
2208 Almost everything having to do with "Slice operation". This includes
2209 much of the following (each entry is followed by a guess/description of
2210 where it is used or defined):
2211
2212 MACROS
2213 $CDIM()
2214
2215 $CHILD()
2216 PDL::PP::Rule::Substitute::Usual
2217
2218 $CHILD_P()
2219 PDL::PP::Rule::Substitute::Usual
2220
2221 $CHILD_PTR()
2222 PDL::PP::Rule::Substitute::Usual
2223
2224 $COPYDIMS()
2225
2226 $COPYINDS()
2227
2228 $CROAK()
2229 PDL::PP::Rule::Substitute::dosubst_private()
2230
2231 $DOCOMPDIMS()
2232 Used in slices.pd, defined where?
2233
2234 $DOPRIVDIMS()
2235 Used in slices.pd, defined where?
2236 Code comes from PDL::PP::CType::get_malloc, which is called by
2237 PDL::PP::CType::get_copy, which is called by PDL::PP::CopyOtherPars,
2238 PDL::PP::NT2Copies__, and PDL::PP::make_incsize_copy. But none of
2239 those three at first glance seem to have anything to do with
2240 $DOPRIVDIMS
2241
2242 $EQUIVCPOFFS()
2243
2244 $EQUIVCPTRUNC()
2245
2246 $PARENT()
2247 PDL::PP::Rule::Substitute::Usual
2248
2249 $PARENT_P()
2250 PDL::PP::Rule::Substitute::Usual
2251
2252 $PARENT_PTR()
2253 PDL::PP::Rule::Substitute::Usual
2254
2255 $PDIM()
2256
2257 $PRIV()
2258 PDL::PP::Rule::Substitute::dosubst_private()
2259
2260 $RESIZE()
2261
2262 $SETDELTATHREADIDS()
2263 PDL::PP::Rule::MakeComp
2264
2265 $SETDIMS()
2266 PDL::PP::Rule::MakeComp
2267
2268 $SETNDIMS()
2269 PDL::PP::Rule::MakeComp
2270
2271 $SETREVERSIBLE()
2272 PDL::PP::Rule::Substitute::dosubst_private()
2273
2274 Keys
2275 AffinePriv
2276
2277 BackCode
2278
2279 BadBackCode
2280
2281 CallCopy
2282
2283 Comp (related to $COMP()?)
2284
2285 DefaultFlow
2286
2287 EquivCDimExpr
2288
2289 EquivCPOffsCode
2290
2291 EquivDimCheck
2292
2293 EquivPDimExpr
2294
2295 FTypes (see comment in this POD's source file between NoPthread and
2296 PMCode.)
2297
2298 GlobalNew
2299
2300 Identity
2301
2302 MakeComp
2303
2304 NoPdlThread
2305
2306 P2Child
2307
2308 ParentInds
2309
2310 Priv
2311
2312 ReadDataFuncName
2313
2314 RedoDims (related to RedoDimsCode ?)
2315
2316 Reversible
2317
2318 WriteBckDataFuncName
2319
2320 XCHGOnly
2321
2323 Although PDL::PP is quite flexible and thoroughly used, there are
2324 surely bugs. First amongst them: this documentation needs a thorough
2325 revision.
2326
2328 Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), Karl
2329 Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian Soeller
2330 (c.soeller@auckland.ac.nz). All rights reserved. Documentation updates
2331 Copyright(C) 2011 David Mertens (dcmertens.perl@gmail.com). This
2332 documentation is licensed under the same terms as Perl itself.
2333
2334
2335
2336perl v5.32.1 2021-02-15 PP(1)