PDL::PP(1)

1PP(1)                 User Contributed Perl Documentation                PP(1)
2
3
4

NAME

6       PDL::PP - Generate PDL routines from concise descriptions
7

SYNOPSIS

9       e.g.
10
11               pp_def(
12                       'sumover',
13                       Pars => 'a(n); [o]b();',
14                       Code => q{
15                               double tmp=0;
16                               loop(n) %{
17                                       tmp += $a();
18                               %}
19                               $b() = tmp;
20                       },
21               );
22
23               pp_done();
24

FUNCTIONS

26       Here is a quick reference list of the functions provided by PDL::PP.
27
28   pp_add_boot
29       Add code to the BOOT section of generated XS file
30
31   pp_add_exported
32       Add functions to the list of exported functions
33
34   pp_add_isa
35       Add entries to the @ISA list
36
37   pp_addbegin
38       Sets code to be added at the top of the generate .pm file
39
40   pp_addhdr
41       Add code and includes to C section of the generated XS file
42
43   pp_addpm
44       Add code to the generated .pm file
45
46   pp_addxs
47       Add extra XS code to the generated XS file
48
49   pp_beginwrap
50       Add BEGIN-block wrapping to code for the generated .pm file
51
52   pp_bless
53       Sets the package to which the XS code is added (default is PDL)
54
55   pp_boundscheck
56       Control state of PDL bounds checking activity
57
58   pp_core_importList
59       Specify what is imported from PDL::Core
60
61   pp_def
62       Define a new PDL function
63
64   pp_deprecate_module
65       Add runtime and POD warnings about a module being deprecated
66
67   pp_done
68       Mark the end of PDL::PP definitions in the file
69
70   pp_export_nothing
71       Clear out the export list for your generated module
72
73   pp_line_numbers
74       Add line number information to simplify debugging of PDL::PP code
75

OVERVIEW

77       For an alternate introduction to PDL::PP, see Practical Magick with C,
78       PDL, and PDL::PP -- a guide to compiled add-ons for PDL
79       <https://arxiv.org/abs/1702.07753>.
80
81       Why do we need PP? Several reasons: firstly, we want to be able to
82       generate subroutine code for each of the PDL datatypes (PDL_Byte,
83       PDL_Short, etc).  AUTOMATICALLY.  Secondly, when referring to slices of
84       PDL arrays in Perl (e.g. "$x->slice('0:10:2,:')" or other things such
85       as transposes) it is nice to be able to do this transparently and to be
86       able to do this 'in-place' - i.e, not to have to make a memory copy of
87       the section. PP handles all the necessary element and offset arithmetic
88       for you. There are also the notions of threading (repeated calling of
89       the same routine for multiple slices, see PDL::Indexing) and dataflow
90       (see PDL::Dataflow) which use of PP allows.
91
92       In much of what follows we will assume familiarity of the reader with
93       the concepts of implicit and explicit threading and index manipulations
94       within PDL. If you have not yet heard of these concepts or are not very
95       comfortable with them it is time to check PDL::Indexing.
96
97       As you may appreciate from its name PDL::PP is a Pre-Processor, i.e.
98       it expands code via substitutions to make real C-code. Technically, the
99       output is XS code (see perlxs) but that is very close to C.
100
101       So how do you use PP? Well for the most part you just write ordinary C
102       code except for special PP constructs which take the form:
103
104          $something(something else)
105
106       or:
107
108          PPfunction %{
109            <stuff>
110          %}
111
112       The most important PP construct is the form "$array()". Consider the
113       very simple PP function to sum the elements of a 1D vector (in fact
114       this is very similar to the actual code used by 'sumover'):
115
116          pp_def('sumit',
117              Pars => 'a(n);  [o]b();',
118              Code => q{
119                  double tmp;
120                  tmp = 0;
121                  loop(n) %{
122                      tmp += $a();
123                  %}
124                  $b() = tmp;
125              }
126          );
127
128       What's going on? The "Pars =>" line is very important for PP - it
129       specifies all the arguments and their dimensionality. We call this the
130       signature of the PP function (compare also the explanations in
131       PDL::Indexing).  In this case the routine takes a 1-D function as input
132       and returns a 0-D scalar as output.  The "$a()" PP construct is used to
133       access elements of the array a(n) for you - PP fills in all the
134       required C code.
135
136       You will notice that we are using the "q{}" single-quote operator. This
137       is not an accident. You generally want to use single quotes to denote
138       your PP Code sections. PDL::PP uses "$var()" for its parsing and if you
139       don't use single quotes, Perl will try to interpolate "$var()". Also,
140       using the single quote "q" operator with curly braces makes it look
141       like you are creating a code block, which is What You Mean. (Perl is
142       smart enough to look for nested curly braces and not close the quote
143       until it finds the matching curly brace, so it's safe to have nested
144       blocks.) Under other circumstances, such as when you're stitching
145       together a Code block using string concatenations, it's often easiest
146       to use real single quotes as
147
148        Code => 'something'.$interpolatable.'somethingelse;'
149
150       In the simple case here where all elements are accessed the PP
151       construct "loop(n) %{ ... %}" is used to loop over all elements in
152       dimension "n".  Note this feature of PP: ALL DIMENSIONS ARE SPECIFIED
153       BY NAME.
154
155       This is made clearer if we avoid the PP loop() construct and write the
156       loop explicitly using conventional C:
157
158          pp_def('sumit',
159              Pars => 'a(n);  [o]b();',
160              Code => q{
161                  PDL_Indx i,n_size;
162                  double tmp;
163                  n_size = $SIZE(n);
164                  tmp = 0;
165                  for(i=0; i<n_size; i++) {
166                      tmp += $a(n=>i);
167                  }
168                  $b() = tmp;
169              },
170          );
171
172       which does the same as before, but is more long-winded.  You can see to
173       get element "i" of a() we say "$a(n=>i)" - we are specifying the
174       dimension by name "n". In 2D we might say:
175
176          Pars=>'a(m,n);',
177             ...
178             tmp += $a(m=>i,n=>j);
179             ...
180
181       The syntax "m=>i" borrows from Perl hashes, which are in fact used in
182       the implementation of PP. One could also say "$a(n=>j,m=>i)" as order
183       is not important.
184
185       You can also see in the above example the use of another PP construct -
186       $SIZE(n) to get the length of the dimension "n".
187
188       It should, however, be noted that you shouldn't write an explicit
189       C-loop when you could have used the PP "loop" construct since PDL::PP
190       checks automatically the loop limits for you, usage of "loop" makes the
191       code more concise, etc. But there are certainly situations where you
192       need explicit control of the loop and now you know how to do it ;).
193
194       To revisit 'Why PP?' - the above code for sumit() will be generated for
195       each data-type. It will operate on slices of arrays 'in-place'. It will
196       thread automatically - e.g. if a 2D array is given it will be called
197       repeatedly for each 1D row (again check PDL::Indexing for the details
198       of threading).  And then b() will be a 1D array of sums of each row.
199       We could call it with $x->transpose to sum the columns instead.  And
200       Dataflow tracing etc. will be available.
201
202       You can see PP saves the programmer from writing a lot of needlessly
203       repetitive C-code -- in our opinion this is one of the best features of
204       PDL making writing new C subroutines for PDL an amazingly concise
205       exercise. A second reason is the ability to make PP expand your concise
206       code definitions into different C code based on the needs of the
207       computer architecture in question. Imagine for example you are lucky to
208       have a supercomputer at your hands; in that case you want PDL::PP
209       certainly to generate code that takes advantage of the
210       vectorising/parallel computing features of your machine (this a project
211       for the future). In any case, the bottom line is that your unchanged
212       code should still expand to working XS code even if the internals of
213       PDL changed.
214
215       Also, because you are generating the code in an actual Perl script,
216       there are many fun things that you can do. Let's say that you need to
217       write both sumit (as above) and multit. With a little bit of
218       creativity, we can do
219
220          for({Name => 'sumit', Init => '0', Op => '+='},
221              {Name => 'multit', Init => '1', Op => '*='}) {
222                  pp_def($_->{Name},
223                          Pars => 'a(n);  [o]b();',
224                          Code => '
225                               double tmp;
226                               tmp = '.$_->{Init}.';
227                               loop(n) %{
228                                 tmp '.$_->{Op}.' $a();
229                               %}
230                               $b() = tmp;
231                  ');
232          }
233
234       which defines both the functions easily. Now, if you later need to
235       change the signature or dimensionality or whatever, you only need to
236       change one place in your code.  Yeah, sure, your editor does have 'cut
237       and paste' and 'search and replace' but it's still less bothersome and
238       definitely more difficult to forget just one place and have strange
239       bugs creep in.  Also, adding 'orit' (bitwise or) later is a one-liner.
240
241       And remember, you really have Perl's full abilities with you - you can
242       very easily read any input file and make routines from the information
243       in that file. For simple cases like the above, the author (Tjl)
244       currently favors the hash syntax like the above - it's not too much
245       more characters than the corresponding array syntax but much easier to
246       understand and change.
247
248       We should mention here also the ability to get the pointer to the
249       beginning of the data in memory - a prerequisite for interfacing PDL to
250       some libraries. This is handled with the "$P(var)" directive, see
251       below.
252
253       When starting work on a new pp_def'ined function, if you make a
254       mistake, you will usually find a pile of compiler errors indicating
255       line numbers in the generated XS file. If you know how to read XS files
256       (or if you want to learn the hard way), you could open the generated XS
257       file and search for the line number with the error. However, a recent
258       addition to PDL::PP helps report the correct line number of your
259       errors: "pp_line_numbers". Working with the original summit example, if
260       you had a mis-spelling of tmp in your code, you could change the
261       (erroneous) code to something like this and the compiler would give you
262       much more useful information:
263
264          pp_def('sumit',
265              Pars => 'a(n);  [o]b();',
266              Code => pp_line_numbers(__LINE__, q{
267                  double tmp;
268                  tmp = 0;
269                  loop(n) %{
270                      tmp += $a();
271                  %}
272                  $b() = rmp;
273              })
274          );
275
276       For the above situation, my compiler tells me:
277
278        ...
279        test.pd:15: error: 'rmp' undeclared (first use in this function)
280        ...
281
282       In my example script (called test.pd), line 15 is exactly the line at
283       which I made my typo: "rmp" instead of "tmp".
284
285       So, after this quick overview of the general flavour of programming PDL
286       routines using PDL::PP let's summarise in which circumstances you
287       should actually use this preprocessor/precompiler. You should use
288       PDL::PP if you want to
289
290       •  interface PDL to some external library
291
292       •  write some algorithm that would be slow if coded in Perl (this is
293          not as often as you think; take a look at threading and dataflow
294          first).
295
296       •  be a PDL developer (and even then it's not obligatory)
297

WARNING

299       Because of its architecture, PDL::PP can be both flexible and easy to
300       use on the one hand, yet exuberantly complicated at the same time.
301       Currently, part of the problem is that error messages are not very
302       informative and if something goes wrong, you'd better know what you are
303       doing and be able to hack your way through the internals (or be able to
304       figure out by trial and error what is wrong with your args to
305       "pp_def"). Although work is being done to produce better warnings, do
306       not be afraid to send your questions to the mailing list if you run
307       into trouble.
308

DESCRIPTION

310       Now that you have some idea how to use "pp_def" to define new PDL
311       functions it is time to explain the general syntax of "pp_def".
312       "pp_def" takes as arguments first the name of the function you are
313       defining and then a hash list that can contain various keys.
314
315       Based on these keys PP generates XS code and a .pm file. The function
316       "pp_done" (see example in the SYNOPSIS) is used to tell PDL::PP that
317       there are no more definitions in this file and it is time to generate
318       the .xs and
319        .pm file.
320
321       As a consequence, there may be several pp_def() calls inside a file (by
322       convention files with PP code have the extension .pd or .pp) but
323       generally only one pp_done().
324
325       There are two main different types of usage of pp_def(), the 'data
326       operation' and 'slice operation' prototypes.
327
328       The 'data operation' is used to take some data, mangle it and output
329       some other data; this includes for example the '+' operation, matrix
330       inverse, sumover etc and all the examples we have talked about in this
331       document so far. Implicit and explicit threading and the creation of
332       the result are taken care of automatically in those operations. You can
333       even do dataflow with "sumit", "sumover", etc (don't be dismayed if you
334       don't understand the concept of dataflow in PDL very well yet; it is
335       still very much experimental).
336
337       The 'slice operation' is a different kind of operation: in a slice
338       operation, you are not changing any data, you are defining
339       correspondences between different elements of two ndarrays (examples
340       include the index manipulation/slicing function definitions in the file
341       slices.pd that is part of the PDL distribution; but beware, this is not
342       introductory level stuff).
343
344       To support bad values, additional keys are required for "pp_def", as
345       explained below.
346
347       If you are just interested in communicating with some external library
348       (for example some linear algebra/matrix library), you'll usually want
349       the 'data operation' so we are going to discuss that first.
350

Data operation

352   A simple example
353       In the data operation, you must know what dimensions of data you need.
354       First, an example with scalars:
355
356               pp_def('add',
357                       Pars => 'a(); b(); [o]c();',
358                       Code => '$c() = $a() + $b();'
359               );
360
361       That looks a little strange but let's dissect it. The first line is
362       easy: we're defining a routine with the name 'add'.  The second line
363       simply declares our parameters and the parentheses mean that they are
364       scalars. We call the string that defines our parameters and their
365       dimensionality the signature of that function. For its relevance with
366       regard to threading and index manipulations check the PDL::Indexing man
367       page.
368
369       The third line is the actual operation. You need to use the dollar
370       signs and parentheses to refer to your parameters (this will probably
371       change at some point in the future, once a good syntax is found).
372
373       These lines are all that is necessary to actually define the function
374       for PDL (well, actually it isn't; you additionally need to write a
375       Makefile.PL (see below) and build the module (something like 'perl
376       Makefile.PL; make'); but let's ignore that for the moment). So now you
377       can do
378
379               use MyModule;
380               $x = pdl 2,3,4;
381               $y = pdl 5;
382
383               $c = add($x,$y);
384               # or
385               add($x,$y,($c=null)); # Alternative form, useful if $c has been
386                                     # preset to something big, not useful here.
387
388       and have threading work correctly (the result is $c == [7 8 9]).
389
390   The Pars section: the signature of a PP function
391       Seeing the above example code you will most probably ask: what is this
392       strange "$c=null" syntax in the second call to our new "add" function?
393       If you take another look at the definition of "add" you will notice
394       that the third argument "c" is flagged with the qualifier "[o]" which
395       tells PDL::PP that this is an output argument. So the above call to add
396       means 'create a new $c from scratch with correct dimensions' - "null"
397       is a special token for 'empty ndarray' (you might ask why we haven't
398       used the value "undef" to flag this instead of the PDL specific "null";
399       we are currently thinking about it ;).
400
401       [This should be explained in some other section of the manual as
402       well!!]  The reason for having this syntax as an alternative is that if
403       you have really huge ndarrays, you can do
404
405               $c = PDL->null;
406               for(some long loop) {
407                       # munge a,b
408                       add($x,$y,$c);
409                       # munge c, put something back to x,y
410               }
411
412       and avoid allocating and deallocating $c each time. It is allocated
413       once at the first add() and thereafter the memory stays until $c is
414       destroyed.
415
416       If you just say
417
418         $c =  add($x,$y);
419
420       the code generated by PP will automatically fill in "$c=null" and
421       return the result. If you want to learn more about the reasons why
422       PDL::PP supports this style where output arguments are given as last
423       arguments check the PDL::Indexing man page.
424
425       "[o]" is not the only qualifier a pdl argument can have in the
426       signature.  Another important qualifier is the "[t]" option which flags
427       a pdl as temporary.  What does that mean? You tell PDL::PP that this
428       pdl is only used for temporary results in the course of the calculation
429       and you are not interested in its value after the computation has been
430       completed. But why should PDL::PP want to know about this in the first
431       place?  The reason is closely related to the concepts of pdl auto
432       creation (you heard about that above) and implicit threading. If you
433       use implicit threading the dimensionality of automatically created pdls
434       is actually larger than that specified in the signature. With "[o]"
435       flagged pdls will be created so that they have the additional
436       dimensions as required by the number of implicit thread dimensions.
437       When creating a temporary pdl, however, it will always only be made big
438       enough so that it can hold the result for one iteration in a thread
439       loop, i.e. as large as required by the signature.  So less memory is
440       wasted when you flag a pdl as temporary. Secondly, you can use output
441       auto creation with temporary pdls even when you are using explicit
442       threading which is forbidden for normal output pdls flagged with "[o]"
443       (see PDL::Indexing).
444
445       Here is an example where we use the [t] qualifier. We define the
446       function "callf" that calls a C routine "f" which needs a temporary
447       array of the same size and type as the array "a" (sorry about the
448       forward reference for $P; it's a pointer access, see below) :
449
450         pp_def('callf',
451               Pars => 'a(n); [t] tmp(n); [o] b()',
452               Code => 'PDL_Indx ns = $SIZE(n);
453                        f($P(a),$P(b),$P(tmp),ns);
454                       '
455         );
456
457       Another possible qualifier is "[phys]". If given, this means the pdl
458       will have "make_physical" in PDL::Core called on it.
459
460       Additionally, if it has a specified dimension "d" that has value 1, "d"
461       will not magically be grown if "d" is larger in another pdl with
462       specified dimension "d", and instead an exception will be thrown. E.g.:
463
464         pp_def('callf',
465               Pars => 'a(n); [phys] b(n); [o] c()',
466               # ...
467         );
468
469       If "a" had lead dimension of 2 and "b" of 3, an exception will always
470       be thrown. However, if "b" has lead dimension of 1, it would be
471       silently repeated as if it were 2, if it were not a "phys" parameter.
472
473   Argument dimensions and the signature
474       Now we have just talked about dimensions of pdls and the signature. How
475       are they related? Let's say that we want to add a scalar + the index
476       number to a vector:
477
478               pp_def('add2',
479                       Pars => 'a(n); b(); [o]c(n);',
480                       Code => 'loop(n) %{
481                                       $c() = $a() + $b() + n;
482                                %}'
483               );
484
485       There are several points to notice here: first, the "Pars" argument now
486       contains the n arguments to show that we have a single dimensions in a
487       and c. It is important to note that dimensions are actual entities that
488       are accessed by name so this declares a and c to have the same first
489       dimensions. In most PP definitions the size of named dimensions will be
490       set from the respective dimensions of non-output pdls (those with no
491       "[o]" flag) but sometimes you might want to set the size of a named
492       dimension explicitly through an integer parameter. See below in the
493       description of the "OtherPars" section how that works.
494
495   Constant argument dimensions in the signature
496       Suppose you want an output ndarray to be created automatically and you
497       know that on every call its dimension will have the same size (say 9)
498       regardless of the dimensions of the input ndarrays. In this case you
499       use the following syntax in the Pars section to specify the size of the
500       dimension:
501
502           ' [o] y(n=9); '
503
504       As expected, extra dimensions required by threading will be created if
505       necessary. If you need to assign a named dimension according to a more
506       complicated formula (than a constant) you must use the "RedoDimsCode"
507       key described below.
508
509   Type conversions and the signature
510       The signature also determines the type conversions that will be
511       performed when a PP function is invoked. So what happens when we invoke
512       one of our previously defined functions with pdls of different type,
513       e.g.
514
515         add2($x,$y,($ret=null));
516
517       where $x is of type "PDL_Float" and $y of type "PDL_Short"? With the
518       signature as shown in the definition of "add2" above the datatype of
519       the operation (as determined at runtime) is that of the pdl with the
520       'highest' type (sequence is byte < short < ushort < long < float <
521       double). In the add2 example the datatype of the operation is float ($x
522       has that datatype). All pdl arguments are then type converted to that
523       datatype (they are not converted inplace but a copy with the right type
524       is created if a pdl argument doesn't have the type of the operation).
525       Null pdls don't contribute a type in the determination of the type of
526       the operation.  However, they will be created with the datatype of the
527       operation; here, for example, $ret will be of type float. You should be
528       aware of these rules when calling PP functions with pdls of different
529       types to take the additional storage and runtime requirements into
530       account.
531
532       These type conversions are correct for most functions you normally
533       define with "pp_def". However, there are certain cases where slightly
534       modified type conversion behaviour is desired. For these cases
535       additional qualifiers in the signature can be used to specify the
536       desired properties with regard to type conversion. These qualifiers can
537       be combined with those we have encountered already (the creation
538       qualifiers "[o]" and "[t]"). Let's go through the list of qualifiers
539       that change type conversion behaviour.
540
541       The most important is the "indx" qualifier which comes in handy when a
542       pdl argument represents indices into another pdl. Let's take a look at
543       an example from "PDL::Ufunc":
544
545          pp_def('maximum_ind',
546                 Pars => 'a(n); indx [o] b()',
547                 Code => '$GENERIC() cur;
548                          PDL_Indx curind;
549                          loop(n) %{
550                           if (!n || $a() > cur) {cur = $a(); curind = n;}
551                          %}
552                          $b() = curind;',
553          );
554
555       The function "maximum_ind" finds the index of the largest element of a
556       vector. If you look at the signature you notice that the output
557       argument "b" has been declared with the additional "indx" qualifier.
558       This has the following consequences for type conversions: regardless of
559       the type of the input pdl "a" the output pdl "b" will be of type
560       "PDL_Indx" which makes sense since "b" will represent an index into
561       "a".
562
563       Note that 'curind' is declared as type "PDL_Indx" and not "indx".
564       While most datatype declarations in the 'Pars' section use the same
565       name as the underlying C type, "indx" is a type which is sufficient to
566       handle PDL indexing operations.  For 32-bit installs, it can be a
567       32-bit integer type.  For 64-bit installs, it will be a 64-bit integer
568       type.
569
570       Furthermore, if you call the function with an existing output pdl "b"
571       its type will not influence the datatype of the operation (see above).
572       Hence, even if "a" is of a smaller type than "b" it will not be
573       converted to match the type of "b" but stays untouched, which saves
574       memory and CPU cycles and is the right thing to do when "b" represents
575       indices. Also note that you can use the 'indx' qualifier together with
576       other qualifiers (the "[o]" and "[t]" qualifiers). Order is significant
577       -- type qualifiers precede creation qualifiers ("[o]" and "[t]").
578
579       The above example also demonstrates typical usage of the "$GENERIC()"
580       macro.  It expands to the current type in a so called generic loop.
581       What is a generic loop? As you already heard a PP function has a
582       runtime datatype as determined by the type of the pdl arguments it has
583       been invoked with.  The PP generated XS code for this function
584       therefore contains a switch like "switch (type) {case PDL_Byte: ...
585       case PDL_Double: ...}" that selects a case based on the runtime
586       datatype of the function (it's called a type ``loop'' because there is
587       a loop in PP code that generates the cases).  In any case your code is
588       inserted once for each PDL type into this switch statement. The
589       "$GENERIC()" macro just expands to the respective type in each copy of
590       your parsed code in this "switch" statement, e.g., in the "case
591       PDL_Byte" section "cur" will expand to "PDL_Byte" and so on for the
592       other case statements. I guess you realise that this is a useful macro
593       to hold values of pdls in some code.
594
595       There are a couple of other qualifiers with similar effects as "indx".
596       For your convenience there are the "float" and "double" qualifiers with
597       analogous consequences on type conversions as "indx". Let's assume you
598       have a very large array for which you want to compute row and column
599       sums with an equivalent of the "sumover" function.  However, with the
600       normal definition of "sumover" you might run into problems when your
601       data is, e.g. of type short. A call like
602
603         sumover($large_pdl,($sums = null));
604
605       will result in $sums be of type short and is therefore prone to
606       overflow errors if $large_pdl is a very large array. On the other hand
607       calling
608
609         @dims = $large_pdl->dims; shift @dims;
610         sumover($large_pdl,($sums = zeroes(double,@dims)));
611
612       is not a good alternative either. Now we don't have overflow problems
613       with $sums but at the expense of a type conversion of $large_pdl to
614       double, something bad if this is really a large pdl. That's where
615       "double" comes in handy:
616
617         pp_def('sumoverd',
618                Pars => 'a(n); double [o] b()',
619                Code => 'double tmp=0;
620                         loop(n) %{ tmp += a(); %}
621                         $b() = tmp;',
622         );
623
624       This gets us around the type conversion and overflow problems. Again,
625       analogous to the "indx" qualifier "double" results in "b" always being
626       of type double regardless of the type of "a" without leading to a type
627       conversion of "a" as a side effect.
628
629       There is also a special type, "real". The others above are all actual
630       PDL/C datatypes, but "real" is a modifier; if the operation type is
631       real, it has no effect; if it is complex, then the parameter will be
632       the real version - so "cdouble" becomes "double", etc.
633
634       There is also the converse, "complex". If the operation is already
635       complex, there is no effect; if not, the output will be promoted to the
636       type's "complexversion" in PDL::Type, which defaults to "cfloat". Note
637       this is controlled both by the PDL::Types data, and the code in
638       PDL::PP.  NB Because this outputs floating-point data, the inputs will
639       by definition be turned into such. Therefore, it only makes sense to
640       have floating-point "GenericTypes" inputs. If you want to default to
641       coercing inputs to "float", give that as the last "GenericTypes" as the
642       generated XS function defaults to the last-given one. Hence (with the
643       "PMCode" and "Doc" omitted):
644
645         pp_def('r2C',
646           GenericTypes=>[reverse qw(F D G C)], # last one is default so here = F
647           Pars => 'r(); complex [o]c()',
648           Code => '$c() = $r();'
649         );
650
651       Finally, there are the "type+" qualifiers where type is one of "int" or
652       "float". What shall that mean. Let's illustrate the "int+" qualifier
653       with the actual definition of sumover:
654
655         pp_def('sumover',
656                Pars => 'a(n); int+ [o] b()',
657                Code => '$GENERIC(b) tmp=0;
658                         loop(n) %{ tmp += a(); %}
659                         $b() = tmp;',
660         );
661
662       As we had already seen for the "int", "float" and "double" qualifiers,
663       a pdl marked with a "type+" qualifier does not influence the datatype
664       of the pdl operation. Its meaning is "make this pdl at least of type
665       "type" or higher, as required by the type of the operation". In the
666       sumover example this means that when you call the function with an "a"
667       of type PDL_Short the output pdl will be of type PDL_Long (just as
668       would have been the case with the "int" qualifier). This again tries to
669       avoid overflow problems when using small datatypes (e.g. byte images).
670       However, when the datatype of the operation is higher than the type
671       specified in the "type+" qualifier "b" will be created with the
672       datatype of the operation, e.g. when "a" is of type double then "b"
673       will be double as well. We hope you agree that this is sensible
674       behaviour for "sumover". It should be obvious how the "float+"
675       qualifier works by analogy.  It may become necessary to be able to
676       specify a set of alternative types for the parameters. However, this
677       will probably not be implemented until someone comes up with a
678       reasonable use for it.
679
680       Note that we now had to specify the $GENERIC macro with the name of the
681       pdl to derive the type from that argument. Why is that? If you
682       carefully followed our explanations you will have realised that in some
683       cases "b" will have a different type than the type of the operation.
684       Calling the '$GENERIC' macro with "b" as argument makes sure that the
685       type will always the same as that of "b" in that part of the generic
686       loop.
687
688       This is about all there is to say about the "Pars" section in a
689       "pp_def" call. You should remember that this section defines the
690       signature of a PP defined function, you can use several options to
691       qualify certain arguments as output and temporary args and all
692       dimensions that you can later refer to in the "Code" section are
693       defined by name.
694
695       It is important that you understand the meaning of the signature since
696       in the latest PDL versions you can use it to define threaded functions
697       from within Perl, i.e. what we call Perl level threading. Please check
698       PDL::Indexing for details.
699
700   The Code section
701       The "Code" section contains the actual XS code that will be in the
702       innermost part of a thread loop (if you don't know what a thread loop
703       is then you still haven't read PDL::Indexing; do it now ;) after any PP
704       macros (like $GENERIC) and PP functions have been expanded (like the
705       "loop" function we are going to explain next).
706
707       Let's quickly reiterate the "sumover" example:
708
709         pp_def('sumover',
710                Pars => 'a(n); int+ [o] b()',
711                Code => '$GENERIC(b) tmp=0;
712                         loop(n) %{ tmp += a(); %}
713                         $b() = tmp;',
714         );
715
716       The "loop" construct in the "Code" section also refers to the dimension
717       name so you don't need to specify any limits: the loop is correctly
718       sized and everything is done for you, again.
719
720       Next, there is the surprising fact that "$a()" and "$b()" do not
721       contain the index. This is not necessary because we're looping over n
722       and both variables know which dimensions they have so they
723       automatically know they're being looped over.
724
725       This feature comes in very handy in many places and makes for much
726       shorter code. Of course, there are times when you want to circumvent
727       this; here is a function which make a matrix symmetric and serves as an
728       example of how to code explicit looping:
729
730               pp_def('symm',
731                       Pars => 'a(n,n); [o]c(n,n);',
732                       Code => 'loop(n) %{
733                                       int n2;
734                                       for(n2=n; n2<$SIZE(n); n2++) {
735                                               $c(n0 => n, n1 => n2) =
736                                               $c(n0 => n2, n1 => n) =
737                                                $a(n0 => n, n1 => n2);
738                                       }
739                               %}
740                       '
741               );
742
743       Let's dissect what is happening. Firstly, what is this function
744       supposed to do? From its signature you see that it takes a 2D matrix
745       with equal numbers of columns and rows and outputs a matrix of the same
746       size. From a given input matrix $a it computes a symmetric output
747       matrix $c (symmetric in the matrix sense that A^T = A where ^T means
748       matrix transpose, or in PDL parlance $c == $c->transpose). It does this
749       by using only the values on and below the diagonal of $a. In the output
750       matrix $c all values on and below the diagonal are the same as those in
751       $a while those above the diagonal are a mirror image of those below the
752       diagonal (above and below are here interpreted in the way that PDL
753       prints 2D pdls). If this explanation still sounds a bit strange just go
754       ahead, make a little file into which you write this definition, build
755       the new PDL extension (see section on Makefiles for PP code) and try it
756       out with a couple of examples.
757
758       Having explained what the function is supposed to do there are a couple
759       of points worth noting from the syntactical point of view. First, we
760       get the size of the dimension named "n" again by using the $SIZE macro.
761       Second, there are suddenly these funny "n0" and "n1" index names in the
762       code though the signature defines only the dimension "n". Why this? The
763       reason becomes clear when you note that both the first and second
764       dimension of $a and $b are named "n" in the signature of "symm". This
765       tells PDL::PP that the first and second dimension of these arguments
766       should have the same size. Otherwise the generated function will raise
767       a runtime error.  However, now in an access to $a and $c PDL::PP cannot
768       figure out which index "n" refers to any more just from the name of the
769       index.  Therefore, the indices with equal dimension names get numbered
770       from left to right starting at 0, e.g. in the above example "n0" refers
771       to the first dimension of $a and $c, "n1" to the second and so on.
772
773       In all examples so far, we have only used the "Pars" and "Code" members
774       of the hash that was passed to "pp_def". There are certainly other keys
775       that are recognised by PDL::PP and we will hear about some of them in
776       the course of this document. Find a (non-exhaustive) list of keys in
777       Appendix A.  A list of macros and PPfunctions (we have only encountered
778       some of those in the examples above yet) that are expanded in values of
779       the hash argument to "pp_def" is summarised in Appendix B.
780
781       At this point, it might be appropriate to mention that PDL::PP is not a
782       completely static, well designed set of routines (as Tuomas puts it:
783       "stop thinking of PP as a set of routines carved in stone") but rather
784       a collection of things that the PDL::PP author (Tuomas J. Lukka)
785       considered he would have to write often into his PDL extension
786       routines. PP tries to be expandable so that in the future, as new needs
787       arise, new common code can be abstracted back into it. If you want to
788       learn more on why you might want to change PDL::PP and how to do it
789       check the section on PDL::PP internals.
790
791   Handling bad values
792       There are several keys and macros used when writing code to handle bad
793       values. The first one is the "HandleBad" key:
794
795       HandleBad => 0
796           This flags a pp-routine as NOT handling bad values. If this routine
797           is sent ndarrays with their "badflag" set, then a warning message
798           is printed to STDOUT and the ndarrays are processed as if the value
799           used to represent bad values is a valid number. The "badflag" value
800           is not propagated to the output ndarrays.
801
802           An example of when this is used is for FFT routines, which
803           generally do not have a way of ignoring part of the data.
804
805       HandleBad => 1
806           This causes PDL::PP to write extra code that ensures the BadCode
807           section is used, and that the "$ISBAD()" macro (and its brethren)
808           work.
809
810       HandleBad is not given
811           If any of the input ndarrays have their "badflag" set, then the
812           output ndarrays will have their "badflag" set, but any supplied
813           BadCode is ignored.
814
815       The value of "HandleBad" is used to define the contents of the "BadDoc"
816       key, if it is not given.
817
818       To handle bad values, code must be written somewhat differently; for
819       instance,
820
821        $c() = $a() + $b();
822
823       becomes something like
824
825        if ( $a() != BADVAL && $b() != BADVAL ) {
826           $c() = $a() + $b();
827        } else {
828           $c() = BADVAL;
829        }
830
831       However, we only want the second version if bad values are present in
832       the input ndarrays (and that bad-value support is wanted!) - otherwise
833       we actually want the original code. This is where the "BadCode" key
834       comes in; you use it to specify the code to execute if bad values may
835       be present, and PP uses both it and the "Code" section to create
836       something like:
837
838        if ( bad_values_are_present ) {
839           fancy_threadloop_stuff {
840              BadCode
841           }
842        } else {
843           fancy_threadloop_stuff {
844              Code
845           }
846        }
847
848       This approach means that there is virtually no overhead when bad values
849       are not present (i.e. the badflag routine returns 0).
850
851       The C preprocessor symbol "PDL_BAD_CODE" is defined when the bad code
852       is compiled, so that you can reduce the amount of code you write.  The
853       BadCode section can use the same macros and looping constructs as the
854       Code section.  However, it wouldn't be much use without the following
855       additional macros:
856
857       $ISBAD(var)
858           To check whether an ndarray's value is bad, use the $ISBAD macro:
859
860            if ( $ISBAD(a()) ) { printf("a() is bad\n"); }
861
862           You can also access given elements of an ndarray:
863
864            if ( $ISBAD(a(n=>l)) ) { printf("element %d of a() is bad\n", l); }
865
866       $ISGOOD(var)
867           This is the opposite of the $ISBAD macro.
868
869       $SETBAD(var)
870           For when you want to set an element of an ndarray bad.
871
872       $ISBADVAR(c_var,pdl)
873           If you have cached the value of an ndarray "$a()" into a c-variable
874           ("foo" say), then to check whether it is bad, use
875           "$ISBADVAR(foo,a)".
876
877       $ISGOODVAR(c_var,pdl)
878           As above, but this time checking that the cached value isn't bad.
879
880       $SETBADVAR(c_var,pdl)
881           To copy the bad value for an ndarray into a c variable, use
882           "$SETBADVAR(foo,a)".
883
884       TODO: mention "$PPISBAD()" etc macros.
885
886       Using these macros, the above code could be specified as:
887
888        Code => '$c() = $a() + $b();',
889        BadCode => '
890           if ( $ISBAD(a()) || $ISBAD(b()) ) {
891              $SETBAD(c());
892           } else {
893              $c() = $a() + $b();
894           }',
895
896       Since this is Perl, TMTOWTDI, so you could also write:
897
898        BadCode => '
899           if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
900              $c() = $a() + $b();
901           } else {
902              $SETBAD(c());
903           }',
904
905       You can reduce code repetition using the C "PDL_BAD_CODE" macro, using
906       the same code for both of the "Code" and "BadCode" sections:
907
908           #ifdef PDL_BAD_CODE
909           if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
910           #endif PDL_BAD_CODE
911
912              $c() = $a() + $b();
913
914           #ifdef PDL_BAD_CODE
915           } else {
916              $SETBAD(c());
917           }
918           #endif PDL_BAD_CODE
919
920       If you want access to the value of the badflag for a given ndarray, you
921       can use the PDL STATE macros:
922
923       $ISPDLSTATEBAD(pdl)
924       $ISPDLSTATEGOOD(pdl)
925       $SETPDLSTATEBAD(pdl)
926       $SETPDLSTATEGOOD(pdl)
927
928       TODO: mention the "FindBadStatusCode" and "CopyBadStatusCode" options
929       to "pp_def", as well as the "BadDoc" key.
930
931   Interfacing your own/library functions using PP
932       Now, consider the following: you have your own C function (that may in
933       fact be part of some library you want to interface to PDL) which takes
934       as arguments two pointers to vectors of double:
935
936               void myfunc(int n,double *v1,double *v2);
937
938       The correct way of defining the PDL function is
939
940               pp_def('myfunc',
941                       Pars => 'a(n); [o]b(n);',
942                       GenericTypes => ['D'],
943                       Code => 'myfunc($SIZE(n),$P(a),$P(b));'
944               );
945
946       The "$P("par")" syntax returns a pointer to the first element and the
947       other elements are guaranteed to lie after that.
948
949       Notice that here it is possible to make many mistakes. First, $SIZE(n)
950       must be used instead of "n". Second, you shouldn't put any loops in
951       this code. Third, here we encounter a new hash key recognised by
952       PDL::PP : the "GenericTypes" declaration tells PDL::PP to ONLY GENERATE
953       THE TYPELOOP FOP THE LIST OF TYPES SPECIFIED. In this case "double".
954       This has two advantages. Firstly the size of the compiled code is
955       reduced vastly, secondly if non-double arguments are passed to
956       "myfunc()" PDL will automatically convert them to double before passing
957       to the external C routine and convert them back afterwards.
958
959       One can also use "Pars" to qualify the types of individual arguments.
960       Thus one could also write this as:
961
962               pp_def('myfunc',
963                       Pars => 'double a(n); double [o]b(n);',
964                       Code => 'myfunc($SIZE(n),$P(a),$P(b));'
965               );
966
967       The type specification in "Pars" exempts the argument from variation in
968       the typeloop - rather it is automatically converted to and from the
969       type specified. This is obviously useful in a more general example,
970       e.g.:
971
972               void myfunc(int n,float *v1,long *v2);
973
974               pp_def('myfunc',
975                       Pars => 'float a(n); long [o]b(n);',
976                       GenericTypes => ['F'],
977                       Code => 'myfunc($SIZE(n),$P(a),$P(b));'
978               );
979
980       Note we still use "GenericTypes" to reduce the size of the type loop,
981       obviously PP could in principle spot this and do it automatically
982       though the code has yet to attain that level of sophistication!
983
984       Finally note when types are converted automatically one MUST use the
985       "[o]" qualifier for output variables or you hard-won changes will get
986       optimised away by PP!
987
988       If you interface a large library you can automate the interfacing even
989       further. Perl can help you again(!) in doing this. In many libraries
990       you have certain calling conventions. This can be exploited. In short,
991       you can write a little parser (which is really not difficult in Perl)
992       that then generates the calls to "pp_def" from parsed descriptions of
993       the functions in that library. For an example, please check the Slatec
994       interface in the "Lib" tree of the PDL distribution. If you want to
995       check (during debugging) which calls to PP functions your Perl code
996       generated a little helper package comes in handy which replaces the PP
997       functions by identically named ones that dump their arguments to
998       stdout.
999
1000       Just say
1001
1002          perl -MPDL::PP::Dump myfile.pd
1003
1004       to see the calls to "pp_def" and friends. Try it with ops.pd and
1005       slatec.pd. If you're interested (or want to enhance it), the source is
1006       in Basic/Gen/PP/Dump.pm
1007
1008   Other macros and functions in the Code section
1009       Macros: So far we have encountered the $SIZE, $GENERIC and $P macros.
1010       Now we are going to quickly explain the other macros that are expanded
1011       in the "Code" section of PDL::PP along with examples of their usage.
1012
1013       $T The $T macro is used for type switches. This is very useful when you
1014          have to use different external (e.g. library) functions depending on
1015          the input type of arguments. The general syntax is
1016
1017                  $Ttypeletters(type_alternatives)
1018
1019          where "typeletters" is a permutation of a subset of the letters
1020          "BSULNQFD" which stand for Byte, Short, Ushort, etc. and
1021          "type_alternatives" are the expansions when the type of the PP
1022          operation is equal to that indicated by the respective letter. Let's
1023          illustrate this incomprehensible description by an example. Assuming
1024          you have two C functions with prototypes
1025
1026            void float_func(float *in, float *out);
1027            void double_func(double *in, double *out);
1028
1029          which do basically the same thing but one accepts float and the
1030          other double pointers. You could interface them to PDL by defining a
1031          generic function "foofunc" (which will call the correct function
1032          depending on the type of the transformation):
1033
1034            pp_def('foofunc',
1035                  Pars => ' a(n); [o] b();',
1036                  Code => ' $TFD(float,double)_func ($P(a),$P(b));'
1037                  GenericTypes => [qw(F D)],
1038            );
1039
1040          There is a limitation that the comma-separated values cannot have
1041          parentheses.
1042
1043       $PP
1044          The $PP macro is used for a so called physical pointer access. The
1045          physical refers to some internal optimisations of PDL (for those who
1046          are familiar with the PDL core we are talking about the vaffine
1047          optimisations). This macro is mainly for internal use and you
1048          shouldn't need to use it in any of your normal code.
1049
1050       $COMP (and the "OtherPars" section)
1051          The $COMP macro is used to access non-pdl values in the code
1052          section. Its name is derived from the implementation of
1053          transformations in PDL. The variables you can refer to using $COMP
1054          are members of the ``compiled'' structure that represents the PDL
1055          transformation in question but does not yet contain any information
1056          about dimensions (for further details check PDL::Internals).
1057          However, you can treat $COMP just as a black box without knowing
1058          anything about the implementation of transformations in PDL. So when
1059          would you use this macro? Its main usage is to access values of
1060          arguments that are declared in the "OtherPars" section of a "pp_def"
1061          definition. But then you haven't heard about the "OtherPars" key
1062          yet?!  Let's have another example that illustrates typical usage of
1063          both new features:
1064
1065            pp_def('pnmout',
1066                  Pars => 'a(m)',
1067                  OtherPars => "char* fd",
1068                  GenericTypes => [qw(B U S L)],
1069                  Code => 'PerlIO *fp;
1070                           IO *io;
1071
1072                         io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
1073                           if (!io || !(fp = IoIFP(io)))
1074                                  croak("Can\'t figure out FP");
1075
1076                           if (PerlIO_write(fp,$P(a),len) != len)
1077                                          croak("Error writing pnm file");
1078            ');
1079
1080          This function is used to write data from a pdl to a file. The file
1081          descriptor is passed as a string into this function. This parameter
1082          does not go into the "Pars" section since it cannot be usefully
1083          treated like a pdl but rather into the aptly named "OtherPars"
1084          section. Parameters in the "OtherPars" section follow those in the
1085          "Pars" section when invoking the function, i.e.
1086
1087             open FILE,">out.dat" or die "couldn't open out.dat";
1088             pnmout($pdl,'FILE');
1089
1090          When you want to access this parameter inside the code section you
1091          have to tell PP by using the $COMP macro, i.e. you write "$COMP(fd)"
1092          as in the example. Otherwise PP wouldn't know that the "fd" you are
1093          referring to is the same as that specified in the "OtherPars"
1094          section.
1095
1096          Another use for the "OtherPars" section is to set a named dimension
1097          in the signature. Let's have an example how that is done:
1098
1099            pp_def('setdim',
1100                  Pars => '[o] a(n)',
1101                  OtherPars => 'int ns => n',
1102                  Code => 'loop(n) %{ $a() = n; %}',
1103            );
1104
1105          This says that the named dimension "n" will be initialised from the
1106          value of the other parameter "ns" which is of integer type (I guess
1107          you have realised that we use the "CType From => named_dim" syntax).
1108          Now you can call this function in the usual way:
1109
1110            setdim(($x=null),5);
1111            print $x;
1112              [ 0 1 2 3 4 ]
1113
1114          Admittedly this function is not very useful but it demonstrates how
1115          it works. If you call the function with an existing pdl and you
1116          don't need to explicitly specify the size of "n" since PDL::PP can
1117          figure it out from the dimensions of the non-null pdl. In that case
1118          you just give the dimension parameter as "-1":
1119
1120            $x = hist($y);
1121            setdim($x,-1);
1122
1123          That should do it.
1124
1125       The only PP function that we have used in the examples so far is
1126       "loop".  Additionally, there are currently two other functions which
1127       are recognised in the "Code" section:
1128
1129       threadloop
1130         As we heard above the signature of a PP defined function defines the
1131         dimensions of all the pdl arguments involved in a primitive
1132         operation.  However, you often call the functions that you defined
1133         with PP with pdls that have more dimensions than those specified in
1134         the signature. In this case the primitive operation is performed on
1135         all subslices of appropriate dimensionality in what is called a
1136         thread loop (see also overview above and PDL::Indexing). Assuming you
1137         have some notion of this concept you will probably appreciate that
1138         the operation specified in the code section should be optimised since
1139         this is the tightest loop inside a thread loop.  However, if you
1140         revisit the example where we define the "pnmout" function, you will
1141         quickly realise that looking up the "IO" file descriptor in the inner
1142         thread loop is not very efficient when writing a pdl with many rows.
1143         A better approach would be to look up the "IO" descriptor once
1144         outside the thread loop and use its value then inside the tightest
1145         thread loop. This is exactly where the "threadloop" function comes in
1146         handy. Here is an improved definition of "pnmout" which uses this
1147         function:
1148
1149           pp_def('pnmout',
1150                 Pars => 'a(m)',
1151                 OtherPars => "char* fd",
1152                 GenericTypes => [qw(B U S L)],
1153                 Code => 'PerlIO *fp;
1154                          IO *io;
1155                          int len;
1156
1157                        io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
1158                          if (!io || !(fp = IoIFP(io)))
1159                                 croak("Can\'t figure out FP");
1160
1161                          len = $SIZE(m) * sizeof($GENERIC());
1162
1163                          threadloop %{
1164                             if (PerlIO_write(fp,$P(a),len) != len)
1165                                         croak("Error writing pnm file");
1166                          %}
1167           ');
1168
1169         This works as follows. Normally the C code you write inside the
1170         "Code" section is placed inside a thread loop (i.e. PP generates the
1171         appropriate wrapping XS code around it). However, when you explicitly
1172         use the "threadloop" function, PDL::PP recognises this and doesn't
1173         wrap your code with an additional thread loop. This has the effect
1174         that code you write outside the thread loop is only executed once per
1175         transformation and just the code with in the surrounding "%{ ... %}"
1176         pair is placed within the tightest thread loop. This also comes in
1177         handy when you want to perform a decision (or any other code,
1178         especially CPU intensive code) only once per thread, i.e.
1179
1180           pp_addhdr('
1181             #define RAW 0
1182             #define ASCII 1
1183           ');
1184           pp_def('do_raworascii',
1185                  Pars => 'a(); b(); [o]c()',
1186                  OtherPars => 'int mode',
1187                Code => ' switch ($COMP(mode)) {
1188                             case RAW:
1189                                 threadloop %{
1190                                     /* do raw stuff */
1191                                 %}
1192                                 break;
1193                             case ASCII:
1194                                 threadloop %{
1195                                     /* do ASCII stuff */
1196                                 %}
1197                                 break;
1198                             default:
1199                                 croak("unknown mode");
1200                            }'
1201            );
1202
1203       types
1204         The types function works similar to the $T macro. However, with the
1205         "types" function the code in the following block (delimited by "%{"
1206         and "%}" as usual) is executed for all those cases in which the
1207         datatype of the operation is any of the types represented by the
1208         letters in the argument to "type", e.g.
1209
1210              Code => '...
1211
1212                      types(BSUL) %{
1213                          /* do integer type operation */
1214                      %}
1215                      types(FD) %{
1216                          /* do floating point operation */
1217                      %}
1218                      ...'
1219
1220         You are encouraged to use this idiom (from PDL::Math) in order to
1221         minimise effort needed to make your code work with new types:
1222
1223           use PDL::Types qw(types);
1224           my @Rtypes = grep $_->real, types();
1225           my @Ctypes = grep !$_->real, types();
1226           # ...
1227             my $got_complex = PDL::Core::Dev::got_complex_version($name, 2);
1228             my $complex_bit = join "\n",
1229               map 'types('.$_->ppsym.') %{$'.$c.'() = c'.$name.$_->floatsuffix.'($'.$x.'(),$'.$y.'());%}',
1230               @Ctypes;
1231             my $real_bit = join "\n",
1232               map 'types('.$_->ppsym.') %{$'.$c.'() = '.$name.'($'.$x.'(),$'.$y.'());%}',
1233               @Rtypes;
1234             ($got_complex ? $complex_bit : '') . $real_bit;
1235
1236   The RedoDimsCode Section
1237       The "RedoDimsCode" key is an optional key that is used to compute
1238       dimensions of ndarrays at runtime in case the standard rules for
1239       computing dimensions from the signature are not sufficient. The
1240       contents of the "RedoDimsCode" entry is interpreted in the same way
1241       that the Code section is interpreted-- i.e., PP macros are expanded and
1242       the result is interpreted as C code. The purpose of the code is to set
1243       the size of some dimensions that appear in the signature. Storage
1244       allocation and threadloops and so forth will be set up as if the
1245       computed dimension had appeared in the signature. In your code, you
1246       first compute the desired size of a named dimension in the signature
1247       according to your needs and then assign that value to it via the
1248       $SIZE() macro.
1249
1250       As an example, consider the following situation. You are interfacing an
1251       external library routine that requires an temporary array for workspace
1252       to be passed as an argument. Two input data arrays that are passed are
1253       p(m) and x(n). The output data array is y(n). The routine requires a
1254       workspace array with a length of n+m*m, and you'd like the storage
1255       created automatically just like it would be for any ndarray flagged
1256       with [t] or [o].  What you'd like is to say something like
1257
1258        pp_def( "myexternalfunc",
1259         Pars => " p(m);  x(n);  [o] y; [t] work(n+m*m); ", ...
1260
1261       but that won't work, because PP can't interpret expressions with
1262       arithmetic in the signature. Instead you write
1263
1264         pp_def(
1265             "myexternalfunc",
1266             Pars         => ' p(m);  x(n);  [o] y(); [t] work(wn); ',
1267             RedoDimsCode => '
1268               PDL_Indx im = $PDL(p)->dims[0];
1269               PDL_Indx in = $PDL(x)->dims[0];
1270               PDL_Indx min = in + im * im;
1271               PDL_Indx inw = $PDL(work)->dims[0];
1272               $SIZE(wn) = inw >= min ? inw : min;
1273             ',
1274             Code => '
1275               externalfunc( $P(p), $P(x), $SIZE(m), $SIZE(n), $P(work) );
1276             '
1277         );
1278
1279       This code works as follows: The macro $PDL(p) expands to a pointer to
1280       the pdl struct for the ndarray p.  You don't want a pointer to the data
1281       ( ie $P ) in this case, because you want to access the methods for the
1282       ndarray on the C level. You get the first dimension of each of the
1283       ndarrays and store them in integers. Then you compute the minimum
1284       length the work array can be. If the user sent an ndarray "work" with
1285       sufficient storage, then leave it alone. If the user sent, say a null
1286       pdl, or no pdl at all, then the size of wn will be zero and you reset
1287       it to the minimum value. Before the code in the Code section is
1288       executed PP will create the proper storage for "work" if it does not
1289       exist. Note that you only took the first dimension of "p" and "x"
1290       because the user may have sent ndarrays with extra threading
1291       dimensions. Of course, the temporary ndarray "work" (note the [t] flag)
1292       should not be given any thread dimensions anyway.
1293
1294       You can also use "RedoDimsCode" to set the dimension of a ndarray
1295       flagged with [o]. In this case you set the dimensions for the named
1296       dimension in the signature using $SIZE() as in the preceding example.
1297       However, because the ndarray is flagged with [o] instead of [t],
1298       threading dimensions will be added if required just as if the size of
1299       the dimension were computed from the signature according to the usual
1300       rules. Here is an example from PDL::Math
1301
1302        pp_def("polyroots",
1303             Pars => 'cr(n); ci(n); [o]rr(m); [o]ri(m);',
1304             RedoDimsCode => 'PDL_Indx sn = $PDL(cr)->dims[0]; $SIZE(m) = sn-1;',
1305
1306       The input ndarrays are the real and imaginary parts of complex
1307       coefficients of a polynomial. The output ndarrays are real and
1308       imaginary parts of the roots. There are "n" roots to an "n"th order
1309       polynomial and such a polynomial has "n+1" coefficients (the zero-th
1310       through the "n"th). In this example, threading will work correctly.
1311       That is, the first dimension of the output ndarray with have its
1312       dimension adjusted, but other threading dimensions will be assigned
1313       just as if there were no "RedoDimsCode".
1314
1315   Typemap handling in the "OtherPars" section
1316       The "OtherPars" section discussed above is very often absolutely
1317       crucial when you interface external libraries with PDL. However in many
1318       cases the external libraries either use derived types or pointers of
1319       various types.
1320
1321       The standard way to handle this in Perl is to use a "typemap" file.
1322       This is discussed in some detail in perlxs in the standard Perl
1323       documentation. In PP the functionality is very similar, so you can
1324       create a "typemap" file in the directory where your PP file resides and
1325       when it is built it is automatically read in to figure out the
1326       appropriate translation between the C type and Perl's built-in type.
1327
1328       That said, there are a couple of important differences from the general
1329       handling of types in XS. The first, and probably most important, is
1330       that at the moment pointers to types are not allowed in the "OtherPars"
1331       section. To get around this limitation you must use the "IV" type
1332       (thanks to Judd Taylor for pointing out that this is necessary for
1333       portability).
1334
1335       It is probably best to illustrate this with a couple of code-snippets:
1336
1337       For instance the "gsl_spline_init" function has the following C
1338       declaration:
1339
1340           int  gsl_spline_init(gsl_spline * spline,
1341                 const double xa[], const double ya[], size_t size);
1342
1343       Clearly the "xa" and "ya" arrays are candidates for being passed in as
1344       ndarrays and the "size" argument is just the length of these ndarrays
1345       so that can be handled by the "$SIZE()" macro in PP. The problem is the
1346       pointer to the "gsl_spline" type. The natural solution would be to
1347       write an "OtherPars" declaration of the form
1348
1349           OtherPars => 'gsl_spline *spl'
1350
1351       and write a short "typemap" file which handled this type. This does not
1352       work at present however! So what you have to do is to go around the
1353       problem slightly (and in some ways this is easier too!):
1354
1355       The solution is to declare "spline" in the "OtherPars" section using an
1356       "Integer Value", "IV". This hides the nature of the variable from PP
1357       and you then need to (well to avoid compiler warnings at least!)
1358       perform a type cast when you use the variable in your code. Thus
1359       "OtherPars" should take the form:
1360
1361           OtherPars => 'IV spl'
1362
1363       and when you use it in the code you will write
1364
1365           INT2PTR(gsl_spline *, $COMP(spl))
1366
1367       where the Perl API macro "INT2PTR" has been used to handle the pointer
1368       cast to avoid compiler warnings and problems for machines with mixed
1369       32bit and 64bit Perl configurations.  Putting this together as Andres
1370       Jordan has done (with the modification using "IV" by Judd Taylor) in
1371       the "gsl_interp.pd" in the distribution source you get:
1372
1373            pp_def('init_meat',
1374                   Pars => 'double x(n); double y(n);',
1375                   OtherPars => 'IV spl',
1376                   Code =>'
1377                gsl_spline_init,( INT2PTR(gsl_spline *, $COMP(spl)), $P(x),$P(y),$SIZE(n)));'
1378           );
1379
1380       where I have removed a macro wrapper call, but that would obscure the
1381       discussion.
1382
1383       The other minor difference as compared to the standard typemap handling
1384       in Perl, is that the user cannot specify non-standard typemap locations
1385       or typemap filenames using the "TYPEMAPS" option in MakeMaker... Thus
1386       you can only use a file called "typemap" and/or the "IV" trick above.
1387
1388   Other useful PP keys in data operation definitions
1389       You have already heard about the "OtherPars" key. Currently, there are
1390       not many other keys for a data operation that will be useful in normal
1391       (whatever that is) PP programming. In fact, it would be interesting to
1392       hear about a case where you think you need more than what is provided
1393       at the moment.  Please speak up on one of the PDL mailing lists. Most
1394       other keys recognised by "pp_def" are only really useful for what we
1395       call slice operations (see also above).
1396
1397       One thing that is strongly being planned is variable number of
1398       arguments, which will be a little tricky.
1399
1400       An incomplete list of the available keys:
1401
1402       Inplace
1403           Setting this key marks the routine as working inplace - ie the
1404           input and output ndarrays are the same. An example is
1405           "$x->inplace->sqrt()" (or "sqrt(inplace($x))").
1406
1407           Inplace => 1
1408               Use when the routine is a unary function, such as "sqrt".
1409
1410           Inplace => ['a']
1411               If there are more than one input ndarrays, specify the name of
1412               the one that can be changed inplace using an array reference.
1413
1414           Inplace => ['a','b']
1415               If there are more than one output ndarray, specify the name of
1416               the input ndarray and output ndarray in a 2-element array
1417               reference. This probably isn't needed, but left in for
1418               completeness.
1419
1420           If bad values are being used, care must be taken to ensure the
1421           propagation of the badflag when inplace is being used; consider
1422           this excerpt from Basic/Bad/bad.pd:
1423
1424             pp_def('replacebad',HandleBad => 1,
1425               Pars => 'a(); [o]b();',
1426               OtherPars => 'double newval',
1427               Inplace => 1,
1428               CopyBadStatusCode =>
1429               '/* propagate badflag if inplace AND it has changed */
1430                if ( a == b && $ISPDLSTATEBAD(a) )
1431                  PDL->propagate_badflag( b, 0 );
1432
1433                /* always make sure the output is "good" */
1434                $SETPDLSTATEGOOD(b);
1435               ',
1436               ...
1437
1438           Since this routine removes all bad values, the output ndarray had
1439           its bad flag cleared. If run inplace (so "a == b"), then we have to
1440           tell all the children of "a" that the bad flag has been cleared (to
1441           save time we make sure that we call "PDL->propagate_badgflag" only
1442           if the input ndarray had its bad flag set).
1443
1444           NOTE: one idea is that the documentation for the routine could be
1445           automatically flagged to indicate that it can be executed inplace,
1446           ie something similar to how "HandleBad" sets "BadDoc" if it's not
1447           supplied (it's not an ideal solution).
1448
1449   Other PDL::PP functions to support concise package definition
1450       So far, we have described the "pp_def" and "pp_done" functions. PDL::PP
1451       exports a few other functions to aid you in writing concise PDL
1452       extension package definitions.
1453
1454       pp_addhdr
1455
1456       Often when you interface library functions as in the above example you
1457       have to include additional C include files. Since the XS file is
1458       generated by PP we need some means to make PP insert the appropriate
1459       include directives in the right place into the generated XS file.  To
1460       this end there is the "pp_addhdr" function. This is also the function
1461       to use when you want to define some C functions for internal use by
1462       some of the XS functions (which are mostly functions defined by
1463       "pp_def").  By including these functions here you make sure that
1464       PDL::PP inserts your code before the point where the actual XS module
1465       section begins and will therefore be left untouched by xsubpp (cf.
1466       perlxs and perlxstut man pages).
1467
1468       A typical call would be
1469
1470         pp_addhdr('
1471         #include <unistd.h>       /* we need defs of XXXX */
1472         #include "libprotos.h"    /* prototypes of library functions */
1473         #include "mylocaldecs.h"  /* Local decs */
1474
1475         static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
1476         {
1477               /* do some calculations with the data */
1478         }
1479         ');
1480
1481       This ensures that all the constants and prototypes you need will be
1482       properly included and that you can use the internal functions defined
1483       here in the "pp_def"s, e.g.:
1484
1485         pp_def('barfoo',
1486                Pars => ' a(n); [o] b(n)',
1487                GenericTypes => ['B'],
1488                Code => ' PDL_Indx ns = $SIZE(n);
1489                          do_the_real_work($P(a),$P(b),ns);
1490                        ',
1491         );
1492
1493       pp_addpm
1494
1495       In many cases the actual PP code (meaning the arguments to "pp_def"
1496       calls) is only part of the package you are currently implementing.
1497       Often there is additional Perl code and XS code you would normally have
1498       written into the pm and XS files which are now automatically generated
1499       by PP. So how to get this stuff into those dynamically generated files?
1500       Fortunately, there are a couple of functions, generally called
1501       "pp_addXXX" that assist you in doing this.
1502
1503       Let's assume you have additional Perl code that should go into the
1504       generated pm-file. This is easily achieved with the "pp_addpm" command:
1505
1506          pp_addpm(<<'EOD');
1507
1508          =head1 NAME
1509
1510          PDL::Lib::Mylib -- a PDL interface to the Mylib library
1511
1512          =head1 DESCRIPTION
1513
1514          This package implements an interface to the Mylib package with full
1515          threading and indexing support (see L<PDL::Indexing>).
1516
1517          =cut
1518
1519          use PGPLOT;
1520
1521          =head2 use_myfunc
1522               this function applies the myfunc operation to all the
1523               elements of the input pdl regardless of dimensions
1524               and returns the sum of the result
1525          =cut
1526
1527          sub use_myfunc {
1528               my $pdl = shift;
1529
1530               myfunc($pdl->clump(-1),($res=null));
1531
1532               return $res->sum;
1533          }
1534
1535          EOD
1536
1537       pp_add_exported
1538
1539       You have probably got the idea. In some cases you also want to export
1540       your additional functions. To avoid getting into trouble with PP which
1541       also messes around with the @EXPORT array you just tell PP to add your
1542       functions to the list of exported functions:
1543
1544         pp_add_exported('use_myfunc gethynx');
1545
1546       pp_add_isa
1547
1548       The "pp_add_isa" command works like the the "pp_add_exported" function.
1549       The arguments to "pp_add_isa" are added the @ISA list, e.g.
1550
1551         pp_add_isa(' Some::Other::Class ');
1552
1553       pp_bless
1554
1555       If your pp_def routines are to be used as object methods use "pp_bless"
1556       to specify the package (i.e. class) to which your pp_defed methods will
1557       be added. For example, "pp_bless('PDL::MyClass')". The default is "PDL"
1558       if this is omitted.
1559
1560       pp_addxs
1561
1562       Sometimes you want to add extra XS code of your own (that is generally
1563       not involved with any threading/indexing issues but supplies some other
1564       functionality you want to access from the Perl side) to the generated
1565       XS file, for example
1566
1567         pp_addxs('','
1568
1569         # Determine endianness of machine
1570
1571         int
1572         isbigendian()
1573            CODE:
1574              unsigned short i;
1575              PDL_Byte *b;
1576
1577              i = 42; b = (PDL_Byte*) (void*) &i;
1578
1579              if (*b == 42)
1580                 RETVAL = 0;
1581              else if (*(b+1) == 42)
1582                 RETVAL = 1;
1583              else
1584                 croak("Impossible - machine is neither big nor little endian!!\n");
1585              OUTPUT:
1586                RETVAL
1587         ');
1588
1589       Especially "pp_add_exported" and "pp_addxs" should be used with care.
1590       PP uses PDL::Exporter, hence letting PP export your function means that
1591       they get added to the standard list of function exported by default
1592       (the list defined by the export tag ``:Func''). If you use "pp_addxs"
1593       you shouldn't try to do anything that involves threading or indexing
1594       directly. PP is much better at generating the appropriate code from
1595       your definitions.
1596
1597       pp_add_boot
1598
1599       Finally, you may want to add some code to the BOOT section of the XS
1600       file (if you don't know what that is check perlxs). This is easily done
1601       with the "pp_add_boot" command:
1602
1603         pp_add_boot(<<EOB);
1604               descrip = mylib_initialize(KEEP_OPEN);
1605
1606               if (descrip == NULL)
1607                  croak("Can't initialize library");
1608
1609               GlobalStruc->descrip = descrip;
1610               GlobalStruc->maxfiles = 200;
1611         EOB
1612
1613       pp_export_nothing
1614
1615       By default, PP.pm puts all subs defined using the pp_def function into
1616       the output .pm file's EXPORT list. This can create problems if you are
1617       creating a subclassed object where you don't want any methods exported.
1618       (i.e. the methods will only be called using the $object->method
1619       syntax).
1620
1621       For these cases you can call pp_export_nothing() to clear out the
1622       export list. Example (At the end of the .pd file):
1623
1624         pp_export_nothing();
1625         pp_done();
1626
1627       pp_core_importList
1628
1629       By default, PP.pm puts the 'use Core;' line into the output .pm file.
1630       This imports Core's exported names into the current namespace, which
1631       can create problems if you are over-riding one of Core's methods in the
1632       current file.  You end up getting messages like "Warning: sub sumover
1633       redefined in file subclass.pm" when running the program.
1634
1635       For these cases the pp_core_importList can be used to change what is
1636       imported from Core.pm.  For example:
1637
1638         pp_core_importList('()')
1639
1640       This would result in
1641
1642         use Core();
1643
1644       being generated in the output .pm file. This would result in no names
1645       being imported from Core.pm. Similarly, calling
1646
1647         pp_core_importList(' qw/ barf /')
1648
1649       would result in
1650
1651         use Core qw/ barf/;
1652
1653       being generated in the output .pm file. This would result in just
1654       'barf' being imported from Core.pm.
1655
1656       pp_setversion
1657
1658       Simultaneously set the .pm and .xs files' versions, thus avoiding
1659       unnecessary version-skew between the two. To use this, simply do this
1660       in your .pd file, probably near the top:
1661
1662        our $VERSION = '0.0.3';
1663        pp_setversion($VERSION);
1664
1665        # Then, in your Makefile.PL:
1666        my @package = qw(FFTW3.pd FFTW3 PDL::FFTW3);
1667        my %descriptor = pdlpp_stdargs(\@package);
1668        $descriptor{VERSION_FROM} = 'FFTW3.pd'; # EUMM can parse the format above
1669
1670       However, don't use this if you use Module::Build::PDL. See that
1671       module's documentation for details.
1672
1673       pp_deprecate_module
1674
1675       If a particular module is deemed obsolete, this function can be used to
1676       mark it as deprecated. This has the effect of emitting a warning when a
1677       user tries to "use" the module. The generated POD for this module also
1678       carries a deprecation notice. The replacement module can be passed as
1679       an argument like this:
1680
1681        pp_deprecate_module( infavor => "PDL::NewNonDeprecatedModule" );
1682
1683       Note that function affects only the runtime warning and the POD.
1684

Making your PP function "private"

1686       Let's say that you have a function in your module called PDL::foo that
1687       uses the PP function "bar_pp" to do the heavy lifting. But you don't
1688       want to advertise that "bar_pp" exists. To do this, you must move your
1689       PP function to the top of your module file, then call
1690
1691        pp_export_nothing()
1692
1693       to clear the "EXPORT" list. To ensure that no documentation (even the
1694       default PP docs) is generated, set
1695
1696        Doc => undef
1697
1698       and to prevent the function from being added to the symbol table, set
1699
1700        PMFunc => ''
1701
1702       in your pp_def declaration (see Image2D.pd for an example). This will
1703       effectively make your PP function "private." However, it is always
1704       accessible via PDL::bar_pp due to Perl's module design. But making it
1705       private will cause the user to go very far out of his or her way to use
1706       it, so he or she shoulders the consequences!
1707

Slice operation

1709       The slice operation section of this manual is provided using dataflow
1710       and lazy evaluation: when you need it, ask Tjl to write it.  a delivery
1711       in a week from when I receive the email is 95% probable and two week
1712       delivery is 99% probable.
1713
1714       And anyway, the slice operations require a much more intimate knowledge
1715       of PDL internals than the data operations. Furthermore, the complexity
1716       of the issues involved is considerably higher than that in the average
1717       data operation. If you would like to convince yourself of this fact
1718       take a look at the Basic/Slices/slices.pd file in the PDL distribution
1719       :-). Nevertheless, functions generated using the slice operations are
1720       at the heart of the index manipulation and dataflow capabilities of
1721       PDL.
1722
1723       Also, there are a lot of dirty issues with virtual ndarrays and
1724       vaffines which we shall entirely skip here.
1725
1726   Slices and bad values
1727       Slice operations need to be able to handle bad values.  The easiest
1728       thing to do is look at Basic/Slices/slices.pd to see how this works.
1729
1730       Along with "BadCode", there are also the "BadBackCode" and
1731       "BadRedoDimsCode" keys for "pp_def". However, any "EquivCPOffsCode"
1732       should not need changing, since any changes are absorbed into the
1733       definition of the "$EQUIVCPOFFS()" macro (i.e. it is handled
1734       automatically by PDL::PP).
1735
1736   A few notes on writing a slicing routine...
1737       The following few paragraphs describe writing of a new slicing routine
1738       ('range'); any errors are CED's. (--CED 26-Aug-2002)
1739

Handling of "warn" and "barf" in PP Code

1741       For printing warning messages or aborting/dieing, you can call "warn"
1742       or "barf" from PP code.  However, you should be aware that these calls
1743       have been redefined using C preprocessor macros to "PDL->barf" and
1744       "PDL->warn". These redefinitions are in place to keep you from
1745       inadvertently calling perl's "warn" or "barf" directly, which can cause
1746       segfaults during pthreading (i.e. processor multi-threading).
1747
1748       PDL's own versions of "barf" and "warn" will queue-up warning or barf
1749       messages until after pthreading is completed, and then call the perl
1750       versions of these routines.
1751
1752       See PDL::ParallelCPU for more information on pthreading.
1753

USEFUL ROUTINES

1755       The PDL "Core" structure, defined in Basic/Core/pdlcore.h.PL, contains
1756       pointers to a number of routines that may be useful to you.  The
1757       majority of these routines deal with manipulating ndarrays, but some
1758       are more general:
1759
1760       PDL->qsort_B( PDL_Byte *xx, PDL_Indx a, PDL_Indx b )
1761           Sort the array "xx" between the indices "a" and "b".  There are
1762           also versions for the other PDL datatypes, with postfix "_S", "_U",
1763           "_L", "_N", "_Q", "_F", and "_D".  Any module using this must
1764           ensure that "PDL::Ufunc" is loaded.
1765
1766       PDL->qsort_ind_B( PDL_Byte *xx, PDL_Indx *ix, PDL_Indx a, PDL_Indx b )
1767           As for "PDL->qsort_B", but this time sorting the indices rather
1768           than the data.
1769
1770       The routine "med2d" in Lib/Image2D/image2d.pd shows how such routines
1771       are used.
1772

MAKEFILES FOR PP FILES

1774       If you are going to generate a package from your PP file (typical file
1775       extensions are ".pd" or ".pp" for the files containing PP code) it is
1776       easiest and safest to leave generation of the appropriate commands to
1777       the Makefile. In the following we will outline the typical format of a
1778       Perl Makefile to automatically build and install your package from a
1779       description in a PP file. Most of the rules to build the xs, pm and
1780       other required files from the PP file are already predefined in the
1781       PDL::Core::Dev package. We just have to tell MakeMaker to use it.
1782
1783       In most cases you can define your Makefile like
1784
1785         # Makefile.PL for a package defined by PP code.
1786
1787         use PDL::Core::Dev;            # Pick up development utilities
1788         use ExtUtils::MakeMaker;
1789
1790         $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
1791         %hash = pdlpp_stdargs($package);
1792         $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
1793         $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
1794         WriteMakefile(%hash);
1795
1796         sub MY::postamble { pdlpp_postamble($package); }
1797
1798       Here, the list in $package is: first: PP source file name, then the
1799       prefix for the produced files and finally the whole package name.  You
1800       can modify the hash in whatever way you like but it would be reasonable
1801       to stay within some limits so that your package will continue to work
1802       with later versions of PDL.
1803
1804       If you don't want to use prepackaged arguments, here is a generic
1805       Makefile.PL that you can adapt for your own needs:
1806
1807         # Makefile.PL for a package defined by PP code.
1808
1809         use PDL::Core::Dev;            # Pick up development utilities
1810         use ExtUtils::MakeMaker;
1811
1812         WriteMakefile(
1813          'NAME'       => 'PDL::Lib::Mylib',
1814          'VERSION_FROM'       => 'mylib.pd',
1815          'TYPEMAPS'     => [&PDL_TYPEMAP()],
1816          'OBJECT'       => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
1817          'PM'         => { 'Mylib.pm'            => '$(INST_LIBDIR)/Mylib.pm'},
1818          'INC'          => &PDL_INCLUDE(), # add include dirs as required by your lib
1819          'LIBS'         => [''],   # add link directives as necessary
1820          'clean'        => {'FILES'  =>
1821                                 'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
1822                                 additional_Ccode$(OBJ_EXT)'},
1823         );
1824
1825         # Add genpp rule; this will invoke PDL::PP on our PP file
1826         # the argument is an array reference where the array has three string elements:
1827         #   arg1: name of the source file that contains the PP code
1828         #   arg2: basename of the xs and pm files to be generated
1829         #   arg3: name of the package that is to be generated
1830         sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }
1831
1832       To make life even easier PDL::Core::Dev defines the function
1833       "pdlpp_stdargs" that returns a hash with default values that can be
1834       passed (either directly or after appropriate modification) to a call to
1835       WriteMakefile.  Currently, "pdlpp_stdargs" returns a hash where the
1836       keys are filled in as follows:
1837
1838               (
1839                'NAME'         => $mod,
1840                'TYPEMAPS'     => [&PDL_TYPEMAP()],
1841                'OBJECT'       => "$pref\$(OBJ_EXT)",
1842                PM     => {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
1843                MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
1844                'INC'          => &PDL_INCLUDE(),
1845                'LIBS'         => [''],
1846                'clean'        => {'FILES'  => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
1847               )
1848
1849       Here, $src is the name of the source file with PP code, $pref the
1850       prefix for the generated .pm and .xs files and $mod the name of the
1851       extension module to generate.
1852

INTERNALS

1854       The internals of the current version consist of a large table which
1855       gives the rules according to which things are translated and the subs
1856       which implement these rules.
1857
1858       Later on, it would be good to make the table modifiable by the user so
1859       that different things may be tried.
1860
1861       [Meta comment: here will hopefully be more in the future; currently,
1862       your best bet will be to read the source code :-( or ask on the list
1863       (try the latter first) ]
1864

Appendix A: Some keys recognised by PDL::PP

1866       Unless otherwise specified, the arguments are strings.
1867
1868       Pars
1869           define the signature of your function
1870
1871       OtherPars
1872           arguments which are not pdls. Default: nothing. This is a semi-
1873           colon separated list of arguments, e.g., "OtherPars=>'int k; double
1874           value; char* fd'". See $COMP(x) and also the same entry in Appendix
1875           B.
1876
1877       Code
1878           the actual code that implements the functionality; several PP
1879           macros and PP functions are recognised in the string value
1880
1881       HandleBad
1882           If set to 1, the routine is assumed to support bad values and the
1883           code in the BadCode key is used if bad values are present; it also
1884           sets things up so that the "$ISBAD()" etc macros can be used.  If
1885           set to 0, cause the routine to print a warning if any of the input
1886           ndarrays have their bad flag set.
1887
1888       BadCode
1889           Give the code to be used if bad values may be present in the input
1890           ndarrays.  Only used if "HandleBad => 1".
1891
1892       GenericTypes
1893           An array reference. The array may contain any subset of the one-
1894           character strings given below, which specify which types your
1895           operation will accept. The meaning of each type is:
1896
1897            B - signed byte (i.e. signed char)
1898            S - signed short (two-byte integer)
1899            U - unsigned short
1900            L - signed long (four-byte integer, int on 32 bit systems)
1901            N - signed integer for indexing ndarray elements (platform & Perl-dependent size)
1902            Q - signed long long (eight byte integer)
1903            F - float
1904            D - double
1905            G - complex float
1906            C - complex double
1907
1908           This is very useful (and important!) when interfacing an external
1909           library.  Default: [qw/B S U L N Q F D/]
1910
1911       Inplace
1912           Mark a function as being able to work inplace.
1913
1914            Inplace => 1          if  Pars => 'a(); [o]b();'
1915            Inplace => ['a']      if  Pars => 'a(); b(); [o]c();'
1916            Inplace => ['a','b']  if  Pars => 'a(); b(); [o]c(); [o]d();'
1917
1918           If bad values are being used, care must be taken to ensure the
1919           propagation of the badflag when inplace is being used; for instance
1920           see the code for "replacebad" in Basic/Bad/bad.pd.
1921
1922       Doc Used to specify a documentation string in Pod format. See PDL::Doc
1923           for information on PDL documentation conventions. Note: in the
1924           special case where the PP 'Doc' string is one line this is
1925           implicitly used for the quick reference AND the documentation!
1926
1927           If the Doc field is omitted PP will generate default documentation
1928           (after all it knows about the Signature).
1929
1930           If you really want the function NOT to be documented in any way at
1931           this point (e.g. for an internal routine, or because you are doing
1932           it elsewhere in the code) explicitly specify "Doc=>undef".
1933
1934       BadDoc
1935           Contains the text returned by the "badinfo" command (in "perldl")
1936           or the "-b" switch to the "pdldoc" shell script. In many cases, you
1937           will not need to specify this, since the information can be
1938           automatically created by PDL::PP. However, as befits computer-
1939           generated text, it's rather stilted; it may be much better to do it
1940           yourself!
1941
1942       NoPthread
1943           Optional flag to indicate the PDL function should not use processor
1944           threads (i.e.  pthreads or POSIX threads) to split up work across
1945           multiple CPU cores. This option is typically set to 1 if the
1946           underlying PDL function is not threadsafe. If this option isn't
1947           present, then the function is assumed to be threadsafe. This option
1948           only applies if PDL has been compiled with POSIX threads enabled.
1949
1950       PMCode
1951             pp_def('funcname',
1952               Pars => 'a(); [o] b();',
1953               PMCode => 'sub PDL::funcname {
1954                 return PDL::_funcname_int(@_) if @_ == 2; # output arg "b" supplied
1955                 PDL::_funcname_int(@_, my $out = PDL->null);
1956                 $out;
1957               }',
1958               # ...
1959             );
1960
1961           PDL functions allow "[o]" ndarray arguments into which you want the
1962           output saved. This is handy because you can allocate an output
1963           ndarray once and reuse it many times; the alternative would be for
1964           PDL to create a new ndarray each time, which may waste compute
1965           cycles or, more likely, RAM.
1966
1967           PDL functions check the number of arguments they are given, and
1968           call "croak" if given the wrong number. By default (with no
1969           "PMCode" supplied), any output arguments may be omitted, and
1970           PDL::PP provides code that can handle this by creating "null"
1971           objects, passing them to your code, then returning them on the
1972           stack.
1973
1974           If you do supply "PMCode", the rest of PDL::PP assumes it will be a
1975           string that defines a Perl function with the function's name in the
1976           "pp_bless" package ("PDL" by default). As the example implies, the
1977           PP-generated function name will change from "<funcname>", to
1978           "_<funcname>_int". As also shown above, you will need to supply all
1979           ndarrays in the exact order specified in the signature: output
1980           ndarrays are not optional, and the PP-generated function will not
1981           return anything.
1982
1983       PMFunc
1984           When pp_def generates functions, it typically defines them in the
1985           PDL package. Then, in the .pm file that it generates for your
1986           module, it typically adds a line that essentially copies that
1987           function into your current package's symbol table with code that
1988           looks like this:
1989
1990            *func_name = \&PDL::func_name;
1991
1992           It's a little bit smarter than that (it knows when to wrap that
1993           sort of thing in a BEGIN block, for example, and if you specified
1994           something different for pp_bless), but that's the gist of it. If
1995           you don't care to import the function into your current package's
1996           symbol table, you can specify
1997
1998            PMFunc => '',
1999
2000           PMFunc has no other side-effects, so you could use it to insert
2001           arbitrary Perl code into your module if you like. However, you
2002           should use pp_addpm if you want to add Perl code to your module.
2003

Appendix B: PP macros and functions

2005   Macros
2006       $variablename_from_sig()
2007              access a pdl (by its name) that was specified in the signature
2008
2009       $COMP(x)
2010              access a value in the private data structure of this
2011              transformation (mainly used to use an argument that is specified
2012              in the "OtherPars" section)
2013
2014       $SIZE(n)
2015              replaced at runtime by the actual size of a named dimension (as
2016              specified in the signature)
2017
2018       $GENERIC()
2019              replaced by the C type that is equal to the runtime type of the
2020              operation
2021
2022       $P(a)  a pointer to the data of the PDL named "a" in the signature.
2023              Useful for interfacing to C functions
2024
2025       $PP(a) a physical pointer access to pdl "a"; mainly for internal use
2026
2027       $TXXX(Alternative,Alternative)
2028              expansion alternatives according to runtime type of operation,
2029              where XXX is some string that is matched by "/[BSULNQFD+]/".
2030
2031       $PDL(a)
2032              return a pointer to the pdl data structure (pdl *) of ndarray
2033              "a"
2034
2035       $ISBAD(a())
2036              returns true if the value stored in "a()" equals the bad value
2037              for this ndarray.  Requires "HandleBad" being set to 1.
2038
2039       $ISGOOD(a())
2040              returns true if the value stored in "a()" does not equal the bad
2041              value for this ndarray.  Requires "HandleBad" being set to 1.
2042
2043       $SETBAD(a())
2044              Sets "a()" to equal the bad value for this ndarray.  Requires
2045              "HandleBad" being set to 1.
2046
2047   functions
2048       "loop(DIMS) %{ ... %}"
2049          loop over named dimensions; limits are generated automatically by PP
2050
2051       "threadloop %{ ... %}"
2052          enclose following code in a thread loop
2053
2054       "types(TYPES) %{ ... %}"
2055          execute following code if type of operation is any of "TYPES"
2056

Appendix C: Functions imported by PDL::PP

2058       A number of functions are imported when you "use PDL::PP". These
2059       include functions that control the generated C or XS code, functions
2060       that control the generated Perl code, and functions that manipulate the
2061       packages and symbol tables into which the code is created.
2062
2063   Generating C and XS Code
2064       PDL::PP's main purpose is to make it easy for you to wrap the threading
2065       engine around your own C code, but you can do some other things, too.
2066
2067       pp_def
2068           Used to wrap the threading engine around your C code. Virtually all
2069           of this document discusses the use of pp_def.
2070
2071       pp_done
2072           Indicates you are done with PDL::PP and that it should generate its
2073           .xs and .pm files based upon the other pp_* functions that you have
2074           called.  This function takes no arguments.
2075
2076       pp_addxs
2077           This lets you add XS code to your .xs file. This is useful if you
2078           want to create Perl-accessible functions that invoke C code but
2079           cannot or should not invoke the threading engine. XS is the
2080           standard means by which you wrap Perl-accessible C code. You can
2081           learn more at perlxs.
2082
2083       pp_add_boot
2084           This function adds whatever string you pass to the XS BOOT section.
2085           The BOOT section is C code that gets called by Perl when your
2086           module is loaded and is useful for automatic initialization. You
2087           can learn more about XS and the BOOT section at perlxs.
2088
2089       pp_addhdr
2090           Adds pure-C code to your XS file. XS files are structured such that
2091           pure C code must come before XS specifications. This allows you to
2092           specify such C code.
2093
2094       pp_boundscheck
2095           PDL normally checks the bounds of your accesses before making them.
2096           You can turn that on or off at runtime by setting
2097           MyPackage::set_boundscheck. This function allows you to remove that
2098           runtime flexibility and never do bounds checking. It also returns
2099           the current boundschecking status if called without any argumens.
2100
2101           NOTE: I have not found anything about bounds checking in other
2102           documentation.  That needs to be addressed.
2103
2104   Generating Perl Code
2105       Many functions imported when you use PDL::PP allow you to modify the
2106       contents of the generated .pm file. In addition to pp_def and pp_done,
2107       the role of these functions is primarily to add code to various parts
2108       of your generated .pm file.
2109
2110       pp_addpm
2111           Adds Perl code to the generated .pm file. PDL::PP actually keeps
2112           track of three different sections of generated code: the Top, the
2113           Middle, and the Bottom. You can add Perl code to the Middle section
2114           using the one-argument form, where the argument is the Perl code
2115           you want to supply. In the two-argument form, the first argument is
2116           an anonymous hash with only one key that specifies where to put the
2117           second argument, which is the string that you want to add to the
2118           .pm file. The hash is one of these three:
2119
2120            {At => 'Top'}
2121            {At => 'Middle'}
2122            {At => 'Bot'}
2123
2124           For example:
2125
2126            pp_addpm({At => 'Bot'}, <<POD);
2127
2128            =head1 Some documentation
2129
2130            I know I'm typing this in the middle of my file, but it'll go at
2131            the bottom.
2132
2133            =cut
2134
2135            POD
2136
2137           Warning: If, in the middle of your .pd file, you put documentation
2138           meant for the bottom of your pod, you will thoroughly confuse CPAN.
2139           On the other hand, if in the middle of your .pd file, you add some
2140           Perl code destined for the bottom or top of your .pm file, you only
2141           have yourself to confuse. :-)
2142
2143       pp_beginwrap
2144           Adds BEGIN-block wrapping. Certain declarations can be wrapped in
2145           BEGIN blocks, though the default behavior is to have no such
2146           wrapping.
2147
2148       pp_addbegin
2149           Sets code to be added to the top of your .pm file, even above code
2150           that you specify with "pp_addpm({At => 'Top'}, ...)". Unlike
2151           pp_addpm, calling this overwrites whatever was there before.
2152           Generally, you probably shouldn't use it.
2153
2154   Tracking Line Numbers
2155       When you get compile errors, either from your C-like code or your Perl
2156       code, it can help to make those errors back to the line numbers in the
2157       source file at which the error occurred.
2158
2159       pp_line_numbers
2160           Takes a line number and a (usually long) string of code. The line
2161           number should indicate the line at which the quote begins. This is
2162           usually Perl's "__LINE__" literal, unless you are using heredocs,
2163           in which case it is "__LINE__ + 1". The returned string has #line
2164           directives interspersed to help the compiler report errors on the
2165           proper line.
2166
2167   Modifying the Symbol Table and Export Behavior
2168       PDL::PP usually exports all functions generated using pp_def, and
2169       usually installs them into the PDL symbol table. However, you can
2170       modify this behavior with these functions.
2171
2172       pp_bless
2173           Sets the package (symbol table) to which the XS code is added. The
2174           default is PDL, which is generally what you want. If you use the
2175           default blessing and you create a function myfunc, then you can do
2176           the following:
2177
2178            $ndarray->myfunc(<args>);
2179            PDL::myfunc($ndarray, <args>);
2180
2181           On the other hand, if you bless your functions into another
2182           package, you cannot invoke them as PDL methods, and must invoke
2183           them as:
2184
2185            MyPackage::myfunc($ndarray, <args>);
2186
2187           Of course, you could always use the PMFunc key to add your function
2188           to the PDL symbol table, but why do that?
2189
2190       pp_add_isa
2191           Adds to the list of modules from which your module inherits. The
2192           default list is
2193
2194            qw(PDL::Exporter DynaLoader)
2195
2196       pp_core_importlist
2197           At the top of your generated .pm file is a line that looks like
2198           this:
2199
2200            use PDL::Core;
2201
2202           You can modify that by specifying a string to pp_core_importlist.
2203           For example,
2204
2205            pp_core_importlist('::Blarg');
2206
2207           will result in
2208
2209            use PDL::Core::Blarg;
2210
2211           You can use this, for example, to add a list of symbols to import
2212           from PDL::Core. For example:
2213
2214            pp_core_importlist(" ':Internal'");
2215
2216           will lead to the following use statement:
2217
2218            use PDL::Core ':Internal';
2219
2220       pp_setversion
2221           Sets your module's version. The version must be consistent between
2222           the .xs and the .pm file, and is used to ensure that your Perl's
2223           libraries do not suffer from version skew.
2224
2225       pp_add_exported
2226           Adds to the export list whatever names you give it.  Functions
2227           created using pp_def are automatically added to the list. This
2228           function is useful if you define any Perl functions using pp_addpm
2229           or pp_addxs that you want exported as well.
2230
2231       pp_export_nothing
2232           This resets the list of exported symbols to nothing. This is
2233           probably better called "pp_export_clear", since you can add
2234           exported symbols after calling "pp_export_nothing". When called
2235           just before calling pp_done, this ensures that your module does not
2236           export anything, for example, if you only want programmers to use
2237           your functions as methods.
2238

CURRENTLY UNDOCUMENTED

2254       Almost everything having to do with "Slice operation". This includes
2255       much of the following (each entry is followed by a guess/description of
2256       where it is used or defined):
2257
2258       MACROS
2259          $CDIM()
2260
2261          $CHILD()
2262              PDL::PP::Rule::Substitute::Usual
2263
2264          $CHILD_P()
2265              PDL::PP::Rule::Substitute::Usual
2266
2267          $CHILD_PTR()
2268              PDL::PP::Rule::Substitute::Usual
2269
2270          $COPYDIMS()
2271
2272          $COPYINDS()
2273
2274          $CROAK()
2275              PDL::PP::Rule::Substitute::dosubst_private()
2276
2277          $DOCOMPDIMS()
2278              Used in slices.pd, defined where?
2279
2280          $DOPRIVDIMS()
2281              Used in slices.pd, defined where?
2282              Code comes from PDL::PP::CType::get_malloc, which is called by
2283          PDL::PP::CType::get_copy, which is called by PDL::PP::CopyOtherPars,
2284          PDL::PP::NT2Copies__, and PDL::PP::make_incsize_copy.  But none of
2285          those three at first glance seem to have anything to do with
2286          $DOPRIVDIMS
2287
2288          $EQUIVCPOFFS()
2289
2290          $EQUIVCPTRUNC()
2291
2292          $PARENT()
2293              PDL::PP::Rule::Substitute::Usual
2294
2295          $PARENT_P()
2296              PDL::PP::Rule::Substitute::Usual
2297
2298          $PARENT_PTR()
2299              PDL::PP::Rule::Substitute::Usual
2300
2301          $PDIM()
2302
2303          $PRIV()
2304              PDL::PP::Rule::Substitute::dosubst_private()
2305
2306          $RESIZE()
2307
2308          $SETDELTATHREADIDS()
2309              PDL::PP::Rule::MakeComp
2310
2311          $SETDIMS()
2312              PDL::PP::Rule::MakeComp
2313
2314          $SETNDIMS()
2315              PDL::PP::Rule::MakeComp
2316
2317          $SETREVERSIBLE()
2318              PDL::PP::Rule::Substitute::dosubst_private()
2319
2320       Keys
2321          AffinePriv
2322
2323          BackCode
2324
2325          BadBackCode
2326
2327          CallCopy
2328
2329          Comp (related to $COMP()?)
2330
2331          DefaultFlow
2332
2333          EquivCDimExpr
2334
2335          EquivCPOffsCode
2336
2337          EquivDimCheck
2338
2339          EquivPDimExpr
2340
2341          FTypes (see comment in this POD's source file between NoPthread and
2342          PMCode.)
2343
2344          GlobalNew
2345
2346          Identity
2347
2348          MakeComp
2349
2350          NoPdlThread
2351
2352          P2Child
2353
2354          ParentInds
2355
2356          Priv
2357
2358          ReadDataFuncName
2359
2360          RedoDims (related to RedoDimsCode ?)
2361
2362          Reversible
2363
2364          WriteBckDataFuncName
2365
2366          XCHGOnly
2367

BUGS

2369       Although PDL::PP is quite flexible and thoroughly used, there are
2370       surely bugs. First amongst them: this documentation needs a thorough
2371       revision.
2372

AUTHOR

2374       Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), Karl
2375       Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian Soeller
2376       (c.soeller@auckland.ac.nz). All rights reserved.  Documentation updates
2377       Copyright(C) 2011 David Mertens (dcmertens.perl@gmail.com). This
2378       documentation is licensed under the same terms as Perl itself.
2379
2380
2381
2382perl v5.34.0                      2021-08-16                             PP(1)