1PP(1) User Contributed Perl Documentation PP(1)
2
3
4
6 PDL::PP - Generate PDL routines from concise descriptions
7
9 e.g.
10
11 pp_def(
12 'sumover',
13 Pars => 'a(n); [o]b();',
14 Code => q{
15 double tmp=0;
16 loop(n) %{
17 tmp += $a();
18 %}
19 $b() = tmp;
20 },
21 );
22
23 pp_done();
24
26 Here is a quick reference list of the functions provided by PDL::PP.
27
28 pp_add_boot
29 Add code to the BOOT section of generated XS file
30
31 pp_add_exported
32 Add functions to the list of exported functions
33
34 pp_add_isa
35 Add entries to the @ISA list
36
37 pp_addbegin
38 Sets code to be added at the top of the generate .pm file
39
40 pp_addhdr
41 Add code and includes to C section of the generated XS file
42
43 pp_addpm
44 Add code to the generated .pm file
45
46 pp_addxs
47 Add extra XS code to the generated XS file
48
49 pp_beginwrap
50 Add BEGIN-block wrapping to code for the generated .pm file
51
52 pp_bless
53 Sets the package to which the XS code is added (default is PDL)
54
55 pp_boundscheck
56 Control state of PDL bounds checking activity
57
58 pp_core_importList
59 Specify what is imported from PDL::Core
60
61 pp_def
62 Define a new PDL function
63
64 pp_deprecate_module
65 Add runtime and POD warnings about a module being deprecated
66
67 pp_done
68 Mark the end of PDL::PP definitions in the file
69
70 pp_export_nothing
71 Clear out the export list for your generated module
72
73 pp_line_numbers
74 Add line number information to simplify debugging of PDL::PP code
75
77 For an alternate introduction to PDL::PP, see Practical Magick with C,
78 PDL, and PDL::PP -- a guide to compiled add-ons for PDL
79 <https://arxiv.org/abs/1702.07753>.
80
81 Why do we need PP? Several reasons: firstly, we want to be able to
82 generate subroutine code for each of the PDL datatypes (PDL_Byte,
83 PDL_Short, etc). AUTOMATICALLY. Secondly, when referring to slices of
84 PDL arrays in Perl (e.g. "$x->slice('0:10:2,:')" or other things such
85 as transposes) it is nice to be able to do this transparently and to be
86 able to do this 'in-place' - i.e, not to have to make a memory copy of
87 the section. PP handles all the necessary element and offset arithmetic
88 for you. There are also the notions of threading (repeated calling of
89 the same routine for multiple slices, see PDL::Indexing) and dataflow
90 (see PDL::Dataflow) which use of PP allows.
91
92 In much of what follows we will assume familiarity of the reader with
93 the concepts of implicit and explicit threading and index manipulations
94 within PDL. If you have not yet heard of these concepts or are not very
95 comfortable with them it is time to check PDL::Indexing.
96
97 As you may appreciate from its name PDL::PP is a Pre-Processor, i.e.
98 it expands code via substitutions to make real C-code. Technically, the
99 output is XS code (see perlxs) but that is very close to C.
100
101 So how do you use PP? Well for the most part you just write ordinary C
102 code except for special PP constructs which take the form:
103
104 $something(something else)
105
106 or:
107
108 PPfunction %{
109 <stuff>
110 %}
111
112 The most important PP construct is the form "$array()". Consider the
113 very simple PP function to sum the elements of a 1D vector (in fact
114 this is very similar to the actual code used by 'sumover'):
115
116 pp_def('sumit',
117 Pars => 'a(n); [o]b();',
118 Code => q{
119 double tmp;
120 tmp = 0;
121 loop(n) %{
122 tmp += $a();
123 %}
124 $b() = tmp;
125 }
126 );
127
128 What's going on? The "Pars =>" line is very important for PP - it
129 specifies all the arguments and their dimensionality. We call this the
130 signature of the PP function (compare also the explanations in
131 PDL::Indexing). In this case the routine takes a 1-D function as input
132 and returns a 0-D scalar as output. The "$a()" PP construct is used to
133 access elements of the array a(n) for you - PP fills in all the
134 required C code.
135
136 You will notice that we are using the "q{}" single-quote operator. This
137 is not an accident. You generally want to use single quotes to denote
138 your PP Code sections. PDL::PP uses "$var()" for its parsing and if you
139 don't use single quotes, Perl will try to interpolate "$var()". Also,
140 using the single quote "q" operator with curly braces makes it look
141 like you are creating a code block, which is What You Mean. (Perl is
142 smart enough to look for nested curly braces and not close the quote
143 until it finds the matching curly brace, so it's safe to have nested
144 blocks.) Under other circumstances, such as when you're stitching
145 together a Code block using string concatenations, it's often easiest
146 to use real single quotes as
147
148 Code => 'something'.$interpolatable.'somethingelse;'
149
150 In the simple case here where all elements are accessed the PP
151 construct "loop(n) %{ ... %}" is used to loop over all elements in
152 dimension "n". Note this feature of PP: ALL DIMENSIONS ARE SPECIFIED
153 BY NAME.
154
155 This is made clearer if we avoid the PP loop() construct and write the
156 loop explicitly using conventional C:
157
158 pp_def('sumit',
159 Pars => 'a(n); [o]b();',
160 Code => q{
161 PDL_Indx i,n_size;
162 double tmp;
163 n_size = $SIZE(n);
164 tmp = 0;
165 for(i=0; i<n_size; i++) {
166 tmp += $a(n=>i);
167 }
168 $b() = tmp;
169 },
170 );
171
172 which does the same as before, but is more long-winded. You can see to
173 get element "i" of a() we say "$a(n=>i)" - we are specifying the
174 dimension by name "n". In 2D we might say:
175
176 Pars=>'a(m,n);',
177 ...
178 tmp += $a(m=>i,n=>j);
179 ...
180
181 The syntax "m=>i" borrows from Perl hashes, which are in fact used in
182 the implementation of PP. One could also say "$a(n=>j,m=>i)" as order
183 is not important.
184
185 You can also see in the above example the use of another PP construct -
186 $SIZE(n) to get the length of the dimension "n".
187
188 It should, however, be noted that you shouldn't write an explicit
189 C-loop when you could have used the PP "loop" construct since PDL::PP
190 checks automatically the loop limits for you, usage of "loop" makes the
191 code more concise, etc. But there are certainly situations where you
192 need explicit control of the loop and now you know how to do it ;).
193
194 To revisit 'Why PP?' - the above code for sumit() will be generated for
195 each data-type. It will operate on slices of arrays 'in-place'. It will
196 thread automatically - e.g. if a 2D array is given it will be called
197 repeatedly for each 1D row (again check PDL::Indexing for the details
198 of threading). And then b() will be a 1D array of sums of each row.
199 We could call it with $x->transpose to sum the columns instead. And
200 Dataflow tracing etc. will be available.
201
202 You can see PP saves the programmer from writing a lot of needlessly
203 repetitive C-code -- in our opinion this is one of the best features of
204 PDL making writing new C subroutines for PDL an amazingly concise
205 exercise. A second reason is the ability to make PP expand your concise
206 code definitions into different C code based on the needs of the
207 computer architecture in question. Imagine for example you are lucky to
208 have a supercomputer at your hands; in that case you want PDL::PP
209 certainly to generate code that takes advantage of the
210 vectorising/parallel computing features of your machine (this a project
211 for the future). In any case, the bottom line is that your unchanged
212 code should still expand to working XS code even if the internals of
213 PDL changed.
214
215 Also, because you are generating the code in an actual Perl script,
216 there are many fun things that you can do. Let's say that you need to
217 write both sumit (as above) and multit. With a little bit of
218 creativity, we can do
219
220 for({Name => 'sumit', Init => '0', Op => '+='},
221 {Name => 'multit', Init => '1', Op => '*='}) {
222 pp_def($_->{Name},
223 Pars => 'a(n); [o]b();',
224 Code => '
225 double tmp;
226 tmp = '.$_->{Init}.';
227 loop(n) %{
228 tmp '.$_->{Op}.' $a();
229 %}
230 $b() = tmp;
231 ');
232 }
233
234 which defines both the functions easily. Now, if you later need to
235 change the signature or dimensionality or whatever, you only need to
236 change one place in your code. Yeah, sure, your editor does have 'cut
237 and paste' and 'search and replace' but it's still less bothersome and
238 definitely more difficult to forget just one place and have strange
239 bugs creep in. Also, adding 'orit' (bitwise or) later is a one-liner.
240
241 And remember, you really have Perl's full abilities with you - you can
242 very easily read any input file and make routines from the information
243 in that file. For simple cases like the above, the author (Tjl)
244 currently favors the hash syntax like the above - it's not too much
245 more characters than the corresponding array syntax but much easier to
246 understand and change.
247
248 We should mention here also the ability to get the pointer to the
249 beginning of the data in memory - a prerequisite for interfacing PDL to
250 some libraries. This is handled with the "$P(var)" directive, see
251 below.
252
253 When starting work on a new pp_def'ined function, if you make a
254 mistake, you will usually find a pile of compiler errors indicating
255 line numbers in the generated XS file. If you know how to read XS files
256 (or if you want to learn the hard way), you could open the generated XS
257 file and search for the line number with the error. However, a recent
258 addition to PDL::PP helps report the correct line number of your
259 errors: "pp_line_numbers". Working with the original summit example, if
260 you had a mis-spelling of tmp in your code, you could change the
261 (erroneous) code to something like this and the compiler would give you
262 much more useful information:
263
264 pp_def('sumit',
265 Pars => 'a(n); [o]b();',
266 Code => pp_line_numbers(__LINE__, q{
267 double tmp;
268 tmp = 0;
269 loop(n) %{
270 tmp += $a();
271 %}
272 $b() = rmp;
273 })
274 );
275
276 For the above situation, my compiler tells me:
277
278 ...
279 test.pd:15: error: 'rmp' undeclared (first use in this function)
280 ...
281
282 In my example script (called test.pd), line 15 is exactly the line at
283 which I made my typo: "rmp" instead of "tmp".
284
285 So, after this quick overview of the general flavour of programming PDL
286 routines using PDL::PP let's summarise in which circumstances you
287 should actually use this preprocessor/precompiler. You should use
288 PDL::PP if you want to
289
290 • interface PDL to some external library
291
292 • write some algorithm that would be slow if coded in Perl (this is
293 not as often as you think; take a look at threading and dataflow
294 first).
295
296 • be a PDL developer (and even then it's not obligatory)
297
299 Because of its architecture, PDL::PP can be both flexible and easy to
300 use on the one hand, yet exuberantly complicated at the same time.
301 Currently, part of the problem is that error messages are not very
302 informative and if something goes wrong, you'd better know what you are
303 doing and be able to hack your way through the internals (or be able to
304 figure out by trial and error what is wrong with your args to
305 "pp_def"). Although work is being done to produce better warnings, do
306 not be afraid to send your questions to the mailing list if you run
307 into trouble.
308
310 Now that you have some idea how to use "pp_def" to define new PDL
311 functions it is time to explain the general syntax of "pp_def".
312 "pp_def" takes as arguments first the name of the function you are
313 defining and then a hash list that can contain various keys.
314
315 Based on these keys PP generates XS code and a .pm file. The function
316 "pp_done" (see example in the SYNOPSIS) is used to tell PDL::PP that
317 there are no more definitions in this file and it is time to generate
318 the .xs and
319 .pm file.
320
321 As a consequence, there may be several pp_def() calls inside a file (by
322 convention files with PP code have the extension .pd or .pp) but
323 generally only one pp_done().
324
325 There are two main different types of usage of pp_def(), the 'data
326 operation' and 'slice operation' prototypes.
327
328 The 'data operation' is used to take some data, mangle it and output
329 some other data; this includes for example the '+' operation, matrix
330 inverse, sumover etc and all the examples we have talked about in this
331 document so far. Implicit and explicit threading and the creation of
332 the result are taken care of automatically in those operations. You can
333 even do dataflow with "sumit", "sumover", etc (don't be dismayed if you
334 don't understand the concept of dataflow in PDL very well yet; it is
335 still very much experimental).
336
337 The 'slice operation' is a different kind of operation: in a slice
338 operation, you are not changing any data, you are defining
339 correspondences between different elements of two ndarrays (examples
340 include the index manipulation/slicing function definitions in the file
341 slices.pd that is part of the PDL distribution; but beware, this is not
342 introductory level stuff).
343
344 To support bad values, additional keys are required for "pp_def", as
345 explained below.
346
347 If you are just interested in communicating with some external library
348 (for example some linear algebra/matrix library), you'll usually want
349 the 'data operation' so we are going to discuss that first.
350
352 A simple example
353 In the data operation, you must know what dimensions of data you need.
354 First, an example with scalars:
355
356 pp_def('add',
357 Pars => 'a(); b(); [o]c();',
358 Code => '$c() = $a() + $b();'
359 );
360
361 That looks a little strange but let's dissect it. The first line is
362 easy: we're defining a routine with the name 'add'. The second line
363 simply declares our parameters and the parentheses mean that they are
364 scalars. We call the string that defines our parameters and their
365 dimensionality the signature of that function. For its relevance with
366 regard to threading and index manipulations check the PDL::Indexing man
367 page.
368
369 The third line is the actual operation. You need to use the dollar
370 signs and parentheses to refer to your parameters (this will probably
371 change at some point in the future, once a good syntax is found).
372
373 These lines are all that is necessary to actually define the function
374 for PDL (well, actually it isn't; you additionally need to write a
375 Makefile.PL (see below) and build the module (something like 'perl
376 Makefile.PL; make'); but let's ignore that for the moment). So now you
377 can do
378
379 use MyModule;
380 $x = pdl 2,3,4;
381 $y = pdl 5;
382
383 $c = add($x,$y);
384 # or
385 add($x,$y,($c=null)); # Alternative form, useful if $c has been
386 # preset to something big, not useful here.
387
388 and have threading work correctly (the result is $c == [7 8 9]).
389
390 The Pars section: the signature of a PP function
391 Seeing the above example code you will most probably ask: what is this
392 strange "$c=null" syntax in the second call to our new "add" function?
393 If you take another look at the definition of "add" you will notice
394 that the third argument "c" is flagged with the qualifier "[o]" which
395 tells PDL::PP that this is an output argument. So the above call to add
396 means 'create a new $c from scratch with correct dimensions' - "null"
397 is a special token for 'empty ndarray' (you might ask why we haven't
398 used the value "undef" to flag this instead of the PDL specific "null";
399 we are currently thinking about it ;).
400
401 [This should be explained in some other section of the manual as
402 well!!] The reason for having this syntax as an alternative is that if
403 you have really huge ndarrays, you can do
404
405 $c = PDL->null;
406 for(some long loop) {
407 # munge a,b
408 add($x,$y,$c);
409 # munge c, put something back to x,y
410 }
411
412 and avoid allocating and deallocating $c each time. It is allocated
413 once at the first add() and thereafter the memory stays until $c is
414 destroyed.
415
416 If you just say
417
418 $c = add($x,$y);
419
420 the code generated by PP will automatically fill in "$c=null" and
421 return the result. If you want to learn more about the reasons why
422 PDL::PP supports this style where output arguments are given as last
423 arguments check the PDL::Indexing man page.
424
425 "[o]" is not the only qualifier a pdl argument can have in the
426 signature. Another important qualifier is the "[t]" option which flags
427 a pdl as temporary. What does that mean? You tell PDL::PP that this
428 pdl is only used for temporary results in the course of the calculation
429 and you are not interested in its value after the computation has been
430 completed. But why should PDL::PP want to know about this in the first
431 place? The reason is closely related to the concepts of pdl auto
432 creation (you heard about that above) and implicit threading. If you
433 use implicit threading the dimensionality of automatically created pdls
434 is actually larger than that specified in the signature. With "[o]"
435 flagged pdls will be created so that they have the additional
436 dimensions as required by the number of implicit thread dimensions.
437 When creating a temporary pdl, however, it will always only be made big
438 enough so that it can hold the result for one iteration in a thread
439 loop, i.e. as large as required by the signature. So less memory is
440 wasted when you flag a pdl as temporary. Secondly, you can use output
441 auto creation with temporary pdls even when you are using explicit
442 threading which is forbidden for normal output pdls flagged with "[o]"
443 (see PDL::Indexing).
444
445 Here is an example where we use the [t] qualifier. We define the
446 function "callf" that calls a C routine "f" which needs a temporary
447 array of the same size and type as the array "a" (sorry about the
448 forward reference for $P; it's a pointer access, see below) :
449
450 pp_def('callf',
451 Pars => 'a(n); [t] tmp(n); [o] b()',
452 Code => 'PDL_Indx ns = $SIZE(n);
453 f($P(a),$P(b),$P(tmp),ns);
454 '
455 );
456
457 Another possible qualifier is "[phys]". If given, this means the pdl
458 will have "make_physical" in PDL::Core called on it.
459
460 Additionally, if it has a specified dimension "d" that has value 1, "d"
461 will not magically be grown if "d" is larger in another pdl with
462 specified dimension "d", and instead an exception will be thrown. E.g.:
463
464 pp_def('callf',
465 Pars => 'a(n); [phys] b(n); [o] c()',
466 # ...
467 );
468
469 If "a" had lead dimension of 2 and "b" of 3, an exception will always
470 be thrown. However, if "b" has lead dimension of 1, it would be
471 silently repeated as if it were 2, if it were not a "phys" parameter.
472
473 Argument dimensions and the signature
474 Now we have just talked about dimensions of pdls and the signature. How
475 are they related? Let's say that we want to add a scalar + the index
476 number to a vector:
477
478 pp_def('add2',
479 Pars => 'a(n); b(); [o]c(n);',
480 Code => 'loop(n) %{
481 $c() = $a() + $b() + n;
482 %}'
483 );
484
485 There are several points to notice here: first, the "Pars" argument now
486 contains the n arguments to show that we have a single dimensions in a
487 and c. It is important to note that dimensions are actual entities that
488 are accessed by name so this declares a and c to have the same first
489 dimensions. In most PP definitions the size of named dimensions will be
490 set from the respective dimensions of non-output pdls (those with no
491 "[o]" flag) but sometimes you might want to set the size of a named
492 dimension explicitly through an integer parameter. See below in the
493 description of the "OtherPars" section how that works.
494
495 Constant argument dimensions in the signature
496 Suppose you want an output ndarray to be created automatically and you
497 know that on every call its dimension will have the same size (say 9)
498 regardless of the dimensions of the input ndarrays. In this case you
499 use the following syntax in the Pars section to specify the size of the
500 dimension:
501
502 ' [o] y(n=9); '
503
504 As expected, extra dimensions required by threading will be created if
505 necessary. If you need to assign a named dimension according to a more
506 complicated formula (than a constant) you must use the "RedoDimsCode"
507 key described below.
508
509 Type conversions and the signature
510 The signature also determines the type conversions that will be
511 performed when a PP function is invoked. So what happens when we invoke
512 one of our previously defined functions with pdls of different type,
513 e.g.
514
515 add2($x,$y,($ret=null));
516
517 where $x is of type "PDL_Float" and $y of type "PDL_Short"? With the
518 signature as shown in the definition of "add2" above the datatype of
519 the operation (as determined at runtime) is that of the pdl with the
520 'highest' type (sequence is byte < short < ushort < long < float <
521 double). In the add2 example the datatype of the operation is float ($x
522 has that datatype). All pdl arguments are then type converted to that
523 datatype (they are not converted inplace but a copy with the right type
524 is created if a pdl argument doesn't have the type of the operation).
525 Null pdls don't contribute a type in the determination of the type of
526 the operation. However, they will be created with the datatype of the
527 operation; here, for example, $ret will be of type float. You should be
528 aware of these rules when calling PP functions with pdls of different
529 types to take the additional storage and runtime requirements into
530 account.
531
532 These type conversions are correct for most functions you normally
533 define with "pp_def". However, there are certain cases where slightly
534 modified type conversion behaviour is desired. For these cases
535 additional qualifiers in the signature can be used to specify the
536 desired properties with regard to type conversion. These qualifiers can
537 be combined with those we have encountered already (the creation
538 qualifiers "[o]" and "[t]"). Let's go through the list of qualifiers
539 that change type conversion behaviour.
540
541 The most important is the "indx" qualifier which comes in handy when a
542 pdl argument represents indices into another pdl. Let's take a look at
543 an example from "PDL::Ufunc":
544
545 pp_def('maximum_ind',
546 Pars => 'a(n); indx [o] b()',
547 Code => '$GENERIC() cur;
548 PDL_Indx curind;
549 loop(n) %{
550 if (!n || $a() > cur) {cur = $a(); curind = n;}
551 %}
552 $b() = curind;',
553 );
554
555 The function "maximum_ind" finds the index of the largest element of a
556 vector. If you look at the signature you notice that the output
557 argument "b" has been declared with the additional "indx" qualifier.
558 This has the following consequences for type conversions: regardless of
559 the type of the input pdl "a" the output pdl "b" will be of type
560 "PDL_Indx" which makes sense since "b" will represent an index into
561 "a".
562
563 Note that 'curind' is declared as type "PDL_Indx" and not "indx".
564 While most datatype declarations in the 'Pars' section use the same
565 name as the underlying C type, "indx" is a type which is sufficient to
566 handle PDL indexing operations. For 32-bit installs, it can be a
567 32-bit integer type. For 64-bit installs, it will be a 64-bit integer
568 type.
569
570 Furthermore, if you call the function with an existing output pdl "b"
571 its type will not influence the datatype of the operation (see above).
572 Hence, even if "a" is of a smaller type than "b" it will not be
573 converted to match the type of "b" but stays untouched, which saves
574 memory and CPU cycles and is the right thing to do when "b" represents
575 indices. Also note that you can use the 'indx' qualifier together with
576 other qualifiers (the "[o]" and "[t]" qualifiers). Order is significant
577 -- type qualifiers precede creation qualifiers ("[o]" and "[t]").
578
579 The above example also demonstrates typical usage of the "$GENERIC()"
580 macro. It expands to the current type in a so called generic loop.
581 What is a generic loop? As you already heard a PP function has a
582 runtime datatype as determined by the type of the pdl arguments it has
583 been invoked with. The PP generated XS code for this function
584 therefore contains a switch like "switch (type) {case PDL_Byte: ...
585 case PDL_Double: ...}" that selects a case based on the runtime
586 datatype of the function (it's called a type ``loop'' because there is
587 a loop in PP code that generates the cases). In any case your code is
588 inserted once for each PDL type into this switch statement. The
589 "$GENERIC()" macro just expands to the respective type in each copy of
590 your parsed code in this "switch" statement, e.g., in the "case
591 PDL_Byte" section "cur" will expand to "PDL_Byte" and so on for the
592 other case statements. I guess you realise that this is a useful macro
593 to hold values of pdls in some code.
594
595 There are a couple of other qualifiers with similar effects as "indx".
596 For your convenience there are the "float" and "double" qualifiers with
597 analogous consequences on type conversions as "indx". Let's assume you
598 have a very large array for which you want to compute row and column
599 sums with an equivalent of the "sumover" function. However, with the
600 normal definition of "sumover" you might run into problems when your
601 data is, e.g. of type short. A call like
602
603 sumover($large_pdl,($sums = null));
604
605 will result in $sums be of type short and is therefore prone to
606 overflow errors if $large_pdl is a very large array. On the other hand
607 calling
608
609 @dims = $large_pdl->dims; shift @dims;
610 sumover($large_pdl,($sums = zeroes(double,@dims)));
611
612 is not a good alternative either. Now we don't have overflow problems
613 with $sums but at the expense of a type conversion of $large_pdl to
614 double, something bad if this is really a large pdl. That's where
615 "double" comes in handy:
616
617 pp_def('sumoverd',
618 Pars => 'a(n); double [o] b()',
619 Code => 'double tmp=0;
620 loop(n) %{ tmp += a(); %}
621 $b() = tmp;',
622 );
623
624 This gets us around the type conversion and overflow problems. Again,
625 analogous to the "indx" qualifier "double" results in "b" always being
626 of type double regardless of the type of "a" without leading to a type
627 conversion of "a" as a side effect.
628
629 There is also a special type, "real". The others above are all actual
630 PDL/C datatypes, but "real" is a modifier; if the operation type is
631 real, it has no effect; if it is complex, then the parameter will be
632 the real version - so "cdouble" becomes "double", etc.
633
634 There is also the converse, "complex". If the operation is already
635 complex, there is no effect; if not, the output will be promoted to the
636 type's "complexversion" in PDL::Type, which defaults to "cfloat". Note
637 this is controlled both by the PDL::Types data, and the code in
638 PDL::PP. NB Because this outputs floating-point data, the inputs will
639 by definition be turned into such. Therefore, it only makes sense to
640 have floating-point "GenericTypes" inputs. If you want to default to
641 coercing inputs to "float", give that as the last "GenericTypes" as the
642 generated XS function defaults to the last-given one. Hence (with the
643 "PMCode" and "Doc" omitted):
644
645 pp_def('r2C',
646 GenericTypes=>[reverse qw(F D G C)], # last one is default so here = F
647 Pars => 'r(); complex [o]c()',
648 Code => '$c() = $r();'
649 );
650
651 Finally, there are the "type+" qualifiers where type is one of "int" or
652 "float". What shall that mean. Let's illustrate the "int+" qualifier
653 with the actual definition of sumover:
654
655 pp_def('sumover',
656 Pars => 'a(n); int+ [o] b()',
657 Code => '$GENERIC(b) tmp=0;
658 loop(n) %{ tmp += a(); %}
659 $b() = tmp;',
660 );
661
662 As we had already seen for the "int", "float" and "double" qualifiers,
663 a pdl marked with a "type+" qualifier does not influence the datatype
664 of the pdl operation. Its meaning is "make this pdl at least of type
665 "type" or higher, as required by the type of the operation". In the
666 sumover example this means that when you call the function with an "a"
667 of type PDL_Short the output pdl will be of type PDL_Long (just as
668 would have been the case with the "int" qualifier). This again tries to
669 avoid overflow problems when using small datatypes (e.g. byte images).
670 However, when the datatype of the operation is higher than the type
671 specified in the "type+" qualifier "b" will be created with the
672 datatype of the operation, e.g. when "a" is of type double then "b"
673 will be double as well. We hope you agree that this is sensible
674 behaviour for "sumover". It should be obvious how the "float+"
675 qualifier works by analogy. It may become necessary to be able to
676 specify a set of alternative types for the parameters. However, this
677 will probably not be implemented until someone comes up with a
678 reasonable use for it.
679
680 Note that we now had to specify the $GENERIC macro with the name of the
681 pdl to derive the type from that argument. Why is that? If you
682 carefully followed our explanations you will have realised that in some
683 cases "b" will have a different type than the type of the operation.
684 Calling the '$GENERIC' macro with "b" as argument makes sure that the
685 type will always the same as that of "b" in that part of the generic
686 loop.
687
688 This is about all there is to say about the "Pars" section in a
689 "pp_def" call. You should remember that this section defines the
690 signature of a PP defined function, you can use several options to
691 qualify certain arguments as output and temporary args and all
692 dimensions that you can later refer to in the "Code" section are
693 defined by name.
694
695 It is important that you understand the meaning of the signature since
696 in the latest PDL versions you can use it to define threaded functions
697 from within Perl, i.e. what we call Perl level threading. Please check
698 PDL::Indexing for details.
699
700 The Code section
701 The "Code" section contains the actual XS code that will be in the
702 innermost part of a thread loop (if you don't know what a thread loop
703 is then you still haven't read PDL::Indexing; do it now ;) after any PP
704 macros (like $GENERIC) and PP functions have been expanded (like the
705 "loop" function we are going to explain next).
706
707 Let's quickly reiterate the "sumover" example:
708
709 pp_def('sumover',
710 Pars => 'a(n); int+ [o] b()',
711 Code => '$GENERIC(b) tmp=0;
712 loop(n) %{ tmp += a(); %}
713 $b() = tmp;',
714 );
715
716 The "loop" construct in the "Code" section also refers to the dimension
717 name so you don't need to specify any limits: the loop is correctly
718 sized and everything is done for you, again.
719
720 Next, there is the surprising fact that "$a()" and "$b()" do not
721 contain the index. This is not necessary because we're looping over n
722 and both variables know which dimensions they have so they
723 automatically know they're being looped over.
724
725 This feature comes in very handy in many places and makes for much
726 shorter code. Of course, there are times when you want to circumvent
727 this; here is a function which make a matrix symmetric and serves as an
728 example of how to code explicit looping:
729
730 pp_def('symm',
731 Pars => 'a(n,n); [o]c(n,n);',
732 Code => 'loop(n) %{
733 int n2;
734 for(n2=n; n2<$SIZE(n); n2++) {
735 $c(n0 => n, n1 => n2) =
736 $c(n0 => n2, n1 => n) =
737 $a(n0 => n, n1 => n2);
738 }
739 %}
740 '
741 );
742
743 Let's dissect what is happening. Firstly, what is this function
744 supposed to do? From its signature you see that it takes a 2D matrix
745 with equal numbers of columns and rows and outputs a matrix of the same
746 size. From a given input matrix $a it computes a symmetric output
747 matrix $c (symmetric in the matrix sense that A^T = A where ^T means
748 matrix transpose, or in PDL parlance $c == $c->transpose). It does this
749 by using only the values on and below the diagonal of $a. In the output
750 matrix $c all values on and below the diagonal are the same as those in
751 $a while those above the diagonal are a mirror image of those below the
752 diagonal (above and below are here interpreted in the way that PDL
753 prints 2D pdls). If this explanation still sounds a bit strange just go
754 ahead, make a little file into which you write this definition, build
755 the new PDL extension (see section on Makefiles for PP code) and try it
756 out with a couple of examples.
757
758 Having explained what the function is supposed to do there are a couple
759 of points worth noting from the syntactical point of view. First, we
760 get the size of the dimension named "n" again by using the $SIZE macro.
761 Second, there are suddenly these funny "n0" and "n1" index names in the
762 code though the signature defines only the dimension "n". Why this? The
763 reason becomes clear when you note that both the first and second
764 dimension of $a and $b are named "n" in the signature of "symm". This
765 tells PDL::PP that the first and second dimension of these arguments
766 should have the same size. Otherwise the generated function will raise
767 a runtime error. However, now in an access to $a and $c PDL::PP cannot
768 figure out which index "n" refers to any more just from the name of the
769 index. Therefore, the indices with equal dimension names get numbered
770 from left to right starting at 0, e.g. in the above example "n0" refers
771 to the first dimension of $a and $c, "n1" to the second and so on.
772
773 In all examples so far, we have only used the "Pars" and "Code" members
774 of the hash that was passed to "pp_def". There are certainly other keys
775 that are recognised by PDL::PP and we will hear about some of them in
776 the course of this document. Find a (non-exhaustive) list of keys in
777 Appendix A. A list of macros and PPfunctions (we have only encountered
778 some of those in the examples above yet) that are expanded in values of
779 the hash argument to "pp_def" is summarised in Appendix B.
780
781 At this point, it might be appropriate to mention that PDL::PP is not a
782 completely static, well designed set of routines (as Tuomas puts it:
783 "stop thinking of PP as a set of routines carved in stone") but rather
784 a collection of things that the PDL::PP author (Tuomas J. Lukka)
785 considered he would have to write often into his PDL extension
786 routines. PP tries to be expandable so that in the future, as new needs
787 arise, new common code can be abstracted back into it. If you want to
788 learn more on why you might want to change PDL::PP and how to do it
789 check the section on PDL::PP internals.
790
791 Handling bad values
792 There are several keys and macros used when writing code to handle bad
793 values. The first one is the "HandleBad" key:
794
795 HandleBad => 0
796 This flags a pp-routine as NOT handling bad values. If this routine
797 is sent ndarrays with their "badflag" set, then a warning message
798 is printed to STDOUT and the ndarrays are processed as if the value
799 used to represent bad values is a valid number. The "badflag" value
800 is not propagated to the output ndarrays.
801
802 An example of when this is used is for FFT routines, which
803 generally do not have a way of ignoring part of the data.
804
805 HandleBad => 1
806 This causes PDL::PP to write extra code that ensures the BadCode
807 section is used, and that the "$ISBAD()" macro (and its brethren)
808 work.
809
810 HandleBad is not given
811 If any of the input ndarrays have their "badflag" set, then the
812 output ndarrays will have their "badflag" set, but any supplied
813 BadCode is ignored.
814
815 The value of "HandleBad" is used to define the contents of the "BadDoc"
816 key, if it is not given.
817
818 To handle bad values, code must be written somewhat differently; for
819 instance,
820
821 $c() = $a() + $b();
822
823 becomes something like
824
825 if ( $a() != BADVAL && $b() != BADVAL ) {
826 $c() = $a() + $b();
827 } else {
828 $c() = BADVAL;
829 }
830
831 However, we only want the second version if bad values are present in
832 the input ndarrays (and that bad-value support is wanted!) - otherwise
833 we actually want the original code. This is where the "BadCode" key
834 comes in; you use it to specify the code to execute if bad values may
835 be present, and PP uses both it and the "Code" section to create
836 something like:
837
838 if ( bad_values_are_present ) {
839 fancy_threadloop_stuff {
840 BadCode
841 }
842 } else {
843 fancy_threadloop_stuff {
844 Code
845 }
846 }
847
848 This approach means that there is virtually no overhead when bad values
849 are not present (i.e. the badflag routine returns 0).
850
851 The C preprocessor symbol "PDL_BAD_CODE" is defined when the bad code
852 is compiled, so that you can reduce the amount of code you write. The
853 BadCode section can use the same macros and looping constructs as the
854 Code section. However, it wouldn't be much use without the following
855 additional macros:
856
857 $ISBAD(var)
858 To check whether an ndarray's value is bad, use the $ISBAD macro:
859
860 if ( $ISBAD(a()) ) { printf("a() is bad\n"); }
861
862 You can also access given elements of an ndarray:
863
864 if ( $ISBAD(a(n=>l)) ) { printf("element %d of a() is bad\n", l); }
865
866 $ISGOOD(var)
867 This is the opposite of the $ISBAD macro.
868
869 $SETBAD(var)
870 For when you want to set an element of an ndarray bad.
871
872 $ISBADVAR(c_var,pdl)
873 If you have cached the value of an ndarray "$a()" into a c-variable
874 ("foo" say), then to check whether it is bad, use
875 "$ISBADVAR(foo,a)".
876
877 $ISGOODVAR(c_var,pdl)
878 As above, but this time checking that the cached value isn't bad.
879
880 $SETBADVAR(c_var,pdl)
881 To copy the bad value for an ndarray into a c variable, use
882 "$SETBADVAR(foo,a)".
883
884 TODO: mention "$PPISBAD()" etc macros.
885
886 Using these macros, the above code could be specified as:
887
888 Code => '$c() = $a() + $b();',
889 BadCode => '
890 if ( $ISBAD(a()) || $ISBAD(b()) ) {
891 $SETBAD(c());
892 } else {
893 $c() = $a() + $b();
894 }',
895
896 Since this is Perl, TMTOWTDI, so you could also write:
897
898 BadCode => '
899 if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
900 $c() = $a() + $b();
901 } else {
902 $SETBAD(c());
903 }',
904
905 You can reduce code repetition using the C "PDL_BAD_CODE" macro, using
906 the same code for both of the "Code" and "BadCode" sections:
907
908 #ifdef PDL_BAD_CODE
909 if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
910 #endif PDL_BAD_CODE
911
912 $c() = $a() + $b();
913
914 #ifdef PDL_BAD_CODE
915 } else {
916 $SETBAD(c());
917 }
918 #endif PDL_BAD_CODE
919
920 If you want access to the value of the badflag for a given ndarray, you
921 can use the PDL STATE macros:
922
923 $ISPDLSTATEBAD(pdl)
924 $ISPDLSTATEGOOD(pdl)
925 $SETPDLSTATEBAD(pdl)
926 $SETPDLSTATEGOOD(pdl)
927
928 TODO: mention the "FindBadStatusCode" and "CopyBadStatusCode" options
929 to "pp_def", as well as the "BadDoc" key.
930
931 Interfacing your own/library functions using PP
932 Now, consider the following: you have your own C function (that may in
933 fact be part of some library you want to interface to PDL) which takes
934 as arguments two pointers to vectors of double:
935
936 void myfunc(int n,double *v1,double *v2);
937
938 The correct way of defining the PDL function is
939
940 pp_def('myfunc',
941 Pars => 'a(n); [o]b(n);',
942 GenericTypes => ['D'],
943 Code => 'myfunc($SIZE(n),$P(a),$P(b));'
944 );
945
946 The "$P("par")" syntax returns a pointer to the first element and the
947 other elements are guaranteed to lie after that.
948
949 Notice that here it is possible to make many mistakes. First, $SIZE(n)
950 must be used instead of "n". Second, you shouldn't put any loops in
951 this code. Third, here we encounter a new hash key recognised by
952 PDL::PP : the "GenericTypes" declaration tells PDL::PP to ONLY GENERATE
953 THE TYPELOOP FOP THE LIST OF TYPES SPECIFIED. In this case "double".
954 This has two advantages. Firstly the size of the compiled code is
955 reduced vastly, secondly if non-double arguments are passed to
956 "myfunc()" PDL will automatically convert them to double before passing
957 to the external C routine and convert them back afterwards.
958
959 One can also use "Pars" to qualify the types of individual arguments.
960 Thus one could also write this as:
961
962 pp_def('myfunc',
963 Pars => 'double a(n); double [o]b(n);',
964 Code => 'myfunc($SIZE(n),$P(a),$P(b));'
965 );
966
967 The type specification in "Pars" exempts the argument from variation in
968 the typeloop - rather it is automatically converted to and from the
969 type specified. This is obviously useful in a more general example,
970 e.g.:
971
972 void myfunc(int n,float *v1,long *v2);
973
974 pp_def('myfunc',
975 Pars => 'float a(n); long [o]b(n);',
976 GenericTypes => ['F'],
977 Code => 'myfunc($SIZE(n),$P(a),$P(b));'
978 );
979
980 Note we still use "GenericTypes" to reduce the size of the type loop,
981 obviously PP could in principle spot this and do it automatically
982 though the code has yet to attain that level of sophistication!
983
984 Finally note when types are converted automatically one MUST use the
985 "[o]" qualifier for output variables or you hard-won changes will get
986 optimised away by PP!
987
988 If you interface a large library you can automate the interfacing even
989 further. Perl can help you again(!) in doing this. In many libraries
990 you have certain calling conventions. This can be exploited. In short,
991 you can write a little parser (which is really not difficult in Perl)
992 that then generates the calls to "pp_def" from parsed descriptions of
993 the functions in that library. For an example, please check the Slatec
994 interface in the "Lib" tree of the PDL distribution. If you want to
995 check (during debugging) which calls to PP functions your Perl code
996 generated a little helper package comes in handy which replaces the PP
997 functions by identically named ones that dump their arguments to
998 stdout.
999
1000 Just say
1001
1002 perl -MPDL::PP::Dump myfile.pd
1003
1004 to see the calls to "pp_def" and friends. Try it with ops.pd and
1005 slatec.pd. If you're interested (or want to enhance it), the source is
1006 in Basic/Gen/PP/Dump.pm
1007
1008 Other macros and functions in the Code section
1009 Macros: So far we have encountered the $SIZE, $GENERIC and $P macros.
1010 Now we are going to quickly explain the other macros that are expanded
1011 in the "Code" section of PDL::PP along with examples of their usage.
1012
1013 $T The $T macro is used for type switches. This is very useful when you
1014 have to use different external (e.g. library) functions depending on
1015 the input type of arguments. The general syntax is
1016
1017 $Ttypeletters(type_alternatives)
1018
1019 where "typeletters" is a permutation of a subset of the letters
1020 "BSULNQFD" which stand for Byte, Short, Ushort, etc. and
1021 "type_alternatives" are the expansions when the type of the PP
1022 operation is equal to that indicated by the respective letter. Let's
1023 illustrate this incomprehensible description by an example. Assuming
1024 you have two C functions with prototypes
1025
1026 void float_func(float *in, float *out);
1027 void double_func(double *in, double *out);
1028
1029 which do basically the same thing but one accepts float and the
1030 other double pointers. You could interface them to PDL by defining a
1031 generic function "foofunc" (which will call the correct function
1032 depending on the type of the transformation):
1033
1034 pp_def('foofunc',
1035 Pars => ' a(n); [o] b();',
1036 Code => ' $TFD(float,double)_func ($P(a),$P(b));'
1037 GenericTypes => [qw(F D)],
1038 );
1039
1040 There is a limitation that the comma-separated values cannot have
1041 parentheses.
1042
1043 $PP
1044 The $PP macro is used for a so called physical pointer access. The
1045 physical refers to some internal optimisations of PDL (for those who
1046 are familiar with the PDL core we are talking about the vaffine
1047 optimisations). This macro is mainly for internal use and you
1048 shouldn't need to use it in any of your normal code.
1049
1050 $COMP (and the "OtherPars" section)
1051 The $COMP macro is used to access non-pdl values in the code
1052 section. Its name is derived from the implementation of
1053 transformations in PDL. The variables you can refer to using $COMP
1054 are members of the ``compiled'' structure that represents the PDL
1055 transformation in question but does not yet contain any information
1056 about dimensions (for further details check PDL::Internals).
1057 However, you can treat $COMP just as a black box without knowing
1058 anything about the implementation of transformations in PDL. So when
1059 would you use this macro? Its main usage is to access values of
1060 arguments that are declared in the "OtherPars" section of a "pp_def"
1061 definition. But then you haven't heard about the "OtherPars" key
1062 yet?! Let's have another example that illustrates typical usage of
1063 both new features:
1064
1065 pp_def('pnmout',
1066 Pars => 'a(m)',
1067 OtherPars => "char* fd",
1068 GenericTypes => [qw(B U S L)],
1069 Code => 'PerlIO *fp;
1070 IO *io;
1071
1072 io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
1073 if (!io || !(fp = IoIFP(io)))
1074 croak("Can\'t figure out FP");
1075
1076 if (PerlIO_write(fp,$P(a),len) != len)
1077 croak("Error writing pnm file");
1078 ');
1079
1080 This function is used to write data from a pdl to a file. The file
1081 descriptor is passed as a string into this function. This parameter
1082 does not go into the "Pars" section since it cannot be usefully
1083 treated like a pdl but rather into the aptly named "OtherPars"
1084 section. Parameters in the "OtherPars" section follow those in the
1085 "Pars" section when invoking the function, i.e.
1086
1087 open FILE,">out.dat" or die "couldn't open out.dat";
1088 pnmout($pdl,'FILE');
1089
1090 When you want to access this parameter inside the code section you
1091 have to tell PP by using the $COMP macro, i.e. you write "$COMP(fd)"
1092 as in the example. Otherwise PP wouldn't know that the "fd" you are
1093 referring to is the same as that specified in the "OtherPars"
1094 section.
1095
1096 Another use for the "OtherPars" section is to set a named dimension
1097 in the signature. Let's have an example how that is done:
1098
1099 pp_def('setdim',
1100 Pars => '[o] a(n)',
1101 OtherPars => 'int ns => n',
1102 Code => 'loop(n) %{ $a() = n; %}',
1103 );
1104
1105 This says that the named dimension "n" will be initialised from the
1106 value of the other parameter "ns" which is of integer type (I guess
1107 you have realised that we use the "CType From => named_dim" syntax).
1108 Now you can call this function in the usual way:
1109
1110 setdim(($x=null),5);
1111 print $x;
1112 [ 0 1 2 3 4 ]
1113
1114 Admittedly this function is not very useful but it demonstrates how
1115 it works. If you call the function with an existing pdl and you
1116 don't need to explicitly specify the size of "n" since PDL::PP can
1117 figure it out from the dimensions of the non-null pdl. In that case
1118 you just give the dimension parameter as "-1":
1119
1120 $x = hist($y);
1121 setdim($x,-1);
1122
1123 That should do it.
1124
1125 The only PP function that we have used in the examples so far is
1126 "loop". Additionally, there are currently two other functions which
1127 are recognised in the "Code" section:
1128
1129 threadloop
1130 As we heard above the signature of a PP defined function defines the
1131 dimensions of all the pdl arguments involved in a primitive
1132 operation. However, you often call the functions that you defined
1133 with PP with pdls that have more dimensions than those specified in
1134 the signature. In this case the primitive operation is performed on
1135 all subslices of appropriate dimensionality in what is called a
1136 thread loop (see also overview above and PDL::Indexing). Assuming you
1137 have some notion of this concept you will probably appreciate that
1138 the operation specified in the code section should be optimised since
1139 this is the tightest loop inside a thread loop. However, if you
1140 revisit the example where we define the "pnmout" function, you will
1141 quickly realise that looking up the "IO" file descriptor in the inner
1142 thread loop is not very efficient when writing a pdl with many rows.
1143 A better approach would be to look up the "IO" descriptor once
1144 outside the thread loop and use its value then inside the tightest
1145 thread loop. This is exactly where the "threadloop" function comes in
1146 handy. Here is an improved definition of "pnmout" which uses this
1147 function:
1148
1149 pp_def('pnmout',
1150 Pars => 'a(m)',
1151 OtherPars => "char* fd",
1152 GenericTypes => [qw(B U S L)],
1153 Code => 'PerlIO *fp;
1154 IO *io;
1155 int len;
1156
1157 io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
1158 if (!io || !(fp = IoIFP(io)))
1159 croak("Can\'t figure out FP");
1160
1161 len = $SIZE(m) * sizeof($GENERIC());
1162
1163 threadloop %{
1164 if (PerlIO_write(fp,$P(a),len) != len)
1165 croak("Error writing pnm file");
1166 %}
1167 ');
1168
1169 This works as follows. Normally the C code you write inside the
1170 "Code" section is placed inside a thread loop (i.e. PP generates the
1171 appropriate wrapping XS code around it). However, when you explicitly
1172 use the "threadloop" function, PDL::PP recognises this and doesn't
1173 wrap your code with an additional thread loop. This has the effect
1174 that code you write outside the thread loop is only executed once per
1175 transformation and just the code with in the surrounding "%{ ... %}"
1176 pair is placed within the tightest thread loop. This also comes in
1177 handy when you want to perform a decision (or any other code,
1178 especially CPU intensive code) only once per thread, i.e.
1179
1180 pp_addhdr('
1181 #define RAW 0
1182 #define ASCII 1
1183 ');
1184 pp_def('do_raworascii',
1185 Pars => 'a(); b(); [o]c()',
1186 OtherPars => 'int mode',
1187 Code => ' switch ($COMP(mode)) {
1188 case RAW:
1189 threadloop %{
1190 /* do raw stuff */
1191 %}
1192 break;
1193 case ASCII:
1194 threadloop %{
1195 /* do ASCII stuff */
1196 %}
1197 break;
1198 default:
1199 croak("unknown mode");
1200 }'
1201 );
1202
1203 types
1204 The types function works similar to the $T macro. However, with the
1205 "types" function the code in the following block (delimited by "%{"
1206 and "%}" as usual) is executed for all those cases in which the
1207 datatype of the operation is any of the types represented by the
1208 letters in the argument to "type", e.g.
1209
1210 Code => '...
1211
1212 types(BSUL) %{
1213 /* do integer type operation */
1214 %}
1215 types(FD) %{
1216 /* do floating point operation */
1217 %}
1218 ...'
1219
1220 You are encouraged to use this idiom (from PDL::Math) in order to
1221 minimise effort needed to make your code work with new types:
1222
1223 use PDL::Types qw(types);
1224 my @Rtypes = grep $_->real, types();
1225 my @Ctypes = grep !$_->real, types();
1226 # ...
1227 my $got_complex = PDL::Core::Dev::got_complex_version($name, 2);
1228 my $complex_bit = join "\n",
1229 map 'types('.$_->ppsym.') %{$'.$c.'() = c'.$name.$_->floatsuffix.'($'.$x.'(),$'.$y.'());%}',
1230 @Ctypes;
1231 my $real_bit = join "\n",
1232 map 'types('.$_->ppsym.') %{$'.$c.'() = '.$name.'($'.$x.'(),$'.$y.'());%}',
1233 @Rtypes;
1234 ($got_complex ? $complex_bit : '') . $real_bit;
1235
1236 The RedoDimsCode Section
1237 The "RedoDimsCode" key is an optional key that is used to compute
1238 dimensions of ndarrays at runtime in case the standard rules for
1239 computing dimensions from the signature are not sufficient. The
1240 contents of the "RedoDimsCode" entry is interpreted in the same way
1241 that the Code section is interpreted-- i.e., PP macros are expanded and
1242 the result is interpreted as C code. The purpose of the code is to set
1243 the size of some dimensions that appear in the signature. Storage
1244 allocation and threadloops and so forth will be set up as if the
1245 computed dimension had appeared in the signature. In your code, you
1246 first compute the desired size of a named dimension in the signature
1247 according to your needs and then assign that value to it via the
1248 $SIZE() macro.
1249
1250 As an example, consider the following situation. You are interfacing an
1251 external library routine that requires an temporary array for workspace
1252 to be passed as an argument. Two input data arrays that are passed are
1253 p(m) and x(n). The output data array is y(n). The routine requires a
1254 workspace array with a length of n+m*m, and you'd like the storage
1255 created automatically just like it would be for any ndarray flagged
1256 with [t] or [o]. What you'd like is to say something like
1257
1258 pp_def( "myexternalfunc",
1259 Pars => " p(m); x(n); [o] y; [t] work(n+m*m); ", ...
1260
1261 but that won't work, because PP can't interpret expressions with
1262 arithmetic in the signature. Instead you write
1263
1264 pp_def(
1265 "myexternalfunc",
1266 Pars => ' p(m); x(n); [o] y(); [t] work(wn); ',
1267 RedoDimsCode => '
1268 PDL_Indx im = $PDL(p)->dims[0];
1269 PDL_Indx in = $PDL(x)->dims[0];
1270 PDL_Indx min = in + im * im;
1271 PDL_Indx inw = $PDL(work)->dims[0];
1272 $SIZE(wn) = inw >= min ? inw : min;
1273 ',
1274 Code => '
1275 externalfunc( $P(p), $P(x), $SIZE(m), $SIZE(n), $P(work) );
1276 '
1277 );
1278
1279 This code works as follows: The macro $PDL(p) expands to a pointer to
1280 the pdl struct for the ndarray p. You don't want a pointer to the data
1281 ( ie $P ) in this case, because you want to access the methods for the
1282 ndarray on the C level. You get the first dimension of each of the
1283 ndarrays and store them in integers. Then you compute the minimum
1284 length the work array can be. If the user sent an ndarray "work" with
1285 sufficient storage, then leave it alone. If the user sent, say a null
1286 pdl, or no pdl at all, then the size of wn will be zero and you reset
1287 it to the minimum value. Before the code in the Code section is
1288 executed PP will create the proper storage for "work" if it does not
1289 exist. Note that you only took the first dimension of "p" and "x"
1290 because the user may have sent ndarrays with extra threading
1291 dimensions. Of course, the temporary ndarray "work" (note the [t] flag)
1292 should not be given any thread dimensions anyway.
1293
1294 You can also use "RedoDimsCode" to set the dimension of a ndarray
1295 flagged with [o]. In this case you set the dimensions for the named
1296 dimension in the signature using $SIZE() as in the preceding example.
1297 However, because the ndarray is flagged with [o] instead of [t],
1298 threading dimensions will be added if required just as if the size of
1299 the dimension were computed from the signature according to the usual
1300 rules. Here is an example from PDL::Math
1301
1302 pp_def("polyroots",
1303 Pars => 'cr(n); ci(n); [o]rr(m); [o]ri(m);',
1304 RedoDimsCode => 'PDL_Indx sn = $PDL(cr)->dims[0]; $SIZE(m) = sn-1;',
1305
1306 The input ndarrays are the real and imaginary parts of complex
1307 coefficients of a polynomial. The output ndarrays are real and
1308 imaginary parts of the roots. There are "n" roots to an "n"th order
1309 polynomial and such a polynomial has "n+1" coefficients (the zero-th
1310 through the "n"th). In this example, threading will work correctly.
1311 That is, the first dimension of the output ndarray with have its
1312 dimension adjusted, but other threading dimensions will be assigned
1313 just as if there were no "RedoDimsCode".
1314
1315 Typemap handling in the "OtherPars" section
1316 The "OtherPars" section discussed above is very often absolutely
1317 crucial when you interface external libraries with PDL. However in many
1318 cases the external libraries either use derived types or pointers of
1319 various types.
1320
1321 The standard way to handle this in Perl is to use a "typemap" file.
1322 This is discussed in some detail in perlxs in the standard Perl
1323 documentation. In PP the functionality is very similar, so you can
1324 create a "typemap" file in the directory where your PP file resides and
1325 when it is built it is automatically read in to figure out the
1326 appropriate translation between the C type and Perl's built-in type.
1327
1328 That said, there are a couple of important differences from the general
1329 handling of types in XS. The first, and probably most important, is
1330 that at the moment pointers to types are not allowed in the "OtherPars"
1331 section. To get around this limitation you must use the "IV" type
1332 (thanks to Judd Taylor for pointing out that this is necessary for
1333 portability).
1334
1335 It is probably best to illustrate this with a couple of code-snippets:
1336
1337 For instance the "gsl_spline_init" function has the following C
1338 declaration:
1339
1340 int gsl_spline_init(gsl_spline * spline,
1341 const double xa[], const double ya[], size_t size);
1342
1343 Clearly the "xa" and "ya" arrays are candidates for being passed in as
1344 ndarrays and the "size" argument is just the length of these ndarrays
1345 so that can be handled by the "$SIZE()" macro in PP. The problem is the
1346 pointer to the "gsl_spline" type. The natural solution would be to
1347 write an "OtherPars" declaration of the form
1348
1349 OtherPars => 'gsl_spline *spl'
1350
1351 and write a short "typemap" file which handled this type. This does not
1352 work at present however! So what you have to do is to go around the
1353 problem slightly (and in some ways this is easier too!):
1354
1355 The solution is to declare "spline" in the "OtherPars" section using an
1356 "Integer Value", "IV". This hides the nature of the variable from PP
1357 and you then need to (well to avoid compiler warnings at least!)
1358 perform a type cast when you use the variable in your code. Thus
1359 "OtherPars" should take the form:
1360
1361 OtherPars => 'IV spl'
1362
1363 and when you use it in the code you will write
1364
1365 INT2PTR(gsl_spline *, $COMP(spl))
1366
1367 where the Perl API macro "INT2PTR" has been used to handle the pointer
1368 cast to avoid compiler warnings and problems for machines with mixed
1369 32bit and 64bit Perl configurations. Putting this together as Andres
1370 Jordan has done (with the modification using "IV" by Judd Taylor) in
1371 the "gsl_interp.pd" in the distribution source you get:
1372
1373 pp_def('init_meat',
1374 Pars => 'double x(n); double y(n);',
1375 OtherPars => 'IV spl',
1376 Code =>'
1377 gsl_spline_init,( INT2PTR(gsl_spline *, $COMP(spl)), $P(x),$P(y),$SIZE(n)));'
1378 );
1379
1380 where I have removed a macro wrapper call, but that would obscure the
1381 discussion.
1382
1383 The other minor difference as compared to the standard typemap handling
1384 in Perl, is that the user cannot specify non-standard typemap locations
1385 or typemap filenames using the "TYPEMAPS" option in MakeMaker... Thus
1386 you can only use a file called "typemap" and/or the "IV" trick above.
1387
1388 Other useful PP keys in data operation definitions
1389 You have already heard about the "OtherPars" key. Currently, there are
1390 not many other keys for a data operation that will be useful in normal
1391 (whatever that is) PP programming. In fact, it would be interesting to
1392 hear about a case where you think you need more than what is provided
1393 at the moment. Please speak up on one of the PDL mailing lists. Most
1394 other keys recognised by "pp_def" are only really useful for what we
1395 call slice operations (see also above).
1396
1397 One thing that is strongly being planned is variable number of
1398 arguments, which will be a little tricky.
1399
1400 An incomplete list of the available keys:
1401
1402 Inplace
1403 Setting this key marks the routine as working inplace - ie the
1404 input and output ndarrays are the same. An example is
1405 "$x->inplace->sqrt()" (or "sqrt(inplace($x))").
1406
1407 Inplace => 1
1408 Use when the routine is a unary function, such as "sqrt".
1409
1410 Inplace => ['a']
1411 If there are more than one input ndarrays, specify the name of
1412 the one that can be changed inplace using an array reference.
1413
1414 Inplace => ['a','b']
1415 If there are more than one output ndarray, specify the name of
1416 the input ndarray and output ndarray in a 2-element array
1417 reference. This probably isn't needed, but left in for
1418 completeness.
1419
1420 If bad values are being used, care must be taken to ensure the
1421 propagation of the badflag when inplace is being used; consider
1422 this excerpt from Basic/Bad/bad.pd:
1423
1424 pp_def('replacebad',HandleBad => 1,
1425 Pars => 'a(); [o]b();',
1426 OtherPars => 'double newval',
1427 Inplace => 1,
1428 CopyBadStatusCode =>
1429 '/* propagate badflag if inplace AND it has changed */
1430 if ( a == b && $ISPDLSTATEBAD(a) )
1431 PDL->propagate_badflag( b, 0 );
1432
1433 /* always make sure the output is "good" */
1434 $SETPDLSTATEGOOD(b);
1435 ',
1436 ...
1437
1438 Since this routine removes all bad values, the output ndarray had
1439 its bad flag cleared. If run inplace (so "a == b"), then we have to
1440 tell all the children of "a" that the bad flag has been cleared (to
1441 save time we make sure that we call "PDL->propagate_badgflag" only
1442 if the input ndarray had its bad flag set).
1443
1444 NOTE: one idea is that the documentation for the routine could be
1445 automatically flagged to indicate that it can be executed inplace,
1446 ie something similar to how "HandleBad" sets "BadDoc" if it's not
1447 supplied (it's not an ideal solution).
1448
1449 Other PDL::PP functions to support concise package definition
1450 So far, we have described the "pp_def" and "pp_done" functions. PDL::PP
1451 exports a few other functions to aid you in writing concise PDL
1452 extension package definitions.
1453
1454 pp_addhdr
1455
1456 Often when you interface library functions as in the above example you
1457 have to include additional C include files. Since the XS file is
1458 generated by PP we need some means to make PP insert the appropriate
1459 include directives in the right place into the generated XS file. To
1460 this end there is the "pp_addhdr" function. This is also the function
1461 to use when you want to define some C functions for internal use by
1462 some of the XS functions (which are mostly functions defined by
1463 "pp_def"). By including these functions here you make sure that
1464 PDL::PP inserts your code before the point where the actual XS module
1465 section begins and will therefore be left untouched by xsubpp (cf.
1466 perlxs and perlxstut man pages).
1467
1468 A typical call would be
1469
1470 pp_addhdr('
1471 #include <unistd.h> /* we need defs of XXXX */
1472 #include "libprotos.h" /* prototypes of library functions */
1473 #include "mylocaldecs.h" /* Local decs */
1474
1475 static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
1476 {
1477 /* do some calculations with the data */
1478 }
1479 ');
1480
1481 This ensures that all the constants and prototypes you need will be
1482 properly included and that you can use the internal functions defined
1483 here in the "pp_def"s, e.g.:
1484
1485 pp_def('barfoo',
1486 Pars => ' a(n); [o] b(n)',
1487 GenericTypes => ['B'],
1488 Code => ' PDL_Indx ns = $SIZE(n);
1489 do_the_real_work($P(a),$P(b),ns);
1490 ',
1491 );
1492
1493 pp_addpm
1494
1495 In many cases the actual PP code (meaning the arguments to "pp_def"
1496 calls) is only part of the package you are currently implementing.
1497 Often there is additional Perl code and XS code you would normally have
1498 written into the pm and XS files which are now automatically generated
1499 by PP. So how to get this stuff into those dynamically generated files?
1500 Fortunately, there are a couple of functions, generally called
1501 "pp_addXXX" that assist you in doing this.
1502
1503 Let's assume you have additional Perl code that should go into the
1504 generated pm-file. This is easily achieved with the "pp_addpm" command:
1505
1506 pp_addpm(<<'EOD');
1507
1508 =head1 NAME
1509
1510 PDL::Lib::Mylib -- a PDL interface to the Mylib library
1511
1512 =head1 DESCRIPTION
1513
1514 This package implements an interface to the Mylib package with full
1515 threading and indexing support (see L<PDL::Indexing>).
1516
1517 =cut
1518
1519 use PGPLOT;
1520
1521 =head2 use_myfunc
1522 this function applies the myfunc operation to all the
1523 elements of the input pdl regardless of dimensions
1524 and returns the sum of the result
1525 =cut
1526
1527 sub use_myfunc {
1528 my $pdl = shift;
1529
1530 myfunc($pdl->clump(-1),($res=null));
1531
1532 return $res->sum;
1533 }
1534
1535 EOD
1536
1537 pp_add_exported
1538
1539 You have probably got the idea. In some cases you also want to export
1540 your additional functions. To avoid getting into trouble with PP which
1541 also messes around with the @EXPORT array you just tell PP to add your
1542 functions to the list of exported functions:
1543
1544 pp_add_exported('use_myfunc gethynx');
1545
1546 pp_add_isa
1547
1548 The "pp_add_isa" command works like the the "pp_add_exported" function.
1549 The arguments to "pp_add_isa" are added the @ISA list, e.g.
1550
1551 pp_add_isa(' Some::Other::Class ');
1552
1553 pp_bless
1554
1555 If your pp_def routines are to be used as object methods use "pp_bless"
1556 to specify the package (i.e. class) to which your pp_defed methods will
1557 be added. For example, "pp_bless('PDL::MyClass')". The default is "PDL"
1558 if this is omitted.
1559
1560 pp_addxs
1561
1562 Sometimes you want to add extra XS code of your own (that is generally
1563 not involved with any threading/indexing issues but supplies some other
1564 functionality you want to access from the Perl side) to the generated
1565 XS file, for example
1566
1567 pp_addxs('','
1568
1569 # Determine endianness of machine
1570
1571 int
1572 isbigendian()
1573 CODE:
1574 unsigned short i;
1575 PDL_Byte *b;
1576
1577 i = 42; b = (PDL_Byte*) (void*) &i;
1578
1579 if (*b == 42)
1580 RETVAL = 0;
1581 else if (*(b+1) == 42)
1582 RETVAL = 1;
1583 else
1584 croak("Impossible - machine is neither big nor little endian!!\n");
1585 OUTPUT:
1586 RETVAL
1587 ');
1588
1589 Especially "pp_add_exported" and "pp_addxs" should be used with care.
1590 PP uses PDL::Exporter, hence letting PP export your function means that
1591 they get added to the standard list of function exported by default
1592 (the list defined by the export tag ``:Func''). If you use "pp_addxs"
1593 you shouldn't try to do anything that involves threading or indexing
1594 directly. PP is much better at generating the appropriate code from
1595 your definitions.
1596
1597 pp_add_boot
1598
1599 Finally, you may want to add some code to the BOOT section of the XS
1600 file (if you don't know what that is check perlxs). This is easily done
1601 with the "pp_add_boot" command:
1602
1603 pp_add_boot(<<EOB);
1604 descrip = mylib_initialize(KEEP_OPEN);
1605
1606 if (descrip == NULL)
1607 croak("Can't initialize library");
1608
1609 GlobalStruc->descrip = descrip;
1610 GlobalStruc->maxfiles = 200;
1611 EOB
1612
1613 pp_export_nothing
1614
1615 By default, PP.pm puts all subs defined using the pp_def function into
1616 the output .pm file's EXPORT list. This can create problems if you are
1617 creating a subclassed object where you don't want any methods exported.
1618 (i.e. the methods will only be called using the $object->method
1619 syntax).
1620
1621 For these cases you can call pp_export_nothing() to clear out the
1622 export list. Example (At the end of the .pd file):
1623
1624 pp_export_nothing();
1625 pp_done();
1626
1627 pp_core_importList
1628
1629 By default, PP.pm puts the 'use Core;' line into the output .pm file.
1630 This imports Core's exported names into the current namespace, which
1631 can create problems if you are over-riding one of Core's methods in the
1632 current file. You end up getting messages like "Warning: sub sumover
1633 redefined in file subclass.pm" when running the program.
1634
1635 For these cases the pp_core_importList can be used to change what is
1636 imported from Core.pm. For example:
1637
1638 pp_core_importList('()')
1639
1640 This would result in
1641
1642 use Core();
1643
1644 being generated in the output .pm file. This would result in no names
1645 being imported from Core.pm. Similarly, calling
1646
1647 pp_core_importList(' qw/ barf /')
1648
1649 would result in
1650
1651 use Core qw/ barf/;
1652
1653 being generated in the output .pm file. This would result in just
1654 'barf' being imported from Core.pm.
1655
1656 pp_setversion
1657
1658 Simultaneously set the .pm and .xs files' versions, thus avoiding
1659 unnecessary version-skew between the two. To use this, simply do this
1660 in your .pd file, probably near the top:
1661
1662 our $VERSION = '0.0.3';
1663 pp_setversion($VERSION);
1664
1665 # Then, in your Makefile.PL:
1666 my @package = qw(FFTW3.pd FFTW3 PDL::FFTW3);
1667 my %descriptor = pdlpp_stdargs(\@package);
1668 $descriptor{VERSION_FROM} = 'FFTW3.pd'; # EUMM can parse the format above
1669
1670 However, don't use this if you use Module::Build::PDL. See that
1671 module's documentation for details.
1672
1673 pp_deprecate_module
1674
1675 If a particular module is deemed obsolete, this function can be used to
1676 mark it as deprecated. This has the effect of emitting a warning when a
1677 user tries to "use" the module. The generated POD for this module also
1678 carries a deprecation notice. The replacement module can be passed as
1679 an argument like this:
1680
1681 pp_deprecate_module( infavor => "PDL::NewNonDeprecatedModule" );
1682
1683 Note that function affects only the runtime warning and the POD.
1684
1686 Let's say that you have a function in your module called PDL::foo that
1687 uses the PP function "bar_pp" to do the heavy lifting. But you don't
1688 want to advertise that "bar_pp" exists. To do this, you must move your
1689 PP function to the top of your module file, then call
1690
1691 pp_export_nothing()
1692
1693 to clear the "EXPORT" list. To ensure that no documentation (even the
1694 default PP docs) is generated, set
1695
1696 Doc => undef
1697
1698 and to prevent the function from being added to the symbol table, set
1699
1700 PMFunc => ''
1701
1702 in your pp_def declaration (see Image2D.pd for an example). This will
1703 effectively make your PP function "private." However, it is always
1704 accessible via PDL::bar_pp due to Perl's module design. But making it
1705 private will cause the user to go very far out of his or her way to use
1706 it, so he or she shoulders the consequences!
1707
1709 The slice operation section of this manual is provided using dataflow
1710 and lazy evaluation: when you need it, ask Tjl to write it. a delivery
1711 in a week from when I receive the email is 95% probable and two week
1712 delivery is 99% probable.
1713
1714 And anyway, the slice operations require a much more intimate knowledge
1715 of PDL internals than the data operations. Furthermore, the complexity
1716 of the issues involved is considerably higher than that in the average
1717 data operation. If you would like to convince yourself of this fact
1718 take a look at the Basic/Slices/slices.pd file in the PDL distribution
1719 :-). Nevertheless, functions generated using the slice operations are
1720 at the heart of the index manipulation and dataflow capabilities of
1721 PDL.
1722
1723 Also, there are a lot of dirty issues with virtual ndarrays and
1724 vaffines which we shall entirely skip here.
1725
1726 Slices and bad values
1727 Slice operations need to be able to handle bad values. The easiest
1728 thing to do is look at Basic/Slices/slices.pd to see how this works.
1729
1730 Along with "BadCode", there are also the "BadBackCode" and
1731 "BadRedoDimsCode" keys for "pp_def". However, any "EquivCPOffsCode"
1732 should not need changing, since any changes are absorbed into the
1733 definition of the "$EQUIVCPOFFS()" macro (i.e. it is handled
1734 automatically by PDL::PP).
1735
1736 A few notes on writing a slicing routine...
1737 The following few paragraphs describe writing of a new slicing routine
1738 ('range'); any errors are CED's. (--CED 26-Aug-2002)
1739
1741 For printing warning messages or aborting/dieing, you can call "warn"
1742 or "barf" from PP code. However, you should be aware that these calls
1743 have been redefined using C preprocessor macros to "PDL->barf" and
1744 "PDL->warn". These redefinitions are in place to keep you from
1745 inadvertently calling perl's "warn" or "barf" directly, which can cause
1746 segfaults during pthreading (i.e. processor multi-threading).
1747
1748 PDL's own versions of "barf" and "warn" will queue-up warning or barf
1749 messages until after pthreading is completed, and then call the perl
1750 versions of these routines.
1751
1752 See PDL::ParallelCPU for more information on pthreading.
1753
1755 The PDL "Core" structure, defined in Basic/Core/pdlcore.h.PL, contains
1756 pointers to a number of routines that may be useful to you. The
1757 majority of these routines deal with manipulating ndarrays, but some
1758 are more general:
1759
1760 PDL->qsort_B( PDL_Byte *xx, PDL_Indx a, PDL_Indx b )
1761 Sort the array "xx" between the indices "a" and "b". There are
1762 also versions for the other PDL datatypes, with postfix "_S", "_U",
1763 "_L", "_N", "_Q", "_F", and "_D". Any module using this must
1764 ensure that "PDL::Ufunc" is loaded.
1765
1766 PDL->qsort_ind_B( PDL_Byte *xx, PDL_Indx *ix, PDL_Indx a, PDL_Indx b )
1767 As for "PDL->qsort_B", but this time sorting the indices rather
1768 than the data.
1769
1770 The routine "med2d" in Lib/Image2D/image2d.pd shows how such routines
1771 are used.
1772
1774 If you are going to generate a package from your PP file (typical file
1775 extensions are ".pd" or ".pp" for the files containing PP code) it is
1776 easiest and safest to leave generation of the appropriate commands to
1777 the Makefile. In the following we will outline the typical format of a
1778 Perl Makefile to automatically build and install your package from a
1779 description in a PP file. Most of the rules to build the xs, pm and
1780 other required files from the PP file are already predefined in the
1781 PDL::Core::Dev package. We just have to tell MakeMaker to use it.
1782
1783 In most cases you can define your Makefile like
1784
1785 # Makefile.PL for a package defined by PP code.
1786
1787 use PDL::Core::Dev; # Pick up development utilities
1788 use ExtUtils::MakeMaker;
1789
1790 $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
1791 %hash = pdlpp_stdargs($package);
1792 $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
1793 $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
1794 WriteMakefile(%hash);
1795
1796 sub MY::postamble { pdlpp_postamble($package); }
1797
1798 Here, the list in $package is: first: PP source file name, then the
1799 prefix for the produced files and finally the whole package name. You
1800 can modify the hash in whatever way you like but it would be reasonable
1801 to stay within some limits so that your package will continue to work
1802 with later versions of PDL.
1803
1804 If you don't want to use prepackaged arguments, here is a generic
1805 Makefile.PL that you can adapt for your own needs:
1806
1807 # Makefile.PL for a package defined by PP code.
1808
1809 use PDL::Core::Dev; # Pick up development utilities
1810 use ExtUtils::MakeMaker;
1811
1812 WriteMakefile(
1813 'NAME' => 'PDL::Lib::Mylib',
1814 'VERSION_FROM' => 'mylib.pd',
1815 'TYPEMAPS' => [&PDL_TYPEMAP()],
1816 'OBJECT' => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
1817 'PM' => { 'Mylib.pm' => '$(INST_LIBDIR)/Mylib.pm'},
1818 'INC' => &PDL_INCLUDE(), # add include dirs as required by your lib
1819 'LIBS' => [''], # add link directives as necessary
1820 'clean' => {'FILES' =>
1821 'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
1822 additional_Ccode$(OBJ_EXT)'},
1823 );
1824
1825 # Add genpp rule; this will invoke PDL::PP on our PP file
1826 # the argument is an array reference where the array has three string elements:
1827 # arg1: name of the source file that contains the PP code
1828 # arg2: basename of the xs and pm files to be generated
1829 # arg3: name of the package that is to be generated
1830 sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }
1831
1832 To make life even easier PDL::Core::Dev defines the function
1833 "pdlpp_stdargs" that returns a hash with default values that can be
1834 passed (either directly or after appropriate modification) to a call to
1835 WriteMakefile. Currently, "pdlpp_stdargs" returns a hash where the
1836 keys are filled in as follows:
1837
1838 (
1839 'NAME' => $mod,
1840 'TYPEMAPS' => [&PDL_TYPEMAP()],
1841 'OBJECT' => "$pref\$(OBJ_EXT)",
1842 PM => {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
1843 MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
1844 'INC' => &PDL_INCLUDE(),
1845 'LIBS' => [''],
1846 'clean' => {'FILES' => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
1847 )
1848
1849 Here, $src is the name of the source file with PP code, $pref the
1850 prefix for the generated .pm and .xs files and $mod the name of the
1851 extension module to generate.
1852
1854 The internals of the current version consist of a large table which
1855 gives the rules according to which things are translated and the subs
1856 which implement these rules.
1857
1858 Later on, it would be good to make the table modifiable by the user so
1859 that different things may be tried.
1860
1861 [Meta comment: here will hopefully be more in the future; currently,
1862 your best bet will be to read the source code :-( or ask on the list
1863 (try the latter first) ]
1864
1866 Unless otherwise specified, the arguments are strings.
1867
1868 Pars
1869 define the signature of your function
1870
1871 OtherPars
1872 arguments which are not pdls. Default: nothing. This is a semi-
1873 colon separated list of arguments, e.g., "OtherPars=>'int k; double
1874 value; char* fd'". See $COMP(x) and also the same entry in Appendix
1875 B.
1876
1877 Code
1878 the actual code that implements the functionality; several PP
1879 macros and PP functions are recognised in the string value
1880
1881 HandleBad
1882 If set to 1, the routine is assumed to support bad values and the
1883 code in the BadCode key is used if bad values are present; it also
1884 sets things up so that the "$ISBAD()" etc macros can be used. If
1885 set to 0, cause the routine to print a warning if any of the input
1886 ndarrays have their bad flag set.
1887
1888 BadCode
1889 Give the code to be used if bad values may be present in the input
1890 ndarrays. Only used if "HandleBad => 1".
1891
1892 GenericTypes
1893 An array reference. The array may contain any subset of the one-
1894 character strings given below, which specify which types your
1895 operation will accept. The meaning of each type is:
1896
1897 B - signed byte (i.e. signed char)
1898 S - signed short (two-byte integer)
1899 U - unsigned short
1900 L - signed long (four-byte integer, int on 32 bit systems)
1901 N - signed integer for indexing ndarray elements (platform & Perl-dependent size)
1902 Q - signed long long (eight byte integer)
1903 F - float
1904 D - double
1905 G - complex float
1906 C - complex double
1907
1908 This is very useful (and important!) when interfacing an external
1909 library. Default: [qw/B S U L N Q F D/]
1910
1911 Inplace
1912 Mark a function as being able to work inplace.
1913
1914 Inplace => 1 if Pars => 'a(); [o]b();'
1915 Inplace => ['a'] if Pars => 'a(); b(); [o]c();'
1916 Inplace => ['a','b'] if Pars => 'a(); b(); [o]c(); [o]d();'
1917
1918 If bad values are being used, care must be taken to ensure the
1919 propagation of the badflag when inplace is being used; for instance
1920 see the code for "replacebad" in Basic/Bad/bad.pd.
1921
1922 Doc Used to specify a documentation string in Pod format. See PDL::Doc
1923 for information on PDL documentation conventions. Note: in the
1924 special case where the PP 'Doc' string is one line this is
1925 implicitly used for the quick reference AND the documentation!
1926
1927 If the Doc field is omitted PP will generate default documentation
1928 (after all it knows about the Signature).
1929
1930 If you really want the function NOT to be documented in any way at
1931 this point (e.g. for an internal routine, or because you are doing
1932 it elsewhere in the code) explicitly specify "Doc=>undef".
1933
1934 BadDoc
1935 Contains the text returned by the "badinfo" command (in "perldl")
1936 or the "-b" switch to the "pdldoc" shell script. In many cases, you
1937 will not need to specify this, since the information can be
1938 automatically created by PDL::PP. However, as befits computer-
1939 generated text, it's rather stilted; it may be much better to do it
1940 yourself!
1941
1942 NoPthread
1943 Optional flag to indicate the PDL function should not use processor
1944 threads (i.e. pthreads or POSIX threads) to split up work across
1945 multiple CPU cores. This option is typically set to 1 if the
1946 underlying PDL function is not threadsafe. If this option isn't
1947 present, then the function is assumed to be threadsafe. This option
1948 only applies if PDL has been compiled with POSIX threads enabled.
1949
1950 PMCode
1951 pp_def('funcname',
1952 Pars => 'a(); [o] b();',
1953 PMCode => 'sub PDL::funcname {
1954 return PDL::_funcname_int(@_) if @_ == 2; # output arg "b" supplied
1955 PDL::_funcname_int(@_, my $out = PDL->null);
1956 $out;
1957 }',
1958 # ...
1959 );
1960
1961 PDL functions allow "[o]" ndarray arguments into which you want the
1962 output saved. This is handy because you can allocate an output
1963 ndarray once and reuse it many times; the alternative would be for
1964 PDL to create a new ndarray each time, which may waste compute
1965 cycles or, more likely, RAM.
1966
1967 PDL functions check the number of arguments they are given, and
1968 call "croak" if given the wrong number. By default (with no
1969 "PMCode" supplied), any output arguments may be omitted, and
1970 PDL::PP provides code that can handle this by creating "null"
1971 objects, passing them to your code, then returning them on the
1972 stack.
1973
1974 If you do supply "PMCode", the rest of PDL::PP assumes it will be a
1975 string that defines a Perl function with the function's name in the
1976 "pp_bless" package ("PDL" by default). As the example implies, the
1977 PP-generated function name will change from "<funcname>", to
1978 "_<funcname>_int". As also shown above, you will need to supply all
1979 ndarrays in the exact order specified in the signature: output
1980 ndarrays are not optional, and the PP-generated function will not
1981 return anything.
1982
1983 PMFunc
1984 When pp_def generates functions, it typically defines them in the
1985 PDL package. Then, in the .pm file that it generates for your
1986 module, it typically adds a line that essentially copies that
1987 function into your current package's symbol table with code that
1988 looks like this:
1989
1990 *func_name = \&PDL::func_name;
1991
1992 It's a little bit smarter than that (it knows when to wrap that
1993 sort of thing in a BEGIN block, for example, and if you specified
1994 something different for pp_bless), but that's the gist of it. If
1995 you don't care to import the function into your current package's
1996 symbol table, you can specify
1997
1998 PMFunc => '',
1999
2000 PMFunc has no other side-effects, so you could use it to insert
2001 arbitrary Perl code into your module if you like. However, you
2002 should use pp_addpm if you want to add Perl code to your module.
2003
2005 Macros
2006 $variablename_from_sig()
2007 access a pdl (by its name) that was specified in the signature
2008
2009 $COMP(x)
2010 access a value in the private data structure of this
2011 transformation (mainly used to use an argument that is specified
2012 in the "OtherPars" section)
2013
2014 $SIZE(n)
2015 replaced at runtime by the actual size of a named dimension (as
2016 specified in the signature)
2017
2018 $GENERIC()
2019 replaced by the C type that is equal to the runtime type of the
2020 operation
2021
2022 $P(a) a pointer to the data of the PDL named "a" in the signature.
2023 Useful for interfacing to C functions
2024
2025 $PP(a) a physical pointer access to pdl "a"; mainly for internal use
2026
2027 $TXXX(Alternative,Alternative)
2028 expansion alternatives according to runtime type of operation,
2029 where XXX is some string that is matched by "/[BSULNQFD+]/".
2030
2031 $PDL(a)
2032 return a pointer to the pdl data structure (pdl *) of ndarray
2033 "a"
2034
2035 $ISBAD(a())
2036 returns true if the value stored in "a()" equals the bad value
2037 for this ndarray. Requires "HandleBad" being set to 1.
2038
2039 $ISGOOD(a())
2040 returns true if the value stored in "a()" does not equal the bad
2041 value for this ndarray. Requires "HandleBad" being set to 1.
2042
2043 $SETBAD(a())
2044 Sets "a()" to equal the bad value for this ndarray. Requires
2045 "HandleBad" being set to 1.
2046
2047 functions
2048 "loop(DIMS) %{ ... %}"
2049 loop over named dimensions; limits are generated automatically by PP
2050
2051 "threadloop %{ ... %}"
2052 enclose following code in a thread loop
2053
2054 "types(TYPES) %{ ... %}"
2055 execute following code if type of operation is any of "TYPES"
2056
2058 A number of functions are imported when you "use PDL::PP". These
2059 include functions that control the generated C or XS code, functions
2060 that control the generated Perl code, and functions that manipulate the
2061 packages and symbol tables into which the code is created.
2062
2063 Generating C and XS Code
2064 PDL::PP's main purpose is to make it easy for you to wrap the threading
2065 engine around your own C code, but you can do some other things, too.
2066
2067 pp_def
2068 Used to wrap the threading engine around your C code. Virtually all
2069 of this document discusses the use of pp_def.
2070
2071 pp_done
2072 Indicates you are done with PDL::PP and that it should generate its
2073 .xs and .pm files based upon the other pp_* functions that you have
2074 called. This function takes no arguments.
2075
2076 pp_addxs
2077 This lets you add XS code to your .xs file. This is useful if you
2078 want to create Perl-accessible functions that invoke C code but
2079 cannot or should not invoke the threading engine. XS is the
2080 standard means by which you wrap Perl-accessible C code. You can
2081 learn more at perlxs.
2082
2083 pp_add_boot
2084 This function adds whatever string you pass to the XS BOOT section.
2085 The BOOT section is C code that gets called by Perl when your
2086 module is loaded and is useful for automatic initialization. You
2087 can learn more about XS and the BOOT section at perlxs.
2088
2089 pp_addhdr
2090 Adds pure-C code to your XS file. XS files are structured such that
2091 pure C code must come before XS specifications. This allows you to
2092 specify such C code.
2093
2094 pp_boundscheck
2095 PDL normally checks the bounds of your accesses before making them.
2096 You can turn that on or off at runtime by setting
2097 MyPackage::set_boundscheck. This function allows you to remove that
2098 runtime flexibility and never do bounds checking. It also returns
2099 the current boundschecking status if called without any argumens.
2100
2101 NOTE: I have not found anything about bounds checking in other
2102 documentation. That needs to be addressed.
2103
2104 Generating Perl Code
2105 Many functions imported when you use PDL::PP allow you to modify the
2106 contents of the generated .pm file. In addition to pp_def and pp_done,
2107 the role of these functions is primarily to add code to various parts
2108 of your generated .pm file.
2109
2110 pp_addpm
2111 Adds Perl code to the generated .pm file. PDL::PP actually keeps
2112 track of three different sections of generated code: the Top, the
2113 Middle, and the Bottom. You can add Perl code to the Middle section
2114 using the one-argument form, where the argument is the Perl code
2115 you want to supply. In the two-argument form, the first argument is
2116 an anonymous hash with only one key that specifies where to put the
2117 second argument, which is the string that you want to add to the
2118 .pm file. The hash is one of these three:
2119
2120 {At => 'Top'}
2121 {At => 'Middle'}
2122 {At => 'Bot'}
2123
2124 For example:
2125
2126 pp_addpm({At => 'Bot'}, <<POD);
2127
2128 =head1 Some documentation
2129
2130 I know I'm typing this in the middle of my file, but it'll go at
2131 the bottom.
2132
2133 =cut
2134
2135 POD
2136
2137 Warning: If, in the middle of your .pd file, you put documentation
2138 meant for the bottom of your pod, you will thoroughly confuse CPAN.
2139 On the other hand, if in the middle of your .pd file, you add some
2140 Perl code destined for the bottom or top of your .pm file, you only
2141 have yourself to confuse. :-)
2142
2143 pp_beginwrap
2144 Adds BEGIN-block wrapping. Certain declarations can be wrapped in
2145 BEGIN blocks, though the default behavior is to have no such
2146 wrapping.
2147
2148 pp_addbegin
2149 Sets code to be added to the top of your .pm file, even above code
2150 that you specify with "pp_addpm({At => 'Top'}, ...)". Unlike
2151 pp_addpm, calling this overwrites whatever was there before.
2152 Generally, you probably shouldn't use it.
2153
2154 Tracking Line Numbers
2155 When you get compile errors, either from your C-like code or your Perl
2156 code, it can help to make those errors back to the line numbers in the
2157 source file at which the error occurred.
2158
2159 pp_line_numbers
2160 Takes a line number and a (usually long) string of code. The line
2161 number should indicate the line at which the quote begins. This is
2162 usually Perl's "__LINE__" literal, unless you are using heredocs,
2163 in which case it is "__LINE__ + 1". The returned string has #line
2164 directives interspersed to help the compiler report errors on the
2165 proper line.
2166
2167 Modifying the Symbol Table and Export Behavior
2168 PDL::PP usually exports all functions generated using pp_def, and
2169 usually installs them into the PDL symbol table. However, you can
2170 modify this behavior with these functions.
2171
2172 pp_bless
2173 Sets the package (symbol table) to which the XS code is added. The
2174 default is PDL, which is generally what you want. If you use the
2175 default blessing and you create a function myfunc, then you can do
2176 the following:
2177
2178 $ndarray->myfunc(<args>);
2179 PDL::myfunc($ndarray, <args>);
2180
2181 On the other hand, if you bless your functions into another
2182 package, you cannot invoke them as PDL methods, and must invoke
2183 them as:
2184
2185 MyPackage::myfunc($ndarray, <args>);
2186
2187 Of course, you could always use the PMFunc key to add your function
2188 to the PDL symbol table, but why do that?
2189
2190 pp_add_isa
2191 Adds to the list of modules from which your module inherits. The
2192 default list is
2193
2194 qw(PDL::Exporter DynaLoader)
2195
2196 pp_core_importlist
2197 At the top of your generated .pm file is a line that looks like
2198 this:
2199
2200 use PDL::Core;
2201
2202 You can modify that by specifying a string to pp_core_importlist.
2203 For example,
2204
2205 pp_core_importlist('::Blarg');
2206
2207 will result in
2208
2209 use PDL::Core::Blarg;
2210
2211 You can use this, for example, to add a list of symbols to import
2212 from PDL::Core. For example:
2213
2214 pp_core_importlist(" ':Internal'");
2215
2216 will lead to the following use statement:
2217
2218 use PDL::Core ':Internal';
2219
2220 pp_setversion
2221 Sets your module's version. The version must be consistent between
2222 the .xs and the .pm file, and is used to ensure that your Perl's
2223 libraries do not suffer from version skew.
2224
2225 pp_add_exported
2226 Adds to the export list whatever names you give it. Functions
2227 created using pp_def are automatically added to the list. This
2228 function is useful if you define any Perl functions using pp_addpm
2229 or pp_addxs that you want exported as well.
2230
2231 pp_export_nothing
2232 This resets the list of exported symbols to nothing. This is
2233 probably better called "pp_export_clear", since you can add
2234 exported symbols after calling "pp_export_nothing". When called
2235 just before calling pp_done, this ensures that your module does not
2236 export anything, for example, if you only want programmers to use
2237 your functions as methods.
2238
2240 PDL
2241
2242 For the concepts of threading and slicing check PDL::Indexing.
2243
2244 PDL::Internals
2245
2246 PDL::BadValues for information on bad values
2247
2248 perlxs, perlxstut
2249
2250 Practical Magick with C, PDL, and PDL::PP -- a guide to compiled add-
2251 ons for PDL <https://arxiv.org/abs/1702.07753>
2252
2254 Almost everything having to do with "Slice operation". This includes
2255 much of the following (each entry is followed by a guess/description of
2256 where it is used or defined):
2257
2258 MACROS
2259 $CDIM()
2260
2261 $CHILD()
2262 PDL::PP::Rule::Substitute::Usual
2263
2264 $CHILD_P()
2265 PDL::PP::Rule::Substitute::Usual
2266
2267 $CHILD_PTR()
2268 PDL::PP::Rule::Substitute::Usual
2269
2270 $COPYDIMS()
2271
2272 $COPYINDS()
2273
2274 $CROAK()
2275 PDL::PP::Rule::Substitute::dosubst_private()
2276
2277 $DOCOMPDIMS()
2278 Used in slices.pd, defined where?
2279
2280 $DOPRIVDIMS()
2281 Used in slices.pd, defined where?
2282 Code comes from PDL::PP::CType::get_malloc, which is called by
2283 PDL::PP::CType::get_copy, which is called by PDL::PP::CopyOtherPars,
2284 PDL::PP::NT2Copies__, and PDL::PP::make_incsize_copy. But none of
2285 those three at first glance seem to have anything to do with
2286 $DOPRIVDIMS
2287
2288 $EQUIVCPOFFS()
2289
2290 $EQUIVCPTRUNC()
2291
2292 $PARENT()
2293 PDL::PP::Rule::Substitute::Usual
2294
2295 $PARENT_P()
2296 PDL::PP::Rule::Substitute::Usual
2297
2298 $PARENT_PTR()
2299 PDL::PP::Rule::Substitute::Usual
2300
2301 $PDIM()
2302
2303 $PRIV()
2304 PDL::PP::Rule::Substitute::dosubst_private()
2305
2306 $RESIZE()
2307
2308 $SETDELTATHREADIDS()
2309 PDL::PP::Rule::MakeComp
2310
2311 $SETDIMS()
2312 PDL::PP::Rule::MakeComp
2313
2314 $SETNDIMS()
2315 PDL::PP::Rule::MakeComp
2316
2317 $SETREVERSIBLE()
2318 PDL::PP::Rule::Substitute::dosubst_private()
2319
2320 Keys
2321 AffinePriv
2322
2323 BackCode
2324
2325 BadBackCode
2326
2327 CallCopy
2328
2329 Comp (related to $COMP()?)
2330
2331 DefaultFlow
2332
2333 EquivCDimExpr
2334
2335 EquivCPOffsCode
2336
2337 EquivDimCheck
2338
2339 EquivPDimExpr
2340
2341 FTypes (see comment in this POD's source file between NoPthread and
2342 PMCode.)
2343
2344 GlobalNew
2345
2346 Identity
2347
2348 MakeComp
2349
2350 NoPdlThread
2351
2352 P2Child
2353
2354 ParentInds
2355
2356 Priv
2357
2358 ReadDataFuncName
2359
2360 RedoDims (related to RedoDimsCode ?)
2361
2362 Reversible
2363
2364 WriteBckDataFuncName
2365
2366 XCHGOnly
2367
2369 Although PDL::PP is quite flexible and thoroughly used, there are
2370 surely bugs. First amongst them: this documentation needs a thorough
2371 revision.
2372
2374 Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), Karl
2375 Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian Soeller
2376 (c.soeller@auckland.ac.nz). All rights reserved. Documentation updates
2377 Copyright(C) 2011 David Mertens (dcmertens.perl@gmail.com). This
2378 documentation is licensed under the same terms as Perl itself.
2379
2380
2381
2382perl v5.34.0 2021-08-16 PP(1)