PDL::Indexing(1)

1INDEXING(1)           User Contributed Perl Documentation          INDEXING(1)
2
3
4

NAME

6       PDL::Indexing - Introduction to indexing and slicing piddles.
7

OVERVIEW

9       This man page should serve as a first tutorial on the indexing and
10       threading features of PDL.
11
12       Like all vectorized languages, PDL automates looping over multi-
13       dimensional data structures ("piddles") using a variant of mathematical
14       vector notation.  The automatic looping is called "threading", in part
15       because ultimately PDL will implement parallel processing to speed up
16       the loops.
17
18       A lot of the flexibility and power of PDL relies on the indexing and
19       threading features of the Perl extension.  Indexing allows access to
20       the data of a piddle in a very flexible way.  Threading provides
21       efficient vectorization of simple operations.
22
23       The values of a piddle are stored compactly as typed values in a single
24       block of memory, not (as in a normal Perl list-of-lists) as individual
25       Perl scalars.
26
27       In the sections that follow many "methods" are called out -- these are
28       Perl operators that apply to piddles.  From the perldl (or pdl2) shell,
29       you can find out more about each method by typing "?" followed by the
30       method name.
31
32   Dimension lists
33       A piddle (PDL variable), in general, is an N-dimensional array where N
34       can be 0 (for a scalar), 1 (e.g. for a sound sample), or higher values
35       for images and more complex structures.  Each dimension of the piddle
36       has a positive integer size.  The "perl" interpreter treats each piddle
37       as a special type of Perl scalar (a blessed Perl object, actually --
38       but you don't have to know that to use them) that can be used anywhere
39       you can put a normal scalar.
40
41       You can access the dimensions of a piddle as a Perl list and otherwise
42       determine the size of a piddle with several methods.  The important
43       ones are:
44
45       nelem - the total number of elements in a piddle
46       ndims - returns the number of dimensions in a piddle
47       dims - returns the dimension list of a piddle as a Perl list
48       dim - returns the size of a particular dimension of a piddle
49
50   Indexing and Dataflow
51       PDL maintains a notion of "dataflow" between a piddle and indexed
52       subfields of that piddle.  When you produce an indexed subfield or
53       single element of a parent piddle, the child and parent remain attached
54       until you manually disconnect them.  This lets you represent the same
55       data different ways within your code -- for example, you can consider
56       an RGB image simultaneously as a collection of (R,G,B) values in a 3 x
57       1000 x 1000 image, and as three separate 1000 x 1000 color planes
58       stored in different variables.  Modifying any of the variables changes
59       the underlying memory, and the changes are reflected in all
60       representations of the data.
61
62       There are two important methods that let you control dataflow
63       connections between a child and parent piddle:
64
65       copy - forces an explicit copy of a piddle
66       sever - breaks the dataflow connection between a piddle and its parents
67       (if any)
68
69   Threading and Dimension Order
70       Most PDL operations act on the first few dimensions of their piddle
71       arguments.  For example, "sumover" sums all elements along the first
72       dimension in the list (dimension 0).  If you feed in a three-
73       dimensional piddle, then the first dimension is considered the "active"
74       dimension and the later dimensions are "thread" dimensions because they
75       are simply looped over.  There are several ways to transpose or re-
76       order the dimension list of a piddle.  Those techniques are very fast
77       since they don't touch the underlying data, only change the way that
78       PDL accesses the data.  The main dimension ordering functions are:
79
80       mv - moves a particular dimension somewhere else in the dimension list
81       xchg - exchanges two dimensions in the dimension list, leaving the rest
82       alone
83       reorder - allows wholesale mixing of the dimensions
84       clump - clumps together two or more small dimensions into one larger
85       one
86       squeeze - eliminates any dimensions of size 1
87
88   Physical and Dummy Dimensions
89       ·    document Perl level threading
90
91       ·    threadids
92
93       ·    update and correct description of slice
94
95       ·    new functions in slice.pd (affine, lag, splitdim)
96
97       ·    reworking of paragraph on explicit threading
98

Indexing and threading with PDL

100       A lot of the flexibility and power of PDL relies on the indexing and
101       looping features of the Perl extension. Indexing allows access to the
102       data of a piddle in a very flexible way. Threading provides efficient
103       implicit looping functionality (since the loops are implemented as
104       optimized C code).
105
106       Piddles are Perl objects that represent multidimensional arrays and
107       operations on those. In contrast to simple Perl @x style lists the
108       array data is compactly stored in a single block of memory thus taking
109       up a lot less memory and enabling use of fast C code to implement
110       operations (e.g. addition, etc) on piddles.
111
112   piddles can have children
113       Central to many of the indexing capabilities of PDL are the relation of
114       "parent" and "child" between piddles. Many of the indexing commands
115       create a new piddle from an existing piddle. The new piddle is the
116       "child" and the old one is the "parent". The data of the new piddle is
117       defined by a transformation that specifies how to generate (compute)
118       its data from the parent's data. The relation between the child piddle
119       and its parent are often bidirectional, meaning that changes in the
120       child's data are propagated back to the parent. (Note: You see, we are
121       aiming in our terminology already towards the new dataflow features.
122       The kind of dataflow that is used by the indexing commands (about which
123       you will learn in a minute) is always in operation, not only when you
124       have explicitly switched on dataflow in your piddle by saying
125       "$x->doflow". For further information about data flow check the
126       dataflow man page.)
127
128       Another way to interpret the piddles created by our indexing commands
129       is to view them as a kind of intelligent pointer that points back to
130       some portion or all of its parent's data. Therefore, it is not
131       surprising that the parent's data (or a portion of it) changes when
132       manipulated through this "pointer". After these introductory remarks
133       that hopefully prepared you for what is coming (rather than confuse you
134       too much) we are going to dive right in and start with a description of
135       the indexing commands and some typical examples how they might be used
136       in PDL programs. We will further illustrate the pointer/dataflow
137       analogies in the context of some of the examples later on.
138
139       There are two different implementations of this ``smart pointer''
140       relationship: the first one, which is a little slower but works for any
141       transformation is simply to do the transformation forwards and
142       backwards as necessary. The other is to consider the child piddle a
143       ``virtual'' piddle, which only stores a pointer to the parent and
144       access information so that routines which use the child piddle actually
145       directly access the data in the parent.  If the virtual piddle is given
146       to a routine which cannot use it, PDL transparently physicalizes the
147       virtual piddle before letting the routine use it.
148
149       Currently (1.94_01) all transformations which are ``affine'', i.e. the
150       indices of the data item in the parent piddle are determined by a
151       linear transformation (+ constant) from the indices of the child piddle
152       result in virtual piddles. All other indexing routines (e.g.
153       "->index(...)") result in physical piddles.  All routines compiled by
154       PP can accept affine piddles (except those routines that pass pointers
155       to external library functions).
156
157       Note that whether something is affine or not does not affect the
158       semantics of what you do in any way: both
159
160        $x->index(...) .= 5;
161        $x->slice(...) .= 5;
162
163       change the data in $x. The affinity does, however, have a significant
164       impact on memory usage and performance.
165
166   Slicing piddles
167       Probably the most important application of the concept of parent/child
168       piddles is the representation of rectangular slices of a physical
169       piddle by a virtual piddle. Having talked long enough about concepts
170       let's get more specific. Suppose we are working with a 2D piddle
171       representing a 5x5 image (its unusually small so that we can print it
172       without filling several screens full of digits ;).
173
174        pdl> $im = sequence(5,5)
175        pdl> p $im
176
177        [
178         [ 0  1  2  3  4]
179         [ 5  6  7  8  9]
180         [10 11 12 13 14]
181         [15 16 17 18 19]
182         [20 21 22 23 24]
183        ]
184
185        pdl> help vars
186        PDL variables in package main::
187
188        Name         Type   Dimension       Flow  State          Mem
189        ----------------------------------------------------------------
190        $im          Double D [5,5]                P            0.20Kb
191
192       [ here it might be appropriate to quickly talk about the "help vars"
193       command that provides information about piddles in the interactive
194       "perldl" or "pdl2" shell that comes with PDL.  ]
195
196       Now suppose we want to create a 1-D piddle that just references one
197       line of the image, say line 2; or a piddle that represents all even
198       lines of the image (imagine we have to deal with even and odd frames of
199       an interlaced image due to some peculiar behaviour of our frame
200       grabber). As another frequent application of slices we might want to
201       create a piddle that represents a rectangular region of the image with
202       top and bottom reversed. All these effects (and many more) can be
203       easily achieved with the powerful slice function:
204
205        pdl> $line = $im->slice(':,(2)')
206        pdl> $even = $im->slice(':,1:-1:2')
207        pdl> $area = $im->slice('3:4,3:1')
208        pdl> help vars  # or just PDL->vars
209        PDL variables in package main::
210
211        Name         Type   Dimension       Flow  State          Mem
212        ----------------------------------------------------------------
213        $even        Double D [5,2]                -C           0.00Kb
214        $im          Double D [5,5]                P            0.20Kb
215        $line        Double D [5]                  -C           0.00Kb
216        $area        Double D [2,3]                -C           0.00Kb
217
218       All three "child" piddles are children of $im or in the other (largely
219       equivalent) interpretation pointers to data of $im.  Operations on
220       those virtual piddles access only those portions of the data as
221       specified by the argument to slice. So we can just print line 2:
222
223        pdl> p $line
224        [10 11 12 13 14]
225
226       Also note the difference in the "Flow State" of $area above and below:
227
228        pdl> p $area
229        pdl> help $area
230        This variable is Double D [2,3]                VC           0.00Kb
231
232       The following demonstrates that $im and $line really behave as you
233       would expect from a pointer-like object (or in the dataflow picture:
234       the changes in $line's data are propagated back to $im):
235
236        pdl> $im++
237        pdl> p $line
238        [11 12 13 14 15]
239        pdl> $line += 2
240        pdl> p $im
241
242        [
243         [ 1  2  3  4  5]
244         [ 6  7  8  9 10]
245         [13 14 15 16 17]
246         [16 17 18 19 20]
247         [21 22 23 24 25]
248        ]
249
250       Note how assignment operations on the child virtual piddles change the
251       parent physical piddle and vice versa (however, the basic "="
252       assignment doesn't, use ".=" to obtain that effect. See below for the
253       reasons).  The virtual child piddles are something like "live links" to
254       the "original" parent piddle. As previously said, they can be thought
255       of to work similar to a C-pointer. But in contrast to a C-pointer they
256       carry a lot more information. Firstly, they specify the structure of
257       the data they represent (the dimensionality of the new piddle) and
258       secondly, specify how to create this structure from its parents data
259       (the way this works is buried in the internals of PDL and not important
260       for you to know anyway (unless you want to hack the core in the future
261       or would like to become a PDL guru in general (for a definition of this
262       strange creature see PDL::Internals)).
263
264       The previous examples have demonstrated typical usage of the slice
265       function. Since the slicing functionality is so important here is an
266       explanation of the syntax for the string argument to slice:
267
268        $vpdl = $x->slice('ind0,ind1...')
269
270       where "ind0" specifies what to do with index No 0 of the piddle $x,
271       etc. Each element of the comma separated list can have one of the
272       following forms:
273
274       ':'   Use the whole dimension
275
276       'n'   Use only index "n". The dimension of this index in the resulting
277             virtual piddle is 1. An example involving those first two index
278             formats:
279
280              pdl> $column = $im->slice('2,:')
281              pdl> $row = $im->slice(':,0')
282              pdl> p $column
283
284              [
285               [ 3]
286               [ 8]
287               [15]
288               [18]
289               [23]
290              ]
291
292              pdl> p $row
293
294              [
295               [1 2 3 4 5]
296              ]
297
298              pdl> help $column
299              This variable is Double D [1,5]                VC           0.00Kb
300
301              pdl> help $row
302              This variable is Double D [5,1]                VC           0.00Kb
303
304       '(n)' Use only index "n". This dimension is removed from the resulting
305             piddle (relying on the fact that a dimension of size 1 can always
306             be removed). The distinction between this case and the previous
307             one becomes important in assignments where left and right hand
308             side have to have appropriate dimensions.
309
310              pdl> $line = $im->slice(':,(0)')
311              pdl> help $line
312              This variable is Double D [5]                  -C           0.00Kb
313
314              pdl> p $line
315              [1 2 3 4 5]
316
317             Spot the difference to the previous example?
318
319       'n1:n2' or 'n1:n2:n3'
320             Take the range of indices from "n1" to "n2" or (second form) take
321             the range of indices from "n1" to "n2" with step "n3". An example
322             for the use of this format is the previous definition of the sub-
323             image composed of even lines.
324
325              pdl> $even = $im->slice(':,1:-1:2')
326
327             This example also demonstrates that negative indices work like
328             they do for normal Perl style arrays by counting backwards from
329             the end of the dimension. If "n2" is smaller than "n1" (in the
330             example -1 is equivalent to index 4) the elements in the virtual
331             piddle are effectively reverted with respect to its parent.
332
333       '*[n]'
334             Add a dummy dimension. The size of this dimension will be 1 by
335             default or equal to "n" if the optional numerical argument is
336             given.
337
338             Now, this is really something a bit strange on first sight. What
339             is a dummy dimension? A dummy dimension inserts a dimension where
340             there wasn't one before. How is that done ? Well, in the case of
341             the new dimension having size 1 it can be easily explained by the
342             way in which you can identify a vector (with "m" elements) with
343             an "(1,m)" or "(m,1)" matrix. The same holds obviously for higher
344             dimensional objects. More interesting is the case of a dummy
345             dimensions of size greater than one (e.g. "slice('*5,:')"). This
346             works in the same way as a call to the dummy function creates a
347             new dummy dimension.  So read on and check its explanation below.
348
349       '([n1:n2[:n3]]=i)'
350             [Not yet implemented ??????]  With an argument like this you make
351             generalised diagonals. The diagonal will be dimension no. "i" of
352             the new output piddle and (if optional part in brackets
353             specified) will extend along the range of indices specified of
354             the respective parent piddle's dimension. In general an argument
355             like this only makes sense if there are other arguments like this
356             in the same call to slice. The part in brackets is optional for
357             this type of argument. All arguments of this type that specify
358             the same target dimension "i" have to relate to the same number
359             of indices in their parent dimension. The best way to explain it
360             is probably to give an example, here we make a piddle that refers
361             to the elements along the space diagonal of its parent piddle (a
362             cube):
363
364              $cube = zeroes(5,5,5);
365              $sdiag = $cube->slice('(=0),(=0),(=0)');
366
367             The above command creates a virtual piddle that represents the
368             diagonal along the parents' dimension no. 0, 1 and 2 and makes
369             its dimension 0 (the only dimension) of it. You use the extended
370             syntax if the dimension sizes of the parent dimensions you want
371             to build the diagonal from have different sizes or you want to
372             reverse the sequence of elements in the diagonal, e.g.
373
374              $rect = zeroes(12,3,5,6,2);
375              $vpdl = $rect->slice('2:7,(0:1=1),(4),(5:4=1),(=1)');
376
377             So the elements of $vpdl will then be related to those of its
378             parent in way we can express as:
379
380               vpdl(i,j) = rect(i+2,j,4,5-j,j)       0<=i<5, 0<=j<2
381
382       [ work in the new index function: "$y = $x->index($c);" ???? ]
383
384   There are different kinds of assignments in PDL
385       The previous examples have already shown that virtual piddles can be
386       used to operate on or access portions of data of a parent piddle. They
387       can also be used as lvalues in assignments (as the use of "++" in some
388       of the examples above has already demonstrated). For explicit
389       assignments to the data represented by a virtual piddle you have to use
390       the overloaded ".=" operator (which in this context we call propagated
391       assignment). Why can't you use the normal assignment operator "="?
392
393       Well, you definitely still can use the '=' operator but it wouldn't do
394       what you want. This is due to the fact that the '=' operator cannot be
395       overloaded in the same way as other assignment operators. If we tried
396       to use '=' to try to assign data to a portion of a physical piddle
397       through a virtual piddle we wouldn't achieve the desired effect
398       (instead the variable representing the virtual piddle (a reference to a
399       blessed thingy) would after the assignment just contain the reference
400       to another blessed thingy which would behave to future assignments as a
401       "physical" copy of the original rvalue [this is actually not yet clear
402       and subject of discussions in the PDL developers mailing list]. In that
403       sense it would break the connection of the piddle to the parent [ isn't
404       this behaviour in a sense the opposite of what happens in dataflow,
405       where ".=" breaks the connection to the parent? ].
406
407       E.g.
408
409        pdl> $line = $im->slice(':,(2)')
410        pdl> $line = zeroes(5);
411        pdl> $line++;
412        pdl> p $im
413
414        [
415         [ 1  2  3  4  5]
416         [ 6  7  8  9 10]
417         [13 14 15 16 17]
418         [16 17 18 19 20]
419         [21 22 23 24 25]
420        ]
421
422        pdl> p $line
423        [1 1 1 1 1]
424
425       But using ".="
426
427        pdl> $line = $im->slice(':,(2)')
428        pdl> $line .= zeroes(5)
429        pdl> $line++
430        pdl> p $im
431
432        [
433         [ 1  2  3  4  5]
434         [ 6  7  8  9 10]
435         [ 1  1  1  1  1]
436         [16 17 18 19 20]
437         [21 22 23 24 25]
438        ]
439
440        pdl> print $line
441        [1 1 1 1 1]
442
443       Also, you can substitute
444
445        pdl> $line .= 0;
446
447       for the assignment above (the zero is converted to a scalar piddle,
448       with no dimensions so it can be assigned to any piddle).
449
450       A nice feature in recent perl versions is lvalue subroutines (i.e.,
451       versions 5.6.x and higher including all perls currently supported by
452       PDL).  That allows one to use the slicing syntax on both sides of the
453       assignment:
454
455        pdl> $im->slice(':,(2)') .= zeroes(5)->xvals->float
456
457       Related to the lvalue sub assignment feature is a little trap for the
458       unwary: recent perls introduced a "feature" which breaks PDL's use of
459       lvalue subs for slice assignments when running under the perl debugger,
460       "perl -d".  Under the debugger, the above usage gives an error like: "
461       Can't return a temporary from lvalue subroutine... " So you must use
462       syntax like this:
463
464        pdl> ($pdl = $im->slice(':,(2)')) .= zeroes(5)->xvals->float
465
466       which works both with and without the debugger but is arguably clumsy
467       and awkward to read.
468
469       Note that there can be a problem with assignments like this when lvalue
470       and rvalue piddles refer to overlapping portions of data in the parent
471       piddle:
472
473        # revert the elements of the first line of $x
474        ($tmp = $x->slice(':,(1)')) .= $x->slice('-1:0,(1)');
475
476       Currently, the parent data on the right side of the assignments is not
477       copied before the (internal) assignment loop proceeds. Therefore, the
478       outcome of this assignment will depend on the sequence in which
479       elements are assigned and almost certainly not do what you wanted.  So
480       the semantics are currently undefined for now and liable to change
481       anytime. To obtain the desired behaviour, use
482
483        ($tmp = $x->slice(':,(1)')) .= $x->slice('-1:0,(1)')->copy;
484
485       which makes a physical copy of the slice or
486
487        ($tmp = $x->slice(':,(1)')) .= $x->slice('-1:0,(1)')->sever;
488
489       which returns the same slice but severs the connection of the slice to
490       its parent.
491
492   Other functions that manipulate dimensions
493       Having talked extensively about the slice function it should be noted
494       that this is not the only PDL indexing function. There are additional
495       indexing functions which are also useful (especially in the context of
496       threading which we will talk about later). Here are a list and some
497       examples how to use them.
498
499       "dummy"
500           inserts a dummy dimension of the size you specify (default 1) at
501           the chosen location. You can't wait to hear how that is achieved?
502           Well, all elements with index "(X,x,Y)" ("0<=x<size_of_dummy_dim")
503           just map to the element with index "(X,Y)" of the parent piddle
504           (where "X" and "Y" refer to the group of indices before and after
505           the location where the dummy dimension was inserted.)
506
507           This example calculates the x coordinate of the centroid of an
508           image (later we will learn that we didn't actually need the dummy
509           dimension thanks to the magic of implicit threading; but using
510           dummy dimensions the code would also work in a thread-less world;
511           though once you have worked with PDL threads you wouldn't want to
512           live without them again).
513
514            # centroid
515            ($xd,$yd) = $im->dims;
516            $xc = sum($im*xvals(zeroes($xd))->dummy(1,$yd))/sum($im);
517
518           Let's explain how that works in a little more detail. First, the
519           product:
520
521            $xvs = xvals(zeroes($xd));
522            print $xvs->dummy(1,$yd);      # repeat the line $yd times
523            $prod = $im*xvs->dummy(1,$yd); # form the pixel-wise product with
524                                           # the repeated line of x-values
525
526           The rest is then summing the results of the pixel-wise product
527           together and normalizing with the sum of all pixel values in the
528           original image thereby calculating the x-coordinate of the "center
529           of mass" of the image (interpreting pixel values as local mass)
530           which is known as the centroid of an image.
531
532           Next is a (from the point of view of memory consumption) very cheap
533           conversion from grey-scale to RGB, i.e. every pixel holds now a
534           triple of values instead of a scalar. The three values in the
535           triple are, fortunately, all the same for a grey image, so that our
536           trick works well in that it maps all the three members of the
537           triple to the same source element:
538
539            # a cheap grey-scale to RGB conversion
540            $rgb = $grey->dummy(0,3)
541
542           Unfortunately this trick cannot be used to convert your old B/W
543           photos to color ones in the way you'd like. :(
544
545           Note that the memory usage of piddles with dummy dimensions is
546           especially sensitive to the internal representation. If the piddle
547           can be represented as a virtual affine (``vaffine'') piddle, only
548           the control structures are stored. But if $y in
549
550            $x = zeroes(10000);
551            $y = $x->dummy(1,10000);
552
553           is made physical by some routine, you will find that the memory
554           usage of your program has suddenly grown by 100Mb.
555
556       "diagonal"
557           replaces two dimensions (which have to be of equal size) by one
558           dimension that references all the elements along the "diagonal"
559           along those two dimensions. Here, we have two examples which should
560           appear familiar to anyone who has ever done some linear algebra.
561           Firstly, make a unity matrix:
562
563            # unity matrix
564            $e = zeroes(float, 3, 3); # make everything zero
565            ($tmp = $e->diagonal(0,1)) .= 1; # set the elements along the diagonal to 1
566            print $e;
567
568           Or the other diagonal:
569
570            ($tmp = $e->slice(':-1:0')->diagonal(0,1)) .= 2;
571            print $e;
572
573           (Did you notice how we used the slice function to revert the
574           sequence of lines before setting the diagonal of the new child,
575           thereby setting the cross diagonal of the parent ?)  Or a mapping
576           from the space of diagonal matrices to the field over which the
577           matrices are defined, the trace of a matrix:
578
579            # trace of a matrix
580            $trace = sum($mat->diagonal(0,1));  # sum all the diagonal elements
581
582       "xchg" and "mv"
583           xchg exchanges or "transposes" the two  specified dimensions.  A
584           straightforward example:
585
586            # transpose a matrix (without explicitly reshuffling data and
587            # making a copy)
588            $prod = $x x $x->xchg(0,1);
589
590           $prod should now be pretty close to the unity matrix if $x is an
591           orthogonal matrix. Often "xchg" will be used in the context of
592           threading but more about that later.
593
594           mv works in a similar fashion. It moves a dimension (specified by
595           its number in the parent) to a new position in the new child
596           piddle:
597
598            $y = $x->mv(4,0);  # make the 5th dimension of $x the first in the
599                               # new child $y
600
601           The difference between "xchg" and "mv" is that "xchg" only changes
602           the position of two dimensions with each other, whereas "mv"
603           inserts the first dimension to the place of second, moving the
604           other dimensions around accordingly.
605
606       "clump"
607           collapses several dimensions into one. Its only argument specifies
608           how many dimensions of the source piddle should be collapsed
609           (starting from the first). An (admittedly unrealistic) example is a
610           3D piddle which holds data from a stack of image files that you
611           have just read in. However, the data from each image really
612           represents a 1D time series and has only been arranged that way
613           because it was digitized with a frame grabber. So to have it again
614           as an array of time sequences you say
615
616            pdl> $seqs = $stack->clump(2)
617            pdl> help vars
618            PDL variables in package main::
619
620            Name         Type   Dimension       Flow  State          Mem
621            ----------------------------------------------------------------
622            $seqs        Double D [8000,50]            -C           0.00Kb
623            $stack       Double D [100,80,50]          P            3.05Mb
624
625           Unrealistic as it may seem, our confocal microscope software writes
626           data (sometimes) this way. But more often you use clump to achieve
627           a certain effect when using implicit or explicit threading.
628
629   Calls to indexing functions can be chained
630       As you might have noticed in some of the examples above calls to the
631       indexing functions can be nicely chained since all of these functions
632       return a newly created child object. However, when doing extensive
633       index manipulations in a chain be sure to keep track of what you are
634       doing, e.g.
635
636        $x->xchg(0,1)->mv(0,4)
637
638       moves the dimension 1 of $x to position 4 since when the second command
639       is executed the original dimension 1 has been moved to position 0 of
640       the new child that calls the "mv" function. I think you get the idea
641       (in spite of my convoluted explanations).
642
643   Propagated assignments ('.=') and dummy dimensions
644       A subtlety related to indexing is the assignment to piddles containing
645       dummy dimensions of size greater than 1. These assignments (using ".=")
646       are forbidden since several elements of the lvalue piddle point to the
647       same element of the parent. As a consequence the value of those parent
648       elements are potentially ambiguous and would depend on the sequence in
649       which the implementation makes the assignments to elements. Therefore,
650       an assignment like this:
651
652        $x = pdl [1,2,3];
653        $y = $x->dummy(1,4);
654        $y .= yvals(zeroes(3,4));
655
656       can produce unexpected results and the results are explicitly undefined
657       by PDL because when PDL gets parallel computing features, the current
658       result may well change.
659
660       From the point of view of dataflow the introduction of greater-size-
661       than-one dummy dimensions is regarded as an irreversible transformation
662       (similar to the terminology in thermodynamics) which precludes backward
663       propagation of assignment to a parent (which you had explicitly
664       requested using the ".=" assignment). A similar problem to watch out
665       for occurs in the context of threading where sometimes dummy dimensions
666       are created implicitly during the thread loop (see below).
667
668   Reasons for the parent/child (or "pointer") concept
669       [ this will have to wait a bit ]
670
671        XXXXX being memory efficient
672        XXXXX in the context of threading
673        XXXXX very flexible and powerful way of accessing portions of piddle data
674              (in much more general way than sec, etc allow)
675        XXXXX efficient implementation
676        XXXXX difference to section/at, etc.
677
678   How to make things physical again
679       [ XXXXX fill in later when everything has settled a bit more ]
680
681        ** When needed (xsub routine interfacing C lib function)
682        ** How achieved (->physical)
683        ** How to test (isphysical (explain how it works currently))
684        ** ->copy and ->sever
685

Threading

687       In the previous paragraph on indexing we have already mentioned the
688       term occasionally but now its really time to talk explicitly about
689       "threading" with piddles. The term threading has many different
690       meanings in different fields of computing. Within the framework of PDL
691       it could probably be loosely defined as an implicit looping facility.
692       It is implicit because you don't specify anything like enclosing for-
693       loops but rather the loops are automatically (or 'magically') generated
694       by PDL based on the dimensions of the piddles involved. This should
695       give you a first idea why the index/dimension manipulating functions
696       you have met in the previous paragraphs are especially important and
697       useful in the context of threading.  The other ingredient for threading
698       (apart from the piddles involved) is a function that is threading aware
699       (generally, these are PDL::PP compiled functions) and that the piddles
700       are "threaded" over.  So much about the terminology and now let's try
701       to shed some light on what it all means.
702
703   Implicit threading - a first example
704       There are two slightly different variants of threading. We start with
705       what we call "implicit threading". Let's pick a practical example that
706       involves looping of a function over many elements of a piddle. Suppose
707       we have an RGB image that we want to convert to grey-scale. The RGB
708       image is represented by a 3-dim piddle "im(3,x,y)" where the first
709       dimension contains the three color components of each pixel and "x" and
710       "y" are width and height of the image, respectively. Next we need to
711       specify how to convert a color-triple at a given pixel into a grey-
712       value (to be a realistic example it should represent the relative
713       intensity with which our color insensitive eye cells would detect that
714       color to achieve what we would call a natural conversion from color to
715       grey-scale). An approximation that works quite well is to compute the
716       grey intensity from each RGB triplet (r,g,b) as a weighted sum
717
718        grey-value = 77/256*r + 150/256*g + 29/256*b =
719            inner([77,150,29]/256, [r,g,b])
720
721       where the last form indicates that we can write this as an inner
722       product of the 3-vector comprising the weights for red, green and blue
723       components with the 3-vector containing the color components.
724       Traditionally, we might have written a function like the following to
725       process the whole image:
726
727        my @dims=$im->dims;
728        # here normally check that first dim has correct size (3), etc
729        $grey=zeroes(@dims[1,2]);   # make the piddle for the resulting grey image
730        $w = pdl [77,150,29] / 256; # the vector of weights
731        for ($j=0;$j<dims[2];$j++) {
732           for ($i=0;$i<dims[1];$i++) {
733               # compute the pixel value
734               $tmp = inner($w,$im->slice(':,(i),(j)'));
735               set($grey,$i,$j,$tmp); # and set it in the grey-scale image
736           }
737        }
738
739       Now we write the same using threading (noting that "inner" is a
740       threading aware function defined in the PDL::Primitive package)
741
742        $grey = inner($im,pdl([77,150,29]/256));
743
744       We have ended up with a one-liner that automatically creates the piddle
745       $grey with the right number and size of dimensions and performs the
746       loops automatically (these loops are implemented as fast C code in the
747       internals of PDL).  Well, we still owe you an explanation how this
748       'magic' is achieved.
749
750   How does the example work ?
751       The first thing to note is that every function that is threading aware
752       (these are without exception functions compiled from concise
753       descriptions by PDL::PP, later just called PP-functions) expects a
754       defined (minimum) number of dimensions (we call them core dimensions)
755       from each of its piddle arguments. The inner function expects two one-
756       dimensional (input) parameters from which it calculates a zero-
757       dimensional (output) parameter. We write that symbolically as
758       "inner((n),(n),[o]())" and call it "inner"'s signature, where n
759       represents the size of that dimension. n being equal in the first and
760       second parameter means that those dimensions have to be of equal size
761       in any call. As a different example take the outer product which takes
762       two 1D vectors to generate a 2D matrix, symbolically written as
763       "outer((n),(m),[o](n,m))". The "[o]" in both examples indicates that
764       this (here third) argument is an output argument. In the latter example
765       the dimensions of first and second argument don't have to agree but you
766       see how they determine the size of the two dimensions of the output
767       piddle.
768
769       Here is the point when threading finally enters the game. If you call
770       PP-functions with piddles that have more than the required core
771       dimensions the first dimensions of the piddle arguments are used as the
772       core dimensions and the additional extra dimensions are threaded over.
773       Let us demonstrate this first with our example above
774
775        $grey = inner($im,$w); # w is the weight vector from above
776
777       In this case $w is 1D and so supplied just the core dimension, $im is
778       3D, more specifically "(3,x,y)". The first dimension (of size 3) is the
779       required core dimension that matches (as required by inner) the first
780       (and only) dimension of $w. The second dimension is the first thread
781       dimension (of size "x") and the third is here the second thread
782       dimension (of size "y"). The output piddle is automatically created (as
783       requested by setting $grey to "null" prior to invocation). The output
784       dimensions are obtained by appending the loop dimensions (here "(x,y)")
785       to the core output dimensions (here 0D) to yield the final dimensions
786       of the auto-created piddle (here "0D+2D=2D" to yield a 2D output of
787       size "(x,y)").
788
789       So the above command calls the core functionality that computes the
790       inner product of two 1D vectors "x*y" times with $w and all 1D slices
791       of the form "(':,(i),(j)')" of $im and sets the respective elements of
792       the output piddle "$grey(i,j)" to the result of each computation. We
793       could write that symbolically as
794
795        $grey(0,0) = f($w,$im(:,(0),(0)))
796        $grey(1,0) = f($w,$im(:,(1),(0)))
797            .
798            .
799            .
800        $grey(x-2,y-1) = f($w,$im(:,(x-2),(y-1)))
801        $grey(x-1,y-1) = f($w,$im(:,(x-1),(y-1)))
802
803       But this is done automatically by PDL without writing any explicit Perl
804       loops.  We see that the command really creates an output piddle with
805       the right dimensions and sets the elements indeed to the result of the
806       computation for each pixel of the input image.
807
808       When even more piddles and extra dimensions are involved things get a
809       bit more complicated. We will first give the general rules how the
810       thread dimensions depend on the dimensions of input piddles enabling
811       you to figure out the dimensionality of an auto-created output piddle
812       (for any given set of input piddles and core dimensions of the PP-
813       function in question). The general rules will most likely appear a bit
814       confusing on first sight so that we'll set out to illustrate the usage
815       with a set of further examples (which will hopefully also demonstrate
816       that there are indeed many practical situations where threading comes
817       in extremely handy).
818
819   A call for coding discipline
820       Before we point out the other technical details of threading, please
821       note this call for programming discipline when using threading:
822
823       In order to preserve human readability, PLEASE comment any nontrivial
824       expression in your code involving threading.  Most importantly, for any
825       subroutine, include information at the beginning about what you expect
826       the dimensions to represent (or ranges of dimensions).
827
828       As a warning, look at this undocumented function and try to guess what
829       might be going on:
830
831        sub lookup {
832          my ($im,$palette) = @_;
833          my $res;
834          index($palette->xchg(0,1),
835                     $im->long->dummy(0,($palette->dim)[0]),
836                     ($res=null));
837          return $res;
838        }
839
840       Would you agree that it might be difficult to figure out expected
841       dimensions, purpose of the routine, etc ?  (If you want to find out
842       what this piece of code does, see below)
843
844   How to figure out the loop dimensions
845       There are a couple of rules that allow you to figure out number and
846       size of loop dimensions (and if the size of your input piddles comply
847       with the threading rules). Dimensions of any piddle argument are broken
848       down into two groups in the following: Core dimensions (as defined by
849       the PP-function, see Appendix B for a list of PDL primitives) and extra
850       dimensions which comprises all remaining dimensions of that piddle. For
851       example calling a function "func" with the signature
852       "func((n,m),[o](n))" with a piddle "$x(2,4,7,1,3)" as "f($x,($o =
853       null))" results in the semantic splitting of x's dimensions into: core
854       dimensions "(2,4)" and extra dimensions "(7,1,3)".
855
856       R0    Core dimensions are identified with the first N dimensions of the
857             respective piddle argument (and are required). Any further
858             dimensions are extra dimensions and used to determine the loop
859             dimensions.
860
861       R1    The number of (implicit) loop dimensions is equal to the maximal
862             number of extra dimensions taken over the set of piddle
863             arguments.
864
865       R2    The size of each of the loop dimensions is derived from the size
866             of the respective dimensions of the piddle arguments. The size of
867             a loop dimension is given by the maximal size found in any of the
868             piddles having this extra dimension.
869
870       R3    For all piddles that have a given extra dimension the size must
871             be equal to the size of the loop dimension (as determined by the
872             previous rule) or 1; otherwise you raise a runtime exception. If
873             the size of the extra dimension in a piddle is one it is
874             implicitly treated as a dummy dimension of size equal to that
875             loop dim size when performing the thread loop.
876
877       R4    If a piddle doesn't have a loop dimension, in the thread loop
878             this piddle is treated as if having a dummy dimension of size
879             equal to the size of that loop dimension.
880
881       R5    If output auto-creation is used (by setting the relevant piddle
882             to "PDL->null" before invocation) the number of dimensions of the
883             created piddle is equal to the sum of the number of core output
884             dimensions + number of loop dimensions. The size of the core
885             output dimensions is derived from the relevant dimension of input
886             piddles (as specified in the function definition) and the sizes
887             of the other dimensions are equal to the size of the loop
888             dimension it is derived from. The automatically created piddle
889             will be physical (unless dataflow is in operation).
890
891       In this context, note that you can run into the problem with assignment
892       to piddles containing greater-than-one dummy dimensions (see above).
893       Although your output piddle(s) didn't contain any dummy dimensions in
894       the first place they may end up with implicitly created dummy
895       dimensions according to R4.
896
897       As an example, suppose we have a (here unspecified) PP-function with
898       the signature:
899
900        func((m,n),(m,n,o),(m),[o](m,o))
901
902       and you call it with 3 piddles "$x(5,3,10,11)", "$y(5,3,2,10,1,12)",
903       and "$z(5,1,11,12)" as
904
905        func($x,$y,$z,($d=null))
906
907       then the number of loop dimensions is 3 (by "R0+R1" from $y and $z)
908       with sizes "(10,11,12)" (by R2); the two output core dimensions are
909       "(5,2)" (from the signature of func) resulting in a 5-dimensional
910       output piddle $c of size "(5,2,10,11,12)" (see R5) and (the
911       automatically created) $d is derived from "($x,$y,$z)" in a way that
912       can be expressed in pdl pseudo-code as
913
914        $d(:,:,i,j,k) .= func($x(:,:,i,j),$y(:,:,:,i,0,k),$z(:,0,j,k))
915           with 0<=i<10, 0<=j<=11, 0<=k<12
916
917       If we analyze the color to grey-scale conversion again with these rules
918       in mind we note another great advantage of implicit threading.  We can
919       call the conversion with a piddle representing a pixel (im(3)), a line
920       of rgb pixels ("im(3,x)"), a proper color image ("im(3,x,y)") or a
921       whole stack of RGB images ("im(3,x,y,z)"). As long as $im is of the
922       form "(3,...)" the automatically created output piddle will contain the
923       right number of dimensions and contain the intensity data as we expect
924       it since the loops have been implicitly performed thanks to implicit
925       threading. You can easily convince yourself that calling with a color
926       pixel $grey is 0D, with a line it turns out 1D grey(x), with an image
927       we get "grey(x,y)" and finally we get a converted image stack
928       "grey(x,y,z)".
929
930       Let's fill these general rules with some more life by going through a
931       couple of further examples. The reader may try to figure out equivalent
932       formulations with explicit for-looping and compare the flexibility of
933       those routines using implicit threading to the explicit formulation.
934       Furthermore, especially when using several thread dimensions it is a
935       useful exercise to check the relative speed by doing some benchmark
936       tests (which we still have to do).
937
938       First in the row is a slightly reworked centroid example, now coded
939       with threading in mind.
940
941        # threaded mult to calculate centroid coords, works for stacks as well
942        $xc = sumover(($im*xvals(($im->dims)[0]))->clump(2)) /
943              sumover($im->clump(2));
944
945       Let's analyze what's going on step by step. First the product:
946
947        $prod = $im*xvals(zeroes(($im->dims)[0]))
948
949       This will actually work for $im being one, two, three, and higher
950       dimensional. If $im is one-dimensional it's just an ordinary product
951       (in the sense that every element of $im is multiplied with the
952       respective element of "xvals(...)"), if $im has more dimensions further
953       threading is done by adding appropriate dummy dimensions to
954       "xvals(...)"  according to R4.  More importantly, the two sumover
955       operations show a first example of how to make use of the dimension
956       manipulating commands. A quick look at sumover's signature will remind
957       you that it will only "gobble up" the first dimension of a given input
958       piddle. But what if we want to really compute the sum over all elements
959       of the first two dimensions? Well, nothing keeps us from passing a
960       virtual piddle into sumover which in this case is formed by clumping
961       the first two dimensions of the "parent piddle" into one. From the
962       point of view of the parent piddle the sum is now computed over the
963       first two dimensions, just as we wanted, though sumover has just done
964       the job as specified by its signature. Got it ?
965
966       Another little finesse of writing the code like that: we intentionally
967       used "sumover($pdl->clump(2))" instead of "sum($pdl)" so that we can
968       either pass just an image "(x,y)" or a stack of images "(x,y,t)" into
969       this routine and get either just one x-coordiante or a vector of
970       x-coordinates (of size t) in return.
971
972       Another set of common operations are what one could call "projection
973       operations". These operations take a N-D piddle as input and return a
974       (N-1)-D "projected" piddle. These operations are often performed with
975       functions like sumover, prodover, minimum and maximum.  Using again
976       images as examples we might want to calculate the maximum pixel value
977       for each line of an image or image stack. We know how to do that
978
979        # maxima of lines (as function of line number and time)
980        maximum($stack,($ret=null));
981
982       But what if you want to calculate maxima per column when implicit
983       threading always applies the core functionality to the first dimension
984       and threads over all others? How can we achieve that instead the core
985       functionality is applied to the second dimension and threading is done
986       over the others. Can you guess it? Yes, we make a virtual piddle that
987       has the second dimension of the "parent piddle" as its first dimension
988       using the "mv" command.
989
990        # maxima of columns (as function of column number and time)
991        maximum($stack->mv(1,0),($ret=null));
992
993       and calculating all the sums of sub-slices over the third dimension is
994       now almost too easy
995
996        # sums of pixels in time (assuming time is the third dim)
997        sumover($stack->mv(2,0),($ret=null));
998
999       Finally, if you want to apply the operation to all elements (like max
1000       over all elements or sum over all elements) regardless of the
1001       dimensions of the piddle in question "clump" comes in handy. As an
1002       example look at the definition of "sum" (as defined in "Ufunc.pm"):
1003
1004        sub sum {
1005          PDL::Ufunc::sumover($name->clump(-1),($tmp=null));
1006          return $tmp->at(); # return a Perl number, not a 0D piddle
1007        }
1008
1009       We have already mentioned that all basic operations support threading
1010       and assignment is no exception. So here are a couple of threaded
1011       assignments
1012
1013        pdl> $im = zeroes(byte, 10,20)
1014        pdl> $line = exp(-rvals(10)**2/9)
1015        # threaded assignment
1016        pdl> $im .= $line      # set every line of $im to $line
1017        pdl> $im2 .= 5         # set every element of $im2 to 5
1018
1019       By now you probably see how it works and what it does, don't you?
1020
1021       To finish the examples in this paragraph here is a function to create
1022       an RGB image from what is called a palette image. The palette image
1023       consists of two parts: an image of indices into a color lookup table
1024       and the color lookup table itself. [ describe how it works ] We are
1025       going to use a PP-function we haven't encoutered yet in the previous
1026       examples. It is the aptly named index function, signature
1027       "((n),(),[o]())" (see Appendix B) with the core functionality that
1028       "index(pdl (0,2,4,5),2,($ret=null))" will return the element with index
1029       2 of the first input piddle. In this case, $ret will contain the value
1030       4.  So here is the example:
1031
1032        # a threaded index lookup to generate an RGB, or RGBA or YMCK image
1033        # from a palette image (represented by a lookup table $palette and
1034        # an color-index image $im)
1035        # you can say just dummy(0) since the rules of threading make it fit
1036        pdl> index($palette->xchg(0,1),
1037                      $im->long->dummy(0,($palette->dim)[0]),
1038                      ($res=null));
1039
1040       Let's go through it and explain the steps involved. Assuming we are
1041       dealing with an RGB lookup-table $palette is of size "(3,x)". First we
1042       exchange the dimensions of the palette so that looping is done over the
1043       first dimension of $palette (of size 3 that represent r, g, and b
1044       components). Now looking at $im, we add a dummy dimension of size equal
1045       to the length of the number of components (in the case we are
1046       discussing here we could have just used the number 3 since we have 3
1047       color components). We can use a dummy dimension since for red, green
1048       and blue color components we use the same index from the original
1049       image, e.g.  assuming a certain pixel of $im had the value 4 then the
1050       lookup should produce the triple
1051
1052        [palette(0,4),palette(1,4),palette(2,4)]
1053
1054       for the new red, green and blue components of the output image.
1055       Hopefully by now you have some sort of idea what the above piece of
1056       code is supposed to do (it is often actually quite complicated to
1057       describe in detail how a piece of threading code works; just go ahead
1058       and experiment a bit to get a better feeling for it).
1059
1060       If you have read the threading rules carefully, then you might have
1061       noticed that we didn't have to explicitly state the size of the dummy
1062       dimension that we created for $im; when we create it with size 1 (the
1063       default) the rules of threading make it automatically fit to the
1064       desired size (by rule R3, in our example the size would be 3 assuming a
1065       palette of size "(3,x)"). Since situations like this do occur often in
1066       practice this is actually why rule R3 has been introduced (the part
1067       that makes dimensions of size 1 fit to the thread loop dim size). So we
1068       can just say
1069
1070        pdl> index($palette->xchg(0,1),$im->long->dummy(0),($res=null));
1071
1072       Again, you can convince yourself that this routine will create the
1073       right output if called with a pixel ($im is 0D), a line ($im is 1D), an
1074       image ($im is 2D), ..., an RGB lookup table (palette is "(3,x)") and
1075       RGBA lookup table (palette is "(4,x)", see e.g. OpenGL). This
1076       flexibility is achieved by the rules of threading which are made to do
1077       the right thing in most situations.
1078
1079       To wrap it all up once again, the general idea is as follows. If you
1080       want to achieve looping over certain dimensions and have the core
1081       functionality applied to another specified set of dimensions you use
1082       the dimension manipulating commands to create a (or several) virtual
1083       piddle(s) so that from the point of view of the parent piddle(s) you
1084       get what you want (always having the signature of the function in
1085       question and R1-R5 in mind!). Easy, isn't it ?
1086
1087   Output auto-creation and PP-function calling conventions
1088       At this point we have to divert to some technical detail that has to do
1089       with the general calling conventions of PP-functions and the automatic
1090       creation of output arguments.  Basically, there are two ways of
1091       invoking PDL routines, namely
1092
1093        $result = func($x,$y);
1094
1095       and
1096
1097        func($x,$y,$result);
1098
1099       If you are only using implicit threading then the output variable can
1100       be automatically created by PDL. You flag that to the PP-function by
1101       setting the output argument to a special kind of piddle that is
1102       returned from a call to the function "PDL->null" that returns an
1103       essentially "empty" piddle (for those interested in details there is a
1104       flag in the C pdl structure for this). The dimensions of the created
1105       piddle are determined by the rules of implicit threading: the first
1106       dimensions are the core output dimensions to which the threading
1107       dimensions are appended (which are in turn determined by the dimensions
1108       of the input piddles as described above).  So you can say
1109
1110        func($x,$y,($result=PDL->null));
1111
1112       or
1113
1114        $result = func($x,$y)
1115
1116       which are exactly equivalent.
1117
1118       Be warned that you can not use output auto-creation when using explicit
1119       threading (for reasons explained in the following section on explicit
1120       threading, the second variant of threading).
1121
1122       In "tight" loops you probably want to avoid the implicit creation of a
1123       temporary piddle in each step of the loop that comes along with the
1124       "functional" style but rather say
1125
1126        # create output piddle of appropriate size only at first invocation
1127        $result = null;
1128        for (0...$n) {
1129             func($x,$y,$result); # in all but the first invocation $result
1130             func2($y);           # is defined and has the right size to
1131                                  # take the output provided $y's dims don't change
1132             twiddle($result,$x); # do something from $result to $x for iteration
1133        }
1134
1135       The take-home message of this section once more: be aware of the
1136       limitation on output creation when using explicit threading.
1137
1138   Explicit threading
1139       Having so far only talked about the first flavour of threading it is
1140       now about time to introduce the second variant. Instead of shuffling
1141       around dimensions all the time and relying on the rules of implicit
1142       threading to get it all right you sometimes might want to specify in a
1143       more explicit way how to perform the thread loop. It is probably not
1144       too surprising that this variant of the game is called explicit
1145       threading.  Now, before we create the wrong impression: it is not
1146       either implicit or explicit; the two flavours do mix. But more about
1147       that later.
1148
1149       The two most used functions with explicit threading are thread and
1150       unthread.  We start with an example that illustrates typical usage of
1151       the former:
1152
1153        [ # ** this is the worst possible example to start with ]
1154        #  but can be used to show that $mat += $line is different from
1155        #                               $mat->thread(0) += $line
1156        # explicit threading to add a vector to each column of a matrix
1157        pdl> $mat  = zeroes(4,3)
1158        pdl> $line = pdl (3.1416,2,-2)
1159        pdl> ($tmp = $mat->thread(0)) += $line
1160
1161       In this example, "$mat->thread(0)" tells PDL that you want the second
1162       dimension of this piddle to be threaded over first leading to a thread
1163       loop that can be expressed as
1164
1165        for (j=0; j<3; j++) {
1166           for (i=0; i<4; i++) {
1167               mat(i,j) += src(j);
1168           }
1169        }
1170
1171       "thread" takes a list of numbers as arguments which explicitly specify
1172       which dimensions to thread over first. With the introduction of
1173       explicit threading the dimensions of a piddle are conceptually split
1174       into three different groups the latter two of which we have already
1175       encountered: thread dimensions, core dimensions and extra dimensions.
1176
1177       Conceptually, it is best to think of those dimensions of a piddle that
1178       have been specified in a call to "thread" as being taken away from the
1179       set of normal dimensions and put on a separate stack. So assuming we
1180       have a piddle "x(4,7,2,8)" saying
1181
1182        $y = $x->thread(2,1)
1183
1184       creates a new virtual piddle of dimension "y(4,8)" (which we call the
1185       remaining dims) that also has 2 thread dimensions of size "(2,7)". For
1186       the purposes of this document we write that symbolically as
1187       "y(4,8){2,7}". An important difference to the previous examples where
1188       only implicit threading was used is the fact that the core dimensions
1189       are matched against the remaining dimensions which are not necessarily
1190       the first dimensions of the piddle. We will now specify how the
1191       presence of thread dimensions changes the rules R1-R5 for thread loops
1192       (which apply to the special case where none of the piddle arguments has
1193       any thread dimensions).
1194
1195       T0  Core dimensions are matched against the first n remaining
1196           dimensions of the piddle argument (note the difference to R1). Any
1197           further remaining dimensions are extra dimensions and are used to
1198           determine the implicit loop dimensions.
1199
1200       T1a The number of implicit loop dimensions is equal to the maximal
1201           number of extra dimensions taken over the set of piddle arguments.
1202
1203       T1b The number of explicit loop dimensions is equal to the maximal
1204           number of thread dimensions taken over the set of piddle arguments.
1205
1206       T1c The total number of loop dimensions is equal to the sum of explicit
1207           loop dimensions and implicit loop dimensions. In the thread loop,
1208           explicit loop dimensions are threaded over first followed by
1209           implicit loop dimensions.
1210
1211       T2  The size of each of the loop dimensions is derived from the size of
1212           the respective dimensions of the piddle arguments. It is given by
1213           the maximal size found in any piddles having this thread dimension
1214           (for explicit loop dimensions) or extra dimension (for implicit
1215           loop dimensions).
1216
1217       T3  This rule applies to any explicit loop dimension as well as any
1218           implicit loop dimension. For all piddles that have a given
1219           thread/extra dimension the size must be equal to the size of the
1220           respective explicit/implicit loop dimension or 1; otherwise you
1221           raise a runtime exception. If the size of a thread/extra dimension
1222           of a piddle is one it is implicitly treated as a dummy dimension of
1223           size equal to the explicit/implicit loop dimension.
1224
1225       T4  If a piddle doesn't have a thread/extra dimension that corresponds
1226           to an explicit/implicit loop dimension, in the thread loop this
1227           piddle is treated as if having a dummy dimension of size equal to
1228           the size of that loop dimension.
1229
1230       T4a All piddles that do have thread dimensions must have the same
1231           number of thread dimensions.
1232
1233       T5  Output auto-creation cannot be used if any of the piddle arguments
1234           has any thread dimensions. Otherwise R5 applies.
1235
1236       The same restrictions apply with regard to implicit dummy dimensions
1237       (created by application of T4) as already mentioned in the section on
1238       implicit threading: if any of the output piddles has an (explicit or
1239       implicitly created) greater-than-one dummy dimension a runtime
1240       exception will be raised.
1241
1242       Let us demonstrate these rules at work in a generic case.  Suppose we
1243       have a (here unspecified) PP-function with the signature:
1244
1245        func((m,n),(m),(),[o](m))
1246
1247       and you call it with 3 piddles "a(5,3,10,11)", "b(3,5,10,1,12)",
1248       "c(10)" and an output piddle "d(3,11,5,10,12)" (which can here not be
1249       automatically created) as
1250
1251        func($x->thread(1,3),$y->thread(0,3),$c,$d->thread(0,1))
1252
1253       From the signature of func and the above call the piddles split into
1254       the following groups of core, extra and thread dimensions (written in
1255       the form "pdl(core dims){thread dims}[extra dims]"):
1256
1257        a(5,10){3,11}[] b(5){3,1}[10,12] c(){}[10] d(5){3,11}[10,12]
1258
1259       With this to help us along (it is in general helpful to write the
1260       arguments down like this when you start playing with threading and want
1261       to keep track of what is going on) we further deduce that the number of
1262       explicit loop dimensions is 2 (by T1b from $a and $b) with sizes
1263       "(3,11)" (by T2); 2 implicit loop dimensions (by T1a from $b and $d) of
1264       size "(10,12)" (by T2) and the elements of are computed from the input
1265       piddles in a way that can be expressed in pdl pseudo-code as
1266
1267        for (l=0;l<12;l++)
1268         for (k=0;k<10;k++)
1269          for (j=0;j<11;j++)         effect of treating it as dummy dim (index j)
1270           for (i=0;i<3;i++)                         |
1271              d(i,j,:,k,l) = func(a(:,i,:,j),b(i,:,k,0,l),c(k))
1272
1273       Ugh, this example was really not easy in terms of bookkeeping. It
1274       serves mostly as an example how to figure out what's going on when you
1275       encounter a complicated looking expression. But now it is really time
1276       to show that threading is useful by giving some more of our so called
1277       "practical" examples.
1278
1279       [ The following examples will need some additional explanations in the
1280       future. For the moment please try to live with the comments in the code
1281       fragments. ]
1282
1283       Example 1:
1284
1285        *** inverse of matrix represented by eigvecs and eigvals
1286        ** given a symmetrical matrix M = A^T x diag(lambda_i) x A
1287        **    =>  inverse M^-1 = A^T x diag(1/lambda_i) x A
1288        ** first $tmp = diag(1/lambda_i)*A
1289        ** then  A^T * $tmp by threaded inner product
1290        # index handling so that matrices print correct under pdl
1291        $inv .= $evecs*0;  # just copy to get appropriately sized output
1292        $tmp .= $evecs;    # initialise, no back-propagation
1293        ($tmp2 = $tmp->thread(0)) /= $evals;    #  threaded division
1294        # and now a matrix multiplication in disguise
1295        PDL::Primitive::inner($evecs->xchg(0,1)->thread(-1,1),
1296                              $tmp->thread(0,-1),
1297                              $inv->thread(0,1));
1298        # alternative for matrix mult using implicit threading,
1299        # first xchg only for transpose
1300        PDL::Primitive::inner($evecs->xchg(0,1)->dummy(1),
1301                              $tmp->xchg(0,1)->dummy(2),
1302                              ($inv=null));
1303
1304       Example 2:
1305
1306        # outer product by threaded multiplication
1307        # stress that we need to do it with explicit call to my_biop1
1308        # when using explicit threading
1309        $res=zeroes(($x->dims)[0],($y->dims)[0]);
1310        my_biop1($x->thread(0,-1),$y->thread(-1,0),$res->(0,1),"*");
1311        # similar thing by implicit threading with auto-created piddle
1312        $res = $x->dummy(1) * $y->dummy(0);
1313
1314       Example 3:
1315
1316        # different use of thread and unthread to shuffle a number of
1317        # dimensions in one go without lots of calls to ->xchg and ->mv
1318
1319
1320        # use thread/unthread to shuffle dimensions around
1321        # just try it out and compare the child piddle with its parent
1322        $trans = $x->thread(4,1,0,3,2)->unthread;
1323
1324       Example 4:
1325
1326        # calculate a couple of bounding boxes
1327        # $bb will hold BB as [xmin,xmax],[ymin,ymax],[zmin,zmax]
1328        # we use again thread and unthread to shuffle dimensions around
1329        pdl> $bb = zeroes(double, 2,3 );
1330        pdl> minimum($vertices->thread(0)->clump->unthread(1), $bb->slice('(0),:'));
1331        pdl> maximum($vertices->thread(0)->clump->unthread(1), $bb->slice('(1),:'));
1332
1333       Example 5:
1334
1335        # calculate a self-rationed (i.e. self normalized) sequence of images
1336        # uses explicit threading and an implicitly threaded division
1337        $stack = read_image_stack();
1338        # calculate the average (per pixel average) of the first $n+1 images
1339        $aver = zeroes([stack->dims]->[0,1]);  # make the output piddle
1340        sumover($stack->slice(":,:,0:$n")->thread(0,1),$aver);
1341        $aver /= ($n+1);
1342        $stack /= $aver;  # normalize the stack by doing a threaded division
1343        # implicit versus explicit
1344        # alternatively calculate $aver with implicit threading and auto-creation
1345        sumover($stack->slice(":,:,0:$n")->mv(2,0),($aver=null));
1346        $aver /= ($n+1);
1347        #
1348
1349   Implicit versus explicit threading
1350       In this paragraph we are going to illustrate when explicit threading is
1351       preferable over implicit threading and vice versa. But then again, this
1352       is probably not the best way of putting the case since you already
1353       know: the two flavours do mix. So, it's more about how to get the best
1354       of both worlds and, anyway, in the best of Perl traditions: TIMTOWTDI !
1355
1356       [ Sorry, this still has to be filled in in a later release; either
1357       refer to above examples or choose some new ones ]
1358
1359       Finally, this may be a good place to justify all the technical detail
1360       we have been going on about for a couple of pages: why threading ?
1361
1362       Well, code that uses threading should be (considerably) faster than
1363       code that uses explicit for-loops (or similar Perl constructs) to
1364       achieve the same functionality. Especially on supercomputers (with
1365       vector computing facilities/parallel processing) PDL threading will be
1366       implemented in a way that takes advantage of the additional facilities
1367       of these machines. Furthermore, it is a conceptually simply construct
1368       (though technical details might get involved at times) and can greatly
1369       reduce the syntactical complexity of PDL code (but keep the admonition
1370       for documentation in mind). Once you are comfortable with the threading
1371       way of thinking (and coding) it shouldn't be too difficult to
1372       understand code that somebody else has written than (provided he gave
1373       you an idea what expected input dimensions are, etc.). As a general tip
1374       to increase the performance of your code: if you have to introduce a
1375       loop into your code try to reformulate the problem so that you can use
1376       threading to perform the loop (as with anything there are exceptions to
1377       this rule of thumb; but the authors of this document tend to think that
1378       these are rare cases ;).
1379

PDL::PP

1381   An easy way to define functions that are aware of indexing and threading
1382       (and the universe and everything)
1383       PDL:PP is part of the PDL distribution. It is used to generate
1384       functions that are aware of indexing and threading rules from very
1385       concise descriptions. It can be useful for you if you want to write
1386       your own functions or if you want to interface functions from an
1387       external library so  that they support indexing and threading (and
1388       maybe dataflow as well, see PDL::Dataflow). For further details check
1389       PDL::PP.
1390

Appendix A

1392   Affine transformations - a special class of simple and powerful
1393       transformations
1394       [ This is also something to be added in future releases. Do we already
1395       have the general make_affine routine in PDL ? It is possible that we
1396       will reference another appropriate man page from here ]
1397

Appendix B

1399   signatures of standard PDL::PP compiled functions
1400       A selection of signatures of PDL primitives to show how many dimensions
1401       PP compiled functions gobble up (and therefore you can figure out what
1402       will be threaded over). Most of those functions are the basic ones
1403       defined in "primitive.pd"
1404
1405        # functions in primitive.pd
1406        #
1407        sumover        ((n),[o]())
1408        prodover       ((n),[o]())
1409        axisvalues     ((n))                                   inplace
1410        inner          ((n),(n),[o]())
1411        outer          ((n),(m),[o](n,m))
1412        innerwt        ((n),(n),(n),[o]())
1413        inner2         ((m),(m,n),(n),[o]())
1414        inner2t        ((j,n),(n,m),(m,k),[o]())
1415        index          (1D,0D,[o])
1416        minimum        (1D,[o])
1417        maximum        (1D,[o])
1418        wstat          ((n),(n),(),[o],())
1419        assgn          ((),())
1420
1421        # basic operations
1422        binary operations ((),(),[o]())
1423        unary operations  ((),[o]())
1424

AUTHOR & COPYRIGHT

1426       Copyright (C) 1997 Christian Soeller (c.soeller@auckland.ac.nz) &
1427       Tuomas J. Lukka (lukka@fas.harvard.edu). All rights reserved. Although
1428       destined for release as a man page with the standard PDL distribution,
1429       it is not public domain. Permission is granted to freely distribute
1430       verbatim copies of this document provided that no modifications outside
1431       of formatting be made, and that this notice remain intact.  You are
1432       permitted and encouraged to use its code and derivatives thereof in
1433       your own source code for fun or for profit as you see fit.
1434
1435
1436
1437perl v5.32.0                      2020-09-17                       INDEXING(1)