1INTERNALS(1)          User Contributed Perl Documentation         INTERNALS(1)
2
3
4

NAME

6       PDL::Internals - description of some aspects of the current internals
7

DESCRIPTION

9   Intro
10       This document explains various aspects of the current implementation of
11       PDL. If you just want to use PDL for something, you definitely do not
12       need to read this. Even if you want to interface your C routines to PDL
13       or create new PDL::PP functions, you do not need to read this man page
14       (though it may be informative). This document is primarily intended for
15       people interested in debugging or changing the internals of PDL. To
16       read this, a good understanding of the C language and programming and
17       data structures in general is required, as well as some Perl
18       understanding. If you read through this document and understand all of
19       it and are able to point what any part of this document refers to in
20       the PDL core sources and additionally struggle to understand PDL::PP,
21       you will be awarded the title "PDL Guru" (of course, the current
22       version of this document is so incomplete that this is next to
23       impossible from just these notes).
24
25       Warning: If it seems that this document has gotten out of date, please
26       inform the PDL porters email list (pdl-devel@lists.sourceforge.net).
27       This may well happen.
28
29   Piddles
30       The pdl data object is generally an opaque scalar reference into a pdl
31       structure in memory. Alternatively, it may be a hash reference with the
32       "PDL" field containing the scalar reference (this makes overloading
33       piddles easy, see PDL::Objects). You can easily find out at the Perl
34       level which type of piddle you are dealing with. The example code below
35       demonstrates how to do it:
36
37          # check if this a piddle
38          die "not a piddle" unless UNIVERSAL::isa($pdl, 'PDL');
39          # is it a scalar ref or a hash ref?
40          if (UNIVERSAL::isa($pdl, "HASH")) {
41            die "not a valid PDL" unless exists $pdl->{PDL} &&
42               UNIVERSAL::isa($pdl->{PDL},'PDL');
43            print "This is a hash reference,",
44               " the PDL field contains the scalar ref\n";
45          } else {
46               print "This is a scalar ref that points to address $$pdl in memory\n";
47          }
48
49       The scalar reference points to the numeric address of a C structure of
50       type "pdl" which is defined in pdl.h. The mapping between the object at
51       the Perl level and the C structure containing the actual data and
52       structural that makes up a piddle is done by the PDL typemap.  The
53       functions used in the PDL typemap are defined pretty much at the top of
54       the file pdlcore.h. So what does the structure look like:
55
56               struct pdl {
57                  unsigned long magicno; /* Always stores PDL_MAGICNO as a sanity check */
58                    /* This is first so most pointer accesses to wrong type are caught */
59                  int state;        /* What's in this pdl */
60
61                  pdl_trans *trans; /* Opaque pointer to internals of transformation from
62                                       parent */
63
64                  pdl_vaffine *vafftrans;
65
66                  void*    sv;      /* (optional) pointer back to original sv.
67                                         ALWAYS check for non-null before use.
68                                         We cannot inc refcnt on this one or we'd
69                                         never get destroyed */
70
71                  void *datasv;        /* Pointer to SV containing data. Refcnt inced */
72                  void *data;            /* Null: no data alloced for this one */
73                  PDL_Indx nvals;           /* How many values allocated */
74                  int datatype;
75                  PDL_Indx   *dims;      /* Array of data dimensions */
76                  PDL_Indx   *dimincs;   /* Array of data default increments */
77                  short    ndims;     /* Number of data dimensions */
78
79                  unsigned char *threadids;  /* Starting index of the thread index set n */
80                  unsigned char nthreadids;
81
82                  pdl_children children;
83
84                  PDL_Indx   def_dims[PDL_NDIMS];   /* Preallocated space for efficiency */
85                  PDL_Indx   def_dimincs[PDL_NDIMS];   /* Preallocated space for efficiency */
86                  unsigned char def_threadids[PDL_NTHREADIDS];
87
88                  struct pdl_magic *magic;
89
90                  void *hdrsv; /* "header", settable from outside */
91               };
92
93       This is quite a structure for just storing some data in - what is going
94       on?
95
96       Data storage
97            We are going to start with some of the simpler members: first of
98            all, there is the member
99
100                    void *datasv;
101
102            which is really a pointer to a Perl SV structure ("SV *"). The SV
103            is expected to be representing a string, in which the data of the
104            piddle is stored in a tightly packed form. This pointer counts as
105            a reference to the SV so the reference count has been incremented
106            when the "SV *" was placed here (this reference count business has
107            to do with Perl's garbage collection mechanism -- don't worry if
108            this doesn't mean much to you). This pointer is allowed to have
109            the value "NULL" which means that there is no actual Perl SV for
110            this data - for instance, the data might be allocated by a "mmap"
111            operation. Note the use of an SV* was purely for convenience, it
112            allows easy transformation of packed data from files into piddles.
113            Other implementations are not excluded.
114
115            The actual pointer to data is stored in the member
116
117                    void *data;
118
119            which contains a pointer to a memory area with space for
120
121                    PDL_Indx nvals;
122
123            data items of the data type of this piddle.  PDL_Indx is either
124            'long' or 'long long' depending on whether your perl is 64bit or
125            not.
126
127            The data type of the data is stored in the variable
128
129                    int datatype;
130
131            the values for this member are given in the enum "pdl_datatypes"
132            (see pdl.h). Currently we have byte, short, unsigned short, long,
133            index (either long or long long), long long, float and double
134            types, see also PDL::Types.
135
136       Dimensions
137            The number of dimensions in the piddle is given by the member
138
139                    int ndims;
140
141            which shows how many entries there are in the arrays
142
143                    PDL_Indx   *dims;
144                    PDL_Indx   *dimincs;
145
146            These arrays are intimately related: "dims" gives the sizes of the
147            dimensions and "dimincs" is always calculated by the code
148
149                    PDL_Indx inc = 1;
150                    for(i=0; i<it->ndims; i++) {
151                            it->dimincs[i] = inc; inc *= it->dims[i];
152                    }
153
154            in the routine "pdl_resize_defaultincs" in "pdlapi.c".  What this
155            means is that the dimincs can be used to calculate the offset by
156            code like
157
158                    PDL_Indx offs = 0;
159                    for(i=0; i<it->ndims; i++) {
160                            offs += it->dimincs[i] * index[i];
161                    }
162
163            but this is not always the right thing to do, at least without
164            checking for certain things first.
165
166       Default storage
167            Since the vast majority of piddles don't have more than 6
168            dimensions, it is more efficient to have default storage for the
169            dimensions and dimincs inside the PDL struct.
170
171                    PDL_Indx   def_dims[PDL_NDIMS];
172                    PDL_Indx   def_dimincs[PDL_NDIMS];
173
174            The "dims" and "dimincs" may be set to point to the beginning of
175            these arrays if "ndims" is smaller than or equal to the compile-
176            time constant "PDL_NDIMS". This is important to note when freeing
177            a piddle struct.  The same applies for the threadids:
178
179                    unsigned char def_threadids[PDL_NTHREADIDS];
180
181       Magic
182            It is possible to attach magic to piddles, much like Perl's own
183            magic mechanism. If the member pointer
184
185                       struct pdl_magic *magic;
186
187            is nonzero, the PDL has some magic attached to it. The
188            implementation of magic can be gleaned from the file pdlmagic.c in
189            the distribution.
190
191       State
192            One of the first members of the structure is
193
194                    int state;
195
196            The possible flags and their meanings are given in "pdl.h".  These
197            are mainly used to implement the lazy evaluation mechanism and
198            keep track of piddles in these operations.
199
200       Transformations and virtual affine transformations
201            As you should already know, piddles often carry information about
202            where they come from. For example, the code
203
204                    $y = $x->slice("2:5");
205                    $y .= 1;
206
207            will alter $x. So $y and $x know that they are connected via a
208            "slice"-transformation. This information is stored in the members
209
210                    pdl_trans *trans;
211                    pdl_vaffine *vafftrans;
212
213            Both $x (the parent) and $y (the child) store this information
214            about the transformation in appropriate slots of the "pdl"
215            structure.
216
217            "pdl_trans" and "pdl_vaffine" are structures that we will look at
218            in more detail below.
219
220       The Perl SVs
221            When piddles are referred to through Perl SVs, we store an
222            additional reference to it in the member
223
224                    void*    sv;
225
226            in order to be able to return a reference to the user when he
227            wants to inspect the transformation structure on the Perl side.
228
229            Also, we store an opaque
230
231                    void *hdrsv;
232
233            which is just for use by the user to hook up arbitrary data with
234            this sv.  This one is generally manipulated through sethdr and
235            gethdr calls.
236
237   Smart references and transformations: slicing and dicing
238       Smart references and most other fundamental functions operating on
239       piddles are implemented via transformations (as mentioned above) which
240       are represented by the type "pdl_trans" in PDL.
241
242       A transformation links input and output piddles and contains all the
243       infrastructure that defines how:
244
245       •   output piddles are obtained from input piddles;
246
247       •   changes in smartly linked output piddles (e.g. the child of a
248           sliced parent piddle) are flown back to the input piddle in
249           transformations where this is supported (the most often used
250           example being "slice" here);
251
252       •   datatype and size of output piddles that need to be created are
253           obtained.
254
255       In general, executing a PDL function on a group of piddles results in
256       creation of a transformation of the requested type that links all input
257       and output arguments (at least those that are piddles). In PDL
258       functions that support data flow between input and output args (e.g.
259       "slice", "index") this transformation links parent (input) and child
260       (output) piddles permanently until either the link is explicitly broken
261       by user request ("sever" at the Perl level) or all parents and children
262       have been destroyed. In those cases the transformation is lazy-
263       evaluated, e.g. only executed when piddle values are actually accessed.
264
265       In non-flowing functions, for example addition ("+") and inner products
266       ("inner"), the transformation is installed just as in flowing functions
267       but then the transformation is immediately executed and destroyed
268       (breaking the link between input and output args) before the function
269       returns.
270
271       It should be noted that the close link between input and output args of
272       a flowing function (like slice) requires that piddle objects that are
273       linked in such a way be kept alive beyond the point where they have
274       gone out of scope from the point of view of Perl:
275
276         $x = zeroes(20);
277         $y = $x->slice('2:4');
278         undef $x;    # last reference to $x is now destroyed
279
280       Although $x should now be destroyed according to Perl's rules the
281       underlying "pdl" structure must actually only be freed when $y also
282       goes out of scope (since it still references internally some of $x's
283       data). This example demonstrates that such a dataflow paradigm between
284       PDL objects necessitates a special destruction algorithm that takes the
285       links between piddles into account and couples the lifespan of those
286       objects. The non-trivial algorithm is implemented in the function
287       "pdl_destroy" in pdlapi.c. In fact, most of the code in pdlapi.c and
288       pdlfamily.c is concerned with making sure that piddles ("pdl *"s) are
289       created, updated and freed at the right times depending on interactions
290       with other piddles via PDL transformations (remember, "pdl_trans").
291
292   Accessing children and parents of a piddle
293       When piddles are dynamically linked via transformations as suggested
294       above input and output piddles are referred to as parents and children,
295       respectively.
296
297       An example of processing the children of a piddle is provided by the
298       "baddata" method of PDL::Bad (only available if you have compiled PDL
299       with the "WITH_BADVAL" option set to 1, but still useful as an
300       example!).
301
302       Consider the following situation:
303
304        pdl> $x = rvals(7,7,{Centre=>[3,4]});
305        pdl> $y = $x->slice('2:4,3:5');
306        pdl> ? vars
307        PDL variables in package main::
308
309        Name         Type   Dimension       Flow  State          Mem
310        ----------------------------------------------------------------
311        $x           Double D [7,7]                P            0.38Kb
312        $y           Double D [3,3]                -C           0.00Kb
313
314       Now, if I suddenly decide that $x should be flagged as possibly
315       containing bad values, using
316
317        pdl> $x->badflag(1)
318
319       then I want the state of $y - it's child - to be changed as well (since
320       it will either share or inherit some of $x's data and so be also bad),
321       so that I get a 'B' in the State field:
322
323        pdl> ? vars
324        PDL variables in package main::
325
326        Name         Type   Dimension       Flow  State          Mem
327        ----------------------------------------------------------------
328        $x           Double D [7,7]                PB           0.38Kb
329        $y           Double D [3,3]                -CB          0.00Kb
330
331       This bit of magic is performed by the "propagate_badflag" function,
332       which is listed below:
333
334        /* newval = 1 means set flag, 0 means clear it */
335        /* thanks to Christian Soeller for this */
336
337        void propagate_badflag( pdl *it, int newval ) {
338           PDL_DECL_CHILDLOOP(it)
339           PDL_START_CHILDLOOP(it)
340           {
341               pdl_trans *trans = PDL_CHILDLOOP_THISCHILD(it);
342               int i;
343               for( i = trans->vtable->nparents;
344                    i < trans->vtable->npdls;
345                    i++ ) {
346                   pdl *child = trans->pdls[i];
347
348                   if ( newval ) child->state |=  PDL_BADVAL;
349                   else          child->state &= ~PDL_BADVAL;
350
351                   /* make sure we propagate to grandchildren, etc */
352                   propagate_badflag( child, newval );
353
354               } /* for: i */
355           }
356           PDL_END_CHILDLOOP(it)
357        } /* propagate_badflag */
358
359       Given a piddle ("pdl *it"), the routine loops through each "pdl_trans"
360       structure, where access to this structure is provided by the
361       "PDL_CHILDLOOP_THISCHILD" macro.  The children of the piddle are stored
362       in the "pdls" array, after the parents, hence the loop from "i =
363       ...nparents" to "i = ...npdls - 1".  Once we have the pointer to the
364       child piddle, we can do what we want to it; here we change the value of
365       the "state" variable, but the details are unimportant).  What is
366       important is that we call "propagate_badflag" on this piddle, to ensure
367       we loop through its children. This recursion ensures we get to all the
368       offspring of a particular piddle.
369
370       Access to parents is similar, with the "for" loop replaced by:
371
372               for( i = 0;
373                    i < trans->vtable->nparents;
374                    i++ ) {
375                  /* do stuff with parent #i: trans->pdls[i] */
376               }
377
378   What's in a transformation ("pdl_trans")
379       All transformations are implemented as structures
380
381         struct XXX_trans {
382               int magicno; /* to detect memory overwrites */
383               short flags; /* state of the trans */
384               pdl_transvtable *vtable;   /* the all important vtable */
385               void (*freeproc)(struct pdl_trans *);  /* Call to free this trans
386                       (in case we had to malloc some stuff for this trans) */
387               pdl *pdls[NP]; /* The pdls involved in the transformation */
388               int __datatype; /* the type of the transformation */
389               /* in general more members
390               /* depending on the actual transformation (slice, add, etc)
391                */
392         };
393
394       The transformation identifies all "pdl"s involved in the trans
395
396         pdl *pdls[NP];
397
398       with "NP" depending on the number of piddle args of the particular
399       trans. It records a state
400
401         short flags;
402
403       and the datatype
404
405         int __datatype;
406
407       of the trans (to which all piddles must be converted unless they are
408       explicitly typed, PDL functions created with PDL::PP make sure that
409       these conversions are done as necessary). Most important is the pointer
410       to the vtable (virtual table) that contains the actual functionality
411
412        pdl_transvtable *vtable;
413
414       The vtable structure in turn looks something like (slightly simplified
415       from pdl.h for clarity)
416
417         typedef struct pdl_transvtable {
418               pdl_transtype transtype;
419               int flags;
420               int nparents;   /* number of parent pdls (input) */
421               int npdls;      /* number of child pdls (output) */
422               char *per_pdl_flags;  /* optimization flags */
423               void (*redodims)(pdl_trans *tr);  /* figure out dims of children */
424               void (*readdata)(pdl_trans *tr);  /* flow parents to children  */
425               void (*writebackdata)(pdl_trans *tr); /* flow backwards */
426               void (*freetrans)(pdl_trans *tr); /* Free both the contents and it of
427                                               the trans member */
428               pdl_trans *(*copy)(pdl_trans *tr); /* Full copy */
429               int structsize;
430               char *name; /* For debuggers, mostly */
431         } pdl_transvtable;
432
433       We focus on the callback functions:
434
435               void (*redodims)(pdl_trans *tr);
436
437       "redodims" will work out the dimensions of piddles that need to be
438       created and is called from within the API function that should be
439       called to ensure that the dimensions of a piddle are accessible
440       (pdlapi.c):
441
442          void pdl_make_physdims(pdl *it)
443
444       "readdata" and "writebackdata" are responsible for the actual
445       computations of the child data from the parents or parent data from
446       those of the children, respectively (the dataflow aspect).  The PDL
447       core makes sure that these are called as needed when piddle data is
448       accessed (lazy-evaluation). The general API function to ensure that a
449       piddle is up-to-date is
450
451         void pdl_make_physvaffine(pdl *it)
452
453       which should be called before accessing piddle data from XS/C (see
454       Core.xs for some examples).
455
456       "freetrans" frees dynamically allocated memory associated with the
457       trans as needed and "copy" can copy the transformation.  Again,
458       functions built with PDL::PP make sure that copying and freeing via
459       these callbacks happens at the right times. (If they fail to do that we
460       have got a memory leak -- this has happened in the past ;).
461
462       The transformation and vtable code is hardly ever written by hand but
463       rather generated by PDL::PP from concise descriptions.
464
465       Certain types of transformations can be optimized very efficiently
466       obviating the need for explicit "readdata" and "writebackdata" methods.
467       Those transformations are called pdl_vaffine. Most dimension
468       manipulating functions (e.g., "slice", "xchg") belong to this class.
469
470       The basic trick is that parent and child of such a transformation work
471       on the same (shared) block of data which they just choose to interpret
472       differently (by using different "dims", "dimincs" and "offs" on the
473       same data, compare the "pdl" structure above).  Each operation on a
474       piddle sharing data with another one in this way is therefore
475       automatically flown from child to parent and back -- after all they are
476       reading and writing the same block of memory. This is currently not
477       Perl thread safe -- no big loss since the whole PDL core is not
478       reentrant (Perl threading "!=" PDL threading!).
479
480   Signatures: threading over elementary operations
481       Most of that functionality of PDL threading (automatic iteration of
482       elementary operations over multi-dim piddles) is implemented in the
483       file pdlthread.c.
484
485       The PDL::PP generated functions (in particular the "readdata" and
486       "writebackdata" callbacks) use this infrastructure to make sure that
487       the fundamental operation implemented by the trans is performed in
488       agreement with PDL's threading semantics.
489
490   Defining new PDL functions -- Glue code generation
491       Please, see PDL::PP and examples in the PDL distribution.
492       Implementation and syntax are currently far from perfect but it does a
493       good job!
494
495   The Core struct
496       As discussed in PDL::API, PDL uses a pointer to a structure to allow
497       PDL modules access to its core routines. The definition of this
498       structure (the "Core" struct) is in pdlcore.h (created by pdlcore.h.PL
499       in Basic/Core) and looks something like
500
501        /* Structure to hold pointers core PDL routines so as to be used by
502         * many modules
503         */
504        struct Core {
505           I32    Version;
506           pdl*   (*SvPDLV)      ( SV*  );
507           void   (*SetSV_PDL)   ( SV *sv, pdl *it );
508        #if defined(PDL_clean_namespace) || defined(PDL_OLD_API)
509           pdl*   (*new)      ( );     /* make it work with gimp-perl */
510        #else
511           pdl*   (*pdlnew)      ( );  /* renamed because of C++ clash */
512        #endif
513           pdl*   (*tmp)         ( );
514           pdl*   (*create)      (int type);
515           void   (*destroy)     (pdl *it);
516           ...
517        }
518        typedef struct Core Core;
519
520       The first field of the structure ("Version") is used to ensure
521       consistency between modules at run time; the following code is placed
522       in the BOOT section of the generated xs code:
523
524        if (PDL->Version != PDL_CORE_VERSION)
525          Perl_croak(aTHX_ "Foo needs to be recompiled against the newly installed PDL");
526
527       If you add a new field to the Core struct you should:
528
529       •    discuss it on the pdl porters email list
530            (pdl-devel@lists.sourceforge.net) [with the possibility of making
531            your changes to a separate branch of the CVS tree if it's a change
532            that will take time to complete]
533
534       •    increase by 1 the value of the $pdl_core_version variable in
535            pdlcore.h.PL. This sets the value of the "PDL_CORE_VERSION" C
536            macro used to populate the Version field
537
538       •    add documentation (e.g. to PDL::API) if it's a "useful" function
539            for external module writers (as well as ensuring the code is as
540            well documented as the rest of PDL ;)
541

BUGS

543       This description is far from perfect. If you need more details or
544       something is still unclear please ask on the pdl-devel mailing list
545       (pdl-devel@lists.sourceforge.net).
546

AUTHOR

548       Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), 2000 Doug
549       Burke (djburke@cpan.org), 2002 Christian Soeller & Doug Burke, 2013
550       Chris Marshall.
551
552       Redistribution in the same form is allowed but reprinting requires a
553       permission from the author.
554
555
556
557perl v5.32.1                      2021-02-15                      INTERNALS(1)
Impressum