1DATAFLOW(1) User Contributed Perl Documentation DATAFLOW(1)
2
3
4
6 PDL::Dataflow -- description of the dataflow implementation and
7 philosophy
8
10 pdl> $x = zeroes(10);
11 pdl> $y = $x->slice("2:4:2");
12 pdl> $y ++;
13 pdl> print $x;
14 [0 0 1 0 1 0 0 0 0 0]
15
17 As of 2.079, this is now a description of the current implementation,
18 together with some design thoughts from its original author, Tuomas
19 Lukka.
20
21 Two-directional dataflow (which implements "->slice()" etc.) is fully
22 functional, as shown in the SYNOPSIS. One-way is implemented, but with
23 restrictions.
24
26 Just about any function which returns some subset of the values in some
27 ndarray will make a binding. $y has become effectively a window to
28 some sub-elements of $x. You can also define your own routines that do
29 different types of subsets. If you don't want $y to be a window to $x,
30 you must do
31
32 $y = $x->slice("some parts")->sever;
33
34 The "sever" destroys the "slice" transform, thereby turning off all
35 dataflow between the two ndarrays.
36
37 Type conversions
38 This works, thanks to a two-way flowing transform that implements type-
39 conversions, particularly for supplied outputs of the "wrong" type for
40 the given transform:
41
42 pdl> $a_bad = pdl double, '[1 BAD 3]';
43 pdl> $b_float = zeroes float, 3;
44 pdl> $a_bad->assgn($b_float); # could be written as $b_float .= $a_bad
45 pdl> p $b_float->badflag;
46 1
47 pdl> p $b_float;
48 [1 BAD 3]
49
51 You need to explicitly turn on one-way dataflow on an ndarray to
52 activate it for non-flowing operations, so
53
54 pdl> $x = pdl 2,3,4;
55 pdl> $x->doflow;
56 pdl> $y = $x * 2;
57 pdl> print $y;
58 [4 6 8]
59 pdl> $x->set(0,5);
60 pdl> print $y;
61 [10 6 8]
62
63 It is not possible to turn on backwards dataflow (such as is used by
64 "slice"-type operations), because there is no general way for PDL (or
65 maths, in fact) to know how to reverse most operations - consider "$z =
66 $x * $y", then adding one to $z.
67
68 Consider the following code:
69
70 $u = sequence(3,3); $u->doflow;
71 $v = ones(3,3); $v->doflow;
72 $w = $u + $v; $w->doflow; # must turn on for each
73 $y = $w + 1; $y->doflow;
74 $x = $w->diagonal(0,1);
75 $x += 50;
76 $z = $w + 2;
77
78 What do $y and $z contain now?
79
80 pdl> p $y
81 [
82 [52 3 4]
83 [ 5 56 7]
84 [ 8 9 60]
85 ]
86 pdl> p $z
87 [
88 [53 4 5]
89 [ 6 57 8]
90 [ 9 10 61]
91 ]
92
93 What about when $u is changed and a recalculation is triggered? A
94 problem arises, in that PDL currently (as of 2.079) disallows (see
95 pdlapi.c), for normal transforms, output ndarrays with flow, or output
96 ndarrays with any parent with dataflow. So "$u++" throws an exception.
97 But it is currently possible to use "set", which is a sort of micro-
98 transform that calls (in the C API) "PDL.set" to mutate the data, then
99 "PDL.changed" to trigger flow updates:
100
101 pdl> $u->set(1,1,90)
102 pdl> p $y
103 [
104 [ 2 3 4]
105 [ 5 92 7]
106 [ 8 9 10]
107 ]
108
109 You'll notice that while the setting of "1,1" (the middle) of $u
110 updated $y, the changes to $y that resulted from adding 50 to the
111 diagonal (via $x, and two-way flow) got lost. This is one-way flow.
112
114 In one-way flow context like the above, with:
115
116 pdl> $y = $x * 2;
117
118 nothing will have been calculated at this point. Even the memory for
119 the contents of $y has not been allocated. Only the command
120
121 pdl> print $y
122
123 will actually cause $y to be calculated. This is important to bear in
124 mind when doing performance measurements and benchmarks as well as when
125 tracking errors.
126
127 There is an explanation for this behaviour: it may save cycles but more
128 importantly, imagine the following:
129
130 pdl> $x = pdl 2,3,4; $x->doflow;
131 pdl> $y = pdl 5,6,7; $y->doflow;
132 pdl> $c = $x + $y;
133 pdl> $x->setdims([4]);
134 pdl> $y->setdims([4]);
135 pdl> print $c;
136
137 Now, if $c were evaluated between the two resizes, an error condition
138 of incompatible sizes would occur.
139
140 What happens in the current version is that resizing $x raises a flag
141 in $c: "PDL_PARENTDIMSCHANGED" and $y just raises the same flag again.
142 When $c is next evaluated, the flags are checked and it is found that a
143 recalculation is needed.
144
145 Of course, lazy evaluation can sometimes make debugging more painful
146 because errors may occur somewhere where you'd not expect them.
147
149 This is one of the more intricate concepts of dataflow. In order to
150 make dataflow work like you'd expect, a rather strange concept must be
151 introduced: families. Let us make a diagram of the one-way flow example
152 - it uses a hypergraph because the transforms (with "+") are connectors
153 between ndarrays (with "*"):
154
155 u* *v
156 \ /
157 +(plus)
158 |
159 1* *w
160 \ /|\
161 \ / | \
162 (plus)+ | +(diagonal)
163 | | |
164 y* | *x
165 |
166 | *1
167 |/
168 +(plus)
169 |
170 z*
171
172 This is what PDL actually has in memory after the first three lines.
173 When $x is changed, $w changes due to "diagonal" being a two-way
174 operation.
175
176 If you want flow from $w, you opt in using "$w->doflow" (as shown in
177 this scenario). If you didn't, then don't enable it. If you have it but
178 want to stop it, call "$ndarray->sever". That will destroy the
179 ndarray's "trans_parent" (here, a node marked with "+"), and as you can
180 visually tell, will stop changes flowing thereafter. If you want to
181 leave the flow operating, but get a copy of the ndarray at that point,
182 use "$ndarray->copy" - it will have the same data at that moment, but
183 have no flow relationships.
184
186 There is the start of a mechanism to bind events onto changed data,
187 intended to allow this to work:
188
189 pdl> $x = pdl 2,3,4
190 pdl> $y = $x + 1;
191 pdl> $c = $y * 2;
192 pdl> $c->bind( sub { print "A now: $x, C now: $c\n" } )
193 pdl> PDL::dowhenidle();
194 A now: [2,3,4], C now: [6 8 10]
195 pdl> $x->set(0,1);
196 pdl> $x->set(1,1);
197 pdl> PDL::dowhenidle();
198 A now: [1,1,4], C now: [4 4 10]
199
200 This hooks into PDL's "magic" which resembles Perl's, but does not
201 currently operate.
202
203 There would be many kinds of uses for this feature: self-updating
204 charts, for instance. It is not yet fully clear whether it would be
205 most useful to queue up changes (useful for doing asynchronously, e.g.
206 when idle), or to activate things immediately.
207
208 In the 2022 era of both GPUs and multiple cores, it is a pity that
209 Perl's dominant model remains single-threaded on CPU, but PDL can use
210 multi-cores for CPU processing (albeit controlled in a single-threaded
211 style) - see PDL::ParallelCPU. It is planned that PDL will gain the
212 ability to use GPUs, and there might be a way to hook that up albeit
213 probably with an event loop to "subscribe" to GPU events.
214
216 PDL implements nearly everything (except for XS oddities like "set")
217 using transforms which connect ndarrays. This includes data
218 transformations like addition, "slicing" to access/operate on subsets,
219 and data-type conversions (which have two-way dataflow, see "Type
220 conversions").
221
222 This does not currently include a resizing transformation, and
223 "setdims" mutates its input. This is intended to change.
224
226 Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu). Same terms
227 as the rest of PDL.
228
229
230
231perl v5.38.0 2023-07-21 DATAFLOW(1)