1PARALLELCPU(1)        User Contributed Perl Documentation       PARALLELCPU(1)
2
3
4

NAME

6       PDL::ParallelCPU - Parallel Processor MultiThreading Support in PDL
7       (Experimental)
8

DESCRIPTION

10       PDL has support (currently experimental) for splitting up numerical
11       processing between multiple parallel processor threads (or pthreads)
12       using the set_autopthread_targ and set_autopthread_size functions.
13       This can improve processing performance (by greater than 2-4X in most
14       cases) by taking advantage of multi-core and/or multi-processor
15       machines.
16

SYNOPSIS

18         use PDL;
19
20         # Set target of 4 parallel pthreads to create, with a lower limit of
21         #  5Meg elements for splitting processing into parallel pthreads.
22         set_autopthread_targ(4);
23         set_autopthread_size(5);
24
25         $x = zeroes(5000,5000); # Create 25Meg element array
26
27         $y = $x + 5; # Processing will be split up into multiple pthreads
28
29         # Get the actual number of pthreads for the last
30         #  processing operation.
31         $actualPthreads = get_autopthread_actual();
32
33         # Or compare these to see CPU usage (first one only 1 pthread, second one 10)
34         # in the PDL shell:
35         $x = ones(10,1000,10000); set_autopthread_targ(1); $y = sin($x)*cos($x); p get_autopthread_actual;
36         $x = ones(10,1000,10000); set_autopthread_targ(10); $y = sin($x)*cos($x); p get_autopthread_actual;
37

Terminology

39       The use of the term threading can be confusing with PDL, because it can
40       refer to PDL threading, as defined in the PDL::Threading docs, or to
41       processor multi-threading.
42
43       To reduce confusion with the existing PDL threading terminology, this
44       document uses pthreading to refer to processor multi-threading, which
45       is the use of multiple processor threads to split up numerical
46       processing into parallel operations.
47

Functions that control PDL PThreads

49       This is a brief listing and description of the PDL pthreading
50       functions, see the PDL::Core docs for detailed information.
51
52       set_autopthread_targ
53            Set the target number of processor-threads (pthreads) for multi-
54            threaded processing. Setting auto_pthread_targ to 0 means that no
55            pthreading will occur.
56
57            See PDL::Core for details.
58
59       set_autopthread_size
60            Set the minimum size (in Meg-elements or 2**20 elements) of the
61            largest PDL involved in a function where auto-pthreading will be
62            performed. For small PDLs, it probably isn't worth starting
63            multiple pthreads, so this function is used to define a minimum
64            threshold where auto-pthreading won't be attempted.
65
66            See PDL::Core for details.
67
68       get_autopthread_actual
69            Get the actual number of pthreads executed for the last pdl
70            processing function.
71
72            See PDL::get_autopthread_actual for details.
73

Global Control of PDL PThreading using Environment Variables

75       PDL PThreading can be globally turned on, without modifying existing
76       code by setting environment variables PDL_AUTOPTHREAD_TARG and
77       PDL_AUTOPTHREAD_SIZE before running a PDL script.  These environment
78       variables are checked when PDL starts up and calls to
79       set_autopthread_targ and set_autopthread_size functions made with the
80       environment variable's values.
81
82       For example, if the environment var PDL_AUTOPTHREAD_TARG is set to 3,
83       and PDL_AUTOPTHREAD_SIZE is set to 10, then any pdl script will run as
84       if the following lines were at the top of the file:
85
86        set_autopthread_targ(3);
87        set_autopthread_size(10);
88

How It Works

90       The auto-pthreading process works by analyzing threaded array
91       dimensions in PDL operations and splitting up processing based on the
92       thread dimension sizes and desired number of pthreads (i.e. the pthread
93       target or pthread_targ). The offsets and increments that PDL uses to
94       step thru the data in memory are modified for each pthread so each one
95       sees a different set of data when performing processing.
96
97       Example
98
99        $x = sequence(20,4,3); # Small 3-D Array, size 20,4,3
100
101        # Setup auto-pthreading:
102        set_autopthread_targ(2); # Target of 2 pthreads
103        set_autopthread_size(0); # Zero so that the small PDLs in this example will be pthreaded
104
105        # This will be split up into 2 pthreads
106        $c = maximum($x);
107
108       For the above example, the maximum function has a signature of "(a(n);
109       [o]c())", which means that the first dimension of $x (size 20) is a
110       Core dimension of the maximum function. The other dimensions of $x
111       (size 4,3) are threaded dimensions (i.e. will be threaded-over in the
112       maximum function.
113
114       The auto-pthreading algorithm examines the threaded dims of size (4,3)
115       and picks the 4 dimension, since it is evenly divisible by the
116       autopthread_targ of 2. The processing of the maximum function is then
117       split into two pthreads on the size-4 dimension, with dim indexes 0,2
118       processed by one pthread
119        and dim indexes 1,3 processed by the other pthread.
120

Limitations

122   Must have POSIX Threads Enabled
123       Auto-PThreading only works if your PDL installation was compiled with
124       POSIX threads enabled. This is normally the case if you are running on
125       linux, or other unix variants.
126
127   Non-Threadsafe Code
128       Not all the libraries that PDL intefaces to are thread-safe, i.e. they
129       aren't written to operate in a multi-threaded environment without
130       crashing or causing side-effects. Some examples in the PDL core is the
131       fft function and the pnmout functions.
132
133       To operate properly with these types of functions, the PPCode flag
134       NoPthread has been introduced to indicate a function as not being
135       pthread-safe. See PDL::PP docs for details.
136
137   Size of PDL Dimensions and PThread Target
138       Due to the way a PDL is split-up for operation using multiple pthreads,
139       the size of a dimension must be evenly divisible by the pthread target.
140       For example, if a PDL has threaded dimension sizes of (4,3,3) and the
141       auto_pthread_targ has been set to 2, then the first threaded dimension
142       (size 4) will be picked to be split up into two pthreads of size 2 and
143       2. However, if the threaded dimension sizes are (3,3,3) and the
144       auto_pthread_targ is still 2, then pthreading won't occur, because no
145       threaded dimensions are divisible by 2.
146
147       The algorithm that picks the actual number of pthreads has some smarts
148       (but could probably be improved) to adjust down from the
149       auto_pthread_targ to get a number of pthreads that can evenly divide
150       one of the threaded dimensions. For example, if a PDL has threaded
151       dimension sizes of (9,2,2) and the auto_pthread_targ is 4, the
152       algorithm will see that no dimension is divisible by 4, then adjust
153       down the target to 3, resulting in splitting up the first threaded
154       dimension (size 9) into 3 pthreads.
155
156   Speed improvement might be less than you expect.
157       If you have a 8 core machine and call auto_pthread_targ with 8 to
158       generate 8 parallel pthreads, you probably won't get a 8X improvement
159       in speed, due to memory bandwidth issues. Even though you have 8
160       separate CPUs crunching away on data, you will have (for most common
161       machine architectures) common RAM that now becomes your bottleneck. For
162       simple calculations (e.g simple additions) you can run into a
163       performance limit at about
164        4 pthreads. For more complex calculations the limit will be higher.
165
167       Copyright 2011 John Cerney. You can distribute and/or modify this
168       document under the same terms as the current Perl license.
169
170       See: http://dev.perl.org/licenses/
171
172
173
174perl v5.34.0                      2021-08-16                    PARALLELCPU(1)
Impressum