1HPL_pdlaswp01T(3) HPL Library Functions HPL_pdlaswp01T(3)
2
3
4
6 HPL_pdlaswp01T - Broadcast a column panel L and swap the row panel U.
7
9 #include "hpl.h"
10
11 void HPL_pdlaswp01T( HPL_T_panel * PBCST, int * IFLAG, HPL_T_panel *
12 PANEL, const int NN );
13
15 HPL_pdlaswp01T applies the NB row interchanges to NN columns of the
16 trailing submatrix and broadcast a column panel.
17
18 A "Spread then roll" algorithm performs the swap :: broadcast of the
19 row panel U at once, resulting in a minimal communication volume and
20 a "very good" use of the connectivity if available. With P process
21 rows and assuming bi-directional links, the running time of this
22 function can be approximated by:
23
24 (log_2(P)+(P-1)) * lat + K * NB * LocQ(N) / bdwth
25
26 where NB is the number of rows of the row panel U, N is the global
27 number of columns being updated, lat and bdwth are the latency and
28 bandwidth of the network for double precision real words. K is a
29 constant in (2,3] that depends on the achieved bandwidth during a
30 simultaneous message exchange between two processes. An empirical
31 optimistic value of K is typically 2.4.
32
34 PBCST (local input/output) HPL_T_panel *
35 On entry, PBCST points to the data structure containing the
36 panel (to be broadcast) information.
37
38 IFLAG (local input/output) int *
39 On entry, IFLAG indicates whether or not the broadcast has
40 already been completed. If not, probing will occur, and the
41 outcome will be contained in IFLAG on exit.
42
43 PANEL (local input/output) HPL_T_panel *
44 On entry, PANEL points to the data structure containing the
45 panel information.
46
47 NN (local input) const int
48 On entry, NN specifies the local number of columns of the
49 trailing submatrix to be swapped and broadcast starting at
50 the current position. NN must be at least zero.
51
53 HPL_pdgesv (3), HPL_pdgesvK2 (3), HPL_pdupdateNT (3), HPL_pdup‐
54 dateTT (3), HPL_pipid (3), HPL_plindx1 (3), HPL_plindx10 (3),
55 HPL_spreadT (3), HPL_equil (3), HPL_rollT (3), HPL_dlaswp10N (3),
56 HPL_dlaswp01T (3), HPL_dlaswp06T (3).
57
58
59
60HPL 2.2 February 24, 2016 HPL_pdlaswp01T(3)