1HPL_pdfact(3) HPL Library Functions HPL_pdfact(3)
2
3
4
6 HPL_pdfact - recursive panel factorization.
7
9 #include "hpl.h"
10
11 void HPL_pdfact( HPL_T_panel * PANEL );
12
14 HPL_pdfact recursively factorizes a 1-dimensional panel of columns.
15 The RPFACT function pointer specifies the recursive algorithm to be
16 used, either Crout, Left- or Right looking. NBMIN allows to vary the
17 recursive stopping criterium in terms of the number of columns in the
18 panel, and NDIV allow to specify the number of subpanels each panel
19 should be divided into. Usuallly a value of 2 will be chosen. Finally
20 PFACT is a function pointer specifying the non-recursive algorithm to
21 to be used on at most NBMIN columns. One can also choose here between
22 Crout, Left- or Right looking. Empirical tests seem to indicate that
23 values of 4 or 8 for NBMIN give the best results.
24
25 Bi-directional exchange is used to perform the swap::broadcast
26 operations at once for one column in the panel. This results in a
27 lower number of slightly larger messages than usual. On P processes
28 and assuming bi-directional links, the running time of this function
29 can be approximated by (when N is equal to N0):
30
31 N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) +
32 N0^2 * ( M - N0/3 ) * gam2-3
33
34 where M is the local number of rows of the panel, lat and bdwth are
35 the latency and bandwidth of the network for double precision real
36 words, and gam2-3 is an estimate of the Level 2 and Level 3 BLAS
37 rate of execution. The recursive algorithm allows indeed to almost
38 achieve Level 3 BLAS performance in the panel factorization. On a
39 large number of modern machines, this operation is however latency
40 bound, meaning that its cost can be estimated by only the latency
41 portion N0 * log_2(P) * lat. Mono-directional links will double this
42 communication cost.
43
45 PANEL (local input/output) HPL_T_panel *
46 On entry, PANEL points to the data structure containing the
47 panel information.
48
50 HPL_dlocmax (3), HPL_dlocswpN (3), HPL_dlocswpT (3), HPL_pdmxswp (3),
51 HPL_pdpancrN (3), HPL_pdpancrT (3), HPL_pdpanllN (3), HPL_pdpanllT (3),
52 HPL_pdpanrlN (3), HPL_pdpanrlT (3), HPL_pdrpancrN (3), HPL_pdrpan‐
53 crT (3), HPL_pdrpanllN (3), HPL_pdrpanllT (3), HPL_pdrpanrlN (3),
54 HPL_pdrpanrlT (3).
55
56
57
58HPL 2.2 February 24, 2016 HPL_pdfact(3)