1HPL_pdfact(3)                HPL Library Functions               HPL_pdfact(3)
2
3
4

NAME

6       HPL_pdfact - recursive panel factorization.
7

SYNOPSIS

9       #include "hpl.h"
10
11       void HPL_pdfact( HPL_T_panel * PANEL );
12

DESCRIPTION

14       HPL_pdfact  recursively  factorizes a  1-dimensional  panel of columns.
15       The  RPFACT  function pointer specifies the recursive algorithm  to  be
16       used,  either  Crout, Left- or Right looking.  NBMIN allows to vary the
17       recursive stopping criterium in terms of the number of columns  in  the
18       panel,  and   NDIV  allow to specify the number of subpanels each panel
19       should be divided into. Usuallly a value of 2 will be  chosen.  Finally
20       PFACT  is  a function pointer specifying the non-recursive algorithm to
21       to be used on at most NBMIN columns. One can also choose  here  between
22       Crout,  Left-  or Right looking.  Empirical tests seem to indicate that
23       values of 4 or 8 for NBMIN give the best results.
24
25       Bi-directional  exchange  is  used  to  perform   the   swap::broadcast
26       operations   at  once  for one column in the panel.  This  results in a
27       lower number of slightly larger  messages than usual.  On  P  processes
28       and  assuming  bi-directional links,  the running time of this function
29       can be approximated by (when N is equal to N0):
30
31          N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) +
32          N0^2 * ( M - N0/3 ) * gam2-3
33
34       where M is the local number of rows of  the panel, lat and  bdwth   are
35       the  latency  and bandwidth of the network for  double  precision  real
36       words, and  gam2-3  is  an estimate of the  Level 2 and Level  3   BLAS
37       rate  of  execution. The  recursive  algorithm  allows indeed to almost
38       achieve  Level 3 BLAS  performance  in the panel factorization.   On  a
39       large   number  of modern machines,  this  operation is however latency
40       bound,  meaning  that its cost can  be estimated  by only  the  latency
41       portion  N0  * log_2(P) * lat.  Mono-directional links will double this
42       communication cost.
43

ARGUMENTS

45       PANEL   (local input/output)    HPL_T_panel *
46               On entry,  PANEL  points to the data structure  containing  the
47               panel information.
48

SEE ALSO

50       HPL_dlocmax (3),  HPL_dlocswpN (3),  HPL_dlocswpT (3), HPL_pdmxswp (3),
51       HPL_pdpancrN (3), HPL_pdpancrT (3), HPL_pdpanllN (3), HPL_pdpanllT (3),
52       HPL_pdpanrlN (3),   HPL_pdpanrlT (3),   HPL_pdrpancrN (3),  HPL_pdrpan‐
53       crT (3),   HPL_pdrpanllN (3),   HPL_pdrpanllT (3),   HPL_pdrpanrlN (3),
54       HPL_pdrpanrlT (3).
55
56
57
58HPL 2.1                        October 26, 2012                  HPL_pdfact(3)
Impressum