1HPL_plindx0(3)               HPL Library Functions              HPL_plindx0(3)
2
3
4

NAME

6       HPL_plindx0 - Compute local swapping index arrays.
7

SYNOPSIS

9       #include "hpl.h"
10
11       void  HPL_plindx0(  HPL_T_panel * PANEL, const int K, int * IPID, int *
12       LINDXA, int * LINDXAU, int * LLEN );
13

DESCRIPTION

15       HPL_plindx0 computes two local arrays  LINDXA and  LINDXAU   containing
16       the   local   source and final destination position  resulting from the
17       application of row interchanges.
18
19       On entry, the array  IPID  of length K is such that the row  of  global
20       index   IPID(i)   should be mapped onto row of global index  IPID(i+1).
21       Let  IA  be the global index of the first row to be swapped. For  k  in
22       [0..K/2),  the  row of global index IPID(2*k) should be mapped onto the
23       row of global index  IPID(2*k+1).  The question then, is  to  determine
24       which rows should ultimately be part of U.
25
26       First,  some  rows of the process ICURROW  may be swapped locally.  One
27       of this row belongs to U, the other one belongs to my local   piece  of
28       A.   The  other  rows of the current block are swapped with remote rows
29       and are thus not part of U. These rows however should be  sent   along,
30       and   grabbed  by the other processes  as we  progress in the  exchange
31       phase.
32
33       So, assume that I am  ICURROW  and consider a row of  index   IPID(2*i)
34       that  I  own. If I own IPID(2*i+1) as well and IPID(2*i+1) - IA is less
35       than N,  this row is locally swapped and should be copied into   U   at
36       the  position  IPID(2*i+1) - IA. No row will be exchanged for this one.
37       If IPID(2*i+1)-IA is greater than N, then the row IPID(2*i)  should  be
38       locally  copied  into my local piece of A at the position corresponding
39       to the row of global index IPID(2*i+1).
40
41       If the process  ICURROW does not own  IPID(2*i+1), then  row  IPID(2*i)
42       is  to  be swapped away and strictly speaking does not belong to U, but
43       to  A  remotely.  Since this  process will however send this  array  U,
44       this  row  is  copied into  U, exactly where the row IPID(2*i+1) should
45       go. For this, we search IPID for k1, such that IPID(2*k1) is  equal  to
46       IPID(2*i+1);  and  row  IPID(2*i) is to be copied in U  at the position
47       IPID(2*k1+1)-IA.
48
49       It is thus  important to put the rows that go into U, i.e.,  such  that
50       IPID(2*i+1)  -  IA is less than N at the begining of the array IPID. By
51       doing so,  U  is formed, and the local copy  is performed in  just  one
52       sweep.
53
54       Two  lists   LINDXA  and  LINDXAU are built.  LINDXA contains the local
55       index of the rows I have that should be copied. LINDXAU   contains  the
56       local  destination  information: if LINDXAU(k) >= 0, row LINDXA(k) of A
57       is to be copied in U at position LINDXAU(k). Otherwise,  row  LINDXA(k)
58       of  A  should be locally copied into A(-LINDXAU(k),:).  In the  process
59       ICURROW, the initial packing algorithm proceeds as follows.
60
61         for all entries in IPID,
62            if IPID(2*i) is in ICURROW,
63               if IPID(2*i+1) is in ICURROW,
64                  if( IPID(2*i+1) - IA < N )
65                   save corresponding local position
66                   of this row (LINDXA);
67                   save local position (LINDXAU) in U
68                   where this row goes;
69                   [copy row IPID(2*i) in U at position
70                   IPID(2*i+1)-IA; ];
71                  else
72                   save corresponding local position of
73                   this row (LINDXA);
74                   save local position (-LINDXAU) in A
75                   where this row goes;
76                   [copy row IPID(2*i) in my piece of A
77                   at IPID(2*i+1);]
78                  end if
79               else
80                  find k1 such that IPID(2*k1) = IPID(2*i+1);
81                  copy row IPID(2*i) in U at position
82                  IPID(2*k1+1)-IA;
83                  save corresponding local position of this
84                  row (LINDXA);
85                  save local position (LINDXAU) in U where
86                  this row goes;
87               end if
88            end if
89         end for
90
91       Second, if I am not the current row process  ICURROW, all  source  rows
92       in  IPID  that I own are part of U. Indeed,  they  are swapped with one
93       row  of  the  current  block  of rows,  and  the   main   factorization
94       algorithm  proceeds  one row after each other.  The processes different
95       from ICURROW,  should  exchange and accumulate  those rows  until  they
96       receive some data previously owned by the process ICURROW.
97
98       In  processes  different from  ICURROW,  the  initial packing algorithm
99       proceeds as follows.  Consider a row of global index IPID(2*i)  that  I
100       own.  When  I will be receiving data previously owned by ICURROW, i.e.,
101       U, row IPID(2*i) should  replace the row in U at  pos.  IPID(2*i+1)-IA,
102       and   this  particular row of U should be first copied into my piece of
103       A, at A(il,:),  where  il is the  local row  index   corresponding   to
104       IPID(2*i).  Now,initially,  this row will be packed into workspace, say
105       as the kth row of  that  work array.  The  following   algorithm   sets
106       LINDXAU[k]  to  IPID(2*i+1)-IA, that is the position in U where the row
107       should be copied. LINDXA(k) stores the local index in  A   where   this
108       row of U should be copied, i.e il.
109
110         for all entries in IPID,
111            if IPID(2*i) is not in ICURROW,
112               copy row IPID(2*i) in work array;
113               save corresponding local position
114               of this row (LINDXA);
115               save position (LINDXAU) in U where
116               this row should be copied;
117            end if
118         end for
119
120       Since  we  are at it, we also globally figure  out  how many rows every
121       process has. That is necessary, because it would rather  be  cumbersome
122       to   figure  it on  the fly  during the  bi-directional exchange phase.
123       This information is kept in the array  LLEN  of size NPROW.  Also  note
124       that the arrays LINDXA and LINDXAU are of max length equal to 2*N.
125

ARGUMENTS

127       PANEL   (local input/output)    HPL_T_panel *
128               On  entry,   PANEL  points to the data structure containing the
129               panel information.
130
131       K       (global input)          const int
132               On entry, K specifies the number of entries in IPID.  K  is  at
133               least 2*N, and at most 4*N.
134
135       IPID    (global input)          int *
136               On  entry,   IPID  is an array of length K. The first K entries
137               of that array contain the src and final  destination  resulting
138               from the application of the interchanges.
139
140       LINDXA  (local output)          int *
141               On  entry,  LINDXA  is an array of dimension 2*N. On exit, this
142               array contains the local indexes of the rows of A I  have  that
143               should be copied into U.
144
145       LINDXAU (local output)          int *
146               On  exit,  LINDXAU  is an array of dimension 2*N. On exit, this
147               array contains  the local destination  information  encoded  as
148               follows.   If  LINDXAU(k) >= 0, row  LINDXA(k)  of A  is  to be
149               copied in U at position LINDXAU(k).  Otherwise,  row  LINDXA(k)
150               of A should be locally copied into A(-LINDXAU(k),:).
151
152       LLEN    (global output)         int *
153               On  entry,   LLEN  is  an array  of length  NPROW.  On exit, it
154               contains how many rows every process has.
155

SEE ALSO

157       HPL_pdlaswp00N (3),       HPL_pdlaswp00T (3),       HPL_pdlaswp01N (3),
158       HPL_pdlaswp01T (3).
159
160
161
162HPL 2.1                        October 26, 2012                 HPL_plindx0(3)
Impressum