intro_blas1(l)

1INTRO_BLAS1(l)                                                  INTRO_BLAS1(l)
2
3
4

NAME

6       INTRO_BLAS1  -  Introduction  to  vector-vector linear algebra (matrix)
7       subprograms
8

DESCRIPTION

10       The Level 1 BLAS perform basic vector-vector operations.  The following
11       three types of vector-vector operations are available:
12
13       Routines  for scaling, copying, swapping, and computing linear combina‐
14       tion of vectors.
15
16       Routines for computing dot products between vectors and various  vector
17       norms.
18
19       Routines for generating or applying plane or modified plane rotations.
20
21       The  Basic  Linear Algebra Subprograms (BLAS) were developed to enhance
22       the portability of published linear algebra codes.  Because these  sub‐
23       programs  are  portable,  modular, self-documenting, and efficient, you
24       can incorporate them into your programs.
25
26       To realize the full power of the BLAS you must understand the following
27       three subjects:
28
29       - FORTRAN storage of arrays
30
31       - FORTRAN array argument association
32
33       - BLAS indexing conventions
34
35   FORTRAN storage of arrays
36       Arrays  in  FORTRAN  are stored in column major order.  This means that
37       the eariler indexes in an array declaration toggle first.  Consider the
38       following specifications:
39
40       DIMENSION A(N1,N2),B(N3)
41       EQUIVALENCE (A,B)
42
43       where  N3  =  N1  * N2.  Then A(I,J) is associated with the same memory
44       location as B(K) where
45
46       K = I + (J-1) * N1
47
48       This means that successive elements of a column of A  are  adjacent  in
49       memory,  while successive elements of a row of A are stored with a dif‐
50       ference of N1 storage units between them.  Remember that the size of  a
51       storage unit depends on the data type.
52
53   FORTRAN array argument association
54       When  a FORTRAN subprogram is called with an array element as argument,
55       the value is not passed.  Instead, the subprogram receives the  address
56       in memory of the element.  Consider the following code segment:
57
58       M=11
59       N=13
60       REAL A(M,N)
61       COL = 3
62       CALL SUBR (A(1,COL),M)
63        .
64        .
65        .
66       SUBROUTINE SUBR (X,N)
67       REAL X(N)
68        .
69        .
70        .
71
72       In  this example, the subroutine SUBR is given the address of the first
73       element of the third column of A.  Because it treats that argument as a
74       one-dimensional  array, successive elements X(1), X(2), ..., occupy the
75       same memory locations as successive elements of the third column of  A:
76       that  is,  A(1,3), A(2,3), ....  Hence, the entire third column of A is
77       available to the subprogram.
78
79   BLAS Indexing Conventions
80       The rest of this section describes the  topics  of  manipulating  array
81       sections, dealing with stride arguments, and handling backward storage.
82
83       A vector description in BLAS is defined by three quantities:
84
85       -  array or starting element within an array, for instance the variable
86       X or X(I,J)
87
88       - vector length or number of elements, for instance the variable N
89
90       - the increment, sometimes called the stride, that defines  the  number
91       of  storage  units between successive vector elements, for instance the
92       variable INCX.
93
94       The notation for describing a BLAS vector in calling a BLAS  subroutine
95       is  the triad (N,X,INCX).  A few very brief examples follow.  If X is a
96       one dimensional array of length  N,  then  (N,X,1)  represents  forward
97       storage  of  X i.e. X(1), X(2), ..., X(N) and (N,X,-1) represents back‐
98       ward storage of X i.e. X(N), X(N-1), ..., X(1).  If A  is  an  M  by  N
99       array,  then  (M,A(1,J),1)  represents column J and (N,A(I,1),M) repre‐
100       sents row I.  Finally, if an M by N matrix is  embedded  in  the  upper
101       left-hand  corner  of  an array B of size LDB by NMAX, then column J is
102       (M,B(1,J),1) and row I is (N,B(I,1),LDB).  More specific  details  fol‐
103       low.
104
105   Forward Storage
106       As  an  example of the BLAS vector declaration using the above, suppose
107       that X represents a declared real array.  Let N be  the  vector  length
108       and  let  INCX  be the increment.  Suppose that a logical vector x with
109       components x(i), i = 1, 2,..., N, is to be stored in X.  If INCX >=  0,
110       then  x(i)  is stored in X(1 + (I-1) * INCX).  This is known as forward
111       array storage starting at X(1) with stride equal to INCX,  ending  with
112       X(1 - (N-1) * INCX).  Thus, if N = 4 and INCX = 2, the logical vector x
113       with components x(1), x(2), x(3), and x(4) are stored in memory in  the
114       array elements X(1), X(3), X(5), and X(7), respectively.
115
116       This  method  of  indexing,  using a starting element, a number of ele‐
117       ments, and a stride, is especially useful for accessing one-dimensional
118       vectors in multidimensional arrays.  For instance, if A is defined as
119
120       REAL A(M,N)
121
122       Then  to  access  the  2nd  row  of A, one uses forward storage with an
123       stride of M.  Thus a BLAS routine call with
124
125       X=A(2,1)
126
127       and increment/stride of
128
129       INCX=M
130
131       will access A(2,i) for i = 1,2,...,N.  To access the third column of  A
132       in a BLAS routine call with
133
134       X=A(1,3)
135
136       and increment/stride of
137
138       INCX=1
139
140       This  approach also works with multidimensional arrays.  As an example,
141       if A is defined as
142
143       REAL A(M,N,P)
144
145       to access the P elements of A at row 3 and column 4 one  could  call  a
146       BLAS routine with starting address X of
147
148       X=A(3,4,1)
149
150       and increment/stride of
151
152       INCX=M*N
153
154   Backward Storage
155       Some  BLAS routines permit backward storage of vectors, which is speci‐
156       fied by using a negative increment INCX.  If INCX <  0,  then  x(i)  is
157       stored  "backwards" in X.  Specifically x(i) is stored in X(1 + (N-I) *
158       |INCX|) or equivalently in X(1 - (N-I) * INCX).  This is  called  back‐
159       ward  storage  starting  from  X(1 - (N-1) * INCX) with stride equal to
160       INCX, ending with X(1).  Thus, if N = 4 and INCX = -2, the logical vec‐
161       tor  components x(1), x(2), x(3), and x(4) are stored in the array ele‐
162       ments X(7), X(5), X(3), and X(1), respectively.
163
164       Note: INCX = 0 is permitted by some BLAS routines and is not  permitted
165       by  others.   When  it  is allowed, it means that logical vector x is a
166       vector of length N, all whose components are equal the value of X(1).
167
168   Further Stride Examples
169       The following examples illustrate how to  use  increment  arguments  to
170       perform  different operations with the same subprogram.  These examples
171       use the BLAS function SDOT, with the following declarations:
172
173       INTEGER*4 N,INCX,INCY
174       REAL*4 SDOT,S,X(1+(N-1)*|INCX|),Y(1+(N-1)*|INCY|)
175       S = SDOT (N, X,INCX, Y,INCY)
176
177       This  sets  S  to  the  dot  product  of  the  vectors  (N,X,INCX)  and
178       (N,Y,INCY).
179
180       Example  1:  Compute  the  dot  product  T  =  X(1)*Y(1)  + X(2)*Y(2) +
181       X(3)*Y(3) + X(4)*Y(4):
182
183       REAL*4 SDOT,T,X(4),Y(4)
184       T = SDOT (4, X,1, Y,1)
185
186       Example 2:  Compute  the  convolution  T  =  X(1)*Y(4)  +  X(2)*Y(3)  +
187       X(3)*Y(2) + X(4)*Y(1):
188
189       REAL*4 SDOT,T,X(4),Y(4)
190       T = SDOT (4, X,1, Y,-1)
191
192       Example  3:  Compute the dot product Y(2) = A(2,1)*X(1) + A(2,2)*X(2) +
193       A(2,3)*X(3), which is the dot product of the second row of an  M  by  3
194       matrix A, stored in a 10 by 3 array, with a 3-element vector X:
195
196       INTEGER*4 N,LDA
197       PARAMETER (LDA = 10)
198       REAL*4 SDOT,A(LDA,3),X(3),Y(LDA)
199       N = 3
200       Y(2) = SDOT (N, A(2,1),LDA, X,1)
201
202   BLAS Data Types
203       The following data types are used in the BLAS routines:
204
205       -  REAL: Fortran "real" data type, 32-bit floating point; these routine
206       names begin with S.
207
208       - COMPLEX: Fortran "complex"  data  type,  two  32-bit  floating  point
209       reals; these routine names begin with C.
210
211       - DOUBLE PRECISION: Fortran "double precision" data type, 64-bit float‐
212       ing point; these routine names begin with D.
213
214       - DOUBLE COMPLEX: Fortran "double complex" data type, two 64-bit float‐
215       ing point doubles; these routine names begin with Z.
216
217   BLAS Naming Conventions
218       The  following  table  describes  the naming conventions for these rou‐
219       tines:
220
221       -------------------------------------------------------------
222                                                         64-bit
223                                                         complex
224                             64-bit real                 (double
225                             (double       32-bit        complex
226               32-bit real   precision)    complex       precision)
227       -------------------------------------------------------------
228       form:   Sname         Dname        Cname         Zname
229       example:SAXPY         DAXPY        CAXPY         ZAXPY
230       -------------------------------------------------------------
231
232   FORTRAN type declaration for functions
233       Always declare the data type of external functions.  Declaring the data
234       type  of  the complex Level 1 BLAS functions is particularily important
235       because, based on the first letter of their names and the Fortran  data
236       typing rules, the default implied data type would be REAL.
237                                                                 Fortran  type
238       declarations for function names follow:
239
240       Type                  Function Name
241
242       REAL                  SASUM, SCASUM, SCNRM2, SDOT, SNRM2, SSUM
243
244       COMPLEX               CDOTC, CDOTU, CSUM
245
246       DOUBLE PRECISION      DASUM, DZASUM, DDOT, DNRM2, DZNRM2, DSUM
247
248       DOUBLE COMPLEX        ZDOTC, ZDOTU, ZSUM
249
250       INTEGER               ISAMAX, IDAMAX, ICAMAX, IZAMAX, ISAMIN, IDAMIN,
251                             ISMAX, IDMAX, ISMIN, IDMIN
252
253   Summary Table of Level 1 BLAS Routines
254       The following table contains the purpose, operation, and name  of  each
255       Level  1  BLAS  routine.   The  first routine name listed in each table
256       block is the name of the manual page that  contains  documentation  for
257       any  routines listed in that block.  The routines marked with an aster‐
258       isk (*) are extensions to the standard set of Level  1  BLAS  routines.
259       For  the  complete details about each operation, see the individual man
260       pages.  Note: functions marked with an asterisk [*] are  extensions  to
261       the  standard  set  of Level 1 BLAS routines that may not be present on
262       all systems.
263
264       The man(1) command can find a man page online by either the real,  com‐
265       plex, double precision, or double complex name.
266
267       --------------------------------------------------------------
268       Purpose                  Operation
269       --------------------------------------------------------------
270       Sums the absolute                          n             SASUM
271       values of the elements   sasum <- ||x|| = Sum |x |       DASUM
272       of a real vector (also                 1  i=1   i
273       called the l1 norm)
274
275       Sums the absolute        scasum <- ||Real[x]||  +        SCASUM
276       values of the real and                        1          DZASUM
277       imaginary parts of the            ||Imag[x]||  =
278       elements of a complex                        1
279       vector                             n
280                                         Sum |Real[x ]| +
281                                         i=1        i
282
283                                          n
284                                         Sum |Imag[x ]|
285                                         i=1        i
286
287       Adds a scalar multiple   y <- alpha*x + beta*y           SAXPBY*
288       of a real or complex                                     DAXPBY*
289       vector to a scalar                                       CAXPBY*
290       multiple of another                                      ZAXPBY*
291       vector
292
293       Adds a scalar multiple   y <- alpha*x + y                SAXPY
294       of a real or complex                                     DAXPY
295       vector to another                                        CAXPY
296       vector                                                   ZAXPY
297
298       Copies a real or         y <- x                          SCOPY
299       complex vector into                                      DCOPY
300       another vector                                           CCOPY
301                                                                ZCOPY
302
303       Computes a dot product            T       n              SDOT
304       of two real or complex   sdot <- x  y  = Sum x y         DDOT
305       vectors                                  i=1  i i
306
307                                          H      n  _           CDOTC
308                                cdotc <- x  y = Sum x y         ZDOTC
309                                                i=1  i i
310
311                                          T       n             CDOTU
312                                cdotu <- x  y  = Sum x y        ZDOTU
313                                                 i=1  i i
314
315       Computes the Hadamard    z(i):=alpha x(i) y(i) + beta    SHAD*
316       product of two vectors   z(i)                            DHAD*
317                                                                CHAD*
318                                                                ZHAD*
319
320       Computes the Euclidean   snrm2 <- ||x|| =                SNRM2
321       norm (also called l2                   2                 DNRM2
322       norm) of a real or             n     2
323       complex vector           sqrt(Sum (x  )
324                                     i=1   i
325
326                                scnrm2 <- ||x|| =               SCNRM2
327                                               2                DZNRM2
328                                      n   _
329                                sqrt(Sum (x  x )
330                                     i=1   i  i
331
332
333       Applies a real plane                                     CSROT*
334       rotation to a pair of                                    ZDROT*
335       complex vectors
336
337       Applies an orthogonal                                    SROT
338       plane rotation                                           DROT
339
340       Constructs a Givens                                      SROTG
341       plane rotation                                           DROTG
342                                                                CROTG*
343                                                                ZROTG*
344
345       Applies a modified                                       SROTM
346       Givens plane rotation                                    DROTM
347
348       Constructs a modified                                    SROTMG
349       Givens plane rotation                                    DROTMG
350
351       Scales a real or         x <- alpha x                    SSCAL
352       complex vector                                           DSCAL
353                                                                CSCAL
354                                                                ZSCAL
355                                                                CSSCAL
356                                                                ZDSCAL
357
358       Sums the elements of a           n                       SSUM*
359       real or complex vector   sum <- Sum x                    DSUM*
360                                       i=1  i                   CSUM*
361                                                                ZSUM*
362
363       Swaps two real or two    x <-> y                         SSWAP
364       complex vectors                                          DSWAP
365                                                                CSWAP
366                                                                ZSWAP
367
368       Searches a vector for    isamax <- MAX |x |              ISAMAX
369       the first occurrence of                  j               IDAMAX
370       the maximum absolute                                     ICAMAX
371       value                                                    IZAMAX
372
373       Searches a vector for    isamin <- MIN |x |              ISAMIN*
374       the first occurrence of                  j               IDAMIN*
375       the minimum absolute
376       value
377
378       Searches a vector for    ismax <- MAX  x                 ISMAX*
379       the first occurrence of                 j                IDMAX*
380       the minimum absolute
381       value
382
383       Searches a vector for    ismin <- MIN x                  ISMIN*
384       the first occurrence of                j                 IDMIN*
385       the minimum absolute
386       value
387       --------------------------------------------------------------
388
389       In addition to the mathematical functions defined above, several search
390       functions are a part of Level 1 BLAS; these functions are listed below:
391
392       ISAMAX, ICAMAX, ISAMIN*, ISMAX*, ISMIN*
393       IDAMAX  IZAMAX, IDAMIN*, IDMAX*, IDMIN*
394

TO DO

396       Many of the stared functions have not been implemented yet  in  a  free
397       software.
398

REFERENCES

403       Lawson, C., Hanson, R., Kincaid, D., and Krogh, F., "Basic Linear Alge‐
404       bra Subprograms for Fortran Usage," ACM  Transactions  on  Mathematical
405       Software, 5 (1979), pp. 308 - 325.
406

IMPLEMENTATION

408       See  the individual man pages for implementation details and full argu‐
409       ment listings
410

AUTHOR

412       John L. Weatherwax
413
414
415
416                                 12 August 05                   INTRO_BLAS1(l)

NAME

DESCRIPTION

TO DO

SEE ALSO

REFERENCES

IMPLEMENTATION

AUTHOR