1INTRO_BLAS1(l) INTRO_BLAS1(l)
2
3
4
6 INTRO_BLAS1 - Introduction to vector-vector linear algebra (matrix)
7 subprograms
8
10 The Level 1 BLAS perform basic vector-vector operations. The following
11 three types of vector-vector operations are available:
12
13 Routines for scaling, copying, swapping, and computing linear combina‐
14 tion of vectors.
15
16 Routines for computing dot products between vectors and various vector
17 norms.
18
19 Routines for generating or applying plane or modified plane rotations.
20
21 The Basic Linear Algebra Subprograms (BLAS) were developed to enhance
22 the portability of published linear algebra codes. Because these sub‐
23 programs are portable, modular, self-documenting, and efficient, you
24 can incorporate them into your programs.
25
26 To realize the full power of the BLAS you must understand the following
27 three subjects:
28
29 - FORTRAN storage of arrays
30
31 - FORTRAN array argument association
32
33 - BLAS indexing conventions
34
35 FORTRAN storage of arrays
36 Arrays in FORTRAN are stored in column major order. This means that
37 the eariler indexes in an array declaration toggle first. Consider the
38 following specifications:
39
40 DIMENSION A(N1,N2),B(N3)
41 EQUIVALENCE (A,B)
42
43 where N3 = N1 * N2. Then A(I,J) is associated with the same memory
44 location as B(K) where
45
46 K = I + (J-1) * N1
47
48 This means that successive elements of a column of A are adjacent in
49 memory, while successive elements of a row of A are stored with a dif‐
50 ference of N1 storage units between them. Remember that the size of a
51 storage unit depends on the data type.
52
53 FORTRAN array argument association
54 When a FORTRAN subprogram is called with an array element as argument,
55 the value is not passed. Instead, the subprogram receives the address
56 in memory of the element. Consider the following code segment:
57
58 M=11
59 N=13
60 REAL A(M,N)
61 COL = 3
62 CALL SUBR (A(1,COL),M)
63 .
64 .
65 .
66 SUBROUTINE SUBR (X,N)
67 REAL X(N)
68 .
69 .
70 .
71
72 In this example, the subroutine SUBR is given the address of the first
73 element of the third column of A. Because it treats that argument as a
74 one-dimensional array, successive elements X(1), X(2), ..., occupy the
75 same memory locations as successive elements of the third column of A:
76 that is, A(1,3), A(2,3), .... Hence, the entire third column of A is
77 available to the subprogram.
78
79 BLAS Indexing Conventions
80 The rest of this section describes the topics of manipulating array
81 sections, dealing with stride arguments, and handling backward storage.
82
83 A vector description in BLAS is defined by three quantities:
84
85 - array or starting element within an array, for instance the variable
86 X or X(I,J)
87
88 - vector length or number of elements, for instance the variable N
89
90 - the increment, sometimes called the stride, that defines the number
91 of storage units between successive vector elements, for instance the
92 variable INCX.
93
94 The notation for describing a BLAS vector in calling a BLAS subroutine
95 is the triad (N,X,INCX). A few very brief examples follow. If X is a
96 one dimensional array of length N, then (N,X,1) represents forward
97 storage of X i.e. X(1), X(2), ..., X(N) and (N,X,-1) represents back‐
98 ward storage of X i.e. X(N), X(N-1), ..., X(1). If A is an M by N
99 array, then (M,A(1,J),1) represents column J and (N,A(I,1),M) repre‐
100 sents row I. Finally, if an M by N matrix is embedded in the upper
101 left-hand corner of an array B of size LDB by NMAX, then column J is
102 (M,B(1,J),1) and row I is (N,B(I,1),LDB). More specific details fol‐
103 low.
104
105 Forward Storage
106 As an example of the BLAS vector declaration using the above, suppose
107 that X represents a declared real array. Let N be the vector length
108 and let INCX be the increment. Suppose that a logical vector x with
109 components x(i), i = 1, 2,..., N, is to be stored in X. If INCX >= 0,
110 then x(i) is stored in X(1 + (I-1) * INCX). This is known as forward
111 array storage starting at X(1) with stride equal to INCX, ending with
112 X(1 - (N-1) * INCX). Thus, if N = 4 and INCX = 2, the logical vector x
113 with components x(1), x(2), x(3), and x(4) are stored in memory in the
114 array elements X(1), X(3), X(5), and X(7), respectively.
115
116 This method of indexing, using a starting element, a number of ele‐
117 ments, and a stride, is especially useful for accessing one-dimensional
118 vectors in multidimensional arrays. For instance, if A is defined as
119
120 REAL A(M,N)
121
122 Then to access the 2nd row of A, one uses forward storage with an
123 stride of M. Thus a BLAS routine call with
124
125 X=A(2,1)
126
127 and increment/stride of
128
129 INCX=M
130
131 will access A(2,i) for i = 1,2,...,N. To access the third column of A
132 in a BLAS routine call with
133
134 X=A(1,3)
135
136 and increment/stride of
137
138 INCX=1
139
140 This approach also works with multidimensional arrays. As an example,
141 if A is defined as
142
143 REAL A(M,N,P)
144
145 to access the P elements of A at row 3 and column 4 one could call a
146 BLAS routine with starting address X of
147
148 X=A(3,4,1)
149
150 and increment/stride of
151
152 INCX=M*N
153
154 Backward Storage
155 Some BLAS routines permit backward storage of vectors, which is speci‐
156 fied by using a negative increment INCX. If INCX < 0, then x(i) is
157 stored "backwards" in X. Specifically x(i) is stored in X(1 + (N-I) *
158 |INCX|) or equivalently in X(1 - (N-I) * INCX). This is called back‐
159 ward storage starting from X(1 - (N-1) * INCX) with stride equal to
160 INCX, ending with X(1). Thus, if N = 4 and INCX = -2, the logical vec‐
161 tor components x(1), x(2), x(3), and x(4) are stored in the array ele‐
162 ments X(7), X(5), X(3), and X(1), respectively.
163
164 Note: INCX = 0 is permitted by some BLAS routines and is not permitted
165 by others. When it is allowed, it means that logical vector x is a
166 vector of length N, all whose components are equal the value of X(1).
167
168 Further Stride Examples
169 The following examples illustrate how to use increment arguments to
170 perform different operations with the same subprogram. These examples
171 use the BLAS function SDOT, with the following declarations:
172
173 INTEGER*4 N,INCX,INCY
174 REAL*4 SDOT,S,X(1+(N-1)*|INCX|),Y(1+(N-1)*|INCY|)
175 S = SDOT (N, X,INCX, Y,INCY)
176
177 This sets S to the dot product of the vectors (N,X,INCX) and
178 (N,Y,INCY).
179
180 Example 1: Compute the dot product T = X(1)*Y(1) + X(2)*Y(2) +
181 X(3)*Y(3) + X(4)*Y(4):
182
183 REAL*4 SDOT,T,X(4),Y(4)
184 T = SDOT (4, X,1, Y,1)
185
186 Example 2: Compute the convolution T = X(1)*Y(4) + X(2)*Y(3) +
187 X(3)*Y(2) + X(4)*Y(1):
188
189 REAL*4 SDOT,T,X(4),Y(4)
190 T = SDOT (4, X,1, Y,-1)
191
192 Example 3: Compute the dot product Y(2) = A(2,1)*X(1) + A(2,2)*X(2) +
193 A(2,3)*X(3), which is the dot product of the second row of an M by 3
194 matrix A, stored in a 10 by 3 array, with a 3-element vector X:
195
196 INTEGER*4 N,LDA
197 PARAMETER (LDA = 10)
198 REAL*4 SDOT,A(LDA,3),X(3),Y(LDA)
199 N = 3
200 Y(2) = SDOT (N, A(2,1),LDA, X,1)
201
202 BLAS Data Types
203 The following data types are used in the BLAS routines:
204
205 - REAL: Fortran "real" data type, 32-bit floating point; these routine
206 names begin with S.
207
208 - COMPLEX: Fortran "complex" data type, two 32-bit floating point
209 reals; these routine names begin with C.
210
211 - DOUBLE PRECISION: Fortran "double precision" data type, 64-bit float‐
212 ing point; these routine names begin with D.
213
214 - DOUBLE COMPLEX: Fortran "double complex" data type, two 64-bit float‐
215 ing point doubles; these routine names begin with Z.
216
217 BLAS Naming Conventions
218 The following table describes the naming conventions for these rou‐
219 tines:
220
221 -------------------------------------------------------------
222 64-bit
223 complex
224 64-bit real (double
225 (double 32-bit complex
226 32-bit real precision) complex precision)
227 -------------------------------------------------------------
228 form: Sname Dname Cname Zname
229 example:SAXPY DAXPY CAXPY ZAXPY
230 -------------------------------------------------------------
231
232 FORTRAN type declaration for functions
233 Always declare the data type of external functions. Declaring the data
234 type of the complex Level 1 BLAS functions is particularily important
235 because, based on the first letter of their names and the Fortran data
236 typing rules, the default implied data type would be REAL.
237 Fortran type
238 declarations for function names follow:
239
240 Type Function Name
241
242 REAL SASUM, SCASUM, SCNRM2, SDOT, SNRM2, SSUM
243
244 COMPLEX CDOTC, CDOTU, CSUM
245
246 DOUBLE PRECISION DASUM, DZASUM, DDOT, DNRM2, DZNRM2, DSUM
247
248 DOUBLE COMPLEX ZDOTC, ZDOTU, ZSUM
249
250 INTEGER ISAMAX, IDAMAX, ICAMAX, IZAMAX, ISAMIN, IDAMIN,
251 ISMAX, IDMAX, ISMIN, IDMIN
252
253 Summary Table of Level 1 BLAS Routines
254 The following table contains the purpose, operation, and name of each
255 Level 1 BLAS routine. The first routine name listed in each table
256 block is the name of the manual page that contains documentation for
257 any routines listed in that block. The routines marked with an aster‐
258 isk (*) are extensions to the standard set of Level 1 BLAS routines.
259 For the complete details about each operation, see the individual man
260 pages. Note: functions marked with an asterisk [*] are extensions to
261 the standard set of Level 1 BLAS routines that may not be present on
262 all systems.
263
264 The man(1) command can find a man page online by either the real, com‐
265 plex, double precision, or double complex name.
266
267 --------------------------------------------------------------
268 Purpose Operation
269 --------------------------------------------------------------
270 Sums the absolute n SASUM
271 values of the elements sasum <- ||x|| = Sum |x | DASUM
272 of a real vector (also 1 i=1 i
273 called the l1 norm)
274
275 Sums the absolute scasum <- ||Real[x]|| + SCASUM
276 values of the real and 1 DZASUM
277 imaginary parts of the ||Imag[x]|| =
278 elements of a complex 1
279 vector n
280 Sum |Real[x ]| +
281 i=1 i
282
283 n
284 Sum |Imag[x ]|
285 i=1 i
286
287 Adds a scalar multiple y <- alpha*x + beta*y SAXPBY*
288 of a real or complex DAXPBY*
289 vector to a scalar CAXPBY*
290 multiple of another ZAXPBY*
291 vector
292
293 Adds a scalar multiple y <- alpha*x + y SAXPY
294 of a real or complex DAXPY
295 vector to another CAXPY
296 vector ZAXPY
297
298 Copies a real or y <- x SCOPY
299 complex vector into DCOPY
300 another vector CCOPY
301 ZCOPY
302
303 Computes a dot product T n SDOT
304 of two real or complex sdot <- x y = Sum x y DDOT
305 vectors i=1 i i
306
307 H n _ CDOTC
308 cdotc <- x y = Sum x y ZDOTC
309 i=1 i i
310
311 T n CDOTU
312 cdotu <- x y = Sum x y ZDOTU
313 i=1 i i
314
315 Computes the Hadamard z(i):=alpha x(i) y(i) + beta SHAD*
316 product of two vectors z(i) DHAD*
317 CHAD*
318 ZHAD*
319
320 Computes the Euclidean snrm2 <- ||x|| = SNRM2
321 norm (also called l2 2 DNRM2
322 norm) of a real or n 2
323 complex vector sqrt(Sum (x )
324 i=1 i
325
326 scnrm2 <- ||x|| = SCNRM2
327 2 DZNRM2
328 n _
329 sqrt(Sum (x x )
330 i=1 i i
331
332
333 Applies a real plane CSROT*
334 rotation to a pair of ZDROT*
335 complex vectors
336
337 Applies an orthogonal SROT
338 plane rotation DROT
339
340 Constructs a Givens SROTG
341 plane rotation DROTG
342 CROTG*
343 ZROTG*
344
345 Applies a modified SROTM
346 Givens plane rotation DROTM
347
348 Constructs a modified SROTMG
349 Givens plane rotation DROTMG
350
351 Scales a real or x <- alpha x SSCAL
352 complex vector DSCAL
353 CSCAL
354 ZSCAL
355 CSSCAL
356 ZDSCAL
357
358 Sums the elements of a n SSUM*
359 real or complex vector sum <- Sum x DSUM*
360 i=1 i CSUM*
361 ZSUM*
362
363 Swaps two real or two x <-> y SSWAP
364 complex vectors DSWAP
365 CSWAP
366 ZSWAP
367
368 Searches a vector for isamax <- MAX |x | ISAMAX
369 the first occurrence of j IDAMAX
370 the maximum absolute ICAMAX
371 value IZAMAX
372
373 Searches a vector for isamin <- MIN |x | ISAMIN*
374 the first occurrence of j IDAMIN*
375 the minimum absolute
376 value
377
378 Searches a vector for ismax <- MAX x ISMAX*
379 the first occurrence of j IDMAX*
380 the minimum absolute
381 value
382
383 Searches a vector for ismin <- MIN x ISMIN*
384 the first occurrence of j IDMIN*
385 the minimum absolute
386 value
387 --------------------------------------------------------------
388
389 In addition to the mathematical functions defined above, several search
390 functions are a part of Level 1 BLAS; these functions are listed below:
391
392 ISAMAX, ICAMAX, ISAMIN*, ISMAX*, ISMIN*
393 IDAMAX IZAMAX, IDAMIN*, IDMAX*, IDMIN*
394
396 Many of the stared functions have not been implemented yet in a free
397 software.
398
400 intro_blas2(1), intro_blas3(1)
401
403 Lawson, C., Hanson, R., Kincaid, D., and Krogh, F., "Basic Linear Alge‐
404 bra Subprograms for Fortran Usage," ACM Transactions on Mathematical
405 Software, 5 (1979), pp. 308 - 325.
406
408 See the individual man pages for implementation details and full argu‐
409 ment listings
410
412 John L. Weatherwax
413
414
415
416 12 August 05 INTRO_BLAS1(l)