1LAMTRACE(1) LAM COMMANDS LAMTRACE(1)
2
3
4
6 lamtrace - Unload LAM trace data.
7
9 lamtrace [-hkvR] [-mpi] [-l <listno>] [-f <#secs>] [<filename>]
10 [<nodes>] [<processes>]
11
13 -h Print useful information on this command.
14
15 -k Copy and do not remove trace data.
16
17 -v Be verbose.
18
19 -R Delete all trace data from the specified nodes.
20
21 -l Unload only from the given list number.
22
23 -mpi Unload trace data for an MPI application.
24
25 -f <#secs> Signal target processes to flush trace data to the dae‐
26 mon. Then wait <#secs> before unloading.
27
28 <filename> Place trace data into this file (default: def.lamtr).
29
31 The -t option of mpirun(1) and loadgo(1) allows the application to gen‐
32 erate execution traces. These traces are first stored in a buffer
33 within each application process. When the buffer is full and when the
34 application terminates, the runtime buffer is flushed to the trace dae‐
35 mon (a structural component within the LAM daemon). The trace daemon
36 will also collect data up to a pre-compiled limit. Beyond this limit,
37 the oldest traces in will be forgotten in favor of the newer traces.
38
39 After an application has finished, the record of its execution is
40 stored in the trace daemons of each node that was running the applica‐
41 tion. The lamtrace command can be used to retrieve these traces and
42 store them in one file for display by a performance visualization tool,
43 such as xmpi(1). If the application was started by xmpi(1), lamtrace
44 is not normally needed as the equivalent functionality is invoked with
45 a button.
46
47 Incomplete trace data can be unloaded while the application is running.
48 The output file must not exist prior to invoking lamtrace. This is a
49 good situation to use the -k option, which preserves the trace daemon's
50 contents after unloading. Each reload will then get the entire run's
51 trace data up to the present time.
52
53 A running process is likely to be holding the most recent trace data in
54 an internal buffer. A standard LAM signal, LAM_SIGTRACE (see doom(1)),
55 causes trace enabled processes to flush the internal trace buffer to
56 the daemon. The -f option tells lamtrace to send this signal to all
57 target processes before unloading trace data. A race condition devel‐
58 ops between the target process storing trace data to the daemon and the
59 unloading procedure. The problem is foisted upon the user who gives a
60 delay parameter after -f.
61
62 Trace data are organized by node, process identifier and list number.
63 A process can store traces on any node, although the local node is the
64 obvious, least intrusive choice. The process can identify itself in
65 any meaningful way (getpid(2) is a good idea) The list number is also
66 chosen by the process. These values may be set by an instrumented li‐
67 brary, such as libmpi(3), or directly by the application with lam_rtr‐
68 store(2). Unloading flexibility follows that of storing with the -l
69 option selecting the list number, and standard LAM command line mnemon‐
70 ics selecting nodes and processes.
71
72 Dropping old traces when a pre-compiled volume limit is reached only
73 happens for positive list numbers. Traces in negatively numbered lists
74 will be collected until the underlying system runs out of memory. Do
75 not use negative list numbers for high volume trace data.
76
77 If no process selection is given on the command line, trace data will
78 be unloaded for all processes on each specified node.
79
80 LAM, its trace daemon and lamtrace are all unaware of the format and
81 meaning of traces.
82
83 The -R option does not unload trace data. It causes the target trace
84 daemons to free the memory occupied by trace data in the given list.
85 If all lists are specified (no -l option), the trace daemon is effec‐
86 tively reset to its state after initiating LAM.
87
88 Unloading MPI Trace Data
89 A special capability, selected by the -mpi option, exists to search for
90 and unload only the trace data generated by an MPI application. For
91 this purpose, lamtrace is aware of the particular reserved list numbers
92 that libmpi(3) uses to store traces. It begins by searching all speci‐
93 fied nodes and processes (the whole LAM multicomputer, if nothing is
94 specified) for a special trace generated by process rank 0 in
95 MPI_COMM_WORLD of an MPI application. This special trace contains the
96 node and process identifiers of all processes in that MPI_COMM_WORLD
97 communicator. lamtrace then uses the node / process information to
98 collect all trace data generated by libmpi(3).
99
100 If multiple world communicators exist within LAM's trace daemons, the
101 first one found is used. Multiple worlds may be present due to multi‐
102 ple concurrent applications, trace data from a previous run not removed
103 (either with lamtrace or lamclean(1)), or an application that spawns
104 processes. A particular world communicator can be located by providing
105 precise node and process location to lamtrace.
106
107 The -mpi option is not compatible with the -l option.
108
110 lamtrace -v -mpi mytraces
111 Unload trace data into the file "mytraces" from the first MPI ap‐
112 plication found in a search of the entire LAM multicomputer. Re‐
113 port on important steps as they are done.
114
115 lamtrace n30 -l 5 p21367
116 Unload trace data from list 5 of process ID 21367 on node 30. Op‐
117 erate silently.
118
119 lamtrace -mpi n30 p21367
120 Unload trace data from the MPI application world group whose
121 process rank 0 has PID 21367 and is/was running on node 30.
122
124 Since trace data can be unloaded during an application's execution,
125 there should be a way to incrementally append to an output file. This
126 is a bit tricky with -mpi, but it can be done.
127
129 def.lamtr default output file
130
132 mpirun(1), loadgo(1), lam_rtrstore(1), lamclean(1), libmpi(3), xmpi(1)
133
134
135
136LAM 7.1.2 March, 2006 LAMTRACE(1)