1PCP-ATOP(1) General Commands Manual PCP-ATOP(1)
2
3
4
6 pcp-atop - Advanced System and Process Monitor
7
9 Interactive Usage:
10
11 pcp [pcp options] atop [-aAcCdDfFgGmMnNopRsuvxy1] [-L linelen] [-Pla‐
12 bel[,label]...] [interval [samples]]
13
14 Writing and reading PCP archive folios:
15
16 pcp atop -w folio [-a] [-S] [interval [samples]]
17 pcp atop -r folio [-AcCdDfFgGmMnNopRsuvxy1] [-b hh:mm] [-e hh:mm] [-L
18 linelen] [-Plabel[,label]...] [interval [samples]]
19
21 The program pcp-atop is an interactive monitor to view various aspects
22 of load on a system. It shows the occupation of the most critical
23 hardware resources (from a performance point of view) on system level,
24 i.e. cpu, memory, disk and network.
25 It also shows which processes are responsible for the indicated load
26 with respect to cpu and memory load on process level. Disk load is
27 shown per process if "storage accounting" is active in the kernel.
28
29 Every interval (default: 10 seconds) information is shown about the
30 resource occupation on system level (cpu, memory, disks and network
31 layers), followed by a list of processes which have been active during
32 the last interval (note that all processes that were unchanged during
33 the last interval are not shown, unless the key 'a' has been pressed or
34 unless sorting on memory occupation is done). If the list of active
35 processes does not entirely fit on the screen, only the top of the list
36 is shown (sorted in order of activity).
37 The intervals are repeated till the number of samples (specified as
38 command argument) is reached, or till the key 'q' is pressed in inter‐
39 active mode.
40
41 When invoked via the pcp(1) command, the PCPIntro(1) options -h/--host,
42 -a/--archive, -O/--origin, -s/--samples, -t/--interval, -Z/--timezone
43 and several other pcp options become indirectly available. The long
44 option form of these is directly available. Additionally, the --hot‐
45 proc option can be used to request the per-process PCP metrics be used
46 instead of the default proc metrics from pmdaproc(1).
47
48 When pcp-atop is started, it checks whether the standard output channel
49 is connected to a screen, or to a file/pipe. In the first case it pro‐
50 duces screen control codes (via the ncurses library) and behaves inter‐
51 actively; in the second case it produces flat ASCII-output.
52
53 In interactive mode, the output of pcp-atop scales dynamically to the
54 current dimensions of the screen/window.
55 If the window is resized horizontally, columns will be added or removed
56 automatically. For this purpose, every column has a particular weight.
57 The columns with the highest weights that fit within the current width
58 will be shown.
59 If the window is resized vertically, lines of the process/thread list
60 will be added or removed automatically.
61
62 Furthermore in interactive mode the output of pcp-atop can be con‐
63 trolled by pressing particular keys. However it is also possible to
64 specify such key as flag on the command line. In that case pcp-atop
65 switches to the indicated mode on beforehand; this mode can be modified
66 again interactively. Specifying such key as flag is especially useful
67 when running pcp-atop with output to a pipe or file (non-interac‐
68 tively). These flags are the same as the keys that can be pressed in
69 interactive mode (see section INTERACTIVE COMMANDS).
70 Additional flags are available to support storage of pcp-atop data in
71 PCP archive format (see section PCP DATA STORAGE).
72
74 For the resource consumption on system level, pcp-atop uses colors to
75 indicate that a critical occupation percentage has been (almost)
76 reached. A critical occupation percentage means that is likely that
77 this load causes a noticeable negative performance influence for appli‐
78 cations using this resource. The critical percentage depends on the
79 type of resource: e.g. the performance influence of a disk with a busy
80 percentage of 80% might be more noticeable for applications/user than a
81 CPU with a busy percentage of 90%.
82 Currently pcp-atop uses the following default values to calculate a
83 weighted percentage per resource:
84
85 Processor
86 A busy percentage of 90% or higher is considered `critical'.
87
88 Disk
89 A busy percentage of 70% or higher is considered `critical'.
90
91 Network
92 A busy percentage of 90% or higher for the load of an interface is
93 considered `critical'.
94
95 Memory
96 An occupation percentage of 90% is considered `critical'. Notice
97 that this occupation percentage is the accumulated memory consump‐
98 tion of the kernel (including slab) and all processes; the memory
99 for the page cache (`cache' and `buff' in the MEM-line) and the
100 reclaimable part of the slab (`slrec`) is not implied!
101 If the number of pages swapped out (`swout' in the PAG-line) is
102 larger than 10 per second, the memory resource is considered
103 `critical'. A value of at least 1 per second is considered
104 `almost critical'.
105 If the committed virtual memory exceeds the limit (`vmcom' and
106 `vmlim' in the SWP-line), the SWP-line is colored due to overcom‐
107 mitting the system.
108
109 Swap
110 An occupation percentage of 80% is considered `critical' because
111 swap space might be completely exhausted in the near future; it is
112 not critical from a performance point-of-view.
113
114 These default values can be modified in the configuration file (see
115 separate man-page of pcp-atoprc(5)).
116
117 When a resource exceeds its critical occupation percentage, the con‐
118 cerning values in the screen line are colored red by default.
119 When a resource exceeded (default) 80% of its critical percentage (so
120 it is almost critical), the concerning values in the screen line are
121 colored cyan by default. This `almost critical percentage' (one value
122 for all resources) can be modified in the configuration file (see sepa‐
123 rate man-page of pcp-atoprc(5)).
124 The default colors red and cyan can be modified in the configuration
125 file as well (see separate man-page of pcp-atoprc(5)).
126
127 With the key 'x' (or flag -x), the use of colors can be suppressed.
128
130 GPU statistics can be gathered by pmdanvidia(1) which is a separate
131 data collection daemon process. It gathers cumulative utilization
132 counters of every Nvidia GPU in the system, as well as utilization
133 counters of every process that uses a GPU. When pcp-atop notices that
134 the daemon is active, it reads these GPU utilization counters with
135 every interval.
136
137 Find a description about the utilization counters in the section OUTPUT
138 DESCRIPTION.
139
141 When running pcp-atop interactively (no output redirection), keys can
142 be pressed to control the output. In general, lower case keys can be
143 used to show other information for the active processes and upper case
144 keys can be used to influence the sort order of the active
145 process/thread list.
146
147 g Show generic output (default).
148
149 Per process the following fields are shown in case of a window-
150 width of 80 positions: process-id, cpu consumption during the last
151 interval in system and user mode, the virtual and resident memory
152 growth of the process.
153
154 The subsequent columns depend on the used kernel:
155 When the kernel supports "storage accounting" (>= 2.6.20), the
156 data transfer for read/write on disk, the status and exit code are
157 shown for each process. When the kernel does not support "storage
158 accounting", the username, number of threads in the thread group,
159 the status and exit code are shown.
160 The last columns contain the state, the occupation percentage for
161 the chosen resource (default: cpu) and the process name.
162
163 When more than 80 positions are available, other information is
164 added.
165
166 m Show memory related output.
167
168 Per process the following fields are shown in case of a window-
169 width of 80 positions: process-id, minor and major memory faults,
170 size of virtual shared text, total virtual process size, total
171 resident process size, virtual and resident growth during last
172 interval, memory occupation percentage and process name.
173
174 When more than 80 positions are available, other information is
175 added.
176
177 For memory consumption, always all processes are shown (also the
178 processes that were not active during the interval).
179
180 d Show disk-related output.
181
182 When "storage accounting" is active in the kernel, the following
183 fields are shown: process-id, amount of data read from disk,
184 amount of data written to disk, amount of data that was written
185 but has been withdrawn again (WCANCL), disk occupation percentage
186 and process name.
187
188 s Show scheduling characteristics.
189
190 Per process the following fields are shown in case of a window-
191 width of 80 positions: process-id, number of threads in state
192 'running' (R), number of threads in state 'interruptible sleeping'
193 (S), number of threads in state 'uninterruptible sleeping' (D),
194 scheduling policy (normal timesharing, realtime round-robin, real‐
195 time fifo), nice value, priority, realtime priority, current pro‐
196 cessor, status, exit code, state, the occupation percentage for
197 the chosen resource and the process name.
198
199 When more than 80 positions are available, other information is
200 added.
201
202 v Show various process characteristics.
203
204 Per process the following fields are shown in case of a window-
205 width of 80 positions: process-id, user name and group, start date
206 and time, status (e.g. exit code if the process has finished),
207 state, the occupation percentage for the chosen resource and the
208 process name.
209
210 When more than 80 positions are available, other information is
211 added.
212
213 c Show the command line of the process.
214
215 Per process the following fields are shown: process-id, the occu‐
216 pation percentage for the chosen resource and the command line
217 including arguments.
218
219 e Show GPU utilization.
220
221 Per process at least the following fields are shown: process-id,
222 range of GPU numbers on which the process currently runs, GPU busy
223 percentage on all GPUs, memory busy percentage (i.e. read and
224 write accesses on memory) on all GPUs, memory occupation at the
225 moment of the sample, average memory occupation during the sample,
226 and GPU percentage.
227
228 When the pmdanvidia daemon does not run with root privileges, the
229 GPU busy percentage and the memory busy percentage are not avail‐
230 able on process level. In that case, the GPU percentage on
231 process level reflects the GPU memory occupation instead of the
232 GPU busy percentage (which is preferred).
233
234 o Show the user-defined line of the process.
235
236 In the configuration file the keyword ownprocline can be specified
237 with the description of a user-defined output-line.
238 Refer to the man-page of pcp-atoprc(5) for a detailed description.
239
240 y Show the individual threads within a process (toggle).
241
242 Single-threaded processes are still shown as one line.
243 For multi-threaded processes, one line represents the process
244 while additional lines show the activity per individual thread (in
245 a different color). Depending on the option 'a' (all or active
246 toggle), all threads are shown or only the threads that were
247 active during the last interval.
248 Whether this key is active or not can be seen in the header line.
249
250 u Show the process activity accumulated per user.
251
252 Per user the following fields are shown: number of processes
253 active or terminated during last interval (or in total if combined
254 with command `a'), accumulated cpu consumption during last inter‐
255 val in system and user mode, the current virtual and resident mem‐
256 ory space consumed by active processes (or all processes of the
257 user if combined with command `a').
258 When "storage accounting" is active in the kernel, the accumulated
259 read and write throughput on disk is shown. When the pmdabcc(1)
260 module `netproc' has been installed, the number of receive and
261 send network calls are shown.
262 The last columns contain the accumulated occupation percentage for
263 the chosen resource (default: cpu) and the user name.
264
265 p Show the process activity accumulated per program (i.e. process
266 name).
267
268 Per program the following fields are shown: number of processes
269 active or terminated during last interval (or in total if combined
270 with command `a'), accumulated cpu consumption during last inter‐
271 val in system and user mode, the current virtual and resident mem‐
272 ory space consumed by active processes (or all processes of the
273 user if combined with command `a').
274 When "storage accounting" is active in the kernel, the accumulated
275 read and write throughput on disk is shown. When the pmdabcc(1)
276 module `netproc' has been installed, the number of receive and
277 send network calls are shown.
278 The last columns contain the accumulated occupation percentage for
279 the chosen resource (default: cpu) and the program name.
280
281 j Show the process activity accumulated per Docker container.
282
283 Per container the following fields are shown: number of processes
284 active or terminated during last interval (or in total if combined
285 with command `a'), accumulated cpu consumption during last inter‐
286 val in system and user mode, the current virtual and resident mem‐
287 ory space consumed by active processes (or all processes of the
288 user if combined with command `a').
289 When "storage accounting" is active in the kernel, the accumulated
290 read and write throughput on disk is shown. When the pmdabcc(1)
291 module `netproc' has been installed, the number of receive and
292 send network calls are shown.
293 The last columns contain the accumulated occupation percentage for
294 the chosen resource (default: cpu) and the Docker container id
295 (CID).
296
297 C Sort the current list in the order of cpu consumption (default).
298 The one-but-last column changes to ``CPU''.
299
300 E Sort the current list in the order of GPU utilization (preferred,
301 but only applicable when the pmdanvidia daemon runs under root
302 privileges) or the order of GPU memory occupation). The one-but-
303 last column changes to ``GPU''.
304
305 M Sort the current list in the order of resident memory consumption.
306 The one-but-last column changes to ``MEM''. In case of sorting on
307 memory, the full process list will be shown (not only the active
308 processes).
309
310 D Sort the current list in the order of disk accesses issued. The
311 one-but-last column changes to ``DSK''.
312
313 N Sort the current list in the order of network bandwidth (received
314 and transmitted). The one-but-last column changes to ``NET''.
315
316 A Sort the current list automatically in the order of the most busy
317 system resource during this interval. The one-but-last column
318 shows either ``ACPU'', ``AMEM'', ``ADSK'' or ``ANET'' (the preced‐
319 ing 'A' indicates automatic sorting-order). The most busy
320 resource is determined by comparing the weighted busy-percentages
321 of the system resources, as described earlier in the section COL‐
322 ORS.
323 This option remains valid until another sorting-order is explic‐
324 itly selected again.
325 A sorting-order for disk is only possible when "storage account‐
326 ing" is active. A sorting-order for network is only possible when
327 the pmdabcc(1) module `netproc' has been installed.
328
329 Miscellaneous interactive commands:
330
331 ? Request for help information (also the key 'h' can be pressed).
332
333 V Request for version information (version number and date).
334
335 R Gather and calculate the proportional set size of processes (tog‐
336 gle). Gathering of all values that are needed to calculate the
337 PSIZE of a process is a relatively time-consuming task, so this
338 key should only be active when analyzing the resident memory con‐
339 sumption of processes.
340
341 x Suppress colors to highlight critical resources (toggle).
342 Whether this key is active or not can be seen in the header line.
343
344 z The pause key can be used to freeze the current situation in order
345 to investigate the output on the screen. While pcp-atop is
346 paused, the keys described above can be pressed to show other
347 information about the current list of processes. Whenever the
348 pause key is pressed again, pcp-atop will continue with the next
349 sample.
350
351 i Modify the interval timer (default: 10 seconds). If an interval
352 timer of 0 is entered, the interval timer is switched off. In
353 that case a new sample can only be triggered manually by pressing
354 the key 't'.
355
356 t Trigger a new sample manually. This key can be pressed if the
357 current sample should be finished before the timer has exceeded,
358 or if no timer is set at all (interval timer defined as 0). In
359 the latter case pcp-atop can be used as a stopwatch to measure the
360 load being caused by a particular application transaction, without
361 knowing on beforehand how many seconds this transaction will last.
362
363 When viewing the contents of an archive folio, this key can be
364 used to show the next sample from the folio.
365
366 T When viewing the contents of an archive folio, this key can be
367 used to show the previous sample from the folio.
368
369 b When viewing the contents of an archive folio, this key can be
370 used to move to a certain timestamp within the file (either for‐
371 ward or backward).
372
373 r Reset all counters to zero to see the system and process activity
374 since boot again.
375
376 When viewing the contents of an archive, this key can be used to
377 rewind to the beginning of the file again.
378
379 U Specify a search string for specific user names as a regular
380 expression. From now on, only (active) processes will be shown
381 from a user which matches the regular expression. The system sta‐
382 tistics are still system wide. If the Enter-key is pressed with‐
383 out specifying a name, (active) processes of all users will be
384 shown again.
385 Whether this key is active or not can be seen in the header line.
386
387 I Specify a list with one or more PIDs to be selected. From now on,
388 only processes will be shown with a PID which matches one of the
389 given list. The system statistics are still system wide. If the
390 Enter-key is pressed without specifying a PID, all (active) pro‐
391 cesses will be shown again.
392 Whether this key is active or not can be seen in the header line.
393
394 P Specify a search string for specific process names as a regular
395 expression. From now on, only processes will be shown with a name
396 which matches the regular expression. The system statistics are
397 still system wide. If the Enter-key is pressed without specifying
398 a name, all (active) processes will be shown again.
399 Whether this key is active or not can be seen in the header line.
400
401 / Specify a specific command line search string as a regular expres‐
402 sion. From now on, only processes will be shown with a command
403 line which matches the regular expression. The system statistics
404 are still system wide. If the Enter-key is pressed without speci‐
405 fying a string, all (active) processes will be shown again.
406 Whether this key is active or not can be seen in the header line.
407
408 J Specify a Docker container id of 12 (hexadecimal) characters.
409 From now on, only processes will be shown that run in that spe‐
410 cific Docker container (CID). The system statistics are still
411 system wide. If the Enter-key is pressed without specifying a
412 container id, all (active) processes will be shown again.
413 Whether this key is active or not can be seen in the header line.
414
415 S Specify search strings for specific logical volume names, specific
416 disk names and specific network interface names. All search
417 strings are interpreted as a regular expressions. From now on,
418 only those system resources are shown that match the concerning
419 regular expression. If the Enter-key is pressed without specify‐
420 ing a search string, all (active) system resources of that type
421 will be shown again.
422 Whether this key is active or not can be seen in the header line.
423
424 a The `all/active' key can be used to toggle between only show‐
425 ing/accumulating the processes that were active during the last
426 interval (default) or showing/accumulating all processes.
427 Whether this key is active or not can be seen in the header line.
428
429 G By default, pcp-atop shows/accumulates the processes that are
430 alive and the processes that are exited during the last interval.
431 With this key (toggle), showing/accumulating the processes that
432 are exited can be suppressed.
433 Whether this key is active or not can be seen in the header line.
434
435 f Show a fixed (maximum) number of header lines for system resources
436 (toggle). By default only the lines are shown about system
437 resources (CPUs, paging, logical volumes, disks, network inter‐
438 faces) that really have been active during the last interval.
439 With this key you can force pcp-atop to show lines of inactive
440 resources as well.
441 Whether this key is active or not can be seen in the header line.
442
443 F Suppress sorting of system resources (toggle). By default system
444 resources (CPUs, logical volumes, disks, network interfaces) are
445 sorted on utilization.
446 Whether this key is active or not can be seen in the header line.
447
448 1 Show relevant counters as an average per second (in the format
449 `..../s') instead of as a total during the interval (toggle).
450 Whether this key is active or not can be seen in the header line.
451
452 l Limit the number of system level lines for the counters per-cpu,
453 the active disks and the network interfaces. By default lines are
454 shown of all CPUs, disks and network interfaces which have been
455 active during the last interval. Limiting these lines can be use‐
456 ful on systems with huge number CPUs, disks or interfaces in order
457 to be able to run pcp-atop on a screen/window with e.g. only 24
458 lines.
459 For all mentioned resources the maximum number of lines can be
460 specified interactively. When using the flag -l the maximum number
461 of per-cpu lines is set to 0, the maximum number of disk lines to
462 5 and the maximum number of interface lines to 3. These values
463 can be modified again in interactive mode.
464
465 k Send a signal to an active process (a.k.a. kill a process).
466
467 q Quit the program.
468
469 PgDn Show the next page of the process/thread list.
470 With the arrow-down key the list can be scrolled downwards with
471 single lines.
472
473 ^F Show the next page of the process/thread list (forward).
474 With the arrow-down key the list can be scrolled downwards with
475 single lines.
476
477 PgUp Show the previous page of the process/thread list.
478 With the arrow-up key the list can be scrolled upwards with single
479 lines.
480
481 ^B Show the previous page of the process/thread list (backward).
482 With the arrow-up key the list can be scrolled upwards with single
483 lines.
484
485 ^L Redraw the screen.
486
488 In order to store system and process level statistics for long-term
489 analysis (e.g. to check the system load and the active processes run‐
490 ning yesterday between 3:00 and 4:00 PM), pcp-atop can store the system
491 and process level statistics in the PCP archive format, as an archive
492 folio (see mkaf(1)).
493 All information about processes and threads is stored in the archive.
494 The interval (default: 10 seconds) and number of samples (default:
495 infinite) can be passed as last arguments. Instead of the number of
496 samples, the flag -S can be used to indicate that pcp-atop should fin‐
497 ish anyhow before midnight.
498
499 A PCP archive can be read and visualized again with the -r option. The
500 argument is a comma-separated list of names, each of which may be the
501 base name of an archive or the name of a directory containing one or
502 more archives. If no argument is specified, the file
503 $PCP_LOG_DIR/pmlogger/HOST/YYYYMMDD is opened for input (where YYYYMMDD
504 are digits representing the current date, and HOST is the hostname of
505 the machine being logged). If a filename is specified in the format
506 YYYYMMDD (representing any valid date), the file $PCP_LOG_DIR/pmlog‐
507 ger/HOST/YYYYMMDD is opened. If a filename with the symbolic name y is
508 specified, yesterday's daily logfile is opened (this can be repeated so
509 'yyyy' indicates the logfile of four days ago).
510 The samples from the file can be viewed interactively by using the key
511 't' to show the next sample, the key 'T' to show the previous sample,
512 the key 'b' to branch to a particular time or the key 'r' to rewind to
513 the begin of the file.
514 When output is redirected to a file or pipe, pcp-atop prints all sam‐
515 ples in plain ASCII. The default line length is 80 characters in that
516 case; with the flag -L followed by an alternate line length, more (or
517 less) columns will be shown.
518 With the flag -b (begin time) and/or -e (end time) followed by a time
519 argument of the form HH:MM, a certain time period within the archive
520 can be selected.
521
523 The first sample shows the system level activity since boot (the
524 elapsed time in the header shows the time since boot). Note that par‐
525 ticular counters could have reached their maximum value (several times)
526 and started by zero again, so do not rely on these figures.
527
528 For every sample pcp-atop first shows the lines related to system level
529 activity. If a particular system resource has not been used during the
530 interval, the entire line related to this resource is suppressed. So
531 the number of system level lines may vary for each sample.
532 After that a list is shown of processes which have been active during
533 the last interval. This list is by default sorted on cpu consumption,
534 but this order can be changed by the keys which are previously
535 described.
536
537 If values have to be shown by pcp-atop which do not fit in the column
538 width, another format is used. If e.g. a cpu-consumption of 233216 mil‐
539 liseconds should be shown in a column width of 4 positions, it is shown
540 as `233s' (in seconds). For large memory figures, another unit is cho‐
541 sen if the value does not fit (Mb instead of Kb, Gb instead of Mb, Tb
542 instead of Gb, ...). For other values, a kind of exponent notation is
543 used (value 123456789 shown in a column of 5 positions gives 123e6).
544
546 The system level information consists of the following output lines:
547
548 PRC Process and thread level totals.
549 This line contains the total cpu time consumed in system mode
550 (`sys') and in user mode (`user'), the total number of processes
551 present at this moment (`#proc'), the total number of threads
552 present at this moment in state `running' (`#trun'), `sleeping
553 interruptible' (`#tslpi') and `sleeping uninterruptible'
554 (`#tslpu'), the number of zombie processes (`#zombie'), the number
555 of clone system calls (`clones'), and the number of processes that
556 ended during the interval (`#exit') when process accounting is
557 used. Instead of `#exit` the last column may indicate that process
558 accounting could not be activated (`no procacct`).
559 If the screen-width does not allow all of these counters, only a
560 relevant subset is shown.
561
562 CPU CPU utilization.
563 At least one line is shown for the total occupation of all CPUs
564 together.
565 In case of a multi-processor system, an additional line is shown
566 for every individual processor (with `cpu' in lower case), sorted
567 on activity. Inactive CPUs will not be shown by default. The
568 lines showing the per-cpu occupation contain the cpu number in the
569 field combined with the wait percentage.
570
571 Every line contains the percentage of cpu time spent in kernel
572 mode by all active processes (`sys'), the percentage of cpu time
573 consumed in user mode (`user') for all active processes (including
574 processes running with a nice value larger than zero), the per‐
575 centage of cpu time spent for interrupt handling (`irq') including
576 softirq, the percentage of unused cpu time while no processes were
577 waiting for disk I/O (`idle'), and the percentage of unused cpu
578 time while at least one process was waiting for disk I/O (`wait').
579 In case of per-cpu occupation, the cpu number and the wait per‐
580 centage (`w') for that cpu. The number of lines showing the per-
581 cpu occupation can be limited.
582
583 For virtual machines, the steal-percentage (`steal') shows the
584 percentage of cpu time stolen by other virtual machines running on
585 the same hardware.
586 For physical machines hosting one or more virtual machines, the
587 guest-percentage (`guest') shows the percentage of cpu time used
588 by the virtual machines. Notice that this percentage overlaps the
589 user-percentage!
590
591 When PMC performance monitoring counters are supported by the CPU
592 and the kernel (and pmdaperfevent(1) runs with root privileges),
593 the number of instructions per CPU cycle (`ipc') is shown. The
594 first sample always shows the value 'initial', because the coun‐
595 ters are just activated at the moment that pcp-atop is started.
596 When the CPU busy percentage is high and the IPC is less than 1.0,
597 it is likely that the CPU is frequently waiting for memory access
598 during instruction execution (larger CPU caches or faster memory
599 might be helpful to improve performance). When the CPU busy per‐
600 centage is high and the IPC is greater than 1.0, it is likely that
601 the CPU is instruction-bound (more/faster cores might be helpful
602 to improve performance).
603 Furthermore, per CPU the effective number of cycles (`cycl') is
604 shown. This value can reach the current CPU frequency if such CPU
605 is 100% busy. When an idle CPU is halted, the number of effective
606 cycles can be (considerably) lower than the current frequency.
607 Notice that the average instructions per cycle and number of
608 cycles is shown in the CPU line for all CPUs.
609 See also: http://www.brendangregg.com/blog/2017-05-09/cpu-utiliza‐
610 tion-is-wrong.html
611
612 In case of frequency scaling, all previously mentioned CPU per‐
613 centages are relative to the used scaling of the CPU during the
614 interval. If a CPU has been active for e.g. 50% in user mode dur‐
615 ing the interval while the frequency scaling of that CPU was 40%,
616 only 20% of the full capacity of the CPU has been used in user
617 mode.
618
619 If the screen-width does not allow all of these counters, only a
620 relevant subset is shown.
621
622 CPL CPU load information.
623 This line contains the load average figures reflecting the number
624 of threads that are available to run on a CPU (i.e. part of the
625 runqueue) or that are waiting for disk I/O. These figures are
626 averaged over 1 (`avg1'), 5 (`avg5') and 15 (`avg15') minutes.
627 Furthermore the number of context switches (`csw'), the number of
628 serviced interrupts (`intr') and the number of available CPUs are
629 shown.
630
631 If the screen-width does not allow all of these counters, only a
632 relevant subset is shown.
633
634 GPU GPU utilization (Nvidia).
635 Read the section GPU STATISTICS GATHERING in this document to find
636 the details about the activation of the pmdanvidia daemon.
637
638 In the first column of every line, the bus-id (last nine charac‐
639 ters) and the GPU number are shown. The subsequent columns show
640 the percentage of time that one or more kernels were executing on
641 the GPU (`gpubusy'), the percentage of time that global (device)
642 memory was being read or written (`membusy'), the occupation per‐
643 centage of memory (`memocc'), the total memory (`total'), the mem‐
644 ory being in use at the moment of the sample (`used'), the average
645 memory being in use during the sample time (`usavg'), the number
646 of processes being active on the GPU at the moment of the sample
647 (`#proc'), and the type of GPU.
648
649 If the screen-width does not allow all of these counters, only a
650 relevant subset is shown.
651 The number of lines showing the GPUs can be limited.
652
653 MEM Memory occupation.
654 This line contains the total amount of physical memory (`tot'),
655 the amount of memory which is currently free (`free'), the amount
656 of memory in use as page cache including the total resident shared
657 memory (`cache'), the amount of memory within the page cache that
658 has to be flushed to disk (`dirty'), the amount of memory used for
659 filesystem meta data (`buff'), the amount of memory being used for
660 kernel mallocs (`slab'), the amount of slab memory that is
661 reclaimable (`slrec'), the resident size of shared memory includ‐
662 ing tmpfs (`shmem`), the resident size of shared memory (`shrss`)
663 the amount of shared memory that is currently swapped (`shswp`),
664 the amount of memory that is currently claimed by vmware's balloon
665 driver (`vmbal`), the amount of memory that is claimed for huge
666 pages (`hptot`), and the amount of huge page memory that is really
667 in use (`hpuse`).
668
669 If the screen-width does not allow all of these counters, only a
670 relevant subset is shown.
671
672 SWP Swap occupation and overcommit info.
673 This line contains the total amount of swap space on disk (`tot')
674 and the amount of free swap space (`free').
675 Furthermore the committed virtual memory space (`vmcom') and the
676 maximum limit of the committed space (`vmlim', which is by default
677 swap size plus 50% of memory size) is shown. The committed space
678 is the reserved virtual space for all allocations of private mem‐
679 ory space for processes. The kernel only verifies whether the
680 committed space exceeds the limit if strict overcommit handling is
681 configured (vm.overcommit_memory is 2).
682
683 PAG Paging frequency.
684 This line contains the number of scanned pages (`scan') due to the
685 fact that free memory drops below a particular threshold and the
686 number times that the kernel tries to reclaim pages due to an
687 urgent need (`stall').
688 Also the number of memory pages the system read from swap space
689 (`swin') and the number of memory pages the system wrote to swap
690 space (`swout') are shown.
691
692 PSI Pressure Stall Information.
693 This line contains three percentages per category: average pres‐
694 sure percentage over the last 10, 60 and 300 seconds (separated by
695 slashes).
696 The categories are: CPU for 'some' (`cs'), memory for 'some'
697 (`ms'), memory for 'full' (`mf'), I/O for 'some' (`is'), and I/O
698 for 'full' (`if').
699
700 LVM/MDD/DSK
701 Logical volume/multiple device/disk utilization.
702 Per active unit one line is produced, sorted on unit activity.
703 Such line shows the name (e.g. VolGroup00-lvtmp for a logical vol‐
704 ume or sda for a hard disk), the busy percentage i.e. the portion
705 of time that the unit was busy handling requests (`busy'), the
706 number of read requests issued (`read'), the number of write
707 requests issued (`write'), the number of KiBytes per read
708 (`KiB/r'), the number of KiBytes per write (`KiB/w'), the number
709 of MiBytes per second throughput for reads (`MBr/s'), the number
710 of MiBytes per second throughput for writes (`MBw/s'), the average
711 queue depth (`avq') and the average number of milliseconds needed
712 by a request (`avio') for seek, latency and data transfer.
713 If the screen-width does not allow all of these counters, only a
714 relevant subset is shown.
715
716 The number of lines showing the units can be limited per class
717 (LVM, MDD or DSK) with the 'l' key or statically (see separate
718 man-page of pcp-atoprc(5)). By specifying the value 0 for a par‐
719 ticular class, no lines will be shown any more for that class.
720
721 NFM Network Filesystem (NFS) mount at the client side.
722 For each NFS-mounted filesystem, a line is shown that contains the
723 mounted server directory, the name of the server (`srv'), the
724 total number of bytes physically read from the server (`read') and
725 the total number of bytes physically written to the server
726 (`write'). Data transfer is subdivided in the number of bytes
727 read via normal read() system calls (`nread'), the number of bytes
728 written via normal read() system calls (`nwrit'), the number of
729 bytes read via direct I/O (`dread'), the number of bytes written
730 via direct I/O (`dwrit'), the number of bytes read via memory
731 mapped I/O pages (`mread'), and the number of bytes written via
732 memory mapped I/O pages (`mwrit').
733
734 NFC Network Filesystem (NFS) client side counters.
735 This line contains the number of RPC calls issues by local pro‐
736 cesses (`rpc'), the number of read RPC calls (`read`) and write
737 RPC calls (`rpwrite') issued to the NFS server, the number of RPC
738 calls being retransmitted (`retxmit') and the number of authoriza‐
739 tion refreshes (`autref').
740
741 NFS Network Filesystem (NFS) server side counters.
742 This line contains the number of RPC calls received from NFS
743 clients (`rpc'), the number of read RPC calls received (`cread`),
744 the number of write RPC calls received (`cwrit'), the number of
745 Megabytes/second returned to read requests by clients (`MBcr/s`),
746 the number of Megabytes/second passed in write requests by clients
747 (`MBcw/s`), the number of network requests handled via TCP
748 (`nettcp'), the number of network requests handled via UDP
749 (`netudp'), the number of reply cache hits (`rchits'), the number
750 of reply cache misses (`rcmiss') and the number of uncached
751 requests (`rcnoca'). Furthermore some error counters indicating
752 the number of requests with a bad format (`badfmt') or a bad
753 authorization (`badaut'), and a counter indicating the number of
754 bad clients (`badcln').
755
756 NET Network utilization (TCP/IP).
757 One line is shown for activity of the transport layer (TCP and
758 UDP), one line for the IP layer and one line per active interface.
759 For the transport layer, counters are shown concerning the number
760 of received TCP segments including those received in error
761 (`tcpi'), the number of transmitted TCP segments excluding those
762 containing only retransmitted octets (`tcpo'), the number of UDP
763 datagrams received (`udpi'), the number of UDP datagrams transmit‐
764 ted (`udpo'), the number of active TCP opens (`tcpao'), the number
765 of passive TCP opens (`tcppo'), the number of TCP output retrans‐
766 missions (`tcprs'), the number of TCP input errors (`tcpie'), the
767 number of TCP output resets (`tcpor'), the number of UDP no ports
768 (`udpnp'), and the number of UDP input errors (`udpie').
769 If the screen-width does not allow all of these counters, only a
770 relevant subset is shown.
771 These counters are related to IPv4 and IPv6 combined.
772
773 For the IP layer, counters are shown concerning the number of IP
774 datagrams received from interfaces, including those received in
775 error (`ipi'), the number of IP datagrams that local higher-layer
776 protocols offered for transmission (`ipo'), the number of received
777 IP datagrams which were forwarded to other interfaces (`ipfrw'),
778 the number of IP datagrams which were delivered to local higher-
779 layer protocols (`deliv'), the number of received ICMP datagrams
780 (`icmpi'), and the number of transmitted ICMP datagrams (`icmpo').
781 If the screen-width does not allow all of these counters, only a
782 relevant subset is shown.
783 These counters are related to IPv4 and IPv6 combined.
784
785 For every active network interface one line is shown, sorted on
786 the interface activity. Such line shows the name of the interface
787 and its busy percentage in the first column. The busy percentage
788 for half duplex is determined by comparing the interface speed
789 with the number of bits transmitted and received per second; for
790 full duplex the interface speed is compared with the highest of
791 either the transmitted or the received bits. When the interface
792 speed can not be determined (e.g. for the loopback interface),
793 `---' is shown instead of the percentage.
794 Furthermore the number of received packets (`pcki'), the number of
795 transmitted packets (`pcko'), the line speed of the interface
796 (`sp'), the effective amount of bits received per second (`si'),
797 the effective amount of bits transmitted per second (`so'), the
798 number of collisions (`coll'), the number of received multicast
799 packets (`mlti'), the number of errors while receiving a packet
800 (`erri'), the number of errors while transmitting a packet
801 (`erro'), the number of received packets dropped (`drpi'), and the
802 number of transmitted packets dropped (`drpo').
803 If the screen-width does not allow all of these counters, only a
804 relevant subset is shown.
805 The number of lines showing the network interfaces can be limited.
806
807 IFB Infiniband utilization.
808 For every active Infiniband port one line is shown, sorted on
809 activity. Such line shows the name of the port and its busy per‐
810 centage in the first column. The busy percentage is determined by
811 taking the highest of either the transmitted or the received bits
812 during the interval, multiplying that value by the number of lanes
813 and comparing it against the maximum port speed.
814 Furthermore the number of received packets divided by the number
815 of lanes (`pcki'), the number of transmitted packets divided by
816 the number of lanes (`pcko'), the maximum line speed (`sp'), the
817 effective amount of bits received per second (`si'), the effective
818 amount of bits transmitted per second (`so'), and the number of
819 lanes (`lanes').
820 If the screen-width does not allow all of these counters, only a
821 relevant subset is shown.
822 The number of lines showing the Infiniband ports can be limited.
823
825 Following the system level information, the processes are shown from
826 which the resource utilization has changed during the last interval.
827 These processes might have used cpu time or issued disk or network
828 requests. However a process is also shown if part of it has been paged
829 out due to lack of memory (while the process itself was in sleep
830 state).
831
832 Per process the following fields may be shown (in alphabetical order),
833 depending on the current output mode as described in the section INTER‐
834 ACTIVE COMMANDS and depending on the current width of your window:
835
836 AVGRSZ The average size of one read-action on disk.
837
838 AVGWSZ The average size of one write-action on disk.
839
840 CID Container ID (Docker) of 12 hexadecimal digits, referring to
841 the container in which the process/thread is running. If a
842 process has been started and finished during the last inter‐
843 val, a `?' is shown because the container ID is not part of
844 the standard process accounting record.
845
846 CMD The name of the process. This name can be surrounded by
847 "less/greater than" signs (`<name>') which means that the
848 process has finished during the last interval.
849 Behind the abbreviation `CMD' in the header line, the current
850 page number and the total number of pages of the
851 process/thread list are shown.
852
853 COMMAND-LINE
854 The full command line of the process (including arguments). If
855 the length of the command line exceeds the length of the
856 screen line, the arrow keys -> and <- can be used for horizon‐
857 tal scroll.
858 Behind the verb `COMMAND-LINE' in the header line, the current
859 page number and the total number of pages of the
860 process/thread list are shown.
861
862 CPU The occupation percentage of this process related to the
863 available capacity for this resource on system level.
864
865 CPUNR The identification of the CPU the (main) thread is running on
866 or has recently been running on.
867
868 CTID Container ID (OpenVZ). If a process has been started and fin‐
869 ished during the last interval, a `?' is shown because the
870 container ID is not part of the standard process accounting
871 record.
872
873 DSK The occupation percentage of this process related to the total
874 load that is produced by all processes (i.e. total disk
875 accesses by all processes during the last interval).
876 This information is shown when per process "storage account‐
877 ing" is active in the kernel.
878
879 EGID Effective group-id under which this process executes.
880
881 ENDATE Date that the process has been finished. If the process is
882 still running, this field shows `active'.
883
884 ENTIME Time that the process has been finished. If the process is
885 still running, this field shows `active'.
886
887 ENVID Virtual environment identified (OpenVZ only).
888
889 EUID Effective user-id under which this process executes.
890
891 EXC The exit code of a terminated process (second position of col‐
892 umn `ST' is E) or the fatal signal number (second position of
893 column `ST' is S or C).
894
895 FSGID Filesystem group-id under which this process executes.
896
897 FSUID Filesystem user-id under which this process executes.
898
899 GPU When the pmdanvidia daemon does not run with root privileges,
900 the GPU percentage reflects the GPU memory occupation percent‐
901 age (memory of all GPUs is 100%).
902 When the pmdanvidia daemon runs with root privileges, the GPU
903 percentage reflects the GPU busy percentage.
904
905 GPUBUSY Busy percentage on all GPUs (one GPU is 100%).
906 When the pmdanvidia daemon does not run with root privileges,
907 this value is not available.
908
909 GPUNUMS Comma-separated list of GPUs used by the process during the
910 interval. When the comma-separated list exceeds the width of
911 the column, a hexadecimal value is shown.
912
913 MAJFLT The number of page faults issued by this process that have
914 been solved by creating/loading the requested memory page.
915
916 MEM The occupation percentage of this process related to the
917 available capacity for this resource on system level.
918
919 MEMAVG Average memory occupation during the interval on all used
920 GPUs.
921
922 MEMBUSY Busy percentage of memory on all GPUs (one GPU is 100%), i.e.
923 the time needed for read and write accesses on memory.
924 When the pmdanvidia daemon does not run with root privileges,
925 this value is not available.
926
927 MEMNOW Memory occupation at the moment of the sample on all used
928 GPUs.
929
930 MINFLT The number of page faults issued by this process that have
931 been solved by reclaiming the requested memory page from the
932 free list of pages.
933
934 NET The occupation percentage of this process related to the total
935 load that is produced by all processes (i.e. consumed network
936 bandwidth of all processes during the last interval).
937 This information will only be shown when the pmdabcc(1) module
938 `netproc' has been installed.
939
940 NICE The more or less static priority that can be given to a
941 process on a scale from -20 (high priority) to +19 (low prior‐
942 ity).
943
944 NPROCS The number of active and terminated processes accumulated for
945 this user or program.
946
947 PID Process-id.
948
949 POLI The policies 'norm' (normal, which is SCHED_OTHER), 'btch'
950 (batch) and 'idle' refer to timesharing processes. The poli‐
951 cies 'fifo' (SCHED_FIFO) and 'rr' (round robin, which is
952 SCHED_RR) refer to realtime processes.
953
954 PPID Parent process-id.
955
956 PRI The process' priority ranges from 0 (highest priority) to 139
957 (lowest priority). Priority 0 to 99 are used for realtime
958 processes (fixed priority independent of their behavior) and
959 priority 100 to 139 for timesharing processes (variable prior‐
960 ity depending on their recent CPU consumption and the nice
961 value).
962
963 PSIZE The proportional memory size of this process (or user).
964 Every process shares resident memory with other processes.
965 E.g. when a particular program is started several times, the
966 code pages (text) are only loaded once in memory and shared by
967 all incarnations. Also the code of shared libraries is shared
968 by all processes using that shared library, as well as shared
969 memory and memory-mapped files. For the PSIZE calculation of
970 a process, the resident memory of a process that is shared
971 with other processes is divided by the number of sharers.
972 This means, that every process is accounted for a proportional
973 part of that memory. Accumulating the PSIZE values of all
974 processes in the system gives a reliable impression of the
975 total resident memory consumed by all processes.
976 Since gathering of all values that are needed to calculate the
977 PSIZE is a relatively time-consuming task, the 'R' key (or
978 '-R' flag) should be active. Gathering these values also
979 requires superuser privileges (otherwise '?K' is shown in the
980 output).
981
982 RDDSK When the kernel maintains standard io statistics (>= 2.6.20):
983 The read data transfer issued physically on disk (so reading
984 from the disk cache is not accounted for).
985 Unfortunately, the kernel aggregates the data tranfer of a
986 process to the data transfer of its parent process when termi‐
987 nating, so you might see transfers for (parent) processes like
988 cron, bash or init, that are not really issued by them.
989
990 RGID The real group-id under which the process executes.
991
992 RGROW The amount of resident memory that the process has grown dur‐
993 ing the last interval. A resident growth can be caused by
994 touching memory pages which were not physically created/loaded
995 before (load-on-demand). Note that a resident growth can also
996 be negative e.g. when part of the process is paged out due to
997 lack of memory or when the process frees dynamically allocated
998 memory. For a process which started during the last interval,
999 the resident growth reflects the total resident size of the
1000 process at that moment.
1001
1002 RSIZE The total resident memory usage consumed by this process (or
1003 user). Notice that the RSIZE of a process includes all resi‐
1004 dent memory used by that process, even if certain memory parts
1005 are shared with other processes (see also the explanation of
1006 PSIZE).
1007
1008 RTPR Realtime priority according the POSIX standard. Value can be
1009 0 for a timesharing process (policy 'norm', 'btch' or 'idle')
1010 or ranges from 1 (lowest) till 99 (highest) for a realtime
1011 process (policy 'rr' or 'fifo').
1012
1013 RUID The real user-id under which the process executes.
1014
1015 S The current state of the (main) thread: `R' for running (cur‐
1016 rently processing or in the runqueue), `S' for sleeping inter‐
1017 ruptible (wait for an event to occur), `D' for sleeping non-
1018 interruptible, `Z' for zombie (waiting to be synchronized with
1019 its parent process), `T' for stopped (suspended or traced),
1020 `W' for swapping, and `E' (exit) for processes which have fin‐
1021 ished during the last interval.
1022
1023 SGID The saved group-id of the process.
1024
1025 ST The status of a process.
1026 The first position indicates if the process has been started
1027 during the last interval (the value N means 'new process').
1028
1029 The second position indicates if the process has been finished
1030 during the last interval.
1031 The value E means 'exit' on the process' own initiative; the
1032 exit code is displayed in the column `EXC'.
1033 The value S means that the process has been terminated unvol‐
1034 untarily by a signal; the signal number is displayed in the in
1035 the column `EXC'.
1036 The value C means that the process has been terminated unvol‐
1037 untarily by a signal, producing a core dump in its current
1038 directory; the signal number is displayed in the column `EXC'.
1039
1040 STDATE The start date of the process.
1041
1042 STTIME The start time of the process.
1043
1044 SUID The saved user-id of the process.
1045
1046 SWAPSZ The swap space consumed by this process (or user).
1047
1048 SYSCPU CPU time consumption of this process in system mode (kernel
1049 mode), usually due to system call handling.
1050
1051 TCPRASZ The average size of a received TCP buffer in bytes. This
1052 information will only be shown when the BCC PMDA is active and
1053 the `netproc' module is enabled.
1054
1055 TCPRCV The number of tcp_recvmsg()/tcp_cleanup_rbuf() calls from this
1056 process. This information will only be shown when the BCC
1057 PMDA is active and the `netproc' module is enabled.
1058
1059 TCPSASZ The average size of a TCP buffer requested to be transmitted
1060 in bytes. This information will only be shown when the BCC
1061 PMDA is active and the `netproc' module is enabled.
1062
1063 TCPSND The number of tcp_sendmsg() calls from this process. This
1064 information will only be shown when the BCC PMDA is active and
1065 the `netproc' module is enabled.
1066
1067 THR Total number of threads within this process. All related
1068 threads are contained in a thread group, represented by pcp-
1069 atop as one line or as a separate line when the 'y' key (or -y
1070 flag) is active.
1071
1072 TID Thread-id. All threads within a process run with the same PID
1073 but with a different TID. This value is shown for individual
1074 threads in multi-threaded processes (when using the key 'y').
1075
1076 TRUN Number of threads within this process that are in the state
1077 'running' (R).
1078
1079 TSLPI Number of threads within this process that are in the state
1080 'interruptible sleeping' (S).
1081
1082 TSLPU Number of threads within this process that are in the state
1083 'uninterruptible sleeping' (D).
1084
1085 UDPRASZ The average size of a received UDP buffer in bytes. This
1086 information will only be shown when the BCC PMDA is active and
1087 the `netproc' module is enabled.
1088
1089 UDPRCV The number of udp_recvmsg()/skb_consume_udp() calls from this
1090 process. This information will only be shown when the BCC
1091 PMDA is active and the `netproc' module is enabled.
1092
1093 UDPSASZ The average size of a UDP buffer requested to be transmitted
1094 in bytes. This information will only be shown when the BCC
1095 PMDA is active and the `netproc' module is enabled.
1096
1097 UDPSND The number of udp_sendmsg() calls from this process. This
1098 information will only be shown when the BCC PMDA is active and
1099 the `netproc' module is enabled.
1100
1101 USRCPU CPU time consumption of this process in user mode, due to pro‐
1102 cessing the own program text.
1103
1104 VDATA The virtual memory size of the private data used by this
1105 process (including heap and shared library data).
1106
1107 VGROW The amount of virtual memory that the process has grown during
1108 the last interval. A virtual growth can be caused by e.g.
1109 issueing a malloc() or attaching a shared memory segment.
1110 Note that a virtual growth can also be negative by e.g. issue‐
1111 ing a free() or detaching a shared memory segment. For a
1112 process which started during the last interval, the virtual
1113 growth reflects the total virtual size of the process at that
1114 moment.
1115
1116 VPID Virtual process-id (within an OpenVZ container). If a process
1117 has been started and finished during the last interval, a `?'
1118 is shown because the virtual process-id is not part of the
1119 standard process accounting record.
1120
1121 VSIZE The total virtual memory usage consumed by this process (or
1122 user).
1123
1124 VSLIBS The virtual memory size of the (shared) text of all shared
1125 libraries used by this process.
1126
1127 VSTACK The virtual memory size of the (private) stack used by this
1128 process
1129
1130 VSTEXT The virtual memory size of the (shared) text of the executable
1131 program.
1132
1133 WRDSK When the kernel maintains standard io statistics (>= 2.6.20):
1134 The write data transfer issued physically on disk (so writing
1135 to the disk cache is not accounted for). This counter is
1136 maintained for the application process that writes its data to
1137 the cache (assuming that this data is physically transferred
1138 to disk later on). Notice that disk I/O needed for swapping
1139 is not taken into account.
1140 Unfortunately, the kernel aggregates the data tranfer of a
1141 process to the data transfer of its parent process when termi‐
1142 nating, so you might see transfers for (parent) processes like
1143 cron, bash or init, that are not really issued by them.
1144
1145 WCANCL When the kernel maintains standard io statistics (>= 2.6.20):
1146 The write data transfer previously accounted for this process
1147 or another process that has been cancelled. Suppose that a
1148 process writes new data to a file and that data is removed
1149 again before the cache buffers have been flushed to disk.
1150 Then the original process shows the written data as WRDSK,
1151 while the process that removes/truncates the file shows the
1152 unflushed removed data as WCANCL.
1153
1155 With the flag -P followed by a list of one or more labels (comma-sepa‐
1156 rated), parseable output is produced for each sample. The labels that
1157 can be specified for system-level statistics correspond to the labels
1158 (first verb of each line) that can be found in the interactive output:
1159 "CPU", "cpu", "CPL", "GPU", "MEM", "SWP", "PAG", "PSI", "LVM", "MDD",
1160 "DSK", "NFM", "NFC", "NFS", "NET" and "IFB".
1161 For process-level statistics special labels are introduced: "PRG" (gen‐
1162 eral), "PRC" (cpu), "PRE" (GPU), "PRM" (memory), "PRD" (disk, only if
1163 "storage accounting" is active).
1164 With the label "ALL", all system and process level statistics are
1165 shown.
1166
1167 For every interval all requested lines are shown whereafter pcp-atop
1168 shows a line just containing the label "SEP" as a separator before the
1169 lines for the next sample are generated.
1170 When a sample contains the values since boot, pcp-atop shows a line
1171 just containing the label "RESET" before the lines for this sample are
1172 generated.
1173
1174 The first part of each output-line consists of the following six
1175 fields: label (the name of the label), host (the name of this machine),
1176 epoch (the time of this interval as number of seconds since 1-1-1970),
1177 date (date of this interval in format YYYY/MM/DD), time (time of this
1178 interval in format HH:MM:SS), and interval (number of seconds elapsed
1179 for this interval).
1180
1181 The subsequent fields of each output-line depend on the label:
1182
1183 CPU Subsequent fields: total number of clock-ticks per second for
1184 this machine, number of processors, consumption for all CPUs
1185 in system mode (clock-ticks), consumption for all CPUs in user
1186 mode (clock-ticks), consumption for all CPUs in user mode for
1187 niced processes (clock-ticks), consumption for all CPUs in
1188 idle mode (clock-ticks), consumption for all CPUs in wait mode
1189 (clock-ticks), consumption for all CPUs in irq mode (clock-
1190 ticks), consumption for all CPUs in softirq mode (clock-
1191 ticks), consumption for all CPUs in steal mode (clock-ticks),
1192 consumption for all CPUs in guest mode (clock-ticks) overlap‐
1193 ping user mode, frequency of all CPUs and frequency percentage
1194 of all CPUs.
1195
1196 cpu Subsequent fields: total number of clock-ticks per second for
1197 this machine, processor-number, consumption for this CPU in
1198 system mode (clock-ticks), consumption for this CPU in user
1199 mode (clock-ticks), consumption for this CPU in user mode for
1200 niced processes (clock-ticks), consumption for this CPU in
1201 idle mode (clock-ticks), consumption for this CPU in wait mode
1202 (clock-ticks), consumption for this CPU in irq mode (clock-
1203 ticks), consumption for this CPU in softirq mode (clock-
1204 ticks), consumption for this CPU in steal mode (clock-ticks),
1205 consumption for this CPU in guest mode (clock-ticks) overlap‐
1206 ping user mode, frequency of all CPUs, frequency percentage of
1207 all CPUs, instructions executed by all CPUs and cycles for all
1208 CPUs.
1209
1210 CPL Subsequent fields: number of processors, load average for last
1211 minute, load average for last five minutes, load average for
1212 last fifteen minutes, number of context-switches, and number
1213 of device interrupts.
1214
1215 GPU Subsequent fields: GPU number, bus-id string, type of GPU
1216 string, GPU busy percentage during last second (-1 if not
1217 available), memory busy percentage during last second (-1 if
1218 not available), total memory size (KiB), used memory (KiB) at
1219 this moment, number of samples taken during interval, cumula‐
1220 tive GPU busy percentage during the interval (to be divided by
1221 the number of samples for the average busy percentage, -1 if
1222 not available), cumulative memory busy percentage during the
1223 interval (to be divided by the number of samples for the aver‐
1224 age busy percentage, -1 if not available), and cumulative mem‐
1225 ory occupation during the interval (to be divided by the num‐
1226 ber of samples for the average occupation).
1227
1228 MEM Subsequent fields: page size for this machine (in bytes), size
1229 of physical memory (pages), size of free memory (pages), size
1230 of page cache (pages), size of buffer cache (pages), size of
1231 slab (pages), dirty pages in cache (pages), reclaimable part
1232 of slab (pages), total size of vmware's balloon pages (pages),
1233 total size of shared memory (pages), size of resident shared
1234 memory (pages), size of swapped shared memory (pages), huge
1235 page size (in bytes), total size of huge pages (huge pages),
1236 and size of free huge pages (huge pages).
1237
1238 SWP Subsequent fields: page size for this machine (in bytes), size
1239 of swap (pages), size of free swap (pages), 0 (future use),
1240 size of committed space (pages), and limit for committed space
1241 (pages).
1242
1243 PAG Subsequent fields: page size for this machine (in bytes), num‐
1244 ber of page scans, number of allocstalls, 0 (future use), num‐
1245 ber of swapins, and number of swapouts.
1246
1247 PSI Subsequent fields: PSI statistics present on this system (n or
1248 y), CPU some avg10, CPU some avg60, CPU some avg300, CPU some
1249 accumulated microseconds during interval, memory some avg10,
1250 memory some avg60, memory some avg300, memory some accumulated
1251 microseconds during interval, memory full avg10, memory full
1252 avg60, memory full avg300, memory full accumulated microsec‐
1253 onds during interval, I/O some avg10, I/O some avg60, I/O some
1254 avg300, I/O some accumulated microseconds during interval, I/O
1255 full avg10, I/O full avg60, I/O full avg300, and I/O full
1256 accumulated microseconds during interval.
1257
1258 LVM/MDD/DSK
1259 For every logical volume/multiple device/hard disk one line is
1260 shown.
1261 Subsequent fields: name, number of milliseconds spent for I/O,
1262 number of reads issued, number of sectors transferred for
1263 reads, number of writes issued, and number of sectors trans‐
1264 ferred for write.
1265
1266 NFM Subsequent fields: mounted NFS filesystem, total number of
1267 bytes read, total number of bytes written, number of bytes
1268 read by normal system calls, number of bytes written by normal
1269 system calls, number of bytes read by direct I/O, number of
1270 bytes written by direct I/O, number of pages read by memory-
1271 mapped I/O, and number of pages written by memory-mapped I/O.
1272
1273 NFC Subsequent fields: number of transmitted RPCs, number of
1274 transmitted read RPCs, number of transmitted write RPCs, num‐
1275 ber of RPC retransmissions, and number of authorization
1276 refreshes.
1277
1278 NFS Subsequent fields: number of handled RPCs, number of received
1279 read RPCs, number of received write RPCs, number of bytes read
1280 by clients, number of bytes written by clients, number of RPCs
1281 with bad format, number of RPCs with bad authorization, number
1282 of RPCs from bad client, total number of handled network
1283 requests, number of handled network requests via TCP, number
1284 of handled network requests via UDP, number of handled TCP
1285 connections, number of hits on reply cache, number of misses
1286 on reply cache, and number of uncached requests.
1287
1288 NET First one line is produced for the upper layers of the TCP/IP
1289 stack.
1290 Subsequent fields: the verb "upper", number of packets
1291 received by TCP, number of packets transmitted by TCP, number
1292 of packets received by UDP, number of packets transmitted by
1293 UDP, number of packets received by IP, number of packets
1294 transmitted by IP, number of packets delivered to higher lay‐
1295 ers by IP, and number of packets forwarded by IP.
1296
1297 Next one line is shown for every interface.
1298 Subsequent fields: name of the interface, number of packets
1299 received by the interface, number of bytes received by the
1300 interface, number of packets transmitted by the interface,
1301 number of bytes transmitted by the interface, interface speed,
1302 and duplex mode (0=half, 1=full).
1303
1304 IFB Subsequent fields: name of the InfiniBand interface, port num‐
1305 ber, number of lanes, maximum rate (Mbps), number of bytes
1306 received, number of bytes transmitted, number of packets
1307 received, and number of packets transmitted.
1308
1309 PRG For every process one line is shown.
1310 Subsequent fields: PID (unique ID of task), name (between
1311 brackets), state, real uid, real gid, TGID (group number of
1312 related tasks/threads), total number of threads, exit code (in
1313 case of fatal signal: signal number + 256), start time
1314 (epoch), full command line (between brackets), PPID, number of
1315 threads in state 'running' (R), number of threads in state
1316 'interruptible sleeping' (S), number of threads in state
1317 'uninterruptible sleeping' (D), effective uid, effective gid,
1318 saved uid, saved gid, filesystem uid, filesystem gid, elapsed
1319 time (hertz), is_process (y/n), OpenVZ virtual pid (VPID),
1320 OpenVZ container id (CTID) and Docker container id (CID).
1321
1322 PRC For every process one line is shown.
1323 Subsequent fields: PID, name (between brackets), state, total
1324 number of clock-ticks per second for this machine, CPU-con‐
1325 sumption in user mode (clockticks), CPU-consumption in system
1326 mode (clockticks), nice value, priority, realtime priority,
1327 scheduling policy, current CPU, sleep average, TGID (group
1328 number of related tasks/threads) and is_process (y/n).
1329
1330 PRE For every process one line is shown.
1331 Subsequent fields: PID, name (between brackets), process
1332 state, GPU state (A for active, E for exited, N for no GPU
1333 user), number of GPUs used by this process, bitlist reflecting
1334 used GPUs, GPU busy percentage during interval, memory busy
1335 percentage during interval, memory occupation (KiB) at this
1336 moment cumulative memory occupation (KiB) during interval, and
1337 number of samples taken during interval.
1338
1339 PRM For every process one line is shown.
1340 Subsequent fields: PID, name (between brackets), state, page
1341 size for this machine (in bytes), virtual memory size
1342 (Kbytes), resident memory size (Kbytes), shared text memory
1343 size (Kbytes), virtual memory growth (Kbytes), resident memory
1344 growth (Kbytes), number of minor page faults, number of major
1345 page faults, virtual library exec size (Kbytes), virtual data
1346 size (Kbytes), virtual stack size (Kbytes), swap space used
1347 (Kbytes), TGID (group number of related tasks/threads),
1348 is_process (y/n) and proportional set size (Kbytes) if in 'R'
1349 option is specified.
1350
1351 PRD For every process one line is shown.
1352 Subsequent fields: PID, name (between brackets), state, obso‐
1353 leted kernel patch installed ('n'), standard io statistics
1354 used ('y' or 'n'), number of reads on disk, cumulative number
1355 of sectors read, number of writes on disk, cumulative number
1356 of sectors written, cancelled number of written sectors, TGID
1357 (group number of related tasks/threads) and is_process (y/n).
1358 If the standard I/O statistics (>= 2.6.20) are not used, the
1359 disk I/O counters per process are not relevant. The counters
1360 'number of reads on disk' and 'number of writes on disk' are
1361 obsoleted anyhow.
1362
1363 PRN For every process one line is shown.
1364 Subsequent fields: PID, name (between brackets), state, pmd‐
1365 abcc(1) module `netproc' loaded ('y' or 'n'), number of
1366 tcp_sendmsg() calls, cumulative size of TCP buffers requested
1367 to be transmitted, number of tcp_recvmsg()/tcp_cleanup_rbuf()
1368 calls, cumulative size of TCP buffers received, number of
1369 udp_sendmsg() calls, cumulative size of UDP buffers requested
1370 to be transmitted, number of udp_recvmsg()/skb_consume_udp()
1371 calls, cumulative size of UDP buffers transmitted, number of
1372 raw packets transmitted (obsolete, always 0), number of raw
1373 packets received (obsolete, always 0), TGID (group number of
1374 related tasks/threads) and is_process (y/n).
1375
1377 By sending the SIGUSR1 signal to pcp-atop a new sample will be forced,
1378 even if the current timer interval has not exceeded yet. The behavior
1379 is similar to pressing the `t` key in an interactive session.
1380
1381 By sending the SIGUSR2 signal to pcp-atop a final sample will be forced
1382 after which pcp-atop will terminate.
1383
1385 To monitor the current system load interactively with an interval of 5
1386 seconds:
1387
1388 pcp atop 5
1389
1390 To monitor the system load and write it to a file (in plain ASCII) with
1391 an interval of one minute during half an hour with active processes
1392 sorted on memory consumption:
1393
1394 pcp atop -M 60 30 > /log/pcp-atop.mem
1395
1396 Store information about the system and process activity in a PCP ar‐
1397 chive folio with an interval of ten minutes during an hour:
1398
1399 pcp atop -w /tmp/pcp-atop 600 6
1400
1401 View the contents of this file interactively:
1402
1403 pcp atop -r /tmp/pcp-atop
1404
1405 View the processor and disk utilization of this file in parseable for‐
1406 mat:
1407
1408 pcp atop -PCPU,DSK -r /tmp/pcp-atop.folio
1409
1410 View the contents of today's standard logfile interactively:
1411
1412 pcp atop -r
1413
1414 View the contents of the standard logfile of the day before yesterday
1415 interactively:
1416
1417 pcp atop -r yy
1418
1419 View the contents of the standard logfile of 2014, June 7 from 02:00 PM
1420 onwards interactively:
1421
1422 pcp atop -r 20140607 -b 14:00
1423
1425 pcp-atop is based on the source code of the atop(1) command from
1426 https://atoptool.nl, maintained by Gerlof Langeveld
1427 (gerlof.langeveld@atoptool.nl), and aims to be command line and output
1428 compatible with it as much as possible. Some features of that atop
1429 command are not available in pcp-atop.
1430
1431 Some features of pcp-atop (such as reporting on the Apache HTTP daemon,
1432 Infiniband, NFS client mounts, hardware event counts, GPU statistics
1433 and per-process TCP and UDP statistics) are only activated if the cor‐
1434 resonding PCP metrics are available. Refer to the documentation for
1435 pmdaapache(1), pmdainfiniband(1), pmdanfsclient(1), pmdanvidia(1),
1436 pmdaperfevent(1) and pmdabcc(1) for further details on activating these
1437 metrics.
1438
1439 The semantics of the per-process network statistics deviate slightly
1440 from the atop(1) tool: instead of the number of TCP/UDP packets
1441 sent/received (which may be inaccurate due to TCP segmentation off‐
1442 load), pcp-atop shows the number of tcp_sendmsg()/udp_sendmsg()/etc.
1443 kernel calls per process.
1444
1446 /etc/atoprc
1447 Configuration file containing system-wide default values. See
1448 related man-page.
1449
1450 ~/.atoprc
1451 Configuration file containing personal default values. See
1452 related man-page.
1453
1455 Environment variables with the prefix PCP_ are used to parameterize the
1456 file and directory names used by PCP. On each installation, the file
1457 /etc/pcp.conf contains the local values for these variables. The
1458 $PCP_CONF variable may be used to specify an alternative configuration
1459 file, as described in pcp.conf(5).
1460
1461 For environment variables affecting PCP tools, see pmGetOptions(3).
1462
1464 PCPIntro(1), pcp(1), pcp-atopsar(1), pmdaapache(1), pmdabcc(1), pmdain‐
1465 finiband(1), pmdanfsclient(1), pmdanvidia(1), pmdaproc(1), mkaf(1),
1466 pmlogger(1), pmlogger_daily(1) and pcp-atoprc(5).
1467
1468
1469
1470Performance Co-Pilot PCP PCP-ATOP(1)