1PCP-ATOP(1) General Commands Manual PCP-ATOP(1)
2
3
4
6 pcp-atop - Advanced System and Process Monitor
7
9 Interactive Usage:
10
11 pcp [pcp options] atop [-aAcCdDfFgGmMnNopRsuvxyY1] [-L linelen] [-Pla‐
12 bel[,label]...] [interval [samples]]
13
14 Writing and reading PCP archive folios:
15
16 pcp atop -w folio [-a] [-S] [interval [samples]]
17 pcp atop -r folio [-AcCdDfFgGmMnNopRsuvxy1] [-b [yy-mm-dd] hh:mm] [-e
18 yy-mm-dd] hh:mm] [-L linelen] [-Plabel[,label]...] [interval [samples]]
19
21 The program pcp-atop is an interactive monitor to view various aspects
22 of load on a system. It shows the occupation of the most critical
23 hardware resources (from a performance point of view) on system level,
24 i.e. cpu, memory, disk and network.
25 It also shows which processes are responsible for the indicated load
26 with respect to cpu and memory load on process level. Disk load is
27 shown per process if "storage accounting" is active in the kernel.
28
29 Every interval (default: 10 seconds) information is shown about the re‐
30 source occupation on system level (cpu, memory, disks and network lay‐
31 ers), followed by a list of processes which have been active during the
32 last interval (note that all processes that were unchanged during the
33 last interval are not shown, unless the key 'a' has been pressed or un‐
34 less sorting on memory occupation is done). If the list of active pro‐
35 cesses does not entirely fit on the screen, only the top of the list is
36 shown (sorted in order of activity).
37 The intervals are repeated till the number of samples (specified as
38 command argument) is reached, or till the key 'q' is pressed in inter‐
39 active mode.
40
41 When invoked via the pcp(1) command, the PCPIntro(1) options -h/--host,
42 -a/--archive, -O/--origin, -s/--samples, -t/--interval, -Z/--timezone
43 and several other pcp options become indirectly available. The long
44 option form of these is directly available. Additionally, the --hot‐
45 proc option can be used to request the per-process PCP metrics be used
46 instead of the default proc metrics from pmdaproc(1).
47
48 When pcp-atop is started, it checks whether the standard output channel
49 is connected to a screen, or to a file/pipe. In the first case it pro‐
50 duces screen control codes (via the ncurses library) and behaves inter‐
51 actively; in the second case it produces flat ASCII-output.
52
53 In interactive mode, the output of pcp-atop scales dynamically to the
54 current dimensions of the screen/window.
55 If the window is resized horizontally, columns will be added or removed
56 automatically. For this purpose, every column has a particular weight.
57 The columns with the highest weights that fit within the current width
58 will be shown.
59 If the window is resized vertically, lines of the process/thread list
60 will be added or removed automatically.
61
62 Furthermore in interactive mode the output of pcp-atop can be con‐
63 trolled by pressing particular keys. However it is also possible to
64 specify such key as flag on the command line. In that case pcp-atop
65 switches to the indicated mode on beforehand; this mode can be modified
66 again interactively. Specifying such key as flag is especially useful
67 when running pcp-atop with output to a pipe or file (non-interac‐
68 tively). These flags are the same as the keys that can be pressed in
69 interactive mode (see section INTERACTIVE COMMANDS).
70 Additional flags are available to support storage of pcp-atop data in
71 PCP archive format (see section PCP DATA STORAGE).
72
74 For the resource consumption on system level, pcp-atop uses colors to
75 indicate that a critical occupation percentage has been (almost)
76 reached. A critical occupation percentage means that is likely that
77 this load causes a noticeable negative performance influence for appli‐
78 cations using this resource. The critical percentage depends on the
79 type of resource: e.g. the performance influence of a disk with a busy
80 percentage of 80% might be more noticeable for applications/user than a
81 CPU with a busy percentage of 90%.
82 Currently pcp-atop uses the following default values to calculate a
83 weighted percentage per resource:
84
85 Processor
86 A busy percentage of 90% or higher is considered `critical'.
87
88 Disk
89 A busy percentage of 70% or higher is considered `critical'.
90
91 Network
92 A busy percentage of 90% or higher for the load of an interface is
93 considered `critical'.
94
95 Memory
96 An occupation percentage of 90% is considered `critical'. Notice
97 that this occupation percentage is the accumulated memory consump‐
98 tion of the kernel (including slab) and all processes; the memory
99 for the page cache (`cache' and `buff' in the MEM-line) and the
100 reclaimable part of the slab (`slrec`) is not implied!
101 If the number of pages swapped out (`swout' in the PAG-line) is
102 larger than 10 per second, the memory resource is considered
103 `critical'. A value of at least 1 per second is considered `al‐
104 most critical'.
105 If the committed virtual memory exceeds the limit (`vmcom' and
106 `vmlim' in the SWP-line), the SWP-line is colored due to overcom‐
107 mitting the system.
108
109 Swap
110 An occupation percentage of 80% is considered `critical' because
111 swap space might be completely exhausted in the near future; it is
112 not critical from a performance point-of-view.
113
114 These default values can be modified in the configuration file (see
115 separate man-page of pcp-atoprc(5)).
116
117 When a resource exceeds its critical occupation percentage, the con‐
118 cerning values in the screen line are colored red by default.
119 When a resource exceeded (default) 80% of its critical percentage (so
120 it is almost critical), the concerning values in the screen line are
121 colored cyan by default. This `almost critical percentage' (one value
122 for all resources) can be modified in the configuration file (see sepa‐
123 rate man-page of pcp-atoprc(5)).
124 The default colors red and cyan can be modified in the configuration
125 file as well (see separate man-page of pcp-atoprc(5)).
126
127 With the key 'x' (or flag -x), the use of colors can be suppressed.
128
130 GPU statistics can be gathered by pmdanvidia(1) which is a separate
131 data collection daemon process. It gathers cumulative utilization
132 counters of every Nvidia GPU in the system, as well as utilization
133 counters of every process that uses a GPU. When pcp-atop notices that
134 the daemon is active, it reads these GPU utilization counters with ev‐
135 ery interval.
136
137 Find a description about the utilization counters in the section OUTPUT
138 DESCRIPTION.
139
141 When running pcp-atop interactively (no output redirection), keys can
142 be pressed to control the output. In general, lower case keys can be
143 used to show other information for the active processes and upper case
144 keys can be used to influence the sort order of the active
145 process/thread list.
146
147 g Show generic output (default).
148
149 Per process the following fields are shown in case of a window-
150 width of 80 positions: process-id, cpu consumption during the last
151 interval in system and user mode, the virtual and resident memory
152 growth of the process.
153
154 The subsequent columns depend on the used kernel:
155 When the kernel supports "storage accounting" (>= 2.6.20), the
156 data transfer for read/write on disk, the status and exit code are
157 shown for each process. When the kernel does not support "storage
158 accounting", the username, number of threads in the thread group,
159 the status and exit code are shown.
160 The last columns contain the state, the occupation percentage for
161 the chosen resource (default: cpu) and the process name.
162
163 When more than 80 positions are available, other information is
164 added.
165
166 m Show memory related output.
167
168 Per process the following fields are shown in case of a window-
169 width of 80 positions: process-id, minor and major memory faults,
170 size of virtual shared text, total virtual process size, total
171 resident process size, virtual and resident growth during last in‐
172 terval, memory occupation percentage and process name.
173
174 When more than 80 positions are available, other information is
175 added.
176
177 For memory consumption, always all processes are shown (also the
178 processes that were not active during the interval).
179
180 d Show disk-related output.
181
182 When "storage accounting" is active in the kernel, the following
183 fields are shown: process-id, amount of data read from disk,
184 amount of data written to disk, amount of data that was written
185 but has been withdrawn again (WCANCL), disk occupation percentage
186 and process name.
187
188 s Show scheduling characteristics.
189
190 Per process the following fields are shown in case of a window-
191 width of 80 positions: process-id, number of threads in state
192 'running' (R), number of threads in state 'interruptible sleeping'
193 (S), number of threads in state 'uninterruptible sleeping' (D),
194 scheduling policy (normal timesharing, realtime round-robin, real‐
195 time fifo), nice value, priority, realtime priority, current pro‐
196 cessor, status, exit code, state, the occupation percentage for
197 the chosen resource and the process name.
198
199 When more than 80 positions are available, other information is
200 added.
201
202 v Show various process characteristics.
203
204 Per process the following fields are shown in case of a window-
205 width of 80 positions: process-id, user name and group, start date
206 and time, status (e.g. exit code if the process has finished),
207 state, the occupation percentage for the chosen resource and the
208 process name.
209
210 When more than 80 positions are available, other information is
211 added.
212
213 c Show the command line of the process.
214
215 Per process the following fields are shown: process-id, the occu‐
216 pation percentage for the chosen resource and the command line in‐
217 cluding arguments.
218
219 e Show GPU utilization.
220
221 Per process at least the following fields are shown: process-id,
222 range of GPU numbers on which the process currently runs, GPU busy
223 percentage on all GPUs, memory busy percentage (i.e. read and
224 write accesses on memory) on all GPUs, memory occupation at the
225 moment of the sample, average memory occupation during the sample,
226 and GPU percentage.
227
228 When the pmdanvidia daemon does not run with root privileges, the
229 GPU busy percentage and the memory busy percentage are not avail‐
230 able on process level. In that case, the GPU percentage on
231 process level reflects the GPU memory occupation instead of the
232 GPU busy percentage (which is preferred).
233
234 o Show the user-defined line of the process.
235
236 In the configuration file the keyword ownprocline can be specified
237 with the description of a user-defined output-line.
238 Refer to the man-page of pcp-atoprc(5) for a detailed description.
239
240 y Show the individual threads within a process (toggle).
241
242 Single-threaded processes are still shown as one line.
243 For multi-threaded processes, one line represents the process
244 while additional lines show the activity per individual thread (in
245 a different color). Depending on the option 'a' (all or active
246 toggle), all threads are shown or only the threads that were ac‐
247 tive during the last interval. Depending on the option 'Y' (sort
248 threads), the threads per process will be sorted on the chosen
249 sort criterium or not.
250 Whether this key is active or not can be seen in the header line.
251
252 Y Sort the threads per process when combined with option 'y' (tog‐
253 gle).
254
255 u Show the process activity accumulated per user.
256
257 Per user the following fields are shown: number of processes ac‐
258 tive or terminated during last interval (or in total if combined
259 with command `a'), accumulated cpu consumption during last inter‐
260 val in system and user mode, the current virtual and resident mem‐
261 ory space consumed by active processes (or all processes of the
262 user if combined with command `a').
263 When "storage accounting" is active in the kernel, the accumulated
264 read and write throughput on disk is shown. When the pmdabcc(1)
265 module `netproc' has been installed, the number of receive and
266 send network calls are shown.
267 The last columns contain the accumulated occupation percentage for
268 the chosen resource (default: cpu) and the user name.
269
270 p Show the process activity accumulated per program (i.e. process
271 name).
272
273 Per program the following fields are shown: number of processes
274 active or terminated during last interval (or in total if combined
275 with command `a'), accumulated cpu consumption during last inter‐
276 val in system and user mode, the current virtual and resident mem‐
277 ory space consumed by active processes (or all processes of the
278 user if combined with command `a').
279 When "storage accounting" is active in the kernel, the accumulated
280 read and write throughput on disk is shown. When the pmdabcc(1)
281 module `netproc' has been installed, the number of receive and
282 send network calls are shown.
283 The last columns contain the accumulated occupation percentage for
284 the chosen resource (default: cpu) and the program name.
285
286 j Show the process activity accumulated per Docker container.
287
288 Per container the following fields are shown: number of processes
289 active or terminated during last interval (or in total if combined
290 with command `a'), accumulated cpu consumption during last inter‐
291 val in system and user mode, the current virtual and resident mem‐
292 ory space consumed by active processes (or all processes of the
293 user if combined with command `a').
294 When "storage accounting" is active in the kernel, the accumulated
295 read and write throughput on disk is shown. When the pmdabcc(1)
296 module `netproc' has been installed, the number of receive and
297 send network calls are shown.
298 The last columns contain the accumulated occupation percentage for
299 the chosen resource (default: cpu) and the Docker container id
300 (CID).
301
302 C Sort the current list in the order of cpu consumption (default).
303 The one-but-last column changes to ``CPU''.
304
305 E Sort the current list in the order of GPU utilization (preferred,
306 but only applicable when the pmdanvidia daemon runs under root
307 privileges) or the order of GPU memory occupation). The one-but-
308 last column changes to ``GPU''.
309
310 M Sort the current list in the order of resident memory consumption.
311 The one-but-last column changes to ``MEM''. In case of sorting on
312 memory, the full process list will be shown (not only the active
313 processes).
314
315 D Sort the current list in the order of disk accesses issued. The
316 one-but-last column changes to ``DSK''.
317
318 N Sort the current list in the order of network bandwidth (received
319 and transmitted). The one-but-last column changes to ``NET''.
320
321 A Sort the current list automatically in the order of the most busy
322 system resource during this interval. The one-but-last column
323 shows either ``ACPU'', ``AMEM'', ``ADSK'' or ``ANET'' (the preced‐
324 ing 'A' indicates automatic sorting-order). The most busy re‐
325 source is determined by comparing the weighted busy-percentages of
326 the system resources, as described earlier in the section COLORS.
327 This option remains valid until another sorting-order is explic‐
328 itly selected again.
329 A sorting-order for disk is only possible when "storage account‐
330 ing" is active. A sorting-order for network is only possible when
331 the pmdabcc(1) module `netproc' has been installed.
332
333 Miscellaneous interactive commands:
334
335 ? Request for help information (also the key 'h' can be pressed).
336
337 V Request for version information (version number and date).
338
339 R Gather and calculate the proportional set size of processes (tog‐
340 gle). Gathering of all values that are needed to calculate the
341 PSIZE of a process is a very time-consuming task, so this key
342 should only be active when analyzing the resident memory consump‐
343 tion of processes.
344
345 W Get the WCHAN per thread (toggle). Gathering of the WCHAN string
346 per thread is a relatively time-consuming task, so this key should
347 only be made active when analyzing the reason for threads to be in
348 sleep state.
349
350 x Suppress colors to highlight critical resources (toggle).
351 Whether this key is active or not can be seen in the header line.
352
353 z The pause key can be used to freeze the current situation in order
354 to investigate the output on the screen. While pcp-atop is
355 paused, the keys described above can be pressed to show other in‐
356 formation about the current list of processes. Whenever the pause
357 key is pressed again, pcp-atop will continue with the next sample.
358
359 i Modify the interval timer (default: 10 seconds). If an interval
360 timer of 0 is entered, the interval timer is switched off. In
361 that case a new sample can only be triggered manually by pressing
362 the key 't'.
363
364 t Trigger a new sample manually. This key can be pressed if the
365 current sample should be finished before the timer has exceeded,
366 or if no timer is set at all (interval timer defined as 0). In
367 the latter case pcp-atop can be used as a stopwatch to measure the
368 load being caused by a particular application transaction, without
369 knowing on beforehand how many seconds this transaction will last.
370
371 When viewing the contents of an archive folio, this key can be
372 used to show the next sample from the folio.
373
374 T When viewing the contents of an archive folio, this key can be
375 used to show the previous sample from the folio.
376
377 b When viewing the contents of an archive folio, this key can be
378 used to move to a certain timestamp within the file (either for‐
379 ward or backward).
380
381 r Reset all counters to zero to see the system and process activity
382 since boot again.
383
384 When viewing the contents of an archive, this key can be used to
385 rewind to the beginning of the file again.
386
387 U Specify a search string for specific user names as a regular ex‐
388 pression. From now on, only (active) processes will be shown from
389 a user which matches the regular expression. The system statis‐
390 tics are still system wide. If the Enter-key is pressed without
391 specifying a name, (active) processes of all users will be shown
392 again.
393 Whether this key is active or not can be seen in the header line.
394
395 I Specify a list with one or more PIDs to be selected. From now on,
396 only processes will be shown with a PID which matches one of the
397 given list. The system statistics are still system wide. If the
398 Enter-key is pressed without specifying a PID, all (active) pro‐
399 cesses will be shown again.
400 Whether this key is active or not can be seen in the header line.
401
402 P Specify a search string for specific process names as a regular
403 expression. From now on, only processes will be shown with a name
404 which matches the regular expression. The system statistics are
405 still system wide. If the Enter-key is pressed without specifying
406 a name, all (active) processes will be shown again.
407 Whether this key is active or not can be seen in the header line.
408
409 / Specify a specific command line search string as a regular expres‐
410 sion. From now on, only processes will be shown with a command
411 line which matches the regular expression. The system statistics
412 are still system wide. If the Enter-key is pressed without speci‐
413 fying a string, all (active) processes will be shown again.
414 Whether this key is active or not can be seen in the header line.
415
416 J Specify a Docker container id of 12 (hexadecimal) characters.
417 From now on, only processes will be shown that run in that spe‐
418 cific Docker container (CID). The system statistics are still
419 system wide. If the Enter-key is pressed without specifying a
420 container id, all (active) processes will be shown again.
421 Whether this key is active or not can be seen in the header line.
422
423 Q Specify a comma-separated list of process state characters. From
424 now on, only processes will be shown that are in those specific
425 process states. Accepted states are: R (running), S (sleeping), D
426 (disk sleep), T (stopped), t (tracing stop), X (dead), Z (zombie)
427 and P (parked). The system statistics are still system wide. If
428 the Enter-key is pressed without specifying a state, all (active)
429 processes will be shown again.
430 Whether this key is active or not can be seen in the header line.
431
432 S Specify search strings for specific logical volume names, specific
433 disk names and specific network interface names. All search
434 strings are interpreted as a regular expressions. From now on,
435 only those system resources are shown that match the concerning
436 regular expression. If the Enter-key is pressed without specify‐
437 ing a search string, all (active) system resources of that type
438 will be shown again.
439 Whether this key is active or not can be seen in the header line.
440
441 a The `all/active' key can be used to toggle between only show‐
442 ing/accumulating the processes that were active during the last
443 interval (default) or showing/accumulating all processes.
444 Whether this key is active or not can be seen in the header line.
445
446 G By default, pcp-atop shows/accumulates the processes that are
447 alive and the processes that are exited during the last interval.
448 With this key (toggle), showing/accumulating the processes that
449 are exited can be suppressed.
450 Whether this key is active or not can be seen in the header line.
451
452 f Show a fixed (maximum) number of header lines for system resources
453 (toggle). By default only the lines are shown about system re‐
454 sources (CPUs, paging, logical volumes, disks, network interfaces)
455 that really have been active during the last interval. With this
456 key you can force pcp-atop to show lines of inactive resources as
457 well.
458 Whether this key is active or not can be seen in the header line.
459
460 F Suppress sorting of system resources (toggle). By default system
461 resources (CPUs, logical volumes, disks, network interfaces) are
462 sorted on utilization.
463 Whether this key is active or not can be seen in the header line.
464
465 1 Show relevant counters as an average per second (in the format
466 `..../s') instead of as a total during the interval (toggle).
467 Whether this key is active or not can be seen in the header line.
468
469 l Limit the number of system level lines for the counters per-cpu,
470 the active disks and the network interfaces. By default lines are
471 shown of all CPUs, disks and network interfaces which have been
472 active during the last interval. Limiting these lines can be use‐
473 ful on systems with huge number CPUs, disks or interfaces in order
474 to be able to run pcp-atop on a screen/window with e.g. only 24
475 lines.
476 For all mentioned resources the maximum number of lines can be
477 specified interactively. When using the flag -l the maximum number
478 of per-cpu lines is set to 0, the maximum number of disk lines to
479 5 and the maximum number of interface lines to 3. These values
480 can be modified again in interactive mode.
481
482 k Send a signal to an active process (a.k.a. kill a process).
483
484 q Quit the program.
485
486 PgDn Show the next page of the process/thread list.
487 With the arrow-down key the list can be scrolled downwards with
488 single lines.
489
490 ^F Show the next page of the process/thread list (forward).
491 With the arrow-down key the list can be scrolled downwards with
492 single lines.
493
494 PgUp Show the previous page of the process/thread list.
495 With the arrow-up key the list can be scrolled upwards with single
496 lines.
497
498 ^B Show the previous page of the process/thread list (backward).
499 With the arrow-up key the list can be scrolled upwards with single
500 lines.
501
502 ^L Redraw the screen.
503
505 In order to store system and process level statistics for long-term
506 analysis (e.g. to check the system load and the active processes run‐
507 ning yesterday between 3:00 and 4:00 PM), pcp-atop can store the system
508 and process level statistics in the PCP archive format, as an archive
509 folio (see mkaf(1)).
510 All information about processes and threads is stored in the archive.
511 The interval (default: 10 seconds) and number of samples (default: in‐
512 finite) can be passed as last arguments. Instead of the number of sam‐
513 ples, the flag -S can be used to indicate that pcp-atop should finish
514 anyhow before midnight.
515
516 A PCP archive can be read and visualized again with the -r option. The
517 argument is a comma-separated list of names, each of which may be the
518 base name of an archive or the name of a directory containing one or
519 more archives. If no argument is specified, the file $PCP_LOG_DIR/pm‐
520 logger/HOST/YYYYMMDD is opened for input (where YYYYMMDD are digits
521 representing the current date, and HOST is the hostname of the machine
522 being logged). If a filename is specified in the format YYYYMMDD (rep‐
523 resenting any valid date), the file $PCP_LOG_DIR/pmlogger/HOST/YYYYMMDD
524 is opened. If a filename with the symbolic name y is specified, yes‐
525 terday's daily logfile is opened (this can be repeated so 'yyyy' indi‐
526 cates the logfile of four days ago).
527 The samples from the file can be viewed interactively by using the key
528 't' to show the next sample, the key 'T' to show the previous sample,
529 the key 'b' to branch to a particular time or the key 'r' to rewind to
530 the begin of the file.
531 When output is redirected to a file or pipe, pcp-atop prints all sam‐
532 ples in plain ASCII. The default line length is 80 characters in that
533 case; with the flag -L followed by an alternate line length, more (or
534 less) columns will be shown.
535 With the flag -b (begin time) and/or -e (end time) followed by a time
536 argument of the form [YY-MM-DD] HH:MM, a certain time period within the
537 archive can be selected.
538
540 The first sample shows the system level activity since boot (the
541 elapsed time in the header shows the time since boot). Note that par‐
542 ticular counters could have reached their maximum value (several times)
543 and started by zero again, so do not rely on these figures.
544
545 For every sample pcp-atop first shows the lines related to system level
546 activity. If a particular system resource has not been used during the
547 interval, the entire line related to this resource is suppressed. So
548 the number of system level lines may vary for each sample.
549 After that a list is shown of processes which have been active during
550 the last interval. This list is by default sorted on cpu consumption,
551 but this order can be changed by the keys which are previously de‐
552 scribed.
553
554 If values have to be shown by pcp-atop which do not fit in the column
555 width, another format is used. If e.g. a cpu-consumption of 233216 mil‐
556 liseconds should be shown in a column width of 4 positions, it is shown
557 as `233s' (in seconds). For large memory figures, another unit is cho‐
558 sen if the value does not fit (Mb instead of Kb, Gb instead of Mb, Tb
559 instead of Gb, ...). For other values, a kind of exponent notation is
560 used (value 123456789 shown in a column of 5 positions gives 123e6).
561
563 The system level information consists of the following output lines:
564
565 PRC Process and thread level totals.
566 This line contains the total cpu time consumed in system mode
567 (`sys') and in user mode (`user'), the total number of processes
568 present at this moment (`#proc'), the total number of threads
569 present at this moment in state `running' (`#trun'), `sleeping in‐
570 terruptible' (`#tslpi') and `sleeping uninterruptible' (`#tslpu'),
571 the number of zombie processes (`#zombie'), the number of clone
572 system calls (`clones'), and the number of processes that ended
573 during the interval (`#exit') when process accounting is used. In‐
574 stead of `#exit` the last column may indicate that process ac‐
575 counting could not be activated (`no procacct`).
576 If the screen-width does not allow all of these counters, only a
577 relevant subset is shown.
578
579 CPU CPU utilization.
580 At least one line is shown for the total occupation of all CPUs
581 together.
582 In case of a multi-processor system, an additional line is shown
583 for every individual processor (with `cpu' in lower case), sorted
584 on activity. Inactive CPUs will not be shown by default. The
585 lines showing the per-cpu occupation contain the cpu number in the
586 field combined with the wait percentage.
587
588 Every line contains the percentage of cpu time spent in kernel
589 mode by all active processes (`sys'), the percentage of cpu time
590 consumed in user mode (`user') for all active processes (including
591 processes running with a nice value larger than zero), the per‐
592 centage of cpu time spent for interrupt handling (`irq') including
593 softirq, the percentage of unused cpu time while no processes were
594 waiting for disk I/O (`idle'), and the percentage of unused cpu
595 time while at least one process was waiting for disk I/O (`wait').
596 In case of per-cpu occupation, the cpu number and the wait per‐
597 centage (`w') for that cpu. The number of lines showing the per-
598 cpu occupation can be limited.
599
600 For virtual machines, the steal-percentage (`steal') shows the
601 percentage of cpu time stolen by other virtual machines running on
602 the same hardware.
603 For physical machines hosting one or more virtual machines, the
604 guest-percentage (`guest') shows the percentage of cpu time used
605 by the virtual machines. Notice that this percentage overlaps the
606 user-percentage!
607
608 When PMC performance monitoring counters are supported by the CPU
609 and the kernel (and pmdaperfevent(1) runs with root privileges),
610 the number of instructions per CPU cycle (`ipc') is shown. The
611 first sample always shows the value 'initial', because the coun‐
612 ters are just activated at the moment that pcp-atop is started.
613 When the CPU busy percentage is high and the IPC is less than 1.0,
614 it is likely that the CPU is frequently waiting for memory access
615 during instruction execution (larger CPU caches or faster memory
616 might be helpful to improve performance). When the CPU busy per‐
617 centage is high and the IPC is greater than 1.0, it is likely that
618 the CPU is instruction-bound (more/faster cores might be helpful
619 to improve performance).
620 Furthermore, per CPU the effective number of cycles (`cycl') is
621 shown. This value can reach the current CPU frequency if such CPU
622 is 100% busy. When an idle CPU is halted, the number of effective
623 cycles can be (considerably) lower than the current frequency.
624 Notice that the average instructions per cycle and number of cy‐
625 cles is shown in the CPU line for all CPUs.
626 See also: http://www.brendangregg.com/blog/2017-05-09/cpu-utiliza‐
627 tion-is-wrong.html
628
629 In case of frequency scaling, all previously mentioned CPU per‐
630 centages are relative to the used scaling of the CPU during the
631 interval. If a CPU has been active for e.g. 50% in user mode dur‐
632 ing the interval while the frequency scaling of that CPU was 40%,
633 only 20% of the full capacity of the CPU has been used in user
634 mode.
635
636 If the screen-width does not allow all of these counters, only a
637 relevant subset is shown.
638
639 CPL CPU load information.
640 This line contains the load average figures reflecting the number
641 of threads that are available to run on a CPU (i.e. part of the
642 runqueue) or that are waiting for disk I/O. These figures are av‐
643 eraged over 1 (`avg1'), 5 (`avg5') and 15 (`avg15') minutes.
644 Furthermore the number of context switches (`csw'), the number of
645 serviced interrupts (`intr') and the number of available CPUs are
646 shown.
647
648 If the screen-width does not allow all of these counters, only a
649 relevant subset is shown.
650
651 GPU GPU utilization (Nvidia).
652 Read the section GPU STATISTICS GATHERING in this document to find
653 the details about the activation of the pmdanvidia daemon.
654
655 In the first column of every line, the bus-id (last nine charac‐
656 ters) and the GPU number are shown. The subsequent columns show
657 the percentage of time that one or more kernels were executing on
658 the GPU (`gpubusy'), the percentage of time that global (device)
659 memory was being read or written (`membusy'), the occupation per‐
660 centage of memory (`memocc'), the total memory (`total'), the mem‐
661 ory being in use at the moment of the sample (`used'), the average
662 memory being in use during the sample time (`usavg'), the number
663 of processes being active on the GPU at the moment of the sample
664 (`#proc'), and the type of GPU.
665
666 If the screen-width does not allow all of these counters, only a
667 relevant subset is shown.
668 The number of lines showing the GPUs can be limited.
669
670 MEM Memory occupation.
671 This line contains the total amount of physical memory (`tot'),
672 the amount of memory which is currently free (`free'), the amount
673 of memory in use as page cache including the total resident shared
674 memory (`cache'), the amount of memory within the page cache that
675 has to be flushed to disk (`dirty'), the amount of memory used for
676 filesystem meta data (`buff'), the amount of memory being used for
677 kernel mallocs (`slab'), the amount of slab memory that is re‐
678 claimable (`slrec'), the resident size of shared memory including
679 tmpfs (`shmem`), the resident size of shared memory (`shrss`) the
680 amount of shared memory that is currently swapped (`shswp`), the
681 amount of memory that is currently claimed by vmware's balloon
682 driver (`vmbal`), the amount of memory that is currently claimed
683 by the ARC (cache) of ZFSonlinux (`zfarc`), the amount of memory
684 that is claimed for huge pages (`hptot`), and the amount of huge
685 page memory that is really in use (`hpuse`).
686
687 If the screen-width does not allow all of these counters, only a
688 relevant subset is shown.
689
690 SWP Swap occupation and overcommit info.
691 This line contains the total amount of swap space on disk (`tot')
692 and the amount of free swap space (`free'), the size of the swap
693 cache (`swcac'), the total size of compressed storage in zswap
694 (`zpool`), the total size of the compressed pages stored in zswap
695 (`zstor'), the total size of the memory used for KSM (`ksuse`,
696 i.e. shared), and the total size of the memory saved (deduped) by
697 KSM (`kssav`, i.e. sharing).
698 Furthermore the committed virtual memory space (`vmcom') and the
699 maximum limit of the committed space (`vmlim', which is by default
700 swap size plus 50% of memory size) is shown. The committed space
701 is the reserved virtual space for all allocations of private mem‐
702 ory space for processes. The kernel only verifies whether the
703 committed space exceeds the limit if strict overcommit handling is
704 configured (vm.overcommit_memory is 2).
705
706 PAG Paging frequency.
707 This line contains the number of scanned pages (`scan') due to the
708 fact that free memory drops below a particular threshold and the
709 number times that the kernel tries to reclaim pages due to an ur‐
710 gent need (`stall').
711 Also the number of memory pages the system read from swap space
712 (`swin') and the number of memory pages the system wrote to swap
713 space (`swout') and the number of OOM (out-of-memory) kills
714 (`oomkill') are shown.
715
716 PSI Pressure Stall Information.
717 This line contains percentages about resource pressure related to
718 CPU, memory and I/O. Certain percentages refer to 'some' meaning
719 that some processes/threads were delayed due to resource overload.
720 Other percentages refer to 'full' meaning a loss of overall
721 throughput due to resource overload.
722 The values `cpusome', `memsome', `memfull', `iosome' and `iofull'
723 show the pressure percentage during the entire interval.
724 The values `cs' (cpu some), `ms' (memory some), `mf' (memory
725 full), `is' (I/O some) and `if' (I/O full) each show three per‐
726 centages separated by slashes: pressure percentage over the last
727 10, 60 and 300 seconds.
728
729 LVM/MDD/DSK
730 Logical volume/multiple device/disk utilization.
731 Per active unit one line is produced, sorted on unit activity.
732 Such line shows the name (e.g. VolGroup00-lvtmp for a logical vol‐
733 ume or sda for a hard disk), the busy percentage i.e. the portion
734 of time that the unit was busy handling requests (`busy'), the
735 number of read requests issued (`read'), the number of write re‐
736 quests issued (`write'), the number of KiBytes per read (`KiB/r'),
737 the number of KiBytes per write (`KiB/w'), the number of MiBytes
738 per second throughput for reads (`MBr/s'), the number of MiBytes
739 per second throughput for writes (`MBw/s'), the average queue
740 depth (`avq') and the average number of milliseconds needed by a
741 request (`avio') for seek, latency and data transfer.
742 If the screen-width does not allow all of these counters, only a
743 relevant subset is shown.
744
745 The number of lines showing the units can be limited per class
746 (LVM, MDD or DSK) with the 'l' key or statically (see separate
747 man-page of pcp-atoprc(5)). By specifying the value 0 for a par‐
748 ticular class, no lines will be shown any more for that class.
749
750 NFM Network Filesystem (NFS) mount at the client side.
751 For each NFS-mounted filesystem, a line is shown that contains the
752 mounted server directory, the name of the server (`srv'), the to‐
753 tal number of bytes physically read from the server (`read') and
754 the total number of bytes physically written to the server
755 (`write'). Data transfer is subdivided in the number of bytes
756 read via normal read() system calls (`nread'), the number of bytes
757 written via normal read() system calls (`nwrit'), the number of
758 bytes read via direct I/O (`dread'), the number of bytes written
759 via direct I/O (`dwrit'), the number of bytes read via memory
760 mapped I/O pages (`mread'), and the number of bytes written via
761 memory mapped I/O pages (`mwrit').
762
763 NFC Network Filesystem (NFS) client side counters.
764 This line contains the number of RPC calls issues by local pro‐
765 cesses (`rpc'), the number of read RPC calls (`read`) and write
766 RPC calls (`rpwrite') issued to the NFS server, the number of RPC
767 calls being retransmitted (`retxmit') and the number of authoriza‐
768 tion refreshes (`autref').
769
770 NFS Network Filesystem (NFS) server side counters.
771 This line contains the number of RPC calls received from NFS
772 clients (`rpc'), the number of read RPC calls received (`cread`),
773 the number of write RPC calls received (`cwrit'), the number of
774 Megabytes/second returned to read requests by clients (`MBcr/s`),
775 the number of Megabytes/second passed in write requests by clients
776 (`MBcw/s`), the number of network requests handled via TCP
777 (`nettcp'), the number of network requests handled via UDP (`ne‐
778 tudp'), the number of reply cache hits (`rchits'), the number of
779 reply cache misses (`rcmiss') and the number of uncached requests
780 (`rcnoca'). Furthermore some error counters indicating the number
781 of requests with a bad format (`badfmt') or a bad authorization
782 (`badaut'), and a counter indicating the number of bad clients
783 (`badcln').
784
785 NET Network utilization (TCP/IP).
786 One line is shown for activity of the transport layer (TCP and
787 UDP), one line for the IP layer and one line per active interface.
788 For the transport layer, counters are shown concerning the number
789 of received TCP segments including those received in error
790 (`tcpi'), the number of transmitted TCP segments excluding those
791 containing only retransmitted octets (`tcpo'), the number of UDP
792 datagrams received (`udpi'), the number of UDP datagrams transmit‐
793 ted (`udpo'), the number of active TCP opens (`tcpao'), the number
794 of passive TCP opens (`tcppo'), the number of TCP output retrans‐
795 missions (`tcprs'), the number of TCP input errors (`tcpie'), the
796 number of TCP output resets (`tcpor'), the number of UDP no ports
797 (`udpnp'), and the number of UDP input errors (`udpie').
798 If the screen-width does not allow all of these counters, only a
799 relevant subset is shown.
800 These counters are related to IPv4 and IPv6 combined.
801
802 For the IP layer, counters are shown concerning the number of IP
803 datagrams received from interfaces, including those received in
804 error (`ipi'), the number of IP datagrams that local higher-layer
805 protocols offered for transmission (`ipo'), the number of received
806 IP datagrams which were forwarded to other interfaces (`ipfrw'),
807 the number of IP datagrams which were delivered to local higher-
808 layer protocols (`deliv'), the number of received ICMP datagrams
809 (`icmpi'), and the number of transmitted ICMP datagrams (`icmpo').
810 If the screen-width does not allow all of these counters, only a
811 relevant subset is shown.
812 These counters are related to IPv4 and IPv6 combined.
813
814 For every active network interface one line is shown, sorted on
815 the interface activity. Such line shows the name of the interface
816 and its busy percentage in the first column. The busy percentage
817 for half duplex is determined by comparing the interface speed
818 with the number of bits transmitted and received per second; for
819 full duplex the interface speed is compared with the highest of
820 either the transmitted or the received bits. When the interface
821 speed can not be determined (e.g. for the loopback interface),
822 `---' is shown instead of the percentage.
823 Furthermore the number of received packets (`pcki'), the number of
824 transmitted packets (`pcko'), the line speed of the interface
825 (`sp'), the effective amount of bits received per second (`si'),
826 the effective amount of bits transmitted per second (`so'), the
827 number of collisions (`coll'), the number of received multicast
828 packets (`mlti'), the number of errors while receiving a packet
829 (`erri'), the number of errors while transmitting a packet
830 (`erro'), the number of received packets dropped (`drpi'), and the
831 number of transmitted packets dropped (`drpo').
832 If the screen-width does not allow all of these counters, only a
833 relevant subset is shown.
834 The number of lines showing the network interfaces can be limited.
835
836 IFB Infiniband utilization.
837 For every active Infiniband port one line is shown, sorted on ac‐
838 tivity. Such line shows the name of the port and its busy per‐
839 centage in the first column. The busy percentage is determined by
840 taking the highest of either the transmitted or the received bits
841 during the interval, multiplying that value by the number of lanes
842 and comparing it against the maximum port speed.
843 Furthermore the number of received packets divided by the number
844 of lanes (`pcki'), the number of transmitted packets divided by
845 the number of lanes (`pcko'), the maximum line speed (`sp'), the
846 effective amount of bits received per second (`si'), the effective
847 amount of bits transmitted per second (`so'), and the number of
848 lanes (`lanes').
849 If the screen-width does not allow all of these counters, only a
850 relevant subset is shown.
851 The number of lines showing the Infiniband ports can be limited.
852
854 Following the system level information, the processes are shown from
855 which the resource utilization has changed during the last interval.
856 These processes might have used cpu time or issued disk or network re‐
857 quests. However a process is also shown if part of it has been paged
858 out due to lack of memory (while the process itself was in sleep
859 state).
860
861 Per process the following fields may be shown (in alphabetical order),
862 depending on the current output mode as described in the section INTER‐
863 ACTIVE COMMANDS and depending on the current width of your window:
864
865 AVGRSZ The average size of one read-action on disk.
866
867 AVGWSZ The average size of one write-action on disk.
868
869 CID Container ID (Docker) of 12 hexadecimal digits, referring to
870 the container in which the process/thread is running. If a
871 process has been started and finished during the last inter‐
872 val, a `?' is shown because the container ID is not part of
873 the standard process accounting record.
874
875 CMD The name of the process. This name can be surrounded by
876 "less/greater than" signs (`<name>') which means that the
877 process has finished during the last interval.
878 Behind the abbreviation `CMD' in the header line, the current
879 page number and the total number of pages of the
880 process/thread list are shown.
881
882 COMMAND-LINE
883 The full command line of the process (including arguments). If
884 the length of the command line exceeds the length of the
885 screen line, the arrow keys -> and <- can be used for horizon‐
886 tal scroll.
887 Behind the verb `COMMAND-LINE' in the header line, the current
888 page number and the total number of pages of the
889 process/thread list are shown.
890
891 CPU The occupation percentage of this process related to the
892 available capacity for this resource on system level.
893
894 CPUNR The identification of the CPU the (main) thread is running on
895 or has recently been running on.
896
897 CTID Container ID (OpenVZ). If a process has been started and fin‐
898 ished during the last interval, a `?' is shown because the
899 container ID is not part of the standard process accounting
900 record.
901
902 DSK The occupation percentage of this process related to the total
903 load that is produced by all processes (i.e. total disk ac‐
904 cesses by all processes during the last interval).
905 This information is shown when per process "storage account‐
906 ing" is active in the kernel.
907
908 EGID Effective group-id under which this process executes.
909
910 ENDATE Date that the process has been finished. If the process is
911 still running, this field shows `active'.
912
913 ENTIME Time that the process has been finished. If the process is
914 still running, this field shows `active'.
915
916 ENVID Virtual environment identified (OpenVZ only).
917
918 EUID Effective user-id under which this process executes.
919
920 EXC The exit code of a terminated process (second position of col‐
921 umn `ST' is E) or the fatal signal number (second position of
922 column `ST' is S or C).
923
924 FSGID Filesystem group-id under which this process executes.
925
926 FSUID Filesystem user-id under which this process executes.
927
928 GPU When the pmdanvidia daemon does not run with root privileges,
929 the GPU percentage reflects the GPU memory occupation percent‐
930 age (memory of all GPUs is 100%).
931 When the pmdanvidia daemon runs with root privileges, the GPU
932 percentage reflects the GPU busy percentage.
933
934 GPUBUSY Busy percentage on all GPUs (one GPU is 100%).
935 When the pmdanvidia daemon does not run with root privileges,
936 this value is not available.
937
938 GPUNUMS Comma-separated list of GPUs used by the process during the
939 interval. When the comma-separated list exceeds the width of
940 the column, a hexadecimal value is shown.
941
942 LOCKSZ The virtual amount of memory being locked (i.e. non-swappable)
943 by this process (or user).
944
945 MAJFLT The number of page faults issued by this process that have
946 been solved by creating/loading the requested memory page.
947
948 MEM The occupation percentage of this process related to the
949 available capacity for this resource on system level.
950
951 MEMAVG Average memory occupation during the interval on all used
952 GPUs.
953
954 MEMBUSY Busy percentage of memory on all GPUs (one GPU is 100%), i.e.
955 the time needed for read and write accesses on memory.
956 When the pmdanvidia daemon does not run with root privileges,
957 this value is not available.
958
959 MEMNOW Memory occupation at the moment of the sample on all used
960 GPUs.
961
962 MINFLT The number of page faults issued by this process that have
963 been solved by reclaiming the requested memory page from the
964 free list of pages.
965
966 NET The occupation percentage of this process related to the total
967 load that is produced by all processes (i.e. consumed network
968 bandwidth of all processes during the last interval).
969 This information will only be shown when the pmdabcc(1) module
970 `netproc' has been installed.
971
972 NICE The more or less static priority that can be given to a
973 process on a scale from -20 (high priority) to +19 (low prior‐
974 ity).
975
976 NPROCS The number of active and terminated processes accumulated for
977 this user or program.
978
979 PID Process-id.
980
981 POLI The policies 'norm' (normal, which is SCHED_OTHER), 'btch'
982 (batch) and 'idle' refer to timesharing processes. The poli‐
983 cies 'fifo' (SCHED_FIFO) and 'rr' (round robin, which is
984 SCHED_RR) refer to realtime processes.
985
986 PPID Parent process-id.
987
988 PRI The process' priority ranges from 0 (highest priority) to 139
989 (lowest priority). Priority 0 to 99 are used for realtime
990 processes (fixed priority independent of their behavior) and
991 priority 100 to 139 for timesharing processes (variable prior‐
992 ity depending on their recent CPU consumption and the nice
993 value).
994
995 PSIZE The proportional memory size of this process (or user).
996 Every process shares resident memory with other processes.
997 E.g. when a particular program is started several times, the
998 code pages (text) are only loaded once in memory and shared by
999 all incarnations. Also the code of shared libraries is shared
1000 by all processes using that shared library, as well as shared
1001 memory and memory-mapped files. For the PSIZE calculation of
1002 a process, the resident memory of a process that is shared
1003 with other processes is divided by the number of sharers.
1004 This means, that every process is accounted for a proportional
1005 part of that memory. Accumulating the PSIZE values of all
1006 processes in the system gives a reliable impression of the to‐
1007 tal resident memory consumed by all processes.
1008 Since gathering of all values that are needed to calculate the
1009 PSIZE is a very time-consuming task, the 'R' key (or '-R'
1010 flag) should be active. Gathering these values also requires
1011 superuser privileges (otherwise '?K' is shown in the output).
1012
1013 RDDSK When the kernel maintains standard io statistics (>= 2.6.20):
1014 The read data transfer issued physically on disk (so reading
1015 from the disk cache is not accounted for).
1016 Unfortunately, the kernel aggregates the data tranfer of a
1017 process to the data transfer of its parent process when termi‐
1018 nating, so you might see transfers for (parent) processes like
1019 cron, bash or init, that are not really issued by them.
1020
1021 RDELAY Runqueue delay, i.e. time spent waiting on a runqueue.
1022
1023 RGID The real group-id under which the process executes.
1024
1025 RGROW The amount of resident memory that the process has grown dur‐
1026 ing the last interval. A resident growth can be caused by
1027 touching memory pages which were not physically created/loaded
1028 before (load-on-demand). Note that a resident growth can also
1029 be negative e.g. when part of the process is paged out due to
1030 lack of memory or when the process frees dynamically allocated
1031 memory. For a process which started during the last interval,
1032 the resident growth reflects the total resident size of the
1033 process at that moment.
1034
1035 RSIZE The total resident memory usage consumed by this process (or
1036 user). Notice that the RSIZE of a process includes all resi‐
1037 dent memory used by that process, even if certain memory parts
1038 are shared with other processes (see also the explanation of
1039 PSIZE).
1040
1041 RTPR Realtime priority according the POSIX standard. Value can be
1042 0 for a timesharing process (policy 'norm', 'btch' or 'idle')
1043 or ranges from 1 (lowest) till 99 (highest) for a realtime
1044 process (policy 'rr' or 'fifo').
1045
1046 RUID The real user-id under which the process executes.
1047
1048 S The current state of the (main) thread: `R' for running (cur‐
1049 rently processing or in the runqueue), `S' for sleeping inter‐
1050 ruptible (wait for an event to occur), `D' for sleeping non-
1051 interruptible, `Z' for zombie (waiting to be synchronized with
1052 its parent process), `T' for stopped (suspended or traced),
1053 `W' for swapping, and `E' (exit) for processes which have fin‐
1054 ished during the last interval.
1055
1056 SGID The saved group-id of the process.
1057
1058 ST The status of a process.
1059 The first position indicates if the process has been started
1060 during the last interval (the value N means 'new process').
1061
1062 The second position indicates if the process has been finished
1063 during the last interval.
1064 The value E means 'exit' on the process' own initiative; the
1065 exit code is displayed in the column `EXC'.
1066 The value S means that the process has been terminated unvol‐
1067 untarily by a signal; the signal number is displayed in the in
1068 the column `EXC'.
1069 The value C means that the process has been terminated unvol‐
1070 untarily by a signal, producing a core dump in its current di‐
1071 rectory; the signal number is displayed in the column `EXC'.
1072
1073 STDATE The start date of the process.
1074
1075 STTIME The start time of the process.
1076
1077 SUID The saved user-id of the process.
1078
1079 SWAPSZ The swap space consumed by this process (or user).
1080
1081 SYSCPU CPU time consumption of this process in system mode (kernel
1082 mode), usually due to system call handling.
1083
1084 TCPRASZ The average size of a received TCP buffer in bytes. This in‐
1085 formation will only be shown when the BCC PMDA is active and
1086 the `netproc' module is enabled.
1087
1088 TCPRCV The number of tcp_recvmsg()/tcp_cleanup_rbuf() calls from this
1089 process. This information will only be shown when the BCC
1090 PMDA is active and the `netproc' module is enabled.
1091
1092 TCPSASZ The average size of a TCP buffer requested to be transmitted
1093 in bytes. This information will only be shown when the BCC
1094 PMDA is active and the `netproc' module is enabled.
1095
1096 TCPSND The number of tcp_sendmsg() calls from this process. This in‐
1097 formation will only be shown when the BCC PMDA is active and
1098 the `netproc' module is enabled.
1099
1100 THR Total number of threads within this process. All related
1101 threads are contained in a thread group, represented by pcp-
1102 atop as one line or as a separate line when the 'y' key (or -y
1103 flag) is active.
1104
1105 TID Thread-id. All threads within a process run with the same PID
1106 but with a different TID. This value is shown for individual
1107 threads in multi-threaded processes (when using the key 'y').
1108
1109 TRUN Number of threads within this process that are in the state
1110 'running' (R).
1111
1112 TSLPI Number of threads within this process that are in the state
1113 'interruptible sleeping' (S).
1114
1115 TSLPU Number of threads within this process that are in the state
1116 'uninterruptible sleeping' (D).
1117
1118 UDPRASZ The average size of a received UDP buffer in bytes. This in‐
1119 formation will only be shown when the BCC PMDA is active and
1120 the `netproc' module is enabled.
1121
1122 UDPRCV The number of udp_recvmsg()/skb_consume_udp() calls from this
1123 process. This information will only be shown when the BCC
1124 PMDA is active and the `netproc' module is enabled.
1125
1126 UDPSASZ The average size of a UDP buffer requested to be transmitted
1127 in bytes. This information will only be shown when the BCC
1128 PMDA is active and the `netproc' module is enabled.
1129
1130 UDPSND The number of udp_sendmsg() calls from this process. This in‐
1131 formation will only be shown when the BCC PMDA is active and
1132 the `netproc' module is enabled.
1133
1134 USRCPU CPU time consumption of this process in user mode, due to pro‐
1135 cessing the own program text.
1136
1137 VDATA The virtual memory size of the private data used by this
1138 process (including heap and shared library data).
1139
1140 VGROW The amount of virtual memory that the process has grown during
1141 the last interval. A virtual growth can be caused by e.g. is‐
1142 sueing a malloc() or attaching a shared memory segment. Note
1143 that a virtual growth can also be negative by e.g. issueing a
1144 free() or detaching a shared memory segment. For a process
1145 which started during the last interval, the virtual growth re‐
1146 flects the total virtual size of the process at that moment.
1147
1148 VPID Virtual process-id (within an OpenVZ container). If a process
1149 has been started and finished during the last interval, a `?'
1150 is shown because the virtual process-id is not part of the
1151 standard process accounting record.
1152
1153 VSIZE The total virtual memory usage consumed by this process (or
1154 user).
1155
1156 VSLIBS The virtual memory size of the (shared) text of all shared li‐
1157 braries used by this process.
1158
1159 VSTACK The virtual memory size of the (private) stack used by this
1160 process
1161
1162 VSTEXT The virtual memory size of the (shared) text of the executable
1163 program.
1164
1165 WCHAN Wait channel of thread in sleep state, i.e. the name of the
1166 kernel function in which the thread has been put asleep.
1167 Since determining the name string of the kernel function is a
1168 relatively time-consuming task, the 'W' key (or '-W' flag)
1169 should be active.
1170
1171 WRDSK When the kernel maintains standard io statistics (>= 2.6.20):
1172 The write data transfer issued physically on disk (so writing
1173 to the disk cache is not accounted for). This counter is
1174 maintained for the application process that writes its data to
1175 the cache (assuming that this data is physically transferred
1176 to disk later on). Notice that disk I/O needed for swapping
1177 is not taken into account.
1178 Unfortunately, the kernel aggregates the data tranfer of a
1179 process to the data transfer of its parent process when termi‐
1180 nating, so you might see transfers for (parent) processes like
1181 cron, bash or init, that are not really issued by them.
1182
1183 WCANCL When the kernel maintains standard io statistics (>= 2.6.20):
1184 The write data transfer previously accounted for this process
1185 or another process that has been cancelled. Suppose that a
1186 process writes new data to a file and that data is removed
1187 again before the cache buffers have been flushed to disk.
1188 Then the original process shows the written data as WRDSK,
1189 while the process that removes/truncates the file shows the
1190 unflushed removed data as WCANCL.
1191
1193 With the flag -P followed by a list of one or more labels (comma-sepa‐
1194 rated), parseable output is produced for each sample. The labels that
1195 can be specified for system-level statistics correspond to the labels
1196 (first verb of each line) that can be found in the interactive output:
1197 "CPU", "cpu", "CPL", "GPU", "MEM", "SWP", "PAG", "PSI", "LVM", "MDD",
1198 "DSK", "NFM", "NFC", "NFS", "NET" and "IFB".
1199 For process-level statistics special labels are introduced: "PRG" (gen‐
1200 eral), "PRC" (cpu), "PRE" (GPU), "PRM" (memory), "PRD" (disk, only if
1201 "storage accounting" is active).
1202 With the label "ALL", all system and process level statistics are
1203 shown.
1204
1205 For every interval all requested lines are shown whereafter pcp-atop
1206 shows a line just containing the label "SEP" as a separator before the
1207 lines for the next sample are generated.
1208 When a sample contains the values since boot, pcp-atop shows a line
1209 just containing the label "RESET" before the lines for this sample are
1210 generated.
1211
1212 The first part of each output-line consists of the following six
1213 fields: label (the name of the label), host (the name of this machine),
1214 epoch (the time of this interval as number of seconds since 1-1-1970),
1215 date (date of this interval in format YYYY/MM/DD), time (time of this
1216 interval in format HH:MM:SS), and interval (number of seconds elapsed
1217 for this interval).
1218
1219 The subsequent fields of each output-line depend on the label:
1220
1221 CPU Subsequent fields: total number of clock-ticks per second for
1222 this machine, number of processors, consumption for all CPUs
1223 in system mode (clock-ticks), consumption for all CPUs in user
1224 mode (clock-ticks), consumption for all CPUs in user mode for
1225 niced processes (clock-ticks), consumption for all CPUs in
1226 idle mode (clock-ticks), consumption for all CPUs in wait mode
1227 (clock-ticks), consumption for all CPUs in irq mode (clock-
1228 ticks), consumption for all CPUs in softirq mode (clock-
1229 ticks), consumption for all CPUs in steal mode (clock-ticks),
1230 consumption for all CPUs in guest mode (clock-ticks) overlap‐
1231 ping user mode, frequency of all CPUs and frequency percentage
1232 of all CPUs.
1233
1234 cpu Subsequent fields: total number of clock-ticks per second for
1235 this machine, processor-number, consumption for this CPU in
1236 system mode (clock-ticks), consumption for this CPU in user
1237 mode (clock-ticks), consumption for this CPU in user mode for
1238 niced processes (clock-ticks), consumption for this CPU in
1239 idle mode (clock-ticks), consumption for this CPU in wait mode
1240 (clock-ticks), consumption for this CPU in irq mode (clock-
1241 ticks), consumption for this CPU in softirq mode (clock-
1242 ticks), consumption for this CPU in steal mode (clock-ticks),
1243 consumption for this CPU in guest mode (clock-ticks) overlap‐
1244 ping user mode, frequency of all CPUs, frequency percentage of
1245 all CPUs, instructions executed by all CPUs and cycles for all
1246 CPUs.
1247
1248 CPL Subsequent fields: number of processors, load average for last
1249 minute, load average for last five minutes, load average for
1250 last fifteen minutes, number of context-switches, and number
1251 of device interrupts.
1252
1253 GPU Subsequent fields: GPU number, bus-id string, type of GPU
1254 string, GPU busy percentage during last second (-1 if not
1255 available), memory busy percentage during last second (-1 if
1256 not available), total memory size (KiB), used memory (KiB) at
1257 this moment, number of samples taken during interval, cumula‐
1258 tive GPU busy percentage during the interval (to be divided by
1259 the number of samples for the average busy percentage, -1 if
1260 not available), cumulative memory busy percentage during the
1261 interval (to be divided by the number of samples for the aver‐
1262 age busy percentage, -1 if not available), and cumulative mem‐
1263 ory occupation during the interval (to be divided by the num‐
1264 ber of samples for the average occupation).
1265
1266 MEM Subsequent fields: page size for this machine (in bytes), size
1267 of physical memory (pages), size of free memory (pages), size
1268 of page cache (pages), size of buffer cache (pages), size of
1269 slab (pages), dirty pages in cache (pages), reclaimable part
1270 of slab (pages), total size of vmware's balloon pages (pages),
1271 total size of shared memory (pages), size of resident shared
1272 memory (pages), size of swapped shared memory (pages), huge
1273 page size (in bytes), total size of huge pages (huge pages),
1274 size of free huge pages (huge pages), size of ARC (cache) of
1275 ZFSonlinux (pages), size of sharing pages for KSM (pages), and
1276 size of shared pages for KSM (pages).
1277
1278 SWP Subsequent fields: page size for this machine (in bytes), size
1279 of swap (pages), size of free swap (pages), size of swap cache
1280 (pages), size of committed space (pages), limit for committed
1281 space (pages), size of the swap cache (pages), size of com‐
1282 pressed pages stored in zswap (pages), and total size of com‐
1283 pressed pool in zswap (pages).
1284
1285 PAG Subsequent fields: page size for this machine (in bytes), num‐
1286 ber of page scans, number of allocstalls, 0 (future use), num‐
1287 ber of swapins, number of swapouts, and number of oomkills.
1288
1289 PSI Subsequent fields: PSI statistics present on this system (n or
1290 y), CPU some avg10, CPU some avg60, CPU some avg300, CPU some
1291 accumulated microseconds during interval, memory some avg10,
1292 memory some avg60, memory some avg300, memory some accumulated
1293 microseconds during interval, memory full avg10, memory full
1294 avg60, memory full avg300, memory full accumulated microsec‐
1295 onds during interval, I/O some avg10, I/O some avg60, I/O some
1296 avg300, I/O some accumulated microseconds during interval, I/O
1297 full avg10, I/O full avg60, I/O full avg300, and I/O full ac‐
1298 cumulated microseconds during interval.
1299
1300 LVM/MDD/DSK
1301 For every logical volume/multiple device/hard disk one line is
1302 shown.
1303 Subsequent fields: name, number of milliseconds spent for I/O,
1304 number of reads issued, number of sectors transferred for
1305 reads, number of writes issued, and number of sectors trans‐
1306 ferred for write.
1307
1308 NFM Subsequent fields: mounted NFS filesystem, total number of
1309 bytes read, total number of bytes written, number of bytes
1310 read by normal system calls, number of bytes written by normal
1311 system calls, number of bytes read by direct I/O, number of
1312 bytes written by direct I/O, number of pages read by memory-
1313 mapped I/O, and number of pages written by memory-mapped I/O.
1314
1315 NFC Subsequent fields: number of transmitted RPCs, number of
1316 transmitted read RPCs, number of transmitted write RPCs, num‐
1317 ber of RPC retransmissions, and number of authorization re‐
1318 freshes.
1319
1320 NFS Subsequent fields: number of handled RPCs, number of received
1321 read RPCs, number of received write RPCs, number of bytes read
1322 by clients, number of bytes written by clients, number of RPCs
1323 with bad format, number of RPCs with bad authorization, number
1324 of RPCs from bad client, total number of handled network re‐
1325 quests, number of handled network requests via TCP, number of
1326 handled network requests via UDP, number of handled TCP con‐
1327 nections, number of hits on reply cache, number of misses on
1328 reply cache, and number of uncached requests.
1329
1330 NET First, one line is produced for the upper layers of the TCP/IP
1331 stack.
1332 Subsequent fields: the verb "upper", number of packets re‐
1333 ceived by TCP, number of packets transmitted by TCP, number of
1334 packets received by UDP, number of packets transmitted by UDP,
1335 number of packets received by IP, number of packets transmit‐
1336 ted by IP, number of packets delivered to higher layers by IP,
1337 number of packets forwarded by IP, number of input errors
1338 (UDP), number of noport errors (UDP), number of active opens
1339 (TCP), number of passive opens (TCP), number of passive opens
1340 (TCP), number of established connections at this moment (TCP),
1341 number of retransmitted segments (TCP), number of input errors
1342 (TCP), and number of output resets (TCP).
1343
1344 Next, one line is shown for every interface.
1345 Subsequent fields: name of the interface, number of packets
1346 received by the interface, number of bytes received by the in‐
1347 terface, number of packets transmitted by the interface, num‐
1348 ber of bytes transmitted by the interface, interface speed,
1349 and duplex mode (0=half, 1=full).
1350
1351 IFB Subsequent fields: name of the InfiniBand interface, port num‐
1352 ber, number of lanes, maximum rate (Mbps), number of bytes re‐
1353 ceived, number of bytes transmitted, number of packets re‐
1354 ceived, and number of packets transmitted.
1355
1356 PRG For every process one line is shown.
1357 Subsequent fields: PID (unique ID of task), name (between
1358 brackets), state, real uid, real gid, TGID (group number of
1359 related tasks/threads), total number of threads, exit code (in
1360 case of fatal signal: signal number + 256), start time
1361 (epoch), full command line (between brackets), PPID, number of
1362 threads in state 'running' (R), number of threads in state
1363 'interruptible sleeping' (S), number of threads in state 'un‐
1364 interruptible sleeping' (D), effective uid, effective gid,
1365 saved uid, saved gid, filesystem uid, filesystem gid, elapsed
1366 time (hertz), is_process (y/n), OpenVZ virtual pid (VPID),
1367 OpenVZ container id (CTID), Docker container id (CID), and in‐
1368 dication if the task is newly started during this interval
1369 ('N').
1370
1371 PRC For every process one line is shown.
1372 Subsequent fields: PID, name (between brackets), state, total
1373 number of clock-ticks per second for this machine, CPU-con‐
1374 sumption in user mode (clockticks), CPU-consumption in system
1375 mode (clockticks), nice value, priority, realtime priority,
1376 scheduling policy, current CPU, sleep average, TGID (group
1377 number of related tasks/threads), is_process (y/n), runqueue
1378 delay in nanoseconds for this thread or for all threads (in
1379 case of process), and wait channel of this thread (between
1380 brackets).
1381
1382 PRE For every process one line is shown.
1383 Subsequent fields: PID, name (between brackets), process
1384 state, GPU state (A for active, E for exited, N for no GPU
1385 user), number of GPUs used by this process, bitlist reflecting
1386 used GPUs, GPU busy percentage during interval, memory busy
1387 percentage during interval, memory occupation (KiB) at this
1388 moment cumulative memory occupation (KiB) during interval, and
1389 number of samples taken during interval.
1390
1391 PRM For every process one line is shown.
1392 Subsequent fields: PID, name (between brackets), state, page
1393 size for this machine (in bytes), virtual memory size
1394 (Kbytes), resident memory size (Kbytes), shared text memory
1395 size (Kbytes), virtual memory growth (Kbytes), resident memory
1396 growth (Kbytes), number of minor page faults, number of major
1397 page faults, virtual library exec size (Kbytes), virtual data
1398 size (Kbytes), virtual stack size (Kbytes), swap space used
1399 (Kbytes), TGID (group number of related tasks/threads),
1400 is_process (y/n), proportional set size (Kbytes) if in 'R' op‐
1401 tion is specified and virtually locked memory space (Kbytes).
1402
1403 PRD For every process one line is shown.
1404 Subsequent fields: PID, name (between brackets), state, obso‐
1405 leted kernel patch installed ('n'), standard io statistics
1406 used ('y' or 'n'), number of reads on disk, cumulative number
1407 of sectors read, number of writes on disk, cumulative number
1408 of sectors written, cancelled number of written sectors, TGID
1409 (group number of related tasks/threads), obsoleted value
1410 ('n'), and is_process (y/n).
1411 If the standard I/O statistics (>= 2.6.20) are not used, the
1412 disk I/O counters per process are not relevant. The counters
1413 'number of reads on disk' and 'number of writes on disk' are
1414 obsoleted anyhow.
1415
1416 PRN For every process one line is shown.
1417 Subsequent fields: PID, name (between brackets), state, pmd‐
1418 abcc(1) module `netproc' loaded ('y' or 'n'), number of
1419 tcp_sendmsg() calls, cumulative size of TCP buffers requested
1420 to be transmitted, number of tcp_recvmsg()/tcp_cleanup_rbuf()
1421 calls, cumulative size of TCP buffers received, number of
1422 udp_sendmsg() calls, cumulative size of UDP buffers requested
1423 to be transmitted, number of udp_recvmsg()/skb_consume_udp()
1424 calls, cumulative size of UDP buffers transmitted, number of
1425 raw packets transmitted (obsolete, always 0), number of raw
1426 packets received (obsolete, always 0), TGID (group number of
1427 related tasks/threads) and is_process (y/n).
1428
1430 By sending the SIGUSR1 signal to pcp-atop a new sample will be forced,
1431 even if the current timer interval has not exceeded yet. The behavior
1432 is similar to pressing the `t` key in an interactive session.
1433
1434 By sending the SIGUSR2 signal to pcp-atop a final sample will be forced
1435 after which pcp-atop will terminate.
1436
1438 To monitor the current system load interactively with an interval of 5
1439 seconds:
1440
1441 pcp atop 5
1442
1443 To monitor the system load and write it to a file (in plain ASCII) with
1444 an interval of one minute during half an hour with active processes
1445 sorted on memory consumption:
1446
1447 pcp atop -M 60 30 > /log/pcp-atop.mem
1448
1449 Store information about the system and process activity in a PCP ar‐
1450 chive folio with an interval of ten minutes during an hour:
1451
1452 pcp atop -w /tmp/pcp-atop 600 6
1453
1454 View the contents of this file interactively:
1455
1456 pcp atop -r /tmp/pcp-atop
1457
1458 View the processor and disk utilization of this file in parseable for‐
1459 mat:
1460
1461 pcp atop -PCPU,DSK -r /tmp/pcp-atop.folio
1462
1463 View the contents of today's standard logfile interactively:
1464
1465 pcp atop -r
1466
1467 View the contents of the standard logfile of the day before yesterday
1468 interactively:
1469
1470 pcp atop -r yy
1471
1472 View the contents of the standard logfile of 2014, June 7 from 02:00 PM
1473 onwards interactively:
1474
1475 pcp atop -r 20140607 -b 14:00
1476
1478 pcp-atop is based on the source code of the atop(1) command from
1479 https://atoptool.nl, maintained by Gerlof Langeveld
1480 (gerlof.langeveld@atoptool.nl), and aims to be command line and output
1481 compatible with it as much as possible. Some features of that atop
1482 command are not available in pcp-atop.
1483
1484 Some features of pcp-atop (such as reporting on the Apache HTTP daemon,
1485 Infiniband, NFS client mounts, hardware event counts, GPU statistics
1486 and per-process TCP and UDP statistics) are only activated if the cor‐
1487 resonding PCP metrics are available. Refer to the documentation for pm‐
1488 daapache(1), pmdainfiniband(1), pmdanfsclient(1), pmdanvidia(1), pm‐
1489 daperfevent(1) and pmdabcc(1) for further details on activating these
1490 metrics.
1491
1492 The semantics of the per-process network statistics deviate slightly
1493 from the atop(1) tool: instead of the number of TCP/UDP packets
1494 sent/received (which may be inaccurate due to TCP segmentation off‐
1495 load), pcp-atop shows the number of tcp_sendmsg()/udp_sendmsg()/etc.
1496 kernel calls per process.
1497
1499 /etc/atoprc
1500 Configuration file containing system-wide default values. See re‐
1501 lated man-page.
1502
1503 ~/.atoprc
1504 Configuration file containing personal default values. See re‐
1505 lated man-page.
1506
1508 Environment variables with the prefix PCP_ are used to parameterize the
1509 file and directory names used by PCP. On each installation, the file
1510 /etc/pcp.conf contains the local values for these variables. The
1511 $PCP_CONF variable may be used to specify an alternative configuration
1512 file, as described in pcp.conf(5).
1513
1514 For environment variables affecting PCP tools, see pmGetOptions(3).
1515
1517 PCPIntro(1), pcp(1), pcp-atopsar(1), pmdaapache(1), pmdabcc(1), pmdain‐
1518 finiband(1), pmdanfsclient(1), pmdanvidia(1), pmdaproc(1), mkaf(1), pm‐
1519 logger(1), pmlogger_daily(1) and pcp-atoprc(5).
1520
1521
1522
1523Performance Co-Pilot PCP PCP-ATOP(1)