1ATOP(1) General Commands Manual ATOP(1)
2
3
4
6 atop - Advanced System & Process Monitor
7
9 Interactive Usage:
10
11 atop [-g|-m|-d|-n|-u|-p|-s|-c|-v|-o|-y|-Y] [-C|-M|-D|-N|-A] [-afFG1xR]
12 [-L linelen] [-Plabel[,label]...] [ interval [ samples ]]
13
14 Writing and reading raw logfiles:
15
16 atop -w rawfile [-a] [-S] [ interval [ samples ]]
17 atop -r [ rawfile ] [-b [YYYYMMDD]hhmm ] [-e [YYYYMMDD]hhmm ]
18 [-g|-m|-d|-n|-u|-p|-s|-c|-v|-o|-y|-Y] [-C|-M|-D|-N|-A] [-fFG1xR] [-L
19 linelen] [-Plabel[,label]...]
20
22 The program atop is an interactive monitor to view the load on a Linux
23 system. It shows the occupation of the most critical hardware
24 resources (from a performance point of view) on system level, i.e. cpu,
25 memory, disk and network.
26 It also shows which processes are responsible for the indicated load
27 with respect to cpu and memory load on process level. Disk load is
28 shown per process if "storage accounting" is active in the kernel.
29 Network load is shown per process if the kernel module `netatop' has
30 been installed.
31
32 Every interval (default: 10 seconds) information is shown about the
33 resource occupation on system level (cpu, memory, disks and network
34 layers), followed by a list of processes which have been active during
35 the last interval (note that all processes that were unchanged during
36 the last interval are not shown, unless the key 'a' has been pressed or
37 unless sorting on memory occupation is done). If the list of active
38 processes does not entirely fit on the screen, only the top of the list
39 is shown (sorted in order of activity).
40 The intervals are repeated till the number of samples (specified as
41 command argument) is reached, or till the key 'q' is pressed in inter‐
42 active mode.
43
44 When atop is started, it checks whether the standard output channel is
45 connected to a screen, or to a file/pipe. In the first case it produces
46 screen control codes (via the ncurses library) and behaves interac‐
47 tively; in the second case it produces flat ASCII-output.
48
49 In interactive mode, the output of atop scales dynamically to the cur‐
50 rent dimensions of the screen/window.
51 If the window is resized horizontally, columns will be added or removed
52 automatically. For this purpose, every column has a particular weight.
53 The columns with the highest weights that fit within the current width
54 will be shown.
55 If the window is resized vertically, lines of the process/thread list
56 will be added or removed automatically.
57
58 Furthermore in interactive mode the output of atop can be controlled by
59 pressing particular keys. However it is also possible to specify such
60 key as flag on the command line. In that case atop switches to the
61 indicated mode on beforehand; this mode can be modified again interac‐
62 tively. Specifying such key as flag is especially useful when running
63 atop with output to a pipe or file (non-interactively). These flags
64 are the same as the keys that can be pressed in interactive mode (see
65 section INTERACTIVE COMMANDS).
66 Additional flags are available to support storage of atop-data in raw
67 format (see section RAW DATA STORAGE).
68
70 With every interval, atop reads the kernel administration to obtain
71 information about all running processes. However, it is likely that
72 during the interval also processes have terminated. These processes
73 might have consumed system resources during this interval as well
74 before they terminated. Therefor, atop tries to read the process
75 accounting records that contain the accounting information of termi‐
76 nated processes and report these processes too. Only when the process
77 accounting mechanism in the kernel is activated, the kernel writes such
78 process accounting record to a file for every process that terminates.
79
80 There are various ways for atop to get access to the process accounting
81 records (tried in this order):
82
83 1. When the environment variable ATOPACCT is set, it specifies the
84 name of the process accounting file. In that case, process
85 accounting for this file should have been activated on beforehand.
86 Before opening this file for reading, atop drops its root privi‐
87 leges (if any).
88 When this environment variable is present but its contents is
89 empty, process accounting will not be used at all.
90
91 2. This is the preferred way of handling process accounting records!
92 When the atopacctd daemon is active, it has activated the process
93 accounting mechanism in the kernel and transfers to original
94 accounting records to shadow files. In that case, atop drops its
95 root privileges and opens the current shadow file for reading.
96 This way is preferred, because the atopacctd daemon maintains full
97 control of the sizes of the original process accounting file (writ‐
98 ten by the kernel) and the shadow files (read by the atop pro‐
99 cesses). For further information, refer to the atopacctd man page.
100
101 3. When the atopacctd daemon is not active, atop verifies if the
102 process accounting mechanism has been switched on via the separate
103 psacct package. In that case, the file /var/account/pacct is in use
104 as process accounting file and atop opens this file for reading.
105
106 4. As a last possibility, atop itself tries to activate the process
107 accounting mechanism (requires root privileges) using the file
108 /var/cache/atop.d/atop.acct (to be written by the kernel, to be
109 read by atop itself). Process accounting remains active as long as
110 at least one atop process is alive. Whenever the last atop process
111 stops (either by pressing `q' or by `kill -15'), it deactivates the
112 process accounting mechanism again. Therefor you should never ter‐
113 minate atop by `kill -9', because then it has no chance to stop
114 process accounting. As a result, the accounting file may consume a
115 lot of disk space after a while.
116 To avoid that the process accounting file consumes too much disk
117 space, atop verifies at the end of every sample if the size of the
118 process accounting file exceeds 200 MiB and if this atop process is
119 the only one that is currently using the file. In that case the
120 file is truncated to a size of zero.
121
122 Notice that root-privileges are required to switch on/off process
123 accounting in the kernel. You can start atop as a root user or
124 specify setuid-root privileges to the executable file. In the lat‐
125 ter case, atop switches on process accounting and drops the root-
126 privileges again.
127 If atop does not run with root-privileges, it does not show infor‐
128 mation about finished processes. It indicates this situation with
129 the message message `no procacct` in the top-right corner (instead
130 of the counter that shows the number of exited processes).
131
132 When during one interval a lot of processes have finished, atop might
133 grow tremendously in memory when reading all process accounting records
134 at the end of the interval. To avoid such excessive growth, atop will
135 never read more than 50 MiB with process information from the process
136 accounting file per interval (approx. 70000 finished processes). In
137 interactive mode a warning is given whenever processes have been
138 skipped for this reason.
139
141 For the resource consumption on system level, atop uses colors to indi‐
142 cate that a critical occupation percentage has been (almost) reached.
143 A critical occupation percentage means that is likely that this load
144 causes a noticeable negative performance influence for applications
145 using this resource. The critical percentage depends on the type of
146 resource: e.g. the performance influence of a disk with a busy percent‐
147 age of 80% might be more noticeable for applications/user than a CPU
148 with a busy percentage of 90%.
149 Currently atop uses the following default values to calculate a
150 weighted percentage per resource:
151
152 Processor
153 A busy percentage of 90% or higher is considered `critical'.
154
155 Disk
156 A busy percentage of 70% or higher is considered `critical'.
157
158 Network
159 A busy percentage of 90% or higher for the load of an interface is
160 considered `critical'.
161
162 Memory
163 An occupation percentage of 90% is considered `critical'. Notice
164 that this occupation percentage is the accumulated memory consump‐
165 tion of the kernel (including slab) and all processes; the memory
166 for the page cache (`cache' and `buff' in the MEM-line) and the
167 reclaimable part of the slab (`slrec`) is not implied!
168 If the number of pages swapped out (`swout' in the PAG-line) is
169 larger than 10 per second, the memory resource is considered
170 `critical'. A value of at least 1 per second is considered
171 `almost critical'.
172 If the committed virtual memory exceeds the limit (`vmcom' and
173 `vmlim' in the SWP-line), the SWP-line is colored due to overcom‐
174 mitting the system.
175
176 Swap
177 An occupation percentage of 80% is considered `critical' because
178 swap space might be completely exhausted in the near future; it is
179 not critical from a performance point-of-view.
180
181 These default values can be modified in the configuration file (see
182 separate man-page of atoprc).
183
184 When a resource exceeds its critical occupation percentage, the con‐
185 cerning values in the screen line are colored red by default.
186 When a resource exceeded (default) 80% of its critical percentage (so
187 it is almost critical), the concerning values in the screen line are
188 colored cyan by default. This `almost critical percentage' (one value
189 for all resources) can be modified in the configuration file (see sepa‐
190 rate man-page of atoprc).
191 The default colors red and cyan can be modified in the configuration
192 file as well (see separate man-page of atoprc).
193
194 With the key 'x' (or flag -x), the use of colors can be suppressed.
195
197 Per-process and per-thread network activity can be measured by the
198 netatop kernel module. You can download this kernel module from the
199 website (mentioned at the end of this manual page) and install it on
200 your system if the kernel version is 2.6.24 or newer.
201 When atop gathers counters for a new interval, it verifies if the
202 netatop module is currently active. If so, atop obtains the relevant
203 network counters from this module and shows the number of sent and
204 received packets per process/thread in the generic screen. Besides,
205 detailed counters can be requested by pressing the `n' key.
206 When the netatopd daemon is running as well, atop also reads the net‐
207 work counters of exited processes that are logged by this daemon (com‐
208 parable with process accounting).
209
210 More information about the optional netatop kernel module and the
211 netatopd daemon can be found in the concerning man-pages and on the
212 website mentioned at the end of this manual page.
213
215 GPU statistics can be gathered by atopgpud which is a separate data
216 collection daemon process. It gathers cumulative utilization counters
217 of every Nvidia GPU in the system, as well as utilization counters of
218 every process that uses a GPU. When atop notices that the daemon is
219 active, it reads these GPU utilization counters with every interval.
220
221 The atopgpud daemon is written in Python, so a Python interpreter
222 should be installed on the target system. The Python code of the daemon
223 is compatible with Python version 2 and version 3. For the gathering
224 of the statistics, the pynvml module is used by the daemon. Be sure
225 that this module is installed on the target system before activating
226 the daemon, by running the command as root pip (the command pip might
227 be exchanged by pip3 in case of Python3):
228
229 pip install nvidia-ml-py
230
231 The atopgpud daemon is installed by default as part of the atop pack‐
232 age, but it is not automatically enabled. The daemon can be enabled
233 and started now by running the following commands (as root):
234
235 systemctl enable atopgpu
236 systemctl start atopgpu
237
238 Find a description about the utilization counters in the section OUTPUT
239 DESCRIPTION.
240
242 When running atop interactively (no output redirection), keys can be
243 pressed to control the output. In general, lower case keys can be used
244 to show other information for the active processes and upper case keys
245 can be used to influence the sort order of the active process/thread
246 list.
247
248 g Show generic output (default).
249
250 Per process the following fields are shown in case of a window-
251 width of 80 positions: process-id, cpu consumption during the last
252 interval in system and user mode, the virtual and resident memory
253 growth of the process.
254
255 The subsequent columns depend on the used kernel:
256 When the kernel supports "storage accounting" (>= 2.6.20), the
257 data transfer for read/write on disk, the status and exit code are
258 shown for each process. When the kernel does not support "storage
259 accounting", the username, number of threads in the thread group,
260 the status and exit code are shown.
261 When the kernel module 'netatop' is loaded, the data transfer for
262 send/receive of network packets is shown for each process.
263 The last columns contain the state, the occupation percentage for
264 the chosen resource (default: cpu) and the process name.
265
266 When more than 80 positions are available, other information is
267 added.
268
269 m Show memory related output.
270
271 Per process the following fields are shown in case of a window-
272 width of 80 positions: process-id, minor and major memory faults,
273 size of virtual shared text, total virtual process size, total
274 resident process size, virtual and resident growth during last
275 interval, memory occupation percentage and process name.
276
277 When more than 80 positions are available, other information is
278 added.
279
280 For memory consumption, always all processes are shown (also the
281 processes that were not active during the interval).
282
283 d Show disk-related output.
284
285 When "storage accounting" is active in the kernel, the following
286 fields are shown: process-id, amount of data read from disk,
287 amount of data written to disk, amount of data that was written
288 but has been withdrawn again (WCANCL), disk occupation percentage
289 and process name.
290
291 n Show network related output.
292
293 Per process the following fields are shown in case of a window-
294 width of 80 positions: process-id, thread-id, total bandwidth for
295 received packets, total bandwidth for sent packets, number of
296 received TCP packets with the average size per packet (in bytes),
297 number of sent TCP packets with the average size per packet (in
298 bytes), number of received UDP packets with the average size per
299 packet (in bytes), number of sent UDP packets with the average
300 size per packet (in bytes), the network occupation percentage and
301 process name.
302 This information can only be shown when kernel module `netatop' is
303 installed.
304
305 When more than 80 positions are available, other information is
306 added.
307
308 s Show scheduling characteristics.
309
310 Per process the following fields are shown in case of a window-
311 width of 80 positions: process-id, number of threads in state
312 'running' (R), number of threads in state 'interruptible sleeping'
313 (S), number of threads in state 'uninterruptible sleeping' (D),
314 scheduling policy (normal timesharing, realtime round-robin, real‐
315 time fifo), nice value, priority, realtime priority, current pro‐
316 cessor, status, exit code, state, the occupation percentage for
317 the chosen resource and the process name.
318
319 When more than 80 positions are available, other information is
320 added.
321
322 v Show various process characteristics.
323
324 Per process the following fields are shown in case of a window-
325 width of 80 positions: process-id, user name and group, start date
326 and time, status (e.g. exit code if the process has finished),
327 state, the occupation percentage for the chosen resource and the
328 process name.
329
330 When more than 80 positions are available, other information is
331 added.
332
333 c Show the command line of the process.
334
335 Per process the following fields are shown: process-id, the occu‐
336 pation percentage for the chosen resource and the command line
337 including arguments.
338
339 e Show GPU utilization.
340
341 Per process at least the following fields are shown: process-id,
342 range of GPU numbers on which the process currently runs, GPU busy
343 percentage on all GPUs, memory busy percentage (i.e. read and
344 write accesses on memory) on all GPUs, memory occupation at the
345 moment of the sample, average memory occupation during the sample,
346 and GPU percentage.
347
348 When the atopgpud daemon does not run with root privileges, the
349 GPU busy percentage and the memory busy percentage are not avail‐
350 able on process level. In that case, the GPU percentage on
351 process level reflects the GPU memory occupation instead of the
352 GPU busy percentage (which is preferred).
353
354 o Show the user-defined line of the process.
355
356 In the configuration file the keyword ownprocline can be specified
357 with the description of a user-defined output-line.
358 Refer to the man-page of atoprc for a detailed description.
359
360 y Show the individual threads within a process (toggle).
361
362 Single-threaded processes are still shown as one line.
363 For multi-threaded processes, one line represents the process
364 while additional lines show the activity per individual thread (in
365 a different color). Depending on the option 'a' (all or active
366 toggle), all threads are shown or only the threads that were
367 active during the last interval. Depending on the option 'Y'
368 (sort threads), the threads per process will be sorted on the cho‐
369 sen sort criterium or not.
370 Whether this key is active or not can be seen in the header line.
371
372 Y Sort the threads per process when combined with option 'y' (tog‐
373 gle).
374
375 u Show the process activity accumulated per user.
376
377 Per user the following fields are shown: number of processes
378 active or terminated during last interval (or in total if combined
379 with command `a'), accumulated cpu consumption during last inter‐
380 val in system and user mode, the current virtual and resident mem‐
381 ory space consumed by active processes (or all processes of the
382 user if combined with command `a').
383 When "storage accounting" is active in the kernel, the accumulated
384 read and write throughput on disk is shown. When the kernel mod‐
385 ule `netatop' has been installed, the number of received and sent
386 network packets are shown.
387 The last columns contain the accumulated occupation percentage for
388 the chosen resource (default: cpu) and the user name.
389
390 p Show the process activity accumulated per program (i.e. process
391 name).
392
393 Per program the following fields are shown: number of processes
394 active or terminated during last interval (or in total if combined
395 with command `a'), accumulated cpu consumption during last inter‐
396 val in system and user mode, the current virtual and resident mem‐
397 ory space consumed by active processes (or all processes of the
398 user if combined with command `a').
399 When "storage accounting" is active in the kernel, the accumulated
400 read and write throughput on disk is shown. When the kernel mod‐
401 ule `netatop' has been installed, the number of received and sent
402 network packets are shown.
403 The last columns contain the accumulated occupation percentage for
404 the chosen resource (default: cpu) and the program name.
405
406 j Show the process activity accumulated per Docker container.
407
408 Per container the following fields are shown: number of processes
409 active or terminated during last interval (or in total if combined
410 with command `a'), accumulated cpu consumption during last inter‐
411 val in system and user mode, the current virtual and resident mem‐
412 ory space consumed by active processes (or all processes of the
413 user if combined with command `a').
414 When "storage accounting" is active in the kernel, the accumulated
415 read and write throughput on disk is shown. When the kernel mod‐
416 ule `netatop' has been installed, the number of received and sent
417 network packets are shown.
418 The last columns contain the accumulated occupation percentage for
419 the chosen resource (default: cpu) and the Docker container id
420 (CID).
421
422 C Sort the current list in the order of cpu consumption (default).
423 The one-but-last column changes to ``CPU''.
424
425 E Sort the current list in the order of GPU utilization (preferred,
426 but only applicable when the atopgpud daemon runs under root priv‐
427 ileges) or the order of GPU memory occupation). The one-but-last
428 column changes to ``GPU''.
429
430 M Sort the current list in the order of resident memory consumption.
431 The one-but-last column changes to ``MEM''. In case of sorting on
432 memory, the full process list will be shown (not only the active
433 processes).
434
435 D Sort the current list in the order of disk accesses issued. The
436 one-but-last column changes to ``DSK''.
437
438 N Sort the current list in the order of network bandwidth (received
439 and transmitted). The one-but-last column changes to ``NET''.
440
441 A Sort the current list automatically in the order of the most busy
442 system resource during this interval. The one-but-last column
443 shows either ``ACPU'', ``AMEM'', ``ADSK'' or ``ANET'' (the preced‐
444 ing 'A' indicates automatic sorting-order). The most busy
445 resource is determined by comparing the weighted busy-percentages
446 of the system resources, as described earlier in the section COL‐
447 ORS.
448 This option remains valid until another sorting-order is explic‐
449 itly selected again.
450 A sorting-order for disk is only possible when "storage account‐
451 ing" is active. A sorting-order for network is only possible when
452 the kernel module `netatop' is loaded.
453
454 Miscellaneous interactive commands:
455
456 ? Request for help information (also the key 'h' can be pressed).
457
458 V Request for version information (version number and date).
459
460 R Gather and calculate the proportional set size of processes (tog‐
461 gle). Gathering of all values that are needed to calculate the
462 PSIZE of a process is a very time-consuming task, so this key
463 should only be active when analyzing the resident memory consump‐
464 tion of processes.
465
466 W Get the WCHAN per thread (toggle). Gathering of the WCHAN string
467 per thread is a relatively time-consuming task, so this key should
468 only be made active when analyzing the reason for threads to be in
469 sleep state.
470
471 x Suppress colors to highlight critical resources (toggle).
472 Whether this key is active or not can be seen in the header line.
473
474 z The pause key can be used to freeze the current situation in order
475 to investigate the output on the screen. While atop is paused, the
476 keys described above can be pressed to show other information
477 about the current list of processes. Whenever the pause key is
478 pressed again, atop will continue with a next sample.
479
480 i Modify the interval timer (default: 10 seconds). If an interval
481 timer of 0 is entered, the interval timer is switched off. In that
482 case a new sample can only be triggered manually by pressing the
483 key 't'.
484
485 t Trigger a new sample manually. This key can be pressed if the cur‐
486 rent sample should be finished before the timer has exceeded, or
487 if no timer is set at all (interval timer defined as 0). In the
488 latter case atop can be used as a stopwatch to measure the load
489 being caused by a particular application transaction, without
490 knowing on beforehand how many seconds this transaction will last.
491
492 When viewing the contents of a raw file this key can be used to
493 show the next sample from the file. This key can also be used when
494 viewing raw data via a pipe.
495
496 T When viewing the contents of a raw file this key can be used to
497 show the previous sample from the file, however not when reading
498 raw data from a pipe.
499
500 b When viewing the contents of a raw file, this key can be used to
501 branch to a certain timestamp within the file either forward or
502 backward. When viewing raw data from a pipe only forward branches
503 are possible.
504
505 r Reset all counters to zero to see the system and process activity
506 since boot again.
507
508 When viewing the contents of a raw file, this key can be used to
509 rewind to the beginning of the file again (except when reading raw
510 data from a pipe).
511
512 U Specify a search string for specific user names as a regular
513 expression. From now on, only (active) processes will be shown
514 from a user which matches the regular expression. The system sta‐
515 tistics are still system wide. If the Enter-key is pressed with‐
516 out specifying a name, (active) processes of all users will be
517 shown again.
518 Whether this key is active or not can be seen in the header line.
519
520 I Specify a list with one or more PIDs to be selected. From now on,
521 only processes will be shown with a PID which matches one of the
522 given list. The system statistics are still system wide. If the
523 Enter-key is pressed without specifying a PID, all (active) pro‐
524 cesses will be shown again.
525 Whether this key is active or not can be seen in the header line.
526
527 P Specify a search string for specific process names as a regular
528 expression. From now on, only processes will be shown with a name
529 which matches the regular expression. The system statistics are
530 still system wide. If the Enter-key is pressed without specifying
531 a name, all (active) processes will be shown again.
532 Whether this key is active or not can be seen in the header line.
533
534 / Specify a specific command line search string as a regular expres‐
535 sion. From now on, only processes will be shown with a command
536 line which matches the regular expression. The system statistics
537 are still system wide. If the Enter-key is pressed without speci‐
538 fying a string, all (active) processes will be shown again.
539 Whether this key is active or not can be seen in the header line.
540
541 J Specify a Docker container id of 12 (hexadecimal) characters.
542 From now on, only processes will be shown that run in that spe‐
543 cific Docker container (CID). The system statistics are still
544 system wide. If the Enter-key is pressed without specifying a
545 container id, all (active) processes will be shown again.
546 Whether this key is active or not can be seen in the header line.
547
548 S Specify search strings for specific logical volume names, specific
549 disk names and specific network interface names. All search
550 strings are interpreted as a regular expressions. From now on,
551 only those system resources are shown that match the concerning
552 regular expression. If the Enter-key is pressed without specify‐
553 ing a search string, all (active) system resources of that type
554 will be shown again.
555 Whether this key is active or not can be seen in the header line.
556
557 a The `all/active' key can be used to toggle between only show‐
558 ing/accumulating the processes that were active during the last
559 interval (default) or showing/accumulating all processes.
560 Whether this key is active or not can be seen in the header line.
561
562 G By default, atop shows/accumulates the processes that are alive
563 and the processes that are exited during the last interval. With
564 this key (toggle), showing/accumulating the processes that are
565 exited can be suppressed.
566 Whether this key is active or not can be seen in the header line.
567
568 f Show a fixed (maximum) number of header lines for system resources
569 (toggle). By default only the lines are shown about system
570 resources (CPUs, paging, logical volumes, disks, network inter‐
571 faces) that really have been active during the last interval.
572 With this key you can force atop to show lines of inactive
573 resources as well.
574 Whether this key is active or not can be seen in the header line.
575
576 F Suppress sorting of system resources (toggle). By default system
577 resources (CPUs, logical volumes, disks, network interfaces) are
578 sorted on utilization.
579 Whether this key is active or not can be seen in the header line.
580
581 1 Show relevant counters as an average per second (in the format
582 `..../s') instead of as a total during the interval (toggle).
583 Whether this key is active or not can be seen in the header line.
584
585 l Limit the number of system level lines for the counters per-cpu,
586 the active disks and the network interfaces. By default lines are
587 shown of all CPUs, disks and network interfaces which have been
588 active during the last interval. Limiting these lines can be use‐
589 ful on systems with huge number CPUs, disks or interfaces in order
590 to be able to run atop on a screen/window with e.g. only 24 lines.
591 For all mentioned resources the maximum number of lines can be
592 specified interactively. When using the flag -l the maximum number
593 of per-cpu lines is set to 0, the maximum number of disk lines to
594 5 and the maximum number of interface lines to 3. These values
595 can be modified again in interactive mode.
596
597 k Send a signal to an active process (a.k.a. kill a process).
598
599 q Quit the program.
600
601 PgDn Show the next page of the process/thread list.
602 With the arrow-down key the list can be scrolled downwards with
603 single lines.
604
605 ^F Show the next page of the process/thread list (forward).
606 With the arrow-down key the list can be scrolled downwards with
607 single lines.
608
609 PgUp Show the previous page of the process/thread list.
610 With the arrow-up key the list can be scrolled upwards with single
611 lines.
612
613 ^B Show the previous page of the process/thread list (backward).
614 With the arrow-up key the list can be scrolled upwards with single
615 lines.
616
617 ^L Redraw the screen.
618
620 In order to store system and process level statistics for long-term
621 analysis (e.g. to check the system load and the active processes run‐
622 ning yesterday between 3:00 and 4:00 PM), atop can store the system and
623 process level statistics in compressed binary format in a raw file with
624 the flag -w followed by the filename. If this file already exists and
625 is recognized as a raw data file, atop will append new samples to the
626 file (starting with a sample which reflects the activity since boot);
627 if the file does not exist, it will be created.
628 All information about processes and threads is stored in the raw file.
629 The interval (default: 10 seconds) and number of samples (default:
630 infinite) can be passed as last arguments. Instead of the number of
631 samples, the flag -S can be used to indicate that atop should finish
632 anyhow before midnight.
633
634 A raw file can be read and visualized again with the flag -r followed
635 by the filename. If no filename is specified, the file
636 /var/log/atop/atop_YYYYMMDD is opened for input (where YYYYMMDD are
637 digits representing the current date). If a filename is specified in
638 the format YYYYMMDD (representing any valid date), the file
639 /var/log/atop/atop_YYYYMMDD is opened. If a filename with the symbolic
640 name y is specified, yesterday's daily logfile is opened (this can be
641 repeated so 'yyyy' indicates the logfile of four days ago). If the
642 filename - is used, stdin will be read.
643 The samples from the file can be viewed interactively by using the key
644 't' to show the next sample, the key 'T' to show the previous sample,
645 the key 'b' to branch to a particular time or the key 'r' to rewind to
646 the begin of the file.
647 When output is redirected to a file or pipe, atop prints all samples in
648 plain ASCII. The default line length is 80 characters in that case;
649 with the flag -L followed by an alternate line length, more (or less)
650 columns will be shown.
651 With the flag -b (begin time) and/or -e (end time) followed by a time
652 argument of the form [YYYYMMDD]hhmm, a certain time period within the
653 raw file can be selected.
654
655 Every day at midnight atop is restarted to write compressed binary data
656 to the file /var/log/atop/atop_YYYYMMDD with an interval of 10 minutes
657 by default.
658 Furthermore all raw files are removed that are older than 28 days (by
659 default).
660 The mentioned default values can be overruled in the file
661 /etc/default/atop that might contain other values for LOGOPTS (by
662 default without any flag), LOGINTERVAL (in seconds, by default 600),
663 LOGGENERATIONS (in days, by default 28), and LOGPATH (directory in
664 which logfiles are stored).
665
666 Unfortunately, it is not always possible to keep the format of the raw
667 files compatible in newer versions of atop especially when lots of new
668 counters have to be maintained. Therefore, the program atopconvert is
669 installed to convert a raw file created by an older version of atop to
670 a raw file that can be read by a newer version of atop (see the man
671 page of atopconvert for more details).
672
673
675 The first sample shows the system level activity since boot (the
676 elapsed time in the header shows the time since boot). Note that par‐
677 ticular counters could have reached their maximum value (several times)
678 and started by zero again, so do not rely on these figures.
679
680 For every sample atop first shows the lines related to system level
681 activity. If a particular system resource has not been used during the
682 interval, the entire line related to this resource is suppressed. So
683 the number of system level lines may vary for each sample.
684 After that a list is shown of processes which have been active during
685 the last interval. This list is by default sorted on cpu consumption,
686 but this order can be changed by the keys which are previously
687 described.
688
689 If values have to be shown by atop which do not fit in the column
690 width, another format is used. If e.g. a cpu-consumption of 233216 mil‐
691 liseconds should be shown in a column width of 4 positions, it is shown
692 as `233s' (in seconds). For large memory figures, another unit is cho‐
693 sen if the value does not fit (Mb instead of Kb, Gb instead of Mb, Tb
694 instead of Gb, ...). For other values, a kind of exponent notation is
695 used (value 123456789 shown in a column of 5 positions gives 123e6).
696
698 The system level information consists of the following output lines:
699
700 PRC Process and thread level totals.
701 This line contains the total cpu time consumed in system mode
702 (`sys') and in user mode (`user'), the total number of processes
703 present at this moment (`#proc'), the total number of threads
704 present at this moment in state `running' (`#trun'), `sleeping
705 interruptible' (`#tslpi') and `sleeping uninterruptible'
706 (`#tslpu'), the number of zombie processes (`#zombie'), the number
707 of clone system calls (`clones'), and the number of processes that
708 ended during the interval (`#exit') when process accounting is
709 used. Instead of `#exit` the last column may indicate that process
710 accounting could not be activated (`no procacct`).
711 If the screen-width does not allow all of these counters, only a
712 relevant subset is shown.
713
714 CPU CPU utilization.
715 At least one line is shown for the total occupation of all CPUs
716 together.
717 In case of a multi-processor system, an additional line is shown
718 for every individual processor (with `cpu' in lower case), sorted
719 on activity. Inactive CPUs will not be shown by default. The
720 lines showing the per-cpu occupation contain the cpu number in the
721 field combined with the wait percentage.
722
723 Every line contains the percentage of cpu time spent in kernel
724 mode by all active processes (`sys'), the percentage of cpu time
725 consumed in user mode (`user') for all active processes (including
726 processes running with a nice value larger than zero), the per‐
727 centage of cpu time spent for interrupt handling (`irq') including
728 softirq, the percentage of unused cpu time while no processes were
729 waiting for disk I/O (`idle'), and the percentage of unused cpu
730 time while at least one process was waiting for disk I/O (`wait').
731 In case of per-cpu occupation, the cpu number and the wait per‐
732 centage (`w') for that cpu. The number of lines showing the per-
733 cpu occupation can be limited.
734
735 For virtual machines, the steal-percentage (`steal') shows the
736 percentage of cpu time stolen by other virtual machines running on
737 the same hardware.
738 For physical machines hosting one or more virtual machines, the
739 guest-percentage (`guest') shows the percentage of cpu time used
740 by the virtual machines. Notice that this percentage overlaps the
741 user percentage!
742
743 When PMC performance monitoring counters are supported by the CPU
744 and the kernel (and atop runs with root privileges), the number of
745 instructions per CPU cycle (`ipc') is shown. The first sample
746 always shows the value 'initial', because the counters are just
747 activated at the moment that atop is started.
748 When the CPU busy percentage is high and the IPC is less than 1.0,
749 it is likely that the CPU is frequently waiting for memory access
750 during instruction execution (larger CPU caches or faster memory
751 might be helpful to improve performance). When the CPU busy per‐
752 centage is high and the IPC is greater than 1.0, it is likely that
753 the CPU is instruction-bound (more/faster cores might be helpful
754 to improve performance).
755 Furthermore, per CPU the effective number of cycles (`cycl') is
756 shown. This value can reach the current CPU frequency if such CPU
757 is 100% busy. When an idle CPU is halted, the number of effective
758 cycles can be (considerably) lower than the current frequency.
759 Notice that the average instructions per cycle and number of
760 cycles is shown in the CPU line for all CPUs.
761 Beware that reading the cycle counter in virtual machines (guests)
762 might introduce performance delays. Therefore this metric is by
763 default disabled in virtual machines. However, with the keyword
764 'perfevents' in the atoprc file this metric can be explicitly set
765 to 'enable' or 'disable' (see separate man-page of atoprc).
766 See also: http://www.brendangregg.com/blog/2017-05-09/cpu-utiliza‐
767 tion-is-wrong.html
768
769
770 In case of frequency scaling, all previously mentioned CPU per‐
771 centages are relative to the used scaling of the CPU during the
772 interval. If a CPU has been active for e.g. 50% in user mode dur‐
773 ing the interval while the frequency scaling of that CPU was 40%,
774 only 20% of the full capacity of the CPU has been used in user
775 mode.
776 In case that the kernel module `cpufreq_stats' is active (after
777 issueing `modprobe cpufreq_stats'), the average frequency (`avgf')
778 and the average scaling percentage (`avgscal') is shown. Otherwise
779 the current frequency (`curf') and the current scaling percentage
780 (`curscal') is shown at the moment that the sample is taken.
781 Notice that average values for frequency and scaling are shown in
782 the CPU line for every CPU.
783 Frequency scaling statistics are only gathered for systems with
784 maximum 8 CPUs, since gathering of these values per CPU is very
785 time consuming.
786
787 If the screen-width does not allow all of these counters, only a
788 relevant subset is shown.
789
790 CPL CPU load information.
791 This line contains the load average figures reflecting the number
792 of threads that are available to run on a CPU (i.e. part of the
793 runqueue) or that are waiting for disk I/O. These figures are
794 averaged over 1 (`avg1'), 5 (`avg5') and 15 (`avg15') minutes.
795 Furthermore the number of context switches (`csw'), the number of
796 serviced interrupts (`intr') and the number of available CPUs are
797 shown.
798
799 If the screen-width does not allow all of these counters, only a
800 relevant subset is shown.
801
802 GPU GPU utilization (Nvidia).
803 Read the section GPU STATISTICS GATHERING in this document to find
804 the details about the activation of the atopgpud daemon.
805
806 In the first column of every line, the bus-id (last nine charac‐
807 ters) and the GPU number are shown. The subsequent columns show
808 the percentage of time that one or more kernels were executing on
809 the GPU (`gpubusy'), the percentage of time that global (device)
810 memory was being read or written (`membusy'), the occupation per‐
811 centage of memory (`memocc'), the total memory (`total'), the mem‐
812 ory being in use at the moment of the sample (`used'), the average
813 memory being in use during the sample time (`usavg'), the number
814 of processes being active on the GPU at the moment of the sample
815 (`#proc'), and the type of GPU.
816
817 If the screen-width does not allow all of these counters, only a
818 relevant subset is shown.
819 The number of lines showing the GPUs can be limited.
820
821 MEM Memory occupation.
822 This line contains the total amount of physical memory (`tot'),
823 the amount of memory which is currently free (`free'), the amount
824 of memory in use as page cache including the total resident shared
825 memory (`cache'), the amount of memory within the page cache that
826 has to be flushed to disk (`dirty'), the amount of memory used for
827 filesystem meta data (`buff'), the amount of memory being used for
828 kernel mallocs (`slab'), the amount of slab memory that is
829 reclaimable (`slrec'), the resident size of shared memory includ‐
830 ing tmpfs (`shmem`), the resident size of shared memory (`shrss`)
831 the amount of shared memory that is currently swapped (`shswp`),
832 the amount of memory that is currently claimed by vmware's balloon
833 driver (`vmbal`), the amount of memory that is currently claimed
834 by the ARC (cache) of ZFSonlinux (`zfarc`), the amount of memory
835 that is claimed for huge pages (`hptot`), and the amount of huge
836 page memory that is really in use (`hpuse`).
837
838 If the screen-width does not allow all of these counters, only a
839 relevant subset is shown.
840
841 SWP Swap occupation and overcommit info.
842 This line contains the total amount of swap space on disk (`tot'),
843 the amount of free swap space (`free') and the size of the swap
844 cache (`swcac').
845 Furthermore the committed virtual memory space (`vmcom') and the
846 maximum limit of the committed space (`vmlim', which is by default
847 swap size plus 50% of memory size) is shown. The committed space
848 is the reserved virtual space for all allocations of private mem‐
849 ory space for processes. The kernel only verifies whether the com‐
850 mitted space exceeds the limit if strict overcommit handling is
851 configured (vm.overcommit_memory is 2).
852
853 PAG Paging frequency.
854 This line contains the number of scanned pages (`scan') due to the
855 fact that free memory drops below a particular threshold and the
856 number times that the kernel tries to reclaim pages due to an
857 urgent need (`stall').
858 Also the number of memory pages the system read from swap space
859 (`swin') and the number of memory pages the system wrote to swap
860 space (`swout') are shown.
861
862 PSI Pressure Stall Information.
863 This line contains percentages about resource pressure related to
864 CPU, memory and I/O. Certain percentages refer to 'some' meaning
865 that some processes/threads were delayed due to resource overload.
866 Other percentages refer to 'full' meaning a loss of overall
867 throughput due to resource overload.
868 The values `cpusome', `memsome', `memfull', `iosome' and `iofull'
869 show the pressure percentage during the entire interval.
870 The values `cs' (cpu some), `ms' (memory some), `mf' (memory
871 full), `is' (I/O some) and `if' (I/O full) each show three per‐
872 centages separated by slashes: pressure percentage over the last
873 10, 60 and 300 seconds.
874
875 LVM/MDD/DSK
876 Logical volume/multiple device/disk utilization.
877 Per active unit one line is produced, sorted on unit activity.
878 Such line shows the name (e.g. VolGroup00-lvtmp for a logical vol‐
879 ume or sda for a hard disk), the busy percentage i.e. the portion
880 of time that the unit was busy handling requests (`busy'), the
881 number of read requests issued (`read'), the number of write
882 requests issued (`write'), the number of KiBytes per read
883 (`KiB/r'), the number of KiBytes per write (`KiB/w'), the number
884 of MiBytes per second throughput for reads (`MBr/s'), the number
885 of MiBytes per second throughput for writes (`MBw/s'), the average
886 queue depth (`avq') and the average number of milliseconds needed
887 by a request (`avio') for seek, latency and data transfer.
888 If the screen-width does not allow all of these counters, only a
889 relevant subset is shown.
890
891 The number of lines showing the units can be limited per class
892 (LVM, MDD or DSK) with the 'l' key or statically (see separate
893 man-page of atoprc). By specifying the value 0 for a particular
894 class, no lines will be shown any more for that class.
895
896 NFM Network Filesystem (NFS) mount at the client side.
897 For each NFS-mounted filesystem, a line is shown that contains the
898 mounted server directory, the name of the server (`srv'), the
899 total number of bytes physically read from the server (`read') and
900 the total number of bytes physically written to the server
901 (`write'). Data transfer is subdivided in the number of bytes
902 read via normal read() system calls (`nread'), the number of bytes
903 written via normal read() system calls (`nwrit'), the number of
904 bytes read via direct I/O (`dread'), the number of bytes written
905 via direct I/O (`dwrit'), the number of bytes read via memory
906 mapped I/O pages (`mread'), and the number of bytes written via
907 memory mapped I/O pages (`mwrit').
908
909 NFC Network Filesystem (NFS) client side counters.
910 This line contains the number of RPC calls issues by local pro‐
911 cesses (`rpc'), the number of read RPC calls (`read`) and write
912 RPC calls (`rpwrite') issued to the NFS server, the number of RPC
913 calls being retransmitted (`retxmit') and the number of authoriza‐
914 tion refreshes (`autref').
915
916 NFS Network Filesystem (NFS) server side counters.
917 This line contains the number of RPC calls received from NFS
918 clients (`rpc'), the number of read RPC calls received (`cread`),
919 the number of write RPC calls received (`cwrit'), the number of
920 Megabytes/second returned to read requests by clients (`MBcr/s`),
921 the number of Megabytes/second passed in write requests by clients
922 (`MBcw/s`), the number of network requests handled via TCP
923 (`nettcp'), the number of network requests handled via UDP
924 (`netudp'), the number of reply cache hits (`rchits'), the number
925 of reply cache misses (`rcmiss') and the number of uncached
926 requests (`rcnoca'). Furthermore some error counters indicating
927 the number of requests with a bad format (`badfmt') or a bad
928 authorization (`badaut'), and a counter indicating the number of
929 bad clients (`badcln').
930
931 NET Network utilization (TCP/IP).
932 One line is shown for activity of the transport layer (TCP and
933 UDP), one line for the IP layer and one line per active interface.
934 For the transport layer, counters are shown concerning the number
935 of received TCP segments including those received in error
936 (`tcpi'), the number of transmitted TCP segments excluding those
937 containing only retransmitted octets (`tcpo'), the number of UDP
938 datagrams received (`udpi'), the number of UDP datagrams transmit‐
939 ted (`udpo'), the number of active TCP opens (`tcpao'), the number
940 of passive TCP opens (`tcppo'), the number of TCP output retrans‐
941 missions (`tcprs'), the number of TCP input errors (`tcpie'), the
942 number of TCP output resets (`tcpor'), the number of UDP no ports
943 (`udpnp'), and the number of UDP input errors (`udpie').
944 If the screen-width does not allow all of these counters, only a
945 relevant subset is shown.
946 These counters are related to IPv4 and IPv6 combined.
947
948 For the IP layer, counters are shown concerning the number of IP
949 datagrams received from interfaces, including those received in
950 error (`ipi'), the number of IP datagrams that local higher-layer
951 protocols offered for transmission (`ipo'), the number of received
952 IP datagrams which were forwarded to other interfaces (`ipfrw'),
953 the number of IP datagrams which were delivered to local higher-
954 layer protocols (`deliv'), the number of received ICMP datagrams
955 (`icmpi'), and the number of transmitted ICMP datagrams (`icmpo').
956 If the screen-width does not allow all of these counters, only a
957 relevant subset is shown.
958 These counters are related to IPv4 and IPv6 combined.
959
960 For every active network interface one line is shown, sorted on
961 the interface activity. Such line shows the name of the interface
962 and its busy percentage in the first column. The busy percentage
963 for half duplex is determined by comparing the interface speed
964 with the number of bits transmitted and received per second; for
965 full duplex the interface speed is compared with the highest of
966 either the transmitted or the received bits. When the interface
967 speed can not be determined (e.g. for the loopback interface),
968 `---' is shown instead of the percentage.
969 Furthermore the number of received packets (`pcki'), the number of
970 transmitted packets (`pcko'), the line speed of the interface
971 (`sp'), the effective amount of bits received per second (`si'),
972 the effective amount of bits transmitted per second (`so'), the
973 number of collisions (`coll'), the number of received multicast
974 packets (`mlti'), the number of errors while receiving a packet
975 (`erri'), the number of errors while transmitting a packet
976 (`erro'), the number of received packets dropped (`drpi'), and the
977 number of transmitted packets dropped (`drpo').
978 If the screen-width does not allow all of these counters, only a
979 relevant subset is shown.
980 The number of lines showing the network interfaces can be limited.
981
982 IFB Infiniband utilization.
983 For every active Infiniband port one line is shown, sorted on
984 activity. Such line shows the name of the port and its busy per‐
985 centage in the first column. The busy percentage is determined by
986 taking the highest of either the transmitted or the received bits
987 during the interval, multiplying that value by the number of lanes
988 and comparing it against the maximum port speed.
989 Furthermore the number of received packets divided by the number
990 of lanes (`pcki'), the number of transmitted packets divided by
991 the number of lanes (`pcko'), the maximum line speed (`sp'), the
992 effective amount of bits received per second (`si'), the effective
993 amount of bits transmitted per second (`so'), and the number of
994 lanes (`lanes').
995 If the screen-width does not allow all of these counters, only a
996 relevant subset is shown.
997 The number of lines showing the Infiniband ports can be limited.
998
1000 Following the system level information, the processes are shown from
1001 which the resource utilization has changed during the last interval.
1002 These processes might have used cpu time or issued disk or network
1003 requests. However a process is also shown if part of it has been paged
1004 out due to lack of memory (while the process itself was in sleep
1005 state).
1006
1007 Per process the following fields may be shown (in alphabetical order),
1008 depending on the current output mode as described in the section INTER‐
1009 ACTIVE COMMANDS and depending on the current width of your window:
1010
1011 AVGRSZ The average size of one read-action on disk.
1012
1013 AVGWSZ The average size of one write-action on disk.
1014
1015 BANDWI Total bandwidth for received TCP and UDP packets consumed by
1016 this process (bits-per-second). This value can be compared
1017 with the value `si' on interface level (used bandwidth per
1018 interface).
1019 This information will only be shown when the kernel module
1020 `netatop' is loaded.
1021
1022 BANDWO Total bandwidth for sent TCP and UDP packets consumed by this
1023 process (bits-per-second). This value can be compared with
1024 the value `so' on interface level (used bandwidth per inter‐
1025 face).
1026 This information will only be shown when the kernel module
1027 `netatop' is loaded.
1028
1029 CID Container ID (Docker) of 12 hexadecimal digits, referring to
1030 the container in which the process/thread is running. If a
1031 process has been started and finished during the last inter‐
1032 val, a `?' is shown because the container ID is not part of
1033 the standard process accounting record.
1034
1035 CMD The name of the process. This name can be surrounded by
1036 "less/greater than" signs (`<name>') which means that the
1037 process has finished during the last interval.
1038 Behind the abbreviation `CMD' in the header line, the current
1039 page number and the total number of pages of the
1040 process/thread list are shown.
1041
1042 COMMAND-LINE
1043 The full command line of the process (including arguments). If
1044 the length of the command line exceeds the length of the
1045 screen line, the arrow keys -> and <- can be used for horizon‐
1046 tal scroll.
1047 Behind the verb `COMMAND-LINE' in the header line, the current
1048 page number and the total number of pages of the
1049 process/thread list are shown.
1050
1051 CPU The occupation percentage of this process related to the
1052 available capacity for this resource on system level.
1053
1054 CPUNR The identification of the CPU the (main) thread is running on
1055 or has recently been running on.
1056
1057 CTID Container ID (OpenVZ). If a process has been started and fin‐
1058 ished during the last interval, a `?' is shown because the
1059 container ID is not part of the standard process accounting
1060 record.
1061
1062 DSK The occupation percentage of this process related to the total
1063 load that is produced by all processes (i.e. total disk
1064 accesses by all processes during the last interval).
1065 This information is shown when per process "storage account‐
1066 ing" is active in the kernel.
1067
1068 EGID Effective group-id under which this process executes.
1069
1070 ENDATE Date that the process has been finished. If the process is
1071 still running, this field shows `active'.
1072
1073 ENTIME Time that the process has been finished. If the process is
1074 still running, this field shows `active'.
1075
1076 ENVID Virtual environment identified (OpenVZ only).
1077
1078 EUID Effective user-id under which this process executes.
1079
1080 EXC The exit code of a terminated process (second position of col‐
1081 umn `ST' is E) or the fatal signal number (second position of
1082 column `ST' is S or C).
1083
1084 FSGID Filesystem group-id under which this process executes.
1085
1086 FSUID Filesystem user-id under which this process executes.
1087
1088 GPU When the atopgpud daemon does not run with root privileges,
1089 the GPU percentage reflects the GPU memory occupation percent‐
1090 age (memory of all GPUs is 100%).
1091 When the atopgpud daemon runs with root privileges, the GPU
1092 percentage reflects the GPU busy percentage.
1093
1094 GPUBUSY Busy percentage on all GPUs (one GPU is 100%).
1095 When the atopgpud daemon does not run with root privileges,
1096 this value is not available.
1097
1098 GPUNUMS Comma-separated list of GPUs used by the process during the
1099 interval. When the comma-separated list exceeds the width of
1100 the column, a hexadecimal value is shown.
1101
1102 LOCKSZ The virtual amount of memory being locked (i.e. non-swappable)
1103 by this process (or user).
1104
1105 MAJFLT The number of page faults issued by this process that have
1106 been solved by creating/loading the requested memory page.
1107
1108 MEM The occupation percentage of this process related to the
1109 available capacity for this resource on system level.
1110
1111 MEMAVG Average memory occupation during the interval on all used
1112 GPUs.
1113
1114 MEMBUSY Busy percentage of memory on all GPUs (one GPU is 100%), i.e.
1115 the time needed for read and write accesses on memory.
1116 When the atopgpud daemon does not run with root privileges,
1117 this value is not available.
1118
1119 MEMNOW Memory occupation at the moment of the sample on all used
1120 GPUs.
1121
1122 MINFLT The number of page faults issued by this process that have
1123 been solved by reclaiming the requested memory page from the
1124 free list of pages.
1125
1126 NET The occupation percentage of this process related to the total
1127 load that is produced by all processes (i.e. consumed network
1128 bandwidth of all processes during the last interval).
1129 This information will only be shown when kernel module
1130 `netatop' is loaded.
1131
1132 NICE The more or less static priority that can be given to a
1133 process on a scale from -20 (high priority) to +19 (low prior‐
1134 ity).
1135
1136 NPROCS The number of active and terminated processes accumulated for
1137 this user or program.
1138
1139 PID Process-id. If a process has been started and finished during
1140 the last interval, a `?' is shown because the process-id is
1141 not part of the standard process accounting record.
1142
1143 POLI The policies 'norm' (normal, which is SCHED_OTHER), 'btch'
1144 (batch) and 'idle' refer to timesharing processes. The poli‐
1145 cies 'fifo' (SCHED_FIFO) and 'rr' (round robin, which is
1146 SCHED_RR) refer to realtime processes.
1147
1148 PPID Parent process-id. If a process has been started and finished
1149 during the last interval, value 0 is shown because the parent
1150 process-id is not part of the standard process accounting
1151 record.
1152
1153 PRI The process' priority ranges from 0 (highest priority) to 139
1154 (lowest priority). Priority 0 to 99 are used for realtime pro‐
1155 cesses (fixed priority independent of their behavior) and pri‐
1156 ority 100 to 139 for timesharing processes (variable priority
1157 depending on their recent CPU consumption and the nice value).
1158
1159 PSIZE The proportional memory size of this process (or user).
1160 Every process shares resident memory with other processes.
1161 E.g. when a particular program is started several times, the
1162 code pages (text) are only loaded once in memory and shared by
1163 all incarnations. Also the code of shared libraries is shared
1164 by all processes using that shared library, as well as shared
1165 memory and memory-mapped files. For the PSIZE calculation of
1166 a process, the resident memory of a process that is shared
1167 with other processes is divided by the number of sharers.
1168 This means, that every process is accounted for a proportional
1169 part of that memory. Accumulating the PSIZE values of all pro‐
1170 cesses in the system gives a reliable impression of the total
1171 resident memory consumed by all processes.
1172 Since gathering of all values that are needed to calculate the
1173 PSIZE is a very time-consuming task, the 'R' key (or '-R'
1174 flag) should be active. Gathering these values also requires
1175 superuser privileges (otherwise '?K' is shown in the output).
1176 If a process has finished during the last interval, no value
1177 is shown since the proportional memory size is not part of the
1178 standard process accounting record.
1179
1180 RDDSK When the kernel maintains standard io statistics (>= 2.6.20):
1181 The read data transfer issued physically on disk (so reading
1182 from the disk cache is not accounted for).
1183 Unfortunately, the kernel aggregates the data tranfer of a
1184 process to the data transfer of its parent process when termi‐
1185 nating, so you might see transfers for (parent) processes like
1186 cron, bash or init, that are not really issued by them.
1187
1188 RDELAY Runqueue delay, i.e. time spent waiting on a runqueue.
1189
1190 RGID The real group-id under which the process executes.
1191
1192 RGROW The amount of resident memory that the process has grown dur‐
1193 ing the last interval. A resident growth can be caused by
1194 touching memory pages which were not physically created/loaded
1195 before (load-on-demand). Note that a resident growth can also
1196 be negative e.g. when part of the process is paged out due to
1197 lack of memory or when the process frees dynamically allocated
1198 memory. For a process which started during the last interval,
1199 the resident growth reflects the total resident size of the
1200 process at that moment.
1201 If a process has finished during the last interval, no value
1202 is shown since resident memory occupation is not part of the
1203 standard process accounting record.
1204
1205 RNET The number of TCP- and UDP packets received by this process.
1206 This information will only be shown when kernel module
1207 `netatop' is installed.
1208 If a process has finished during the last interval, no value
1209 is shown since network counters are not part of the standard
1210 process accounting record.
1211
1212 RSIZE The total resident memory usage consumed by this process (or
1213 user). Notice that the RSIZE of a process includes all resi‐
1214 dent memory used by that process, even if certain memory parts
1215 are shared with other processes (see also the explanation of
1216 PSIZE).
1217 If a process has finished during the last interval, no value
1218 is shown since resident memory occupation is not part of the
1219 standard process accounting record.
1220
1221 RTPR Realtime priority according the POSIX standard. Value can be
1222 0 for a timesharing process (policy 'norm', 'btch' or 'idle')
1223 or ranges from 1 (lowest) till 99 (highest) for a realtime
1224 process (policy 'rr' or 'fifo').
1225
1226 RUID The real user-id under which the process executes.
1227
1228 S The current state of the (main) thread: `R' for running (cur‐
1229 rently processing or in the runqueue), `S' for sleeping inter‐
1230 ruptible (wait for an event to occur), `D' for sleeping non-
1231 interruptible, `Z' for zombie (waiting to be synchronized with
1232 its parent process), `T' for stopped (suspended or traced),
1233 `W' for swapping, and `E' (exit) for processes which have fin‐
1234 ished during the last interval.
1235
1236 SGID The saved group-id of the process.
1237
1238 SNET The number of TCP and UDP packets transmitted by this process.
1239 This information will only be shown when the kernel module
1240 `netatop' is loaded.
1241
1242 ST The status of a process.
1243 The first position indicates if the process has been started
1244 during the last interval (the value N means 'new process').
1245
1246 The second position indicates if the process has been finished
1247 during the last interval.
1248 The value E means 'exit' on the process' own initiative; the
1249 exit code is displayed in the column `EXC'.
1250 The value S means that the process has been terminated unvol‐
1251 untarily by a signal; the signal number is displayed in the in
1252 the column `EXC'.
1253 The value C means that the process has been terminated unvol‐
1254 untarily by a signal, producing a core dump in its current
1255 directory; the signal number is displayed in the column `EXC'.
1256
1257 STDATE The start date of the process.
1258
1259 STTIME The start time of the process.
1260
1261 SUID The saved user-id of the process.
1262
1263 SWAPSZ The swap space consumed by this process (or user).
1264
1265 SYSCPU CPU time consumption of this process in system mode (kernel
1266 mode), usually due to system call handling.
1267
1268 TCPRASZ The average size of a received TCP buffer in bytes. This
1269 information will only be shown when the kernel module
1270 `netatop' is loaded.
1271
1272 TCPRCV The number of TCP packets received for this process. This
1273 information will only be shown when the kernel module
1274 `netatop' is loaded.
1275
1276 TCPSASZ The average size of a transmitted TCP buffer in bytes. This
1277 information will only be shown when the kernel module
1278 `netatop' is loaded.
1279
1280 TCPSND The number of TCP packets transmitted for this process. This
1281 information will only be shown when the kernel module
1282 `netatop' is loaded.
1283
1284 THR Total number of threads within this process. All related
1285 threads are contained in a thread group, represented by atop
1286 as one line or as a separate line when the 'y' key (or -y
1287 flag) is active.
1288
1289 On Linux 2.4 systems it is hardly possible to determine which
1290 threads (i.e. processes) are related to the same thread group.
1291 Every thread is represented by atop as a separate line.
1292
1293 TID Thread-id. All threads within a process run with the same PID
1294 but with a different TID. This value is shown for individual
1295 threads in multi-threaded processes (when using the key 'y').
1296
1297 TRUN Number of threads within this process that are in the state
1298 'running' (R).
1299
1300 TSLPI Number of threads within this process that are in the state
1301 'interruptible sleeping' (S).
1302
1303 TSLPU Number of threads within this process that are in the state
1304 'uninterruptible sleeping' (D).
1305
1306 UDPRASZ The average size of a received UDP packet in bytes. This
1307 information will only be shown when the kernel module
1308 `netatop' is loaded.
1309
1310 UDPRCV The number of UDP packets received by this process. This
1311 information will only be shown when the kernel module
1312 `netatop' is loaded.
1313
1314 UDPSASZ The average size of a transmitted UDP packets in bytes. This
1315 information will only be shown when the kernel module
1316 `netatop' is loaded.
1317
1318 UDPSND The number of UDP packets transmitted by this process. This
1319 information will only be shown when the kernel module
1320 `netatop' is loaded.
1321
1322 USRCPU CPU time consumption of this process in user mode, due to pro‐
1323 cessing the own program text.
1324
1325 VDATA The virtual memory size of the private data used by this
1326 process (including heap and shared library data).
1327
1328 VGROW The amount of virtual memory that the process has grown during
1329 the last interval. A virtual growth can be caused by e.g.
1330 issueing a malloc() or attaching a shared memory segment. Note
1331 that a virtual growth can also be negative by e.g. issueing a
1332 free() or detaching a shared memory segment. For a process
1333 which started during the last interval, the virtual growth
1334 reflects the total virtual size of the process at that moment.
1335 If a process has finished during the last interval, no value
1336 is shown since virtual memory occupation is not part of the
1337 standard process accounting record.
1338
1339 VPID Virtual process-id (within an OpenVZ container). If a process
1340 has been started and finished during the last interval, a `?'
1341 is shown because the virtual process-id is not part of the
1342 standard process accounting record.
1343
1344 VSIZE The total virtual memory usage consumed by this process (or
1345 user).
1346 If a process has finished during the last interval, no value
1347 is shown since virtual memory occupation is not part of the
1348 standard process accounting record.
1349
1350 VSLIBS The virtual memory size of the (shared) text of all shared
1351 libraries used by this process.
1352
1353 VSTACK The virtual memory size of the (private) stack used by this
1354 process
1355
1356 VSTEXT The virtual memory size of the (shared) text of the executable
1357 program.
1358
1359 WCHAN Wait channel of thread in sleep state, i.e. the name of the
1360 kernel function in which the thread has been put asleep.
1361 Since determining the name string of the kernel function is a
1362 relatively time-consuming task, the 'W' key (or '-W' flag)
1363 should be active.
1364
1365 WRDSK When the kernel maintains standard io statistics (>= 2.6.20):
1366 The write data transfer issued physically on disk (so writing
1367 to the disk cache is not accounted for). This counter is
1368 maintained for the application process that writes its data to
1369 the cache (assuming that this data is physically transferred
1370 to disk later on). Notice that disk I/O needed for swapping is
1371 not taken into account.
1372 Unfortunately, the kernel aggregates the data tranfer of a
1373 process to the data transfer of its parent process when termi‐
1374 nating, so you might see transfers for (parent) processes like
1375 cron, bash or init, that are not really issued by them.
1376
1377 WCANCL When the kernel maintains standard io statistics (>= 2.6.20):
1378 The write data transfer previously accounted for this process
1379 or another process that has been cancelled. Suppose that a
1380 process writes new data to a file and that data is removed
1381 again before the cache buffers have been flushed to disk.
1382 Then the original process shows the written data as WRDSK,
1383 while the process that removes/truncates the file shows the
1384 unflushed removed data as WCANCL.
1385
1387 With the flag -P followed by a list of one or more labels (comma-sepa‐
1388 rated), parseable output is produced for each sample. The labels that
1389 can be specified for system-level statistics correspond to the labels
1390 (first verb of each line) that can be found in the interactive output:
1391 "CPU", "cpu", "CPL", "GPU", "MEM", "SWP", "PAG", "PSI", "LVM", "MDD",
1392 "DSK", "NFM", "NFC", "NFS", "NET" and "IFB".
1393 For process-level statistics special labels are introduced: "PRG" (gen‐
1394 eral), "PRC" (cpu), "PRE" (GPU), "PRM" (memory), "PRD" (disk, only if
1395 "storage accounting" is active) and "PRN" (network, only if the kernel
1396 module 'netatop' has been installed).
1397 With the label "ALL", all system and process level statistics are
1398 shown.
1399
1400 For every interval all requested lines are shown whereafter atop shows
1401 a line just containing the label "SEP" as a separator before the lines
1402 for the next sample are generated.
1403 When a sample contains the values since boot, atop shows a line just
1404 containing the label "RESET" before the lines for this sample are gen‐
1405 erated.
1406
1407 The first part of each output-line consists of the following six
1408 fields: label (the name of the label), host (the name of this machine),
1409 epoch (the time of this interval as number of seconds since 1-1-1970),
1410 date (date of this interval in format YYYY/MM/DD), time (time of this
1411 interval in format HH:MM:SS), and interval (number of seconds elapsed
1412 for this interval).
1413
1414 The subsequent fields of each output-line depend on the label:
1415
1416 CPU Subsequent fields: total number of clock-ticks per second for
1417 this machine, number of processors, consumption for all CPUs
1418 in system mode (clock-ticks), consumption for all CPUs in user
1419 mode (clock-ticks), consumption for all CPUs in user mode for
1420 niced processes (clock-ticks), consumption for all CPUs in
1421 idle mode (clock-ticks), consumption for all CPUs in wait mode
1422 (clock-ticks), consumption for all CPUs in irq mode (clock-
1423 ticks), consumption for all CPUs in softirq mode (clock-
1424 ticks), consumption for all CPUs in steal mode (clock-ticks),
1425 consumption for all CPUs in guest mode (clock-ticks) overlap‐
1426 ping user mode, frequency of all CPUs, frequency percentage of
1427 all CPUs, instructions executed by all CPUs and cycles for all
1428 CPUs.
1429
1430 cpu Subsequent fields: total number of clock-ticks per second for
1431 this machine, processor-number, consumption for this CPU in
1432 system mode (clock-ticks), consumption for this CPU in user
1433 mode (clock-ticks), consumption for this CPU in user mode for
1434 niced processes (clock-ticks), consumption for this CPU in
1435 idle mode (clock-ticks), consumption for this CPU in wait mode
1436 (clock-ticks), consumption for this CPU in irq mode (clock-
1437 ticks), consumption for this CPU in softirq mode (clock-
1438 ticks), consumption for this CPU in steal mode (clock-ticks),
1439 consumption for this CPU in guest mode (clock-ticks) overlap‐
1440 ping user mode, frequency of this CPU, frequency percentage of
1441 this CPU, instructions executed by this CPU and cycles for
1442 this CPU.
1443
1444 CPL Subsequent fields: number of processors, load average for last
1445 minute, load average for last five minutes, load average for
1446 last fifteen minutes, number of context-switches, and number
1447 of device interrupts.
1448
1449 GPU Subsequent fields: GPU number, bus-id string, type of GPU
1450 string, GPU busy percentage during last second (-1 if not
1451 available), memory busy percentage during last second (-1 if
1452 not available), total memory size (KiB), used memory (KiB) at
1453 this moment, number of samples taken during interval, cumula‐
1454 tive GPU busy percentage during the interval (to be divided by
1455 the number of samples for the average busy percentage, -1 if
1456 not available), cumulative memory busy percentage during the
1457 interval (to be divided by the number of samples for the aver‐
1458 age busy percentage, -1 if not available), and cumulative mem‐
1459 ory occupation during the interval (to be divided by the num‐
1460 ber of samples for the average occupation).
1461
1462 MEM Subsequent fields: page size for this machine (in bytes), size
1463 of physical memory (pages), size of free memory (pages), size
1464 of page cache (pages), size of buffer cache (pages), size of
1465 slab (pages), dirty pages in cache (pages), reclaimable part
1466 of slab (pages), total size of vmware's balloon pages (pages),
1467 total size of shared memory (pages), size of resident shared
1468 memory (pages), size of swapped shared memory (pages), huge
1469 page size (in bytes), total size of huge pages (huge pages),
1470 size of free huge pages (huge pages), and size of ARC (cache)
1471 of ZFSonlinux (pages).
1472
1473 SWP Subsequent fields: page size for this machine (in bytes), size
1474 of swap (pages), size of free swap (pages), size of swap cache
1475 (pages), size of committed space (pages), and limit for com‐
1476 mitted space (pages).
1477
1478 PAG Subsequent fields: page size for this machine (in bytes), num‐
1479 ber of page scans, number of allocstalls, 0 (future use), num‐
1480 ber of swapins, and number of swapouts.
1481
1482 PSI Subsequent fields: PSI statistics present on this system (n or
1483 y), CPU some avg10, CPU some avg60, CPU some avg300, CPU some
1484 accumulated microseconds during interval, memory some avg10,
1485 memory some avg60, memory some avg300, memory some accumulated
1486 microseconds during interval, memory full avg10, memory full
1487 avg60, memory full avg300, memory full accumulated microsec‐
1488 onds during interval, I/O some avg10, I/O some avg60, I/O some
1489 avg300, I/O some accumulated microseconds during interval, I/O
1490 full avg10, I/O full avg60, I/O full avg300, and I/O full
1491 accumulated microseconds during interval.
1492
1493 LVM/MDD/DSK
1494 For every logical volume/multiple device/hard disk one line is
1495 shown.
1496 Subsequent fields: name, number of milliseconds spent for I/O,
1497 number of reads issued, number of sectors transferred for
1498 reads, number of writes issued, and number of sectors trans‐
1499 ferred for write.
1500
1501 NFM Subsequent fields: mounted NFS filesystem, total number of
1502 bytes read, total number of bytes written, number of bytes
1503 read by normal system calls, number of bytes written by normal
1504 system calls, number of bytes read by direct I/O, number of
1505 bytes written by direct I/O, number of pages read by memory-
1506 mapped I/O, and number of pages written by memory-mapped I/O.
1507
1508 NFC Subsequent fields: number of transmitted RPCs, number of
1509 transmitted read RPCs, number of transmitted write RPCs, num‐
1510 ber of RPC retransmissions, and number of authorization
1511 refreshes.
1512
1513 NFS Subsequent fields: number of handled RPCs, number of received
1514 read RPCs, number of received write RPCs, number of bytes read
1515 by clients, number of bytes written by clients, number of RPCs
1516 with bad format, number of RPCs with bad authorization, number
1517 of RPCs from bad client, total number of handled network
1518 requests, number of handled network requests via TCP, number
1519 of handled network requests via UDP, number of handled TCP
1520 connections, number of hits on reply cache, number of misses
1521 on reply cache, and number of uncached requests.
1522
1523 NET First, one line is produced for the upper layers of the TCP/IP
1524 stack.
1525 Subsequent fields: the verb "upper", number of packets
1526 received by TCP, number of packets transmitted by TCP, number
1527 of packets received by UDP, number of packets transmitted by
1528 UDP, number of packets received by IP, number of packets
1529 transmitted by IP, number of packets delivered to higher lay‐
1530 ers by IP, number of packets forwarded by IP, number of input
1531 errors (UDP), number of noport errors (UDP), number of active
1532 opens (TCP), number of passive opens (TCP), number of passive
1533 opens (TCP), number of established connections at this moment
1534 (TCP), number of retransmitted segments (TCP), number of input
1535 errors (TCP), and number of output resets (TCP).
1536
1537 Next, one line is shown for every interface.
1538 Subsequent fields: name of the interface, number of packets
1539 received by the interface, number of bytes received by the
1540 interface, number of packets transmitted by the interface,
1541 number of bytes transmitted by the interface, interface speed,
1542 and duplex mode (0=half, 1=full).
1543
1544 IFB Subsequent fields: name of the InfiniBand interface, port num‐
1545 ber, number of lanes, maximum rate (Mbps), number of bytes
1546 received, number of bytes transmitted, number of packets
1547 received, and number of packets transmitted.
1548
1549 PRG For every process one line is shown.
1550 Subsequent fields: PID (unique ID of task), name (between
1551 brackets), state, real uid, real gid, TGID (group number of
1552 related tasks/threads), total number of threads, exit code (in
1553 case of fatal signal: signal number + 256), start time
1554 (epoch), full command line (between brackets), PPID, number of
1555 threads in state 'running' (R), number of threads in state
1556 'interruptible sleeping' (S), number of threads in state
1557 'uninterruptible sleeping' (D), effective uid, effective gid,
1558 saved uid, saved gid, filesystem uid, filesystem gid, elapsed
1559 time (hertz), is_process (y/n), OpenVZ virtual pid (VPID),
1560 OpenVZ container id (CTID), Docker container id (CID), and
1561 indication if the task is newly started during this interval
1562 ('N').
1563
1564 PRC For every process one line is shown.
1565 Subsequent fields: PID, name (between brackets), state, total
1566 number of clock-ticks per second for this machine, CPU-con‐
1567 sumption in user mode (clockticks), CPU-consumption in system
1568 mode (clockticks), nice value, priority, realtime priority,
1569 scheduling policy, current CPU, sleep average, TGID (group
1570 number of related tasks/threads), is_process (y/n), runqueue
1571 delay in nanoseconds for this thread or for all threads (in
1572 case of process), and wait channel of this thread (between
1573 brackets).
1574
1575 PRE For every process one line is shown.
1576 Subsequent fields: PID, name (between brackets), process
1577 state, GPU state (A for active, E for exited, N for no GPU
1578 user), number of GPUs used by this process, bitlist reflecting
1579 used GPUs, GPU busy percentage during interval, memory busy
1580 percentage during interval, memory occupation (KiB) at this
1581 moment cumulative memory occupation (KiB) during interval, and
1582 number of samples taken during interval.
1583
1584 PRM For every process one line is shown.
1585 Subsequent fields: PID, name (between brackets), state, page
1586 size for this machine (in bytes), virtual memory size
1587 (Kbytes), resident memory size (Kbytes), shared text memory
1588 size (Kbytes), virtual memory growth (Kbytes), resident memory
1589 growth (Kbytes), number of minor page faults, number of major
1590 page faults, virtual library exec size (Kbytes), virtual data
1591 size (Kbytes), virtual stack size (Kbytes), swap space used
1592 (Kbytes), TGID (group number of related tasks/threads),
1593 is_process (y/n), proportional set size (Kbytes) if in 'R'
1594 option is specified and virtually locked memory space
1595 (Kbytes).
1596
1597 PRD For every process one line is shown.
1598 Subsequent fields: PID, name (between brackets), state, obso‐
1599 leted kernel patch installed ('n'), standard io statistics
1600 used ('y' or 'n'), number of reads on disk, cumulative number
1601 of sectors read, number of writes on disk, cumulative number
1602 of sectors written, cancelled number of written sectors, TGID
1603 (group number of related tasks/threads), obsoleted value
1604 ('n'), and is_process (y/n).
1605 If the standard I/O statistics (>= 2.6.20) are not used, the
1606 disk I/O counters per process are not relevant. The counters
1607 'number of reads on disk' and 'number of writes on disk' are
1608 obsoleted anyhow.
1609
1610 PRN For every process one line is shown.
1611 Subsequent fields: PID, name (between brackets), state, kernel
1612 module 'netatop' loaded ('y' or 'n'), number of TCP-packets
1613 transmitted, cumulative size of TCP-packets transmitted, num‐
1614 ber of TCP-packets received, cumulative size of TCP-packets
1615 received, number of UDP-packets transmitted, cumulative size
1616 of UDP-packets transmitted, number of UDP-packets received,
1617 cumulative size of UDP-packets transmitted, number of raw
1618 packets transmitted (obsolete, always 0), number of raw pack‐
1619 ets received (obsolete, always 0), TGID (group number of
1620 related tasks/threads) and is_process (y/n).
1621 If the kernel module is not active, the network I/O counters
1622 per process are not relevant.
1623
1625 By sending the SIGUSR1 signal to atop a new sample will be forced, even
1626 if the current timer interval has not exceeded yet. The behavior is
1627 similar to pressing the `t` key in an interactive session.
1628
1629 By sending the SIGUSR2 signal to atop a final sample will be forced
1630 after which atop will terminate.
1631
1633 To monitor the current system load interactively with an interval of 5
1634 seconds:
1635
1636 atop 5
1637
1638 To monitor the system load and write it to a file (in plain ASCII) with
1639 an interval of one minute during half an hour with active processes
1640 sorted on memory consumption:
1641
1642 atop -M 60 30 > /log/atop.mem
1643
1644 Store information about the system and process activity in binary com‐
1645 pressed form to a file with an interval of ten minutes during an hour:
1646
1647 atop -w /tmp/atop.raw 600 6
1648
1649 View the contents of this file interactively:
1650
1651 atop -r /tmp/atop.raw
1652
1653 View the processor and disk utilization of this file in parseable for‐
1654 mat:
1655
1656 atop -PCPU,DSK -r /tmp/atop.raw
1657
1658 View the contents of today's standard logfile interactively:
1659
1660 atop -r
1661
1662 View the contents of the standard logfile of the day before yesterday
1663 interactively:
1664
1665 atop -r yy
1666
1667 View the contents of the standard logfile of 2014, June 7 from 02:00 PM
1668 onwards interactively:
1669
1670 atop -r 20140607 -b 1400
1671
1672 Concatenate all raw log files of January 2020 and generate parsable
1673 output about the CPU utilization:
1674
1675 atopcat /var/log/atop/atop_202001?? | atop -r - -PCPU
1676
1678 /var/run/pacct_shadow.d/
1679 Directory containing the process accounting shadow files that are
1680 used by atop when the atopacctd daemon is active.
1681
1682 /var/cache/atop.d/atop.acct
1683 File in which the kernel writes the accounting records when atop
1684 itself has activated the process accounting mechanism.
1685
1686 /etc/atoprc
1687 Configuration file containing system-wide default values. See
1688 related man-page.
1689
1690 ~/.atoprc
1691 Configuration file containing personal default values. See
1692 related man-page.
1693
1694 /etc/default/atop
1695 Configuration file to overrule the settings of atop that runs in
1696 the background to create the daily logfile. This file is created
1697 when atop is installed. The default settings are:
1698
1699 LOGOPTS=""
1700 LOGINTERVAL=600
1701 LOGGENERATIONS=28
1702
1703 /var/log/atop/atop_YYYYMMDD
1704 Raw file, where YYYYMMDD are digits representing the current date.
1705 This name is used by atop running in the background as default
1706 name for the output file, and by atop as default name for the
1707 input file when using the -r flag.
1708 All binary system and process level data in this file has been
1709 stored in compressed format.
1710
1711 /var/run/netatop.log
1712 File that contains the netpertask structs containing the network
1713 counters of exited processes. These structs are written by the
1714 netatopd daemon and read by atop after reading the standard
1715 process accounting records.
1716
1718 atopsar(1), atopconvert(1), atopcat(1), atoprc(5), atopacctd(8),
1719 netatop(4), netatopd(8), atopgpud(8), logrotate(8)
1720 https://www.atoptool.nl
1721
1723 Gerlof Langeveld (gerlof.langeveld@atoptool.nl)
1724 JC van Winkel
1725
1726
1727
1728Linux December 2020 ATOP(1)