1COLLECTL(1) Collectl COLLECTL(1)
2
3
4
6 collectl - Collects data that describes the current system status.
7
8
10 Record Mode - read data from live system and write to file or display
11 on terminal
12
13 collectl [-f file] [options]
14
15 Playback Mode - read data from one or more raw data files and display
16 on terminal
17
18 collectl -p file1 [file2 ...] [options]
19
20
22 Record Mode
23
24 In this mode data is taken from a live system and either displayed on
25 the terminal or written to one or more files or a socket.
26
27 --align
28 If the HiRes modules is present, collectl sample monitoring will
29 be aligned such that a sample will always be taken at the top of
30 a minute (this does NOT mean the first sample will occur then)
31 so that all instances of collectl running on any systems which
32 have their clocks synchronized will all take samples at the same
33 time. Furthermore, if one is doing process monitoring, those
34 samples will also be taken at the top of the minute and so can
35 delay the start of sampling up to 2 full process monitoring
36 intervals.
37
38 --all
39 Collect summary data for ALL subsystems except slabs, since slab
40 monitoring requires a different monitoring interval. This also
41 means you won't get any detail data which also includes pro‐
42 cesses and environmementals. You can use this switch anywhere
43 -s can be used but not both together. If the system supports
44 lustre and/or interconnect monitoring those statistics will be
45 provided but the warnings produced when they are not available
46 you try to select them with -s will not be displayed.
47
48 -A, --address address[:port] | server[:port]
49 In the first form, one species an address or hostname and
50 optional port. All data is then written to that socket prefaced
51 with the current host name at the named address and port until
52 the socket is closed, at which time collectl will exit.
53
54 In the second form one enters the text "server" and optional
55 port. In this form, collectl runs as a server, waiting for a
56 connection and once established writes data on that socket. The
57 key difference here is if the client exists collectl keeps run‐
58 ning and will again look for a new connection, allowing it to
59 survive client restarts or crashes. If collectl receives the
60 text "exit" from the client, it will shut down.
61
62 The default port is set at 2655 but can be changed - see col‐
63 lectl.conf.
64
65 In both forms, one can additionally request local data logging
66 by specifying a combination of -P and -f. See man collectl-log‐
67 ging for more details.
68
69 -c, --count Samples
70 The number of samples to record. This is one way of 3 ways of
71 describing how long collectl should run (see -r and -R ). Note
72 that these 3 switches are mutually exclusive.
73
74 -C, --config filename
75 Name/location of the collectl configuration file. If not speci‐
76 fied, collectl searches for collectl.conf first in /etc (the
77 default), then in the same directory the collectl executable is
78 in, and finally the current working directory.
79
80 -D, --daemon
81 Run collectl as a daemon, primarily used when starting as a ser‐
82 vice. One caveat about this mode is you can only run one copy.
83
84 --export file[,options]
85 This requests that collectl does not print anything on the ter‐
86 minal (or send it to a socket) using the standard brief/ver‐
87 bose/plot formats. Instead it executes a perl "require" on the
88 named file, using an extension of ph if not specified. It first
89 looks in the current directory and if not there the directory
90 the executable is in. It then calls the function
91 "file"Init(options) towards the beginning of collectl and again
92 as simply "file"(@options) to generate the exported formatted
93 output. See the document sections on Exporting Custom Output
94 and Logging for more details.
95
96 -f, --filename Filename
97 This is the name of a file to write the output to. See the
98 description of File Naming for further details.
99
100 -F, --flush seconds
101 Flush output buffers after this number of seconds. This is
102 equivalent to issuing kill -s USR1 at the same frequency (but a
103 lot easier!). If 0, a flush will occur every data collection
104 interval.
105
106 --grep pattern
107 The main purpose of this switch is for those users who have dis‐
108 covered there is some data in the raw files that never appears
109 in any display and have taken to displaying it themselves with
110 grep. Unfortunately this method does not include timestamps and
111 so makes it difficult to interpret the results. Even if you
112 include the timestamp from the file it is in UTC and so needs to
113 be translated to be of any real value. This switch does just
114 that and then some.
115
116 Specifically, it allows you to playback a file and instead of
117 processing it normally it simply searches for any entries that
118 match the perl pattern and reports those lines prefaced with
119 time stamps. You can optionally change the time format with the
120 usual -o options and can even select the timeframe with --from
121 and --thru.
122
123 --home
124 Always start the display for the current interval at the top of
125 the screen also known as the home position (non-plot format
126 only). This generates a real-time, continously refreshing dis‐
127 play when the data fits on a single screen.
128
129 -i, --interval interval[:interval2[:interval3]]
130 This is the sampling interval in seconds. The default is 10
131 seconds when run as a daemon and 1 second otherwise. The
132 process subsystem and slabs (-sY and -sZ) are sampled at the
133 lower rate of interval2. Environmentals (-sE), which only apply
134 to a subset of hardware, are sampled at interval3. Both inter‐
135 val2 and interval3, if specified, must be an even multiple of
136 interval1. The daemon default is -i10:60:300 and all other
137 modes are -i1:60:300. To sample only processes once every 10
138 seconds use -i:10.
139
140 -r, --rolllogs time[[,days][,minutes]]
141 When selected, collectl runs indefinately (or at least until the
142 system reboots). The maximum number of raw and/or plot files
143 that will be retained (older ones are automatically deleted) is
144 controlled by the days field, the default is 7 days. The incre‐
145 ment field which is also optional (but is position dependent)
146 specifies the duration of an individual collection file in min‐
147 utes the default of which is 1440 or 1 day.
148
149 --quiet
150 Whenever collectl wants to tell the user something, it assigns a
151 category to it such as Informational, Warning, Error or Fatal.
152 When run with -m, all messages are displayed for the user and if
153 logging data to a file with -f, these messages are also sent to
154 a log file which is in the data collection directory and has an
155 extenion of "log". However, if -m is not specified Informa‐
156 tional messages (such as collectl starting or stopping) are not
157 reported on the terminal but the other 3 are. Sometimes the
158 warnings can be annoying and one can suppress these with --quiet
159 though they will still be written to the message log in -f. You
160 cannot suppress Error or Fatal errors.
161
162 --rawtoo
163 Only available in conjunction with -P, this switch causes the
164 creation/logging of raw data in addition to plottable data.
165 While this may seem excessive, keep in mind that unlike plot‐
166 table data, raw data can be played back with different switches
167 potentially providing more details. The overhead to write out
168 this additional data is minimal, the only real cost being that
169 of extra disk space.
170
171 -R, --runtime duration
172 Specify the duration of data collection where the duration is a
173 number followed by one of wdhms, indicating how many weeks,
174 days, hours, minutes or seconds the collection is to be taken
175 for.
176
177 --sep separator
178 Specify the plot format separator - default is a space. If this
179 is a numeric field it is interpretted as the decimal value of
180 the associated ASCII character code. Otherwise it is interpret‐
181 ted as the character itself. In other words, "--sep :" sets the
182 separator character to a colon and "--sep 9" sets it to a hori‐
183 zontal tab. "--sep 58" would also set it to a colon.
184
185 --ssh
186 This is typically used when starting collectl on another system
187 via ssh or rsh. It causes collectl to "watch" for its parent
188 (who started it locally) to exit at which point it will exit as
189 well. The reason for this switch is that when the remote com‐
190 mand that started collectl exists, collectl's parent will exit
191 as well but NOT collectl, unless --s is specified.
192
193 --tworaw
194 The switches -G and --group have been replaced by --rawtoo,
195 which is more rescriptive of its function. When specified, it
196 tells collectl to treat process and slab data as an entirely
197 separate group of raw files, named with the extention "rawp".
198 These separate files can be played back and processed just like
199 any other collectl raw files and in fact one can even play back
200 both at the same time if that is what is desired. The only real
201 purpose of this switch is that on some systems with many pro‐
202 cesses, it is possible to generate huge raw files (some have
203 been observerd to be >250MB!) and while collectl will happily
204 play back/process these files it can take a long time. By using
205 the --tworaw switch one still gets a huge rawp file, but the
206 normal raw file is a much more manageable size and as a result
207 will faster to process then when all data is combined into the
208 same file.
209
210 Playback Mode
211
212 In this mode, data is read from one or more data files that were gener‐
213 ated in Record Mode
214
215 --export Filename
216 When playing back a file, use this switch to create an identical
217 raw file differing only in the timeframe being convered, so nat‐
218 urally one must also include --from, --thru or both. Further,
219 since the resultant file will contain the exact same raw data
220 you cannot select a subset using -s. This switch is actually
221 intended for a support function for situations where somone is
222 having problems playing back a file and a subset of the original
223 raw file that covers the problem time has been requested, hope‐
224 fully allowing a significantly file to be posted or emailed.
225
226 --extract filename
227 If specified, rather than actually play back the file specified
228 with -p, ALL raw data between the date ranges is selected and a
229 subset of that raw file created. The rules for how to interpret
230 the filename are the same as used for -f.
231
232 -f, --filename filename
233 If specified, this is the name of a file or directory to write
234 the output to (rather than the terminal). See the description
235 for details on the format of this field. This requires the -P
236 flag as well.
237
238 --from time range
239 Play back data starting with this time, which may optionally
240 include the ending time as well, which is of the format of
241 [date:]time[-[date:]time]. The leading 0 of the hour is
242 optional and if the seconds field is not specified is assumed to
243 be 0. If no dates specified the time(s) apply to each file
244 specified by -P. Otherwise the time(s) only apply to the
245 first/last dates and any files between those dates will have all
246 their data reported.
247
248 --passwd filename
249 When reporting usernames associated with a UID, use this file
250 for the mapping. This is particularly important on systems run‐
251 ning NIS where this are no user names in /etc/passwd.
252
253 --pname name
254 By default, collectl uses the file /var/run/collectl.pid to
255 indicate the pid of the running instance of collectl and prevent
256 multiple copies from being run. If you DO want to run a second
257 copy, this switch will cause collectl to change its process name
258 to collectl-name and use that name as the associated pid file as
259 well.
260
261 --offsettime seconds
262 This field originally was used before collectl reported the
263 timezone in the file headers and allowed one to compensate.
264 Since then it is rarely needed except in two possible cases, one
265 in which data on two systems is to be compared and they weren't
266 synchonized with ntp. This allows all the times to be reported
267 as shifted by some number of seconds. The other case (and this
268 is very rare) is when a clock had changed in the middle of a
269 sample and will not be converted correctly. When this happens
270 one may have to play back the samples in pieces and manually set
271 the time offset.
272
273 -p, --playback Filename
274 Read data from the specified playback file(s), noting that one
275 can use wildcards in the filename if quoted (if playing back
276 multiple files to the terminal you probably want to include -m
277 to see the filenames as they are processed). The filename must
278 either end in raw or raw.gz. As an added feature, since people
279 sometimes automate the running of this option and don't want to
280 hard code a date, you can specify the string YESTERDAY or TODAY
281 and they will be replaced in the filename string by the appro‐
282 priate date.
283
284 --procanalyze
285 When specified and there is process data in the raw file, a sum‐
286 mary file will be generated with one entry unique process con‐
287 taining such things as the total cpu consumed for both user and
288 system, min/max utilization of various memory types, total page
289 faults and several others.
290
291 --slabanalyze
292 When specified and there is slab data in the raw file, a summary
293 file will be generated with one entry unique slab containing
294 data on physical memory usage by that slab.
295
296 --thru time
297 Time thru which to play back a raw file. See --from for more
298
299 Common Switches - both record and playback modes
300
301 -d, --debug debug
302 Control the level of debugging information, not typically used.
303 For details see the source code.
304
305 -h, --help, -x, --helpext, -X, --helpall
306 Display standard, extended help message (which doesn't include
307 the optional displays such as --showoptions, --showsubsys,
308 --showsubopts, --showtopopts) or everything.
309
310 --hr, --headerrepeat num
311 Sets the number of intervals to display data for before repeat‐
312 ing the header. A value -1 will prevent any headers from being
313 displayed and a value of 0 will cause only a single header to be
314 displayed and never repeated.
315
316 --iosize
317 In brief mode, include iosize with disk, infiniband and network
318 data.
319
320 -l, --limits limit
321 Override one or more default exception limits. If more than one
322 limit they must be separated by hyphens. Current values are:
323
324 SVC:value
325 Report partition activity with Service times >= 30 msec
326
327 IOS:value
328 Report device activity with 10 or more reads or writes
329 per second
330
331 LusKBS:value
332 Report client or OSS activity greater than limit. Only
333 applies to Client Summary or OSS Detail reporting.
334 [default=100000]
335
336 LusReints:value
337 Report MDS activity with Reint greater than limit. Only
338 applies to MDS Summary reporting. [default=1000]
339
340 AND
341 Both the IOS and SCV limits must be reached before a
342 device is reported. This is the default value and is
343 only included for completeness.
344
345 OR
346 Report device activity if either IOS or SVC thresholds
347 are reached.
348
349 -L, --lustsvcs [c|m|o][:seconds]
350 This switch limits which servics lustre checks for and
351 the frequency of those checks. For more information see
352 the man page collectl-lustre.
353
354 -m, --messages
355 Write status to a monthly log file in the same directory as the
356 output file (requires -f to be specified as well). The name of
357 the file will be collectl-yyyymm.log and will track various mes‐
358 sages that may get generated during every run of collectl.
359
360 -N, --nice
361 Set priority to a nicer one of 10.
362
363 -o, --options Options
364 These apply to the way output is displayed OR written to a plot
365 file. They do not effect the way data is selected for record‐
366 ing. Most of these switches work in both record as well as
367 playback mode. If you're not sure, just try it.
368
369 1
370 Data in plotting format should use 1 decimal point of
371 precision as appropriate.
372
373 2
374 Data in plotting format should use 2 decimal points of
375 precision as appropriate.
376
377 a
378 Always append data to an existing plot file. By default
379 if a plot file exists, the playback file will be skipped
380 as a way of assuring it is associated with a single
381 recorded file. This switch overrides that mechanism
382 allowing muliple recorded files to be processed and writ‐
383 ten to a single plot file.
384
385 A
386 When playing back one or more files to the terminal in
387 brief mode, append the Average and Totals.
388
389 c
390 Always open newly named plot fies in create mode, over‐
391 writing any old ones that may already exists. If one
392 processes multiple files for the same day in append mode
393 multiple times, the same data will be appended to the
394 same file mulitple times. This assures a new file is
395 created at the start of the processing.
396
397 d
398 For use with terminal output and brief mode. Preceed
399 each line with a date/time stamp, the date being in mm/dd
400 format. This option can also be applied to plot formatit
401 which will cause the date portion to also be displayed in
402 this format as opposed to D format.
403
404 D
405 For use with terminal output and brief mode. Preceed
406 each line with a date/time stamp, the date being in
407 yyyymmdd format.
408
409 g
410 For use with terminal output and brief mode. When dis‐
411 playing values of 1G or greater there is limited preci‐
412 sion for 1 digit values. This options provides a way to
413 display additional digits for more granularity by substi‐
414 tuting a "g" for the decimal point rather than the trail‐
415 ing "G".
416
417 G
418 For use with terminal output and brief mode. This is
419 similar to "g" but preserves the trailing "G" by sacri‐
420 ficing a digit of granularity.
421
422 m
423 Whenever times are reported in plot format, in the normal
424 terminal reporting format at the bginning of each inter‐
425 val or when when one of the time reporting options (d, D,
426 T or U is selected), append the milliseconds to the time.
427
428 n
429 Where appropriate, data such as disk KBs or transfers are
430 normalized to units per second by taking the change in a
431 counter and dividing by the number of seconds in that
432 interval. Normalization can be disabled via this option,
433 the result being the reported values are not divided by
434 the duration of the interval. This can be particulary
435 useful for reporting values that are < 1/2 the sampling,
436 which will be rounded to 0.
437
438 T
439 For use with terminal output and brief mode, preceeds
440 each line with a time stamp.
441
442 u
443 Create plot files with unique names by include the start‐
444 ing time of a colletion in the name. This forces multi‐
445 ple collections taken the same day to be written to mul‐
446 tiple files.
447
448 -U or --utc
449 In plot format only, report timestamps in Coordinated
450 Universal time which is more commonly know as UTC.
451
452 x
453 Report only exception records for selected subsystems.
454 Exception reporting also requires --verbose. Currently
455 this only applies to disk detail and Lustre server infor‐
456 mation so one must select at least -s D, l or L for this
457 to apply. If writing to a detail file, this data will go
458 into a separate file with the extension X appended to the
459 regular detail file name.
460
461 X
462 Report both exceptions as well as all details for
463 selected subsystems, for -s D, l or L only.
464
465 z
466 If the compression library has been installed, all output
467 files will be compressed by default. This switch tells
468 collectl not to compress any plottable files. If col‐
469 lectl tries to compress but cannot because the library
470 hasn't been installed, it will generate a warning which
471 can be suppressed with this switch.
472
473 -P, --plot
474 Generate output in plot format. This format is space separated
475 data which consists of a header (prefaced with a # for easy
476 identification by an analysis program as well as identifying it
477 as a comment for programs, such as gnuplot, which honor that
478 convention). When written to disk, which is the typical way
479 this option is used, summary data elements are written to the
480 tab file and the detail elements written to one or more files,
481 one per detail subsystem. If -f is not specified, all output is
482 sent to the terminal. Output is always one line per sampling
483 interval.
484
485 -s, --subsys subsystem
486 This field controls which subsystem data is to be collected or
487 played back for. The rules for displaying results vary depending
488 on the type of data to be displayed. If you write data for CPUs
489 and DISKs to a raw file and play it back with -sc, you will only
490 see CPU data. If you play it back with -scm you will still only
491 see CPU data since memory data was not collected. However, when
492 used with -P, collectl will always honor the subsystems speci‐
493 fied with this switch so in the previous example you will see
494 CPU data plus memory data of all 0s. To see the current set of
495 default subsystems, which are a subset of this full list, use
496 -h.
497
498 You can also use + or - to add or subtract subsystems to/from
499 the default values. For example, "-s-cdn+N"< will remove cpu,
500 disk and network monitoring from the defaults while adding net‐
501 work detail.
502
503 The default is "cdn", which stands for CPU, Disk and Network
504 summary data.
505
506 Refer to data definitions on the sourceforge website OR in
507 /usr/share/collectl/doc/collectl-xxx to see complete descrip‐
508 tions of the data returned.
509
510 SUMMARY SUBSYSTEMS
511
512 b - buddy info (memory fragmentation)
513 c - CPU
514 d - Disk
515 f - NFS V3 Data
516 i - Inode and File System
517 j - Interrupts
518 l - Lustre
519 m - Memory
520 n - Networks
521 s - Sockets
522 t - TCP
523 x - Interconnect
524 y - Slabs (system object caches)
525
526 DETAIL SUBSYSTEMS
527
528 This is the set of detail data from which in most cases the cor‐
529 responding summary data is derived. There are currently 2 types
530 that do not have corresponding summary data and those are "Envi‐
531 ronmental" and "Process". So, if one has 3 disks and chooses
532 -sd, one will only see a single total taken across all 3 disks.
533 If one chooses -sD, individual disk totals will be reported but
534 no totals. Choosing -sdD will get you both.
535
536 C - CPU
537 D - Disk
538 E - Environmental data (fan, power, temp), via ipmitool
539 F - NFS Data
540 J - Interrupts
541 L - Lustre OST detail OR client Filesystem detail
542 M - Memory node data, which is also known as numa data
543 N - Networks
544 T - 65 TCP counters only available in plot format
545 X - Interconnect
546 Y - Slabs (system object caches)
547 Z - Processes
548
549 --showheader
550 In collectl mode this command will cause the header that is nor‐
551 mally written to a data file to be displayed on the terminal and
552 collectl then exists. This can be a handy way to get a brief
553 overview of the system configuration.
554
555 --showoptions
556 This command shows only the portion of the help text that
557 desribes the -o and --options switches to save the time of wad‐
558 ing through the entire help screen.
559
560 --showcolheaders
561 This command shows the first set of headers that will be printed
562 by collectl and exits. Doesn't really make sense for multi-sec‐
563 tion output like several sets of verbose or detail data. Also
564 note that since it requires one monitoring interval to build up
565 some headers which may be dynamic, it also forces the interval
566 to 0.
567
568 --showsubopts
569 List all the subsystem specifice options
570
571 --showtopopts
572 Show all the different values for the --top type field, which
573 specify the field(s) by to sort the data
574
575 --showrootslabs
576 This command only works on systems using the new slab allocator
577 and will list the root name (these are those entries in
578 /sys/slab which are not soft links) along with all its alias
579 names. If a name doesn't have an alias, it will not appear in
580 this report.
581
582 --showslabaliases
583 This command only works on systems using the new slab allocator.
584 Like --showrootslabs, it will name a slab and all its aliases
585 but rather than show the root slab name it will show one of the
586 aliases to provide a more meaningful name. If there are any
587 slabs that only have a single (or no) alias they will not be
588 included in this report.
589
590 --showsubopts
591 Similar to --showoptions, this command summaries just the para‐
592 maters associated with -O and --subopts.
593
594 --showsubsys
595 Yet another way to summare a portion of the help text, this com‐
596 mand only shows valid subsystems.
597
598 --top [type][,num]
599 Include the top "num" consumers by resource for this interval.
600 The default number is the height of the window if it can be
601 determined otherwise 24, and the default resource is the total
602 cpu time which is taken as the sum of SysT and UsrT. See
603 --showtopopts for a list of other types of data you can sort on.
604
605 This switch can also be used with -s in which case a portion of
606 the window is reserved at the top to fill in the subsystem data,
607 which is currently in verbose mode though a brief format is con‐
608 templated for some time in the future.
609
610 In interactive mode and if not specified, the process monitoring
611 interval will be set to that for other subsystems. The screen
612 will be cleared for each interval resulting in a display similar
613 to the "top" utility. In playback more the screen will NOT be
614 cleared. You cannot use this switch in "record" mode.
615
616 --umask mask
617 Sets collectl's umask to control output file permissions. Only
618 root can set the umask. See "man umask" for details.
619
620 --utime mask
621 Write periodic micro-timestamps into raw file at different
622 points in time for fine grained measurements of operation times.
623 1 - write timestamps when entering major sections
624 2 - write timestamps for all /proc accesses except for process
625 data
626 4 - write timestamps for /proc data for all processes including
627 threads
628
629 -v
630 Show version and whether or not Compression and/or HiResTime
631 modules have been installed and exit.
632
633 -V
634 Show default parmeter and control settings, all of which can be
635 changed in /etc/collectl.conf
636
637 --verbose
638 Display output in verbose mode. This often displays more data
639 than in the default mode. When displaying detail data, verbose
640 mode is forced. Furthermore, if summary data for a single sub‐
641 system is to be displayed in verbose mode, the headers are only
642 repeated occasionally whereas if multiple subsystems are
643 involved each needs their own header.
644
645 -w
646 Disply data in wide mode. When displaying data on the terminal,
647 some data is formatted followed by a K, M or G as appropriate.
648 Selecting this switch will cause the full field to be displayed.
649 Note that there is no attempt to align data with the column
650 headings in this mode.
651
652
654 The following options are subsystem specific and typically filter data
655 for collection and/or display as well as affect the output format:
656
657 --dskfilt [^]perl-regx[,perl-regx...]
658 NOTE - this does NOT effect data collection, ALL disk data will
659 always be collected. However, only data for disk names that
660 match the pattern(s) will be included in the summary totals and
661 displayed displayed when details are requested. Alternaltively,
662 if you preface the first expression with a caret, all names that
663 match that string will be excluded from the summary totals and
664 detail displays rather then included. If you don't know perl, a
665 partial string will usually work too.
666
667 --dskopts
668 i - display the i/o sizes in brief mode just like with --iosize
669 z - only applies to disk details, do not report any lines with
670 values of all zeros.
671
672 --envopts Environmental Options
673 The default is to display ALL data but the following will cause
674 a subset to be displayed
675
676 f - display fan data
677 p - display current (power) data
678 t - display temperature data
679 C - convert temperature to Celcius if in Farenheit
680 F - convert temperature to Farenheit if in Celcius
681 M - display each type of data on separate line
682 T - display data truncated to whole integers (some implemena‐
683 tions displayed them with fractional components)
684 9 - any number, will tell ipmitool to read on this device number
685
686 --envfilt regx If specified, this regx is evaluated against each line
687 of data returned by ipmitool and only those that match are retained.
688 All other data is lost.
689
690 --envremap perl-regx,...
691 If specified as a comma separated list of perl regular substitu‐
692 tion expressions without the =~s portion, each expression is
693 applied to each environmental field name, thereby allowing one
694 to rename the column headers. This can be most useful when run‐
695 ning on heterogeneuos systems and you want consistent column
696 names.
697
698 --lustopts Lustre Options
699 B - For clients and servers, show buffer stats
700 D - For MDSs and OSTs AND running earlier versions of HPSFS,
701 collect disk block iostats
702 M - For clients, collect metadata
703 O - For OSTs, show detail level stats
704 R - For client, collect readahead stats
705
706 --memopts Memory Options
707 R - show memory values (including swap space) as rates of change
708 as opposed to absolute values. One can also show absolute
709 changes between intervals by including -on.
710
711 --netfilt [^]perl-regx[,perl-regx...]
712 NOTE - this does NOT effect data collection, ALL network data
713 will always be collected. However, only data for network names
714 that match the pattern(s) will be included in the summary totals
715 and displayed displayed when details are requested. Alternal‐
716 tively, if you preface the first expression with a caret, all
717 names that match that string will be excluded from the summary
718 totals and detail displays rather then included. If you don't
719 know perl, a partial string will usually work too.
720
721 --netopts
722 e - include network error counts in brief and explicit error
723 types elsewhere
724 E - only include lines with network errors in them
725 i - include i/o sizes in brief mode
726 w - set width of network device name
727
728 --nfsfilt NFS Filters
729 Specify one or more comma separated filters as a C/S followed by
730 an nfs version number and only those will have data reported on.
731 For example, C2 says to report data on V2 Clients. As a data
732 collection performance optimization, if one or more client fil‐
733 ters are specified, data will actually be collected for all
734 clients as is also done for servers.
735
736 --nfsopts NFS Options q.RS z - only display detail lines which have
737 data
738
739 --procfilt Process Filters
740 These filters restrict which processes are selected for collec‐
741 tion/display and replaces -Z which is now deprecated. The for‐
742 mat of a filter is a one charter type followed by a match
743 string. Multiple filters may be specified if separated by com‐
744 mas.
745
746 c - substring of the command being executed
747 C - any command that starts with the specified string
748 f - full path of the command, including arguments
749 p - pid
750 P - parent pid
751 u - any process ownerd by this user's UID or in the range speci‐
752 fide by uxxx-yyy
753 U - any process owned by this username
754
755 --procopts options
756 These options control the way data is displayed and can also
757 improved data collection performance
758
759 c - include CPU time of children who have exited (same as ps -S)
760 f - use cumulative totals for page faults in process data
761 instead of rates
762 i - show process I/O counters in display instead of default for‐
763 mat
764 m - show breakdown of memory utilization instead of default for‐
765 mat
766 p - never look for new pids or threads during data collection
767 r - show root command name only (no directory) for narrower dis‐
768 play
769 R - show ALL process priorities ('RT' currently displayed if
770 realtime)
771 t - include ALL process threads (increases collection overhead)
772 w - widen display by including whole argument string, with
773 optional max width
774 z - exclude any processes with 0 in sort field (in --top mode)
775
776
777 --slabfilt Slab Filters
778 One can specify a list of slab names separated by commas and
779 only those slabs whose names start with those strings will be
780 listed or summaried.
781
782 --slabopts Slab Options
783 s - exclude any slabs with an allocation of 0
784 S - only show those slabs whose allocations changed since last
785 display
786
787 --xopts
788 i - include i/o sizes in brief mode
789
790
792 The collectl utility is a system monitoring tool that records or dis‐
793 plays specific operating system data for one or more sets of subsys‐
794 tems. Any set of the subsystems, such as CPU, Disks, Memory or Sockets
795 can be included in or excluded from data collection. Data can either
796 be displayed back to the terminal, or stored in either a compressed or
797 uncompressed data file. The data files themselves can either be in raw
798 format (essentially a direct copy from the associated /proc structures)
799 or in a space separated plottable format such that it can be easily
800 plotted using tools such as gnuplot or excel. Data files can be read
801 and manipulated from the command line, or through use of command
802 scripts.
803
804 Upon startup, collectl.conf is read, which sets a number of default
805 parameters and switch values. Collectl searches for this file first in
806 /etc, then in the directory the collectl execuable lives in (typically
807 /usr/sbin) and finally the current directory. These locations can be
808 overriden with the -C switch. Unless you're doing something really
809 special, this file need never be touched, the only exception perhaps
810 being when choosing to run collectl as a service and you wish to change
811 it's default behavior which is set by the DaemonCommand entry.
812
813
815 Thread reporting currently only works with 2.6 kernels.
816
817 The pagesize has been hardcoded for perl 5.6 systems to 4096 for IA32
818 and 16384 for all others. If you are running 5.6 on a system with a
819 different pagesize you will see incorrect SLAB allocation sizes and
820 will need to scale the numbers you're seeing accordingly.
821
822 I have recently discovered there is a bug in /proc in that an extra
823 line is occasionally read with the end of the previous buffer! When
824 this occurs a message is written (if -m enabled) and always written to
825 the terminal. Since this happens with a higher frequency with process
826 data I silently ignore those as the output can get pretty noisey. If
827 for any reason this is a problem, be sure to let me know.
828
829 Since collectl has no control over the frequency at which data gets
830 written to /proc, one can get anomolous statistics as collectl is only
831 reporting a snapshot of what is being recorded. For more information
832 see http://collectl.sourceforge.net/TheMath.html.
833
834 At least one network card occasionally generates erroneous network
835 stats and to try to keep the data rational, collectl tries to detect
836 this and when it does generates a message that bogus data has been
837 detected.
838
839
841 http://collectl.sourceforge.net OR /opt/hp/collectl/docs
842
843
845 I would like to thank Rob Urban for his creation of the Tru64 Unix col‐
846 lect tool, which collectl is based on.
847
848
850 This program was written by Mark Seger (mjseger@gmail.com).
851 Copyright 2003-2011 Hewlett-Packard Development Company, LP
852 collectl may be copied only under the terms of either the Artistic
853 License or the GNU General Public License, which may be found in the
854 source kit
855
856
857
858LOCAL APRIL 2003 COLLECTL(1)