trace-cmd-record(1)

1TRACE-CMD-RECORD(1)            libtracefs Manual           TRACE-CMD-RECORD(1)
2
3
4

NAME

6       trace-cmd-record - record a trace from the Ftrace Linux internal tracer
7

SYNOPSIS

9       trace-cmd record [OPTIONS] [command]
10

DESCRIPTION

12       The trace-cmd(1) record command will set up the Ftrace Linux kernel
13       tracer to record the specified plugins or events that happen while the
14       command executes. If no command is given, then it will record until the
15       user hits Ctrl-C.
16
17       The record command of trace-cmd will set up the Ftrace tracer to start
18       tracing the various events or plugins that are given on the command
19       line. It will then create a number of tracing processes (one per CPU)
20       that will start recording from the kernel ring buffer straight into
21       temporary files. When the command is complete (or Ctrl-C is hit) all
22       the files will be combined into a trace.dat file that can later be read
23       (see trace-cmd-report(1)).
24

OPTIONS

26       -p tracer
27           Specify a tracer. Tracers usually do more than just trace an event.
28           Common tracers are: function, function_graph, preemptirqsoff,
29           irqsoff, preemptoff and wakeup. A tracer must be supported by the
30           running kernel. To see a list of available tracers, see
31           trace-cmd-list(1).
32
33       -e event
34           Specify an event to trace. Various static trace points have been
35           added to the Linux kernel. They are grouped by subsystem where you
36           can enable all events of a given subsystem or specify specific
37           events to be enabled. The event is of the format
38           "subsystem:event-name". You can also just specify the subsystem
39           without the :event-name or the event-name without the "subsystem:".
40           Using "-e sched_switch" will enable the "sched_switch" event where
41           as, "-e sched" will enable all events under the "sched" subsystem.
42
43               The 'event' can also contain glob expressions. That is, "*stat*" will
44               select all events (or subsystems) that have the characters "stat" in their
45               names.
46
47               The keyword 'all' can be used to enable all events.
48
49       -a
50           Every event that is being recorded has its output format file saved
51           in the output file to be able to display it later. But if other
52           events are enabled in the trace without trace-cmd’s knowledge, the
53           formats of those events will not be recorded and trace-cmd report
54           will not be able to display them. If this is the case, then specify
55           the -a option and the format for all events in the system will be
56           saved.
57
58       -T
59           Enable a stacktrace on each event. For example:
60
61                         <idle>-0     [003] 58549.289091: sched_switch:         kworker/0:1:0 [120] R ==> trace-cmd:2603 [120]
62                         <idle>-0     [003] 58549.289092: kernel_stack:         <stack trace>
63               => schedule (ffffffff814b260e)
64               => cpu_idle (ffffffff8100a38c)
65               => start_secondary (ffffffff814ab828)
66
67       --func-stack
68           Enable a stack trace on all functions. Note this is only applicable
69           for the "function" plugin tracer, and will only take effect if the
70           -l option is used and succeeds in limiting functions. If the
71           function tracer is not filtered, and the stack trace is enabled,
72           you can live lock the machine.
73
74       -f filter
75           Specify a filter for the previous event. This must come after a -e.
76           This will filter what events get recorded based on the content of
77           the event. Filtering is passed to the kernel directly so what
78           filtering is allowed may depend on what version of the kernel you
79           have. Basically, it will let you use C notation to check if an
80           event should be processed or not.
81
82
83           .ft C
84               ==, >=, <=, >, <, &, |, && and ||
85           .ft
86
87
88           The above are usually safe to use to compare fields.
89
90       --no-filter
91           Do not filter out the trace-cmd threads. By default, the threads
92           are filtered out to not be traced by events. This option will have
93           the trace-cmd threads also be traced.
94
95       -R trigger
96           Specify a trigger for the previous event. This must come after a
97           -e. This will add a given trigger to the given event. To only
98           enable the trigger and not the event itself, then place the event
99           after the -v option.
100
101               See Documentation/trace/events.txt in the Linux kernel source for more
102               information on triggers.
103
104       -v
105           This will cause all events specified after it on the command line
106           to not be traced. This is useful for selecting a subsystem to be
107           traced but to leave out various events. For Example: "-e sched -v
108           -e "*stat\*"" will enable all events in the sched subsystem except
109           those that have "stat" in their names.
110
111               Note: the *-v* option was taken from the way grep(1) inverts the following
112               matches.
113
114       -F
115           This will filter only the executable that is given on the command
116           line. If no command is given, then it will filter itself (pretty
117           pointless). Using -F will let you trace only events that are caused
118           by the given command.
119
120       -P pid
121           Similar to -F but lets you specify a process ID to trace.
122
123       -c
124           Used with either -F (or -P if kernel supports it) to trace the
125           process' children too.
126
127       --user
128           Execute the specified command as given user.
129
130       -C clock
131           Set the trace clock to "clock".
132
133               Use trace-cmd(1) list -C to see what clocks are available.
134
135       -o output-file
136           By default, trace-cmd report will create a trace.dat file. You can
137           specify a different file to write to with the -o option.
138
139       -l function-name
140           This will limit the function and function_graph tracers to only
141           trace the given function name. More than one -l may be specified on
142           the command line to trace more than one function. This supports
143           both full regex(3) parsing, or basic glob parsing. If the filter
144           has only alphanumeric, _, *, ?  and .  characters, then it will be
145           parsed as a basic glob. to force it to be a regex, prefix the
146           filter with ^ or append it with $ and it will then be parsed as a
147           regex.
148
149       -g function-name
150           This option is for the function_graph plugin. It will graph the
151           given function. That is, it will only trace the function and all
152           functions that it calls. You can have more than one -g on the
153           command line.
154
155       -n function-name
156           This has the opposite effect of -l. The function given with the -n
157           option will not be traced. This takes precedence, that is, if you
158           include the same function for both -n and -l, it will not be
159           traced.
160
161       -d
162           Some tracer plugins enable the function tracer by default. Like the
163           latency tracers. This option prevents the function tracer from
164           being enabled at start up.
165
166       -D
167           The option -d will try to use the function-trace option to disable
168           the function tracer (if available), otherwise it defaults to the
169           proc file: /proc/sys/kernel/ftrace_enabled, but will not touch it
170           if the function-trace option is available. The -D option will
171           disable both the ftrace_enabled proc file as well as the
172           function-trace option if it exists.
173
174               Note, this disable function tracing for all users, which includes users
175               outside of ftrace tracers (stack_tracer, perf, etc).
176
177       -O option
178           Ftrace has various options that can be enabled or disabled. This
179           allows you to set them. Appending the text no to an option disables
180           it. For example: "-O nograph-time" will disable the "graph-time"
181           Ftrace option.
182
183       -s interval
184           The processes that trace-cmd creates to record from the ring buffer
185           need to wake up to do the recording. Setting the interval to zero
186           will cause the processes to wakeup every time new data is written
187           into the buffer. But since Ftrace is recording kernel activity, the
188           act of this processes going back to sleep may cause new events into
189           the ring buffer which will wake the process back up. This will
190           needlessly add extra data into the ring buffer.
191
192               The 'interval' metric is microseconds. The default is set to 1000 (1 ms).
193               This is the time each recording process will sleep before waking up to
194               record any new data that was written to the ring buffer.
195
196       -r priority
197           The priority to run the capture threads at. In a busy system the
198           trace capturing threads may be staved and events can be lost. This
199           increases the priority of those threads to the real time (FIFO)
200           priority. But use this option with care, it can also change the
201           behaviour of the system being traced.
202
203       -b size
204           This sets the ring buffer size to size kilobytes. Because the
205           Ftrace ring buffer is per CPU, this size is the size of each per
206           CPU ring buffer inside the kernel. Using "-b 10000" on a machine
207           with 4 CPUs will make Ftrace have a total buffer size of 40 Megs.
208
209       -B buffer-name
210           If the kernel supports multiple buffers, this will add a buffer
211           with the given name. If the buffer name already exists, that buffer
212           is just reset and will not be deleted at the end of record
213           execution. If the buffer is created, it will be removed at the end
214           of execution (unless the -k is set, or start command was used).
215
216               After a buffer name is stated, all events added after that will be
217               associated with that buffer. If no buffer is specified, or an event
218               is specified before a buffer name, it will be associated with the
219               main (toplevel) buffer.
220
221               trace-cmd record -e sched -B block -e block -B time -e timer sleep 1
222
223               The above is will enable all sched events in the main buffer. It will
224               then create a 'block' buffer instance and enable all block events within
225               that buffer. A 'time' buffer instance is created and all timer events
226               will be enabled for that event.
227
228       -m size
229           The max size in kilobytes that a per cpu buffer should be. Note,
230           due to rounding to page size, the number may not be totally
231           correct. Also, this is performed by switching between two buffers
232           that are half the given size thus the output may not be of the
233           given size even if much more was written.
234
235               Use this to prevent running out of diskspace for long runs.
236
237       -M cpumask
238           Set the cpumask for to trace. It only affects the last buffer
239           instance given. If supplied before any buffer instance, then it
240           affects the main buffer. The value supplied must be a hex number.
241
242               trace-cmd record -p function -M c -B events13 -e all -M 5
243
244               If the -M is left out, then the mask stays the same. To enable all
245               CPUs, pass in a value of '-1'.
246
247       -k
248           By default, when trace-cmd is finished tracing, it will reset the
249           buffers and disable all the tracing that it enabled. This option
250           keeps trace-cmd from disabling the tracer and reseting the buffer.
251           This option is useful for debugging trace-cmd.
252
253               Note: usually trace-cmd will set the "tracing_on" file back to what it
254               was before it was called. This option will leave that file set to zero.
255
256       -i
257           By default, if an event is listed that trace-cmd does not find, it
258           will exit with an error. This option will just ignore events that
259           are listed on the command line but are not found on the system.
260
261       -N host:port
262           If another machine is running "trace-cmd listen", this option is
263           used to have the data sent to that machine with UDP packets.
264           Instead of writing to an output file, the data is sent off to a
265           remote box. This is ideal for embedded machines with little
266           storage, or having a single machine that will keep all the data in
267           a single repository.
268
269               Note: This option is not supported with latency tracer plugins:
270                 wakeup, wakeup_rt, irqsoff, preemptoff and preemptirqsoff
271
272       -t
273           This option is used with -N, when there’s a need to send the live
274           data with TCP packets instead of UDP. Although TCP is not nearly as
275           fast as sending the UDP packets, but it may be needed if the
276           network is not that reliable, the amount of data is not that
277           intensive, and a guarantee is needed that all traced information is
278           transfered successfully.
279
280       -q | --quiet
281           For use with recording an application. Suppresses normal output
282           (except for errors) to allow only the application’s output to be
283           displayed.
284
285       --date
286           With the --date option, "trace-cmd" will write timestamps into the
287           trace buffer after it has finished recording. It will then map the
288           timestamp to gettimeofday which will allow wall time output from
289           the timestamps reading the created trace.dat file.
290
291       --max-graph-depth depth
292           Set the maximum depth the function_graph tracer will trace into a
293           function. A value of one will only show where userspace enters the
294           kernel but not any functions called in the kernel. The default is
295           zero, which means no limit.
296
297       --cmdlines-size size
298           Set the number of entries the kernel tracing file "saved_cmdlines"
299           can contain. This file is a circular buffer which stores the
300           mapping between cmdlines and PIDs. If full, it leads to unresolved
301           cmdlines ("<...>") within the trace. The kernel default value is
302           128.
303
304       --module module
305           Filter a module’s name in function tracing. It is equivalent to
306           adding :mod:module after all other functions being filtered. If no
307           other function filter is listed, then all modules functions will be
308           filtered in the filter.
309
310               '--module snd'  is equivalent to  '-l :mod:snd'
311
312               '--module snd -l "*jack*"' is equivalent to '-l "*jack*:mod:snd"'
313
314               '--module snd -n "*"' is equivalent to '-n :mod:snd'
315
316       --proc-map
317           Save the traced process address map into the trace.dat file. The
318           traced processes can be specified using the option -P, or as a
319           given command.
320
321       --profile
322           With the --profile option, "trace-cmd" will enable tracing that can
323           be used with trace-cmd-report(1) --profile option. If a tracer -p
324           is not set, and function graph depth is supported by the kernel,
325           then the function_graph tracer will be enabled with a depth of one
326           (only show where userspace enters into the kernel). It will also
327           enable various tracepoints with stack tracing such that the report
328           can show where tasks have been blocked for the longest time.
329
330               See trace-cmd-profile(1) for more details and examples.
331
332       -G
333           Set interrupt (soft and hard) events as global (associated to CPU
334           instead of tasks). Only works for --profile.
335
336       -H event-hooks
337           Add custom event matching to connect any two events together. When
338           not used with --profile, it will save the parameter and this will
339           be used by trace-cmd report --profile, too. That is:
340
341               trace-cmd record -H hrtimer_expire_entry,hrtimer/hrtimer_expire_exit,hrtimer,sp
342               trace-cmd report --profile
343
344               Will profile hrtimer_expire_entry and hrtimer_expire_ext times.
345
346               See trace-cmd-profile(1) for format.
347
348       -S
349           (for --profile only) Only enable the tracer or events speficied on
350           the command line. With this option, the function_graph tracer is
351           not enabled, nor are any events (like sched_switch), unless they
352           are specifically specified on the command line (i.e. -p function -e
353           sched_switch -e sched_wakeup)
354
355       --ts-offset offset
356           Add an offset for the timestamp in the trace.dat file. This will
357           add a offset option into the trace.dat file such that a trace-cmd
358           report will offset all the timestamps of the events by the given
359           offset. The offset is in raw units. That is, if the event
360           timestamps are in nanoseconds the offset will also be in
361           nanoseconds even if the displayed units are in microseconds.
362
363       --tsync-interval
364           Set the loop interval, in ms, for timestamps synchronization with
365           guests: If a negative number is specified, timestamps
366           synchronization is disabled If 0 is specified, no loop is performed
367           - timestamps offset is calculated only twice," at the beginning and
368           at the end of the trace. Timestamps synchronization with guests
369           works only if there is support for VSOCK.\n"
370
371       --tsc2nsec
372           Convert the current clock to nanoseconds, using tsc multiplier and
373           shift from the Linux kernel’s perf interface. This option does not
374           change the trace clock, just assumes that the tsc multiplier and
375           shift are applicable for the selected clock. You may use the "-C
376           tsc2nsec" clock, if not sure what clock to select.
377
378       --stderr
379           Have output go to stderr instead of stdout, but the output of the
380           command executed will not be changed. This is useful if you want to
381           monitor the output of the command being executed, but not see the
382           output from trace-cmd.
383
384       --poll
385           Waiting for data to be available on the trace ring-buffers may
386           trigger IPIs. This might generate unacceptable trace noise when
387           tracing low latency or real time systems. The poll option forces
388           trace-cmd to use O_NONBLOCK. Traces are extracted by busy waiting,
389           which will hog the CPUs, so only use when really needed.
390
391       --name
392           Give a specific name for the current agent being processed. Used
393           after -A to give the guest being traced a name. Useful when using
394           the vsocket ID instead of a name of the guest.
395
396       --verbose[=level]
397           Set the log level. Supported log levels are "none", "critical",
398           "error", "warning", "info", "debug", "all" or their identifiers
399           "0", "1", "2", "3", "4", "5", "6". Setting the log level to
400           specific value enables all logs from that and all previous levels.
401           The level will default to "info" if one is not specified.
402
403               Example: enable all critical, error and warning logs
404
405               trace-cmd record --verbose=warning
406
407       --file-version
408           Desired version of the output file. Supported versions are 6 or 7.
409
410       --compression
411           Compression of the trace output file, one of these strings can be
412           passed:
413
414               'any'  - auto select the best available compression algorithm
415
416               'none' - do not compress the trace file
417
418               'name' - the name of the desired compression algorithms. Available algorithms can be listed with
419               trace-cmd list -c
420

EXAMPLES

422       The basic way to trace all events:
423
424
425           .ft C
426            # trace-cmd record -e all ls > /dev/null
427            # trace-cmd report
428                  trace-cmd-13541 [003] 106260.693809: filemap_fault: address=0x128122 offset=0xce
429                  trace-cmd-13543 [001] 106260.693809: kmalloc: call_site=81128dd4 ptr=0xffff88003dd83800 bytes_req=768 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_ZERO
430                         ls-13545 [002] 106260.693809: kfree: call_site=810a7abb ptr=0x0
431                         ls-13545 [002] 106260.693818: sys_exit_write:       0x1
432           .ft
433
434
435       To use the function tracer with sched switch tracing:
436
437
438           .ft C
439            # trace-cmd record -p function -e sched_switch ls > /dev/null
440            # trace-cmd report
441                         ls-13587 [002] 106467.860310: function: hrtick_start_fair <-- pick_next_task_fair
442                         ls-13587 [002] 106467.860313: sched_switch: prev_comm=trace-cmd prev_pid=13587 prev_prio=120 prev_state=R ==> next_comm=trace-cmd next_pid=13583 next_prio=120
443                  trace-cmd-13585 [001] 106467.860314: function: native_set_pte_at <-- __do_fault
444                  trace-cmd-13586 [003] 106467.860314: function:             up_read <-- do_page_fault
445                         ls-13587 [002] 106467.860317: function:             __phys_addr <-- schedule
446                  trace-cmd-13585 [001] 106467.860318: function: _raw_spin_unlock <-- __do_fault
447                         ls-13587 [002] 106467.860320: function: native_load_sp0 <-- __switch_to
448                  trace-cmd-13586 [003] 106467.860322: function: down_read_trylock <-- do_page_fault
449           .ft
450
451
452       Here is a nice way to find what interrupts have the highest latency:
453
454
455           .ft C
456            # trace-cmd record -p function_graph -e irq_handler_entry  -l do_IRQ sleep 10
457            # trace-cmd report
458                     <idle>-0     [000] 157412.933969: funcgraph_entry:                  |  do_IRQ() {
459                     <idle>-0     [000] 157412.933974: irq_handler_entry:    irq=48 name=eth0
460                     <idle>-0     [000] 157412.934004: funcgraph_exit:       + 36.358 us |  }
461                     <idle>-0     [000] 157413.895004: funcgraph_entry:                  |  do_IRQ() {
462                     <idle>-0     [000] 157413.895011: irq_handler_entry:    irq=48 name=eth0
463                     <idle>-0     [000] 157413.895026: funcgraph_exit:                        + 24.014 us |  }
464                     <idle>-0     [000] 157415.891762: funcgraph_entry:                  |  do_IRQ() {
465                     <idle>-0     [000] 157415.891769: irq_handler_entry:    irq=48 name=eth0
466                     <idle>-0     [000] 157415.891784: funcgraph_exit:       + 22.928 us |  }
467                     <idle>-0     [000] 157415.934869: funcgraph_entry:                  |  do_IRQ() {
468                     <idle>-0     [000] 157415.934874: irq_handler_entry:    irq=48 name=eth0
469                     <idle>-0     [000] 157415.934906: funcgraph_exit:       + 37.512 us |  }
470                     <idle>-0     [000] 157417.888373: funcgraph_entry:                  |  do_IRQ() {
471                     <idle>-0     [000] 157417.888381: irq_handler_entry:    irq=48 name=eth0
472                     <idle>-0     [000] 157417.888398: funcgraph_exit:       + 25.943 us |  }
473           .ft
474
475
476       An example of the profile:
477
478
479           .ft C
480            # trace-cmd record --profile sleep 1
481            # trace-cmd report --profile --comm sleep
482           task: sleep-21611
483             Event: sched_switch:R (1) Total: 99442 Avg: 99442 Max: 99442 Min:99442
484                <stack> 1 total:99442 min:99442 max:99442 avg=99442
485                  => ftrace_raw_event_sched_switch (0xffffffff8105f812)
486                  => __schedule (0xffffffff8150810a)
487                  => preempt_schedule (0xffffffff8150842e)
488                  => ___preempt_schedule (0xffffffff81273354)
489                  => cpu_stop_queue_work (0xffffffff810b03c5)
490                  => stop_one_cpu (0xffffffff810b063b)
491                  => sched_exec (0xffffffff8106136d)
492                  => do_execve_common.isra.27 (0xffffffff81148c89)
493                  => do_execve (0xffffffff811490b0)
494                  => SyS_execve (0xffffffff811492c4)
495                  => return_to_handler (0xffffffff8150e3c8)
496                  => stub_execve (0xffffffff8150c699)
497             Event: sched_switch:S (1) Total: 1000506680 Avg: 1000506680 Max: 1000506680 Min:1000506680
498                <stack> 1 total:1000506680 min:1000506680 max:1000506680 avg=1000506680
499                  => ftrace_raw_event_sched_switch (0xffffffff8105f812)
500                  => __schedule (0xffffffff8150810a)
501                  => schedule (0xffffffff815084b8)
502                  => do_nanosleep (0xffffffff8150b22c)
503                  => hrtimer_nanosleep (0xffffffff8108d647)
504                  => SyS_nanosleep (0xffffffff8108d72c)
505                  => return_to_handler (0xffffffff8150e3c8)
506                  => tracesys_phase2 (0xffffffff8150c304)
507             Event: sched_wakeup:21611 (1) Total: 30326 Avg: 30326 Max: 30326 Min:30326
508                <stack> 1 total:30326 min:30326 max:30326 avg=30326
509                  => ftrace_raw_event_sched_wakeup_template (0xffffffff8105f653)
510                  => ttwu_do_wakeup (0xffffffff810606eb)
511                  => ttwu_do_activate.constprop.124 (0xffffffff810607c8)
512                  => try_to_wake_up (0xffffffff8106340a)
513           .ft
514
515

AUTHOR

523       Written by Steven Rostedt, <rostedt@goodmis.org[1]>
524

RESOURCES

526       https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/
527

COPYING

529       Copyright (C) 2010 Red Hat, Inc. Free use of this software is granted
530       under the terms of the GNU Public License (GPL).
531

NOTES

533        1. rostedt@goodmis.org
534           mailto:rostedt@goodmis.org
535
536
537
538libtracefs                        07/23/2022               TRACE-CMD-RECORD(1)