1TRACE-CMD-RECORD(1) libtracefs Manual TRACE-CMD-RECORD(1)
2
3
4
6 trace-cmd-record - record a trace from the Ftrace Linux internal tracer
7
9 trace-cmd record [OPTIONS] [command]
10
12 The trace-cmd(1) record command will set up the Ftrace Linux kernel
13 tracer to record the specified plugins or events that happen while the
14 command executes. If no command is given, then it will record until the
15 user hits Ctrl-C.
16
17 The record command of trace-cmd will set up the Ftrace tracer to start
18 tracing the various events or plugins that are given on the command
19 line. It will then create a number of tracing processes (one per CPU)
20 that will start recording from the kernel ring buffer straight into
21 temporary files. When the command is complete (or Ctrl-C is hit) all
22 the files will be combined into a trace.dat file that can later be read
23 (see trace-cmd-report(1)).
24
26 -p tracer
27 Specify a tracer. Tracers usually do more than just trace an event.
28 Common tracers are: function, function_graph, preemptirqsoff,
29 irqsoff, preemptoff and wakeup. A tracer must be supported by the
30 running kernel. To see a list of available tracers, see
31 trace-cmd-list(1).
32
33 -e event
34 Specify an event to trace. Various static trace points have been
35 added to the Linux kernel. They are grouped by subsystem where you
36 can enable all events of a given subsystem or specify specific
37 events to be enabled. The event is of the format
38 "subsystem:event-name". You can also just specify the subsystem
39 without the :event-name or the event-name without the "subsystem:".
40 Using "-e sched_switch" will enable the "sched_switch" event where
41 as, "-e sched" will enable all events under the "sched" subsystem.
42
43 The 'event' can also contain glob expressions. That is, "*stat*" will
44 select all events (or subsystems) that have the characters "stat" in their
45 names.
46
47 The keyword 'all' can be used to enable all events.
48
49 -a
50 Every event that is being recorded has its output format file saved
51 in the output file to be able to display it later. But if other
52 events are enabled in the trace without trace-cmd’s knowledge, the
53 formats of those events will not be recorded and trace-cmd report
54 will not be able to display them. If this is the case, then specify
55 the -a option and the format for all events in the system will be
56 saved.
57
58 -T
59 Enable a stacktrace on each event. For example:
60
61 <idle>-0 [003] 58549.289091: sched_switch: kworker/0:1:0 [120] R ==> trace-cmd:2603 [120]
62 <idle>-0 [003] 58549.289092: kernel_stack: <stack trace>
63 => schedule (ffffffff814b260e)
64 => cpu_idle (ffffffff8100a38c)
65 => start_secondary (ffffffff814ab828)
66
67 --func-stack
68 Enable a stack trace on all functions. Note this is only applicable
69 for the "function" plugin tracer, and will only take effect if the
70 -l option is used and succeeds in limiting functions. If the
71 function tracer is not filtered, and the stack trace is enabled,
72 you can live lock the machine.
73
74 -f filter
75 Specify a filter for the previous event. This must come after a -e.
76 This will filter what events get recorded based on the content of
77 the event. Filtering is passed to the kernel directly so what
78 filtering is allowed may depend on what version of the kernel you
79 have. Basically, it will let you use C notation to check if an
80 event should be processed or not.
81
82 ==, >=, <=, >, <, &, |, && and ||
83
84 The above are usually safe to use to compare fields.
85
86 --no-filter
87 Do not filter out the trace-cmd threads. By default, the threads
88 are filtered out to not be traced by events. This option will have
89 the trace-cmd threads also be traced.
90
91 -R trigger
92 Specify a trigger for the previous event. This must come after a
93 -e. This will add a given trigger to the given event. To only
94 enable the trigger and not the event itself, then place the event
95 after the -v option.
96
97 See Documentation/trace/events.txt in the Linux kernel source for more
98 information on triggers.
99
100 -v
101 This will cause all events specified after it on the command line
102 to not be traced. This is useful for selecting a subsystem to be
103 traced but to leave out various events. For Example: "-e sched -v
104 -e "*stat\*"" will enable all events in the sched subsystem except
105 those that have "stat" in their names.
106
107 Note: the *-v* option was taken from the way grep(1) inverts the following
108 matches.
109
110 -F
111 This will filter only the executable that is given on the command
112 line. If no command is given, then it will filter itself (pretty
113 pointless). Using -F will let you trace only events that are caused
114 by the given command.
115
116 -P pid
117 Similar to -F but lets you specify a process ID to trace.
118
119 -c
120 Used with either -F (or -P if kernel supports it) to trace the
121 process' children too.
122
123 --user
124 Execute the specified command as given user.
125
126 -C clock
127 Set the trace clock to "clock".
128
129 Use trace-cmd(1) list -C to see what clocks are available.
130
131 -o output-file
132 By default, trace-cmd report will create a trace.dat file. You can
133 specify a different file to write to with the -o option.
134
135 -l function-name
136 This will limit the function and function_graph tracers to only
137 trace the given function name. More than one -l may be specified on
138 the command line to trace more than one function. This supports
139 both full regex(3) parsing, or basic glob parsing. If the filter
140 has only alphanumeric, _, *, ? and . characters, then it will be
141 parsed as a basic glob. to force it to be a regex, prefix the
142 filter with ^ or append it with $ and it will then be parsed as a
143 regex.
144
145 -g function-name
146 This option is for the function_graph plugin. It will graph the
147 given function. That is, it will only trace the function and all
148 functions that it calls. You can have more than one -g on the
149 command line.
150
151 -n function-name
152 This has the opposite effect of -l. The function given with the -n
153 option will not be traced. This takes precedence, that is, if you
154 include the same function for both -n and -l, it will not be
155 traced.
156
157 -d
158 Some tracer plugins enable the function tracer by default. Like the
159 latency tracers. This option prevents the function tracer from
160 being enabled at start up.
161
162 -D
163 The option -d will try to use the function-trace option to disable
164 the function tracer (if available), otherwise it defaults to the
165 proc file: /proc/sys/kernel/ftrace_enabled, but will not touch it
166 if the function-trace option is available. The -D option will
167 disable both the ftrace_enabled proc file as well as the
168 function-trace option if it exists.
169
170 Note, this disable function tracing for all users, which includes users
171 outside of ftrace tracers (stack_tracer, perf, etc).
172
173 -O option
174 Ftrace has various options that can be enabled or disabled. This
175 allows you to set them. Appending the text no to an option disables
176 it. For example: "-O nograph-time" will disable the "graph-time"
177 Ftrace option.
178
179 -s interval
180 The processes that trace-cmd creates to record from the ring buffer
181 need to wake up to do the recording. Setting the interval to zero
182 will cause the processes to wakeup every time new data is written
183 into the buffer. But since Ftrace is recording kernel activity, the
184 act of this processes going back to sleep may cause new events into
185 the ring buffer which will wake the process back up. This will
186 needlessly add extra data into the ring buffer.
187
188 The 'interval' metric is microseconds. The default is set to 1000 (1 ms).
189 This is the time each recording process will sleep before waking up to
190 record any new data that was written to the ring buffer.
191
192 -r priority
193 The priority to run the capture threads at. In a busy system the
194 trace capturing threads may be staved and events can be lost. This
195 increases the priority of those threads to the real time (FIFO)
196 priority. But use this option with care, it can also change the
197 behaviour of the system being traced.
198
199 -b size
200 This sets the ring buffer size to size kilobytes. Because the
201 Ftrace ring buffer is per CPU, this size is the size of each per
202 CPU ring buffer inside the kernel. Using "-b 10000" on a machine
203 with 4 CPUs will make Ftrace have a total buffer size of 40 Megs.
204
205 -B buffer-name
206 If the kernel supports multiple buffers, this will add a buffer
207 with the given name. If the buffer name already exists, that buffer
208 is just reset and will not be deleted at the end of record
209 execution. If the buffer is created, it will be removed at the end
210 of execution (unless the -k is set, or start command was used).
211
212 After a buffer name is stated, all events added after that will be
213 associated with that buffer. If no buffer is specified, or an event
214 is specified before a buffer name, it will be associated with the
215 main (toplevel) buffer.
216
217 trace-cmd record -e sched -B block -e block -B time -e timer sleep 1
218
219 The above is will enable all sched events in the main buffer. It will
220 then create a 'block' buffer instance and enable all block events within
221 that buffer. A 'time' buffer instance is created and all timer events
222 will be enabled for that event.
223
224 -m size
225 The max size in kilobytes that a per cpu buffer should be. Note,
226 due to rounding to page size, the number may not be totally
227 correct. Also, this is performed by switching between two buffers
228 that are half the given size thus the output may not be of the
229 given size even if much more was written.
230
231 Use this to prevent running out of diskspace for long runs.
232
233 -M cpumask
234 Set the cpumask for to trace. It only affects the last buffer
235 instance given. If supplied before any buffer instance, then it
236 affects the main buffer. The value supplied must be a hex number.
237
238 trace-cmd record -p function -M c -B events13 -e all -M 5
239
240 If the -M is left out, then the mask stays the same. To enable all
241 CPUs, pass in a value of '-1'.
242
243 -k
244 By default, when trace-cmd is finished tracing, it will reset the
245 buffers and disable all the tracing that it enabled. This option
246 keeps trace-cmd from disabling the tracer and reseting the buffer.
247 This option is useful for debugging trace-cmd.
248
249 Note: usually trace-cmd will set the "tracing_on" file back to what it
250 was before it was called. This option will leave that file set to zero.
251
252 -i
253 By default, if an event is listed that trace-cmd does not find, it
254 will exit with an error. This option will just ignore events that
255 are listed on the command line but are not found on the system.
256
257 -N host:port
258 If another machine is running "trace-cmd listen", this option is
259 used to have the data sent to that machine with UDP packets.
260 Instead of writing to an output file, the data is sent off to a
261 remote box. This is ideal for embedded machines with little
262 storage, or having a single machine that will keep all the data in
263 a single repository.
264
265 Note: This option is not supported with latency tracer plugins:
266 wakeup, wakeup_rt, irqsoff, preemptoff and preemptirqsoff
267
268 -V cid:port
269 If recording on a guest VM and the host is running trace-cmd listen
270 with the -V option as well, or if this is recording on the host,
271 and a guest in running trace-cmd listen with the -V option, then
272 connect to the listener (the same as connecting with the -N option
273 via the network). This has the same limitations as the -N option
274 above with respect to latency tracer plugins.
275
276 -t
277 This option is used with -N, when there’s a need to send the live
278 data with TCP packets instead of UDP. Although TCP is not nearly as
279 fast as sending the UDP packets, but it may be needed if the
280 network is not that reliable, the amount of data is not that
281 intensive, and a guarantee is needed that all traced information is
282 transfered successfully.
283
284 -q | --quiet
285 For use with recording an application. Suppresses normal output
286 (except for errors) to allow only the application’s output to be
287 displayed.
288
289 --date
290 With the --date option, "trace-cmd" will write timestamps into the
291 trace buffer after it has finished recording. It will then map the
292 timestamp to gettimeofday which will allow wall time output from
293 the timestamps reading the created trace.dat file.
294
295 --max-graph-depth depth
296 Set the maximum depth the function_graph tracer will trace into a
297 function. A value of one will only show where userspace enters the
298 kernel but not any functions called in the kernel. The default is
299 zero, which means no limit.
300
301 --cmdlines-size size
302 Set the number of entries the kernel tracing file "saved_cmdlines"
303 can contain. This file is a circular buffer which stores the
304 mapping between cmdlines and PIDs. If full, it leads to unresolved
305 cmdlines ("<...>") within the trace. The kernel default value is
306 128.
307
308 --module module
309 Filter a module’s name in function tracing. It is equivalent to
310 adding :mod:module after all other functions being filtered. If no
311 other function filter is listed, then all modules functions will be
312 filtered in the filter.
313
314 '--module snd' is equivalent to '-l :mod:snd'
315
316 '--module snd -l "*jack*"' is equivalent to '-l "*jack*:mod:snd"'
317
318 '--module snd -n "*"' is equivalent to '-n :mod:snd'
319
320 --proc-map
321 Save the traced process address map into the trace.dat file. The
322 traced processes can be specified using the option -P, or as a
323 given command.
324
325 --profile
326 With the --profile option, "trace-cmd" will enable tracing that can
327 be used with trace-cmd-report(1) --profile option. If a tracer -p
328 is not set, and function graph depth is supported by the kernel,
329 then the function_graph tracer will be enabled with a depth of one
330 (only show where userspace enters into the kernel). It will also
331 enable various tracepoints with stack tracing such that the report
332 can show where tasks have been blocked for the longest time.
333
334 See trace-cmd-profile(1) for more details and examples.
335
336 -G
337 Set interrupt (soft and hard) events as global (associated to CPU
338 instead of tasks). Only works for --profile.
339
340 -H event-hooks
341 Add custom event matching to connect any two events together. When
342 not used with --profile, it will save the parameter and this will
343 be used by trace-cmd report --profile, too. That is:
344
345 trace-cmd record -H hrtimer_expire_entry,hrtimer/hrtimer_expire_exit,hrtimer,sp
346 trace-cmd report --profile
347
348 Will profile hrtimer_expire_entry and hrtimer_expire_ext times.
349
350 See trace-cmd-profile(1) for format.
351
352 -S
353 (for --profile only) Only enable the tracer or events speficied on
354 the command line. With this option, the function_graph tracer is
355 not enabled, nor are any events (like sched_switch), unless they
356 are specifically specified on the command line (i.e. -p function -e
357 sched_switch -e sched_wakeup)
358
359 --temp directory
360 When trace-cmd is recording the trace, it records the per CPU data
361 into a separate file for each CPU. At the end of the trace, these
362 files are concatenated onto the final trace.dat file. If the final
363 file is on a network file system, it may not be appropriate to copy
364 these temp files into the same location. --temp can be used to
365 tell trace-cmd where those temp files should be created.
366
367 --ts-offset offset
368 Add an offset for the timestamp in the trace.dat file. This will
369 add a offset option into the trace.dat file such that a trace-cmd
370 report will offset all the timestamps of the events by the given
371 offset. The offset is in raw units. That is, if the event
372 timestamps are in nanoseconds the offset will also be in
373 nanoseconds even if the displayed units are in microseconds.
374
375 --tsync-interval
376 Set the loop interval, in ms, for timestamps synchronization with
377 guests: If a negative number is specified, timestamps
378 synchronization is disabled If 0 is specified, no loop is performed
379 - timestamps offset is calculated only twice," at the beginning and
380 at the end of the trace. Timestamps synchronization with guests
381 works only if there is support for VSOCK.\n"
382
383 --tsc2nsec
384 Convert the current clock to nanoseconds, using tsc multiplier and
385 shift from the Linux kernel’s perf interface. This option does not
386 change the trace clock, just assumes that the tsc multiplier and
387 shift are applicable for the selected clock. You may use the "-C
388 tsc2nsec" clock, if not sure what clock to select.
389
390 --stderr
391 Have output go to stderr instead of stdout, but the output of the
392 command executed will not be changed. This is useful if you want to
393 monitor the output of the command being executed, but not see the
394 output from trace-cmd.
395
396 --poll
397 Waiting for data to be available on the trace ring-buffers may
398 trigger IPIs. This might generate unacceptable trace noise when
399 tracing low latency or real time systems. The poll option forces
400 trace-cmd to use O_NONBLOCK. Traces are extracted by busy waiting,
401 which will hog the CPUs, so only use when really needed.
402
403 --name
404 Give a specific name for the current agent being processed. Used
405 after -A to give the guest being traced a name. Useful when using
406 the vsocket ID instead of a name of the guest.
407
408 --verbose[=level]
409 Set the log level. Supported log levels are "none", "critical",
410 "error", "warning", "info", "debug", "all" or their identifiers
411 "0", "1", "2", "3", "4", "5", "6". Setting the log level to
412 specific value enables all logs from that and all previous levels.
413 The level will default to "info" if one is not specified.
414
415 Example: enable all critical, error and warning logs
416
417 trace-cmd record --verbose=warning
418
419 --file-version
420 Desired version of the output file. Supported versions are 6 or 7.
421
422 --compression
423 Compression of the trace output file, one of these strings can be
424 passed:
425
426 'any' - auto select the best available compression algorithm
427
428 'none' - do not compress the trace file
429
430 'name' - the name of the desired compression algorithms. Available algorithms can be listed with
431 trace-cmd list -c
432
433 --proxy vsocket
434 Use a vsocket proxy to reach the agent. Acts the same as -A (for an
435 agent) but will send the proxy connection to the agent. It is
436 expected to run on a privileged guest that the host is aware of (as
437 denoted by the cid in the -P option for the agent).
438
440 The basic way to trace all events:
441
442 # trace-cmd record -e all ls > /dev/null
443 # trace-cmd report
444 trace-cmd-13541 [003] 106260.693809: filemap_fault: address=0x128122 offset=0xce
445 trace-cmd-13543 [001] 106260.693809: kmalloc: call_site=81128dd4 ptr=0xffff88003dd83800 bytes_req=768 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_ZERO
446 ls-13545 [002] 106260.693809: kfree: call_site=810a7abb ptr=0x0
447 ls-13545 [002] 106260.693818: sys_exit_write: 0x1
448
449 To use the function tracer with sched switch tracing:
450
451 # trace-cmd record -p function -e sched_switch ls > /dev/null
452 # trace-cmd report
453 ls-13587 [002] 106467.860310: function: hrtick_start_fair <-- pick_next_task_fair
454 ls-13587 [002] 106467.860313: sched_switch: prev_comm=trace-cmd prev_pid=13587 prev_prio=120 prev_state=R ==> next_comm=trace-cmd next_pid=13583 next_prio=120
455 trace-cmd-13585 [001] 106467.860314: function: native_set_pte_at <-- __do_fault
456 trace-cmd-13586 [003] 106467.860314: function: up_read <-- do_page_fault
457 ls-13587 [002] 106467.860317: function: __phys_addr <-- schedule
458 trace-cmd-13585 [001] 106467.860318: function: _raw_spin_unlock <-- __do_fault
459 ls-13587 [002] 106467.860320: function: native_load_sp0 <-- __switch_to
460 trace-cmd-13586 [003] 106467.860322: function: down_read_trylock <-- do_page_fault
461
462 Here is a nice way to find what interrupts have the highest latency:
463
464 # trace-cmd record -p function_graph -e irq_handler_entry -l do_IRQ sleep 10
465 # trace-cmd report
466 <idle>-0 [000] 157412.933969: funcgraph_entry: | do_IRQ() {
467 <idle>-0 [000] 157412.933974: irq_handler_entry: irq=48 name=eth0
468 <idle>-0 [000] 157412.934004: funcgraph_exit: + 36.358 us | }
469 <idle>-0 [000] 157413.895004: funcgraph_entry: | do_IRQ() {
470 <idle>-0 [000] 157413.895011: irq_handler_entry: irq=48 name=eth0
471 <idle>-0 [000] 157413.895026: funcgraph_exit: + 24.014 us | }
472 <idle>-0 [000] 157415.891762: funcgraph_entry: | do_IRQ() {
473 <idle>-0 [000] 157415.891769: irq_handler_entry: irq=48 name=eth0
474 <idle>-0 [000] 157415.891784: funcgraph_exit: + 22.928 us | }
475 <idle>-0 [000] 157415.934869: funcgraph_entry: | do_IRQ() {
476 <idle>-0 [000] 157415.934874: irq_handler_entry: irq=48 name=eth0
477 <idle>-0 [000] 157415.934906: funcgraph_exit: + 37.512 us | }
478 <idle>-0 [000] 157417.888373: funcgraph_entry: | do_IRQ() {
479 <idle>-0 [000] 157417.888381: irq_handler_entry: irq=48 name=eth0
480 <idle>-0 [000] 157417.888398: funcgraph_exit: + 25.943 us | }
481
482 An example of the profile:
483
484 # trace-cmd record --profile sleep 1
485 # trace-cmd report --profile --comm sleep
486 task: sleep-21611
487 Event: sched_switch:R (1) Total: 99442 Avg: 99442 Max: 99442 Min:99442
488 <stack> 1 total:99442 min:99442 max:99442 avg=99442
489 => ftrace_raw_event_sched_switch (0xffffffff8105f812)
490 => __schedule (0xffffffff8150810a)
491 => preempt_schedule (0xffffffff8150842e)
492 => ___preempt_schedule (0xffffffff81273354)
493 => cpu_stop_queue_work (0xffffffff810b03c5)
494 => stop_one_cpu (0xffffffff810b063b)
495 => sched_exec (0xffffffff8106136d)
496 => do_execve_common.isra.27 (0xffffffff81148c89)
497 => do_execve (0xffffffff811490b0)
498 => SyS_execve (0xffffffff811492c4)
499 => return_to_handler (0xffffffff8150e3c8)
500 => stub_execve (0xffffffff8150c699)
501 Event: sched_switch:S (1) Total: 1000506680 Avg: 1000506680 Max: 1000506680 Min:1000506680
502 <stack> 1 total:1000506680 min:1000506680 max:1000506680 avg=1000506680
503 => ftrace_raw_event_sched_switch (0xffffffff8105f812)
504 => __schedule (0xffffffff8150810a)
505 => schedule (0xffffffff815084b8)
506 => do_nanosleep (0xffffffff8150b22c)
507 => hrtimer_nanosleep (0xffffffff8108d647)
508 => SyS_nanosleep (0xffffffff8108d72c)
509 => return_to_handler (0xffffffff8150e3c8)
510 => tracesys_phase2 (0xffffffff8150c304)
511 Event: sched_wakeup:21611 (1) Total: 30326 Avg: 30326 Max: 30326 Min:30326
512 <stack> 1 total:30326 min:30326 max:30326 avg=30326
513 => ftrace_raw_event_sched_wakeup_template (0xffffffff8105f653)
514 => ttwu_do_wakeup (0xffffffff810606eb)
515 => ttwu_do_activate.constprop.124 (0xffffffff810607c8)
516 => try_to_wake_up (0xffffffff8106340a)
517
519 trace-cmd(1), trace-cmd-report(1), trace-cmd-start(1),
520 trace-cmd-stop(1), trace-cmd-extract(1), trace-cmd-reset(1),
521 trace-cmd-split(1), trace-cmd-list(1), trace-cmd-listen(1),
522 trace-cmd-profile(1)
523
525 Written by Steven Rostedt, <rostedt@goodmis.org[1]>
526
528 https://git.kernel.org/pub/scm/utils/trace-cmd/trace-cmd.git/
529
531 Copyright (C) 2010 Red Hat, Inc. Free use of this software is granted
532 under the terms of the GNU Public License (GPL).
533
535 1. rostedt@goodmis.org
536 mailto:rostedt@goodmis.org
537
538
539
540libtracefs 10/11/2022 TRACE-CMD-RECORD(1)