1PTRACE(2) Linux Programmer's Manual PTRACE(2)
2
3
4
6 ptrace - process trace
7
9 #include <sys/ptrace.h>
10
11 long ptrace(enum __ptrace_request request, pid_t pid,
12 void *addr, void *data);
13
15 The ptrace() system call provides a means by which one process (the
16 "tracer") may observe and control the execution of another process (the
17 "tracee"), and examine and change the tracee's memory and registers.
18 It is primarily used to implement breakpoint debugging and system call
19 tracing.
20
21 A tracee first needs to be attached to the tracer. Attachment and sub‐
22 sequent commands are per thread: in a multithreaded process, every
23 thread can be individually attached to a (potentially different)
24 tracer, or left not attached and thus not debugged. Therefore,
25 "tracee" always means "(one) thread", never "a (possibly multithreaded)
26 process". Ptrace commands are always sent to a specific tracee using a
27 call of the form
28
29 ptrace(PTRACE_foo, pid, ...)
30
31 where pid is the thread ID of the corresponding Linux thread.
32
33 (Note that in this page, a "multithreaded process" means a thread group
34 consisting of threads created using the clone(2) CLONE_THREAD flag.)
35
36 A process can initiate a trace by calling fork(2) and having the
37 resulting child do a PTRACE_TRACEME, followed (typically) by an
38 execve(2). Alternatively, one process may commence tracing another
39 process using PTRACE_ATTACH or PTRACE_SEIZE.
40
41 While being traced, the tracee will stop each time a signal is deliv‐
42 ered, even if the signal is being ignored. (An exception is SIGKILL,
43 which has its usual effect.) The tracer will be notified at its next
44 call to waitpid(2) (or one of the related "wait" system calls); that
45 call will return a status value containing information that indicates
46 the cause of the stop in the tracee. While the tracee is stopped, the
47 tracer can use various ptrace requests to inspect and modify the
48 tracee. The tracer then causes the tracee to continue, optionally
49 ignoring the delivered signal (or even delivering a different signal
50 instead).
51
52 If the PTRACE_O_TRACEEXEC option is not in effect, all successful calls
53 to execve(2) by the traced process will cause it to be sent a SIGTRAP
54 signal, giving the parent a chance to gain control before the new pro‐
55 gram begins execution.
56
57 When the tracer is finished tracing, it can cause the tracee to con‐
58 tinue executing in a normal, untraced mode via PTRACE_DETACH.
59
60 The value of request determines the action to be performed:
61
62 PTRACE_TRACEME
63 Indicate that this process is to be traced by its parent. A
64 process probably shouldn't make this request if its parent isn't
65 expecting to trace it. (pid, addr, and data are ignored.)
66
67 The PTRACE_TRACEME request is used only by the tracee; the
68 remaining requests are used only by the tracer. In the follow‐
69 ing requests, pid specifies the thread ID of the tracee to be
70 acted on. For requests other than PTRACE_ATTACH, PTRACE_SEIZE,
71 PTRACE_INTERRUPT, and PTRACE_KILL, the tracee must be stopped.
72
73 PTRACE_PEEKTEXT, PTRACE_PEEKDATA
74 Read a word at the address addr in the tracee's memory, return‐
75 ing the word as the result of the ptrace() call. Linux does not
76 have separate text and data address spaces, so these two
77 requests are currently equivalent. (data is ignored; but see
78 NOTES.)
79
80 PTRACE_PEEKUSER
81 Read a word at offset addr in the tracee's USER area, which
82 holds the registers and other information about the process (see
83 <sys/user.h>). The word is returned as the result of the
84 ptrace() call. Typically, the offset must be word-aligned,
85 though this might vary by architecture. See NOTES. (data is
86 ignored; but see NOTES.)
87
88 PTRACE_POKETEXT, PTRACE_POKEDATA
89 Copy the word data to the address addr in the tracee's memory.
90 As for PTRACE_PEEKTEXT and PTRACE_PEEKDATA, these two requests
91 are currently equivalent.
92
93 PTRACE_POKEUSER
94 Copy the word data to offset addr in the tracee's USER area. As
95 for PTRACE_PEEKUSER, the offset must typically be word-aligned.
96 In order to maintain the integrity of the kernel, some modifica‐
97 tions to the USER area are disallowed.
98
99 PTRACE_GETREGS, PTRACE_GETFPREGS
100 Copy the tracee's general-purpose or floating-point registers,
101 respectively, to the address data in the tracer. See
102 <sys/user.h> for information on the format of this data. (addr
103 is ignored.) Note that SPARC systems have the meaning of data
104 and addr reversed; that is, data is ignored and the registers
105 are copied to the address addr. PTRACE_GETREGS and PTRACE_GETF‐
106 PREGS are not present on all architectures.
107
108 PTRACE_GETREGSET (since Linux 2.6.34)
109 Read the tracee's registers. addr specifies, in an architec‐
110 ture-dependent way, the type of registers to be read. NT_PRSTA‐
111 TUS (with numerical value 1) usually results in reading of gen‐
112 eral-purpose registers. If the CPU has, for example, floating-
113 point and/or vector registers, they can be retrieved by setting
114 addr to the corresponding NT_foo constant. data points to a
115 struct iovec, which describes the destination buffer's location
116 and length. On return, the kernel modifies iov.len to indicate
117 the actual number of bytes returned.
118
119 PTRACE_SETREGS, PTRACE_SETFPREGS
120 Modify the tracee's general-purpose or floating-point registers,
121 respectively, from the address data in the tracer. As for
122 PTRACE_POKEUSER, some general-purpose register modifications may
123 be disallowed. (addr is ignored.) Note that SPARC systems have
124 the meaning of data and addr reversed; that is, data is ignored
125 and the registers are copied from the address addr.
126 PTRACE_SETREGS and PTRACE_SETFPREGS are not present on all
127 architectures.
128
129 PTRACE_SETREGSET (since Linux 2.6.34)
130 Modify the tracee's registers. The meaning of addr and data is
131 analogous to PTRACE_GETREGSET.
132
133 PTRACE_GETSIGINFO (since Linux 2.3.99-pre6)
134 Retrieve information about the signal that caused the stop.
135 Copy a siginfo_t structure (see sigaction(2)) from the tracee to
136 the address data in the tracer. (addr is ignored.)
137
138 PTRACE_SETSIGINFO (since Linux 2.3.99-pre6)
139 Set signal information: copy a siginfo_t structure from the
140 address data in the tracer to the tracee. This will affect only
141 signals that would normally be delivered to the tracee and were
142 caught by the tracer. It may be difficult to tell these normal
143 signals from synthetic signals generated by ptrace() itself.
144 (addr is ignored.)
145
146 PTRACE_PEEKSIGINFO (since Linux 3.10)
147 Retrieve siginfo_t structures without removing signals from a
148 queue. addr points to a ptrace_peeksiginfo_args structure that
149 specifies the ordinal position from which copying of signals
150 should start, and the number of signals to copy. siginfo_t
151 structures are copied into the buffer pointed to by data. The
152 return value contains the number of copied signals (zero indi‐
153 cates that there is no signal corresponding to the specified
154 ordinal position). Within the returned siginfo structures, the
155 si_code field includes information (__SI_CHLD, __SI_FAULT, etc.)
156 that are not otherwise exposed to user space.
157
158 struct ptrace_peeksiginfo_args {
159 u64 off; /* Ordinal position in queue at which
160 to start copying signals */
161 u32 flags; /* PTRACE_PEEKSIGINFO_SHARED or 0 */
162 s32 nr; /* Number of signals to copy */
163 };
164
165 Currently, there is only one flag, PTRACE_PEEKSIGINFO_SHARED,
166 for dumping signals from the process-wide signal queue. If this
167 flag is not set, signals are read from the per-thread queue of
168 the specified thread.
169
170 PTRACE_GETSIGMASK (since Linux 3.11)
171 Place a copy of the mask of blocked signals (see sigprocmask(2))
172 in the buffer pointed to by data, which should be a pointer to a
173 buffer of type sigset_t. The addr argument contains the size of
174 the buffer pointed to by data (i.e., sizeof(sigset_t)).
175
176 PTRACE_SETSIGMASK (since Linux 3.11)
177 Change the mask of blocked signals (see sigprocmask(2)) to the
178 value specified in the buffer pointed to by data, which should
179 be a pointer to a buffer of type sigset_t. The addr argument
180 contains the size of the buffer pointed to by data (i.e.,
181 sizeof(sigset_t)).
182
183 PTRACE_SETOPTIONS (since Linux 2.4.6; see BUGS for caveats)
184 Set ptrace options from data. (addr is ignored.) data is
185 interpreted as a bit mask of options, which are specified by the
186 following flags:
187
188 PTRACE_O_EXITKILL (since Linux 3.8)
189 Send a SIGKILL signal to the tracee if the tracer exits.
190 This option is useful for ptrace jailers that want to
191 ensure that tracees can never escape the tracer's con‐
192 trol.
193
194 PTRACE_O_TRACECLONE (since Linux 2.5.46)
195 Stop the tracee at the next clone(2) and automatically
196 start tracing the newly cloned process, which will start
197 with a SIGSTOP, or PTRACE_EVENT_STOP if PTRACE_SEIZE was
198 used. A waitpid(2) by the tracer will return a status
199 value such that
200
201 status>>8 == (SIGTRAP | (PTRACE_EVENT_CLONE<<8))
202
203 The PID of the new process can be retrieved with
204 PTRACE_GETEVENTMSG.
205
206 This option may not catch clone(2) calls in all cases.
207 If the tracee calls clone(2) with the CLONE_VFORK flag,
208 PTRACE_EVENT_VFORK will be delivered instead if
209 PTRACE_O_TRACEVFORK is set; otherwise if the tracee calls
210 clone(2) with the exit signal set to SIGCHLD,
211 PTRACE_EVENT_FORK will be delivered if PTRACE_O_TRACEFORK
212 is set.
213
214 PTRACE_O_TRACEEXEC (since Linux 2.5.46)
215 Stop the tracee at the next execve(2). A waitpid(2) by
216 the tracer will return a status value such that
217
218 status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))
219
220 If the execing thread is not a thread group leader, the
221 thread ID is reset to thread group leader's ID before
222 this stop. Since Linux 3.0, the former thread ID can be
223 retrieved with PTRACE_GETEVENTMSG.
224
225 PTRACE_O_TRACEEXIT (since Linux 2.5.60)
226 Stop the tracee at exit. A waitpid(2) by the tracer will
227 return a status value such that
228
229 status>>8 == (SIGTRAP | (PTRACE_EVENT_EXIT<<8))
230
231 The tracee's exit status can be retrieved with
232 PTRACE_GETEVENTMSG.
233
234 The tracee is stopped early during process exit, when
235 registers are still available, allowing the tracer to see
236 where the exit occurred, whereas the normal exit notifi‐
237 cation is done after the process is finished exiting.
238 Even though context is available, the tracer cannot pre‐
239 vent the exit from happening at this point.
240
241 PTRACE_O_TRACEFORK (since Linux 2.5.46)
242 Stop the tracee at the next fork(2) and automatically
243 start tracing the newly forked process, which will start
244 with a SIGSTOP, or PTRACE_EVENT_STOP if PTRACE_SEIZE was
245 used. A waitpid(2) by the tracer will return a status
246 value such that
247
248 status>>8 == (SIGTRAP | (PTRACE_EVENT_FORK<<8))
249
250 The PID of the new process can be retrieved with
251 PTRACE_GETEVENTMSG.
252
253 PTRACE_O_TRACESYSGOOD (since Linux 2.4.6)
254 When delivering system call traps, set bit 7 in the sig‐
255 nal number (i.e., deliver SIGTRAP|0x80). This makes it
256 easy for the tracer to distinguish normal traps from
257 those caused by a system call.
258
259 PTRACE_O_TRACEVFORK (since Linux 2.5.46)
260 Stop the tracee at the next vfork(2) and automatically
261 start tracing the newly vforked process, which will start
262 with a SIGSTOP, or PTRACE_EVENT_STOP if PTRACE_SEIZE was
263 used. A waitpid(2) by the tracer will return a status
264 value such that
265
266 status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK<<8))
267
268 The PID of the new process can be retrieved with
269 PTRACE_GETEVENTMSG.
270
271 PTRACE_O_TRACEVFORKDONE (since Linux 2.5.60)
272 Stop the tracee at the completion of the next vfork(2).
273 A waitpid(2) by the tracer will return a status value
274 such that
275
276 status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK_DONE<<8))
277
278 The PID of the new process can (since Linux 2.6.18) be
279 retrieved with PTRACE_GETEVENTMSG.
280
281 PTRACE_O_TRACESECCOMP (since Linux 3.5)
282 Stop the tracee when a seccomp(2) SECCOMP_RET_TRACE rule
283 is triggered. A waitpid(2) by the tracer will return a
284 status value such that
285
286 status>>8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP<<8))
287
288 While this triggers a PTRACE_EVENT stop, it is similar to
289 a syscall-enter-stop. For details, see the note on
290 PTRACE_EVENT_SECCOMP below. The seccomp event message
291 data (from the SECCOMP_RET_DATA portion of the seccomp
292 filter rule) can be retrieved with PTRACE_GETEVENTMSG.
293
294 PTRACE_O_SUSPEND_SECCOMP (since Linux 4.3)
295 Suspend the tracee's seccomp protections. This applies
296 regardless of mode, and can be used when the tracee has
297 not yet installed seccomp filters. That is, a valid use
298 case is to suspend a tracee's seccomp protections before
299 they are installed by the tracee, let the tracee install
300 the filters, and then clear this flag when the filters
301 should be resumed. Setting this option requires that the
302 tracer have the CAP_SYS_ADMIN capability, not have any
303 seccomp protections installed, and not have PTRACE_O_SUS‐
304 PEND_SECCOMP set on itself.
305
306 PTRACE_GETEVENTMSG (since Linux 2.5.46)
307 Retrieve a message (as an unsigned long) about the ptrace event
308 that just happened, placing it at the address data in the
309 tracer. For PTRACE_EVENT_EXIT, this is the tracee's exit sta‐
310 tus. For PTRACE_EVENT_FORK, PTRACE_EVENT_VFORK,
311 PTRACE_EVENT_VFORK_DONE, and PTRACE_EVENT_CLONE, this is the PID
312 of the new process. For PTRACE_EVENT_SECCOMP, this is the sec‐
313 comp(2) filter's SECCOMP_RET_DATA associated with the triggered
314 rule. (addr is ignored.)
315
316 PTRACE_CONT
317 Restart the stopped tracee process. If data is nonzero, it is
318 interpreted as the number of a signal to be delivered to the
319 tracee; otherwise, no signal is delivered. Thus, for example,
320 the tracer can control whether a signal sent to the tracee is
321 delivered or not. (addr is ignored.)
322
323 PTRACE_SYSCALL, PTRACE_SINGLESTEP
324 Restart the stopped tracee as for PTRACE_CONT, but arrange for
325 the tracee to be stopped at the next entry to or exit from a
326 system call, or after execution of a single instruction, respec‐
327 tively. (The tracee will also, as usual, be stopped upon
328 receipt of a signal.) From the tracer's perspective, the tracee
329 will appear to have been stopped by receipt of a SIGTRAP. So,
330 for PTRACE_SYSCALL, for example, the idea is to inspect the
331 arguments to the system call at the first stop, then do another
332 PTRACE_SYSCALL and inspect the return value of the system call
333 at the second stop. The data argument is treated as for
334 PTRACE_CONT. (addr is ignored.)
335
336 PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP (since Linux 2.6.14)
337 For PTRACE_SYSEMU, continue and stop on entry to the next system
338 call, which will not be executed. See the documentation on
339 syscall-stops below. For PTRACE_SYSEMU_SINGLESTEP, do the same
340 but also singlestep if not a system call. This call is used by
341 programs like User Mode Linux that want to emulate all the
342 tracee's system calls. The data argument is treated as for
343 PTRACE_CONT. The addr argument is ignored. These requests are
344 currently supported only on x86.
345
346 PTRACE_LISTEN (since Linux 3.4)
347 Restart the stopped tracee, but prevent it from executing. The
348 resulting state of the tracee is similar to a process which has
349 been stopped by a SIGSTOP (or other stopping signal). See the
350 "group-stop" subsection for additional information. PTRACE_LIS‐
351 TEN works only on tracees attached by PTRACE_SEIZE.
352
353 PTRACE_KILL
354 Send the tracee a SIGKILL to terminate it. (addr and data are
355 ignored.)
356
357 This operation is deprecated; do not use it! Instead, send a
358 SIGKILL directly using kill(2) or tgkill(2). The problem with
359 PTRACE_KILL is that it requires the tracee to be in signal-
360 delivery-stop, otherwise it may not work (i.e., may complete
361 successfully but won't kill the tracee). By contrast, sending a
362 SIGKILL directly has no such limitation.
363
364 PTRACE_INTERRUPT (since Linux 3.4)
365 Stop a tracee. If the tracee is running or sleeping in kernel
366 space and PTRACE_SYSCALL is in effect, the system call is inter‐
367 rupted and syscall-exit-stop is reported. (The interrupted sys‐
368 tem call is restarted when the tracee is restarted.) If the
369 tracee was already stopped by a signal and PTRACE_LISTEN was
370 sent to it, the tracee stops with PTRACE_EVENT_STOP and WSTOP‐
371 SIG(status) returns the stop signal. If any other ptrace-stop
372 is generated at the same time (for example, if a signal is sent
373 to the tracee), this ptrace-stop happens. If none of the above
374 applies (for example, if the tracee is running in user space),
375 it stops with PTRACE_EVENT_STOP with WSTOPSIG(status) == SIG‐
376 TRAP. PTRACE_INTERRUPT only works on tracees attached by
377 PTRACE_SEIZE.
378
379 PTRACE_ATTACH
380 Attach to the process specified in pid, making it a tracee of
381 the calling process. The tracee is sent a SIGSTOP, but will not
382 necessarily have stopped by the completion of this call; use
383 waitpid(2) to wait for the tracee to stop. See the "Attaching
384 and detaching" subsection for additional information. (addr and
385 data are ignored.)
386
387 Permission to perform a PTRACE_ATTACH is governed by a ptrace
388 access mode PTRACE_MODE_ATTACH_REALCREDS check; see below.
389
390 PTRACE_SEIZE (since Linux 3.4)
391 Attach to the process specified in pid, making it a tracee of
392 the calling process. Unlike PTRACE_ATTACH, PTRACE_SEIZE does
393 not stop the process. Group-stops are reported as
394 PTRACE_EVENT_STOP and WSTOPSIG(status) returns the stop signal.
395 Automatically attached children stop with PTRACE_EVENT_STOP and
396 WSTOPSIG(status) returns SIGTRAP instead of having SIGSTOP sig‐
397 nal delivered to them. execve(2) does not deliver an extra SIG‐
398 TRAP. Only a PTRACE_SEIZEd process can accept PTRACE_INTERRUPT
399 and PTRACE_LISTEN commands. The "seized" behavior just
400 described is inherited by children that are automatically
401 attached using PTRACE_O_TRACEFORK, PTRACE_O_TRACEVFORK, and
402 PTRACE_O_TRACECLONE. addr must be zero. data contains a bit
403 mask of ptrace options to activate immediately.
404
405 Permission to perform a PTRACE_SEIZE is governed by a ptrace
406 access mode PTRACE_MODE_ATTACH_REALCREDS check; see below.
407
408 PTRACE_SECCOMP_GET_FILTER (since Linux 4.4)
409 This operation allows the tracer to dump the tracee's classic
410 BPF filters.
411
412 addr is an integer specifying the index of the filter to be
413 dumped. The most recently installed filter has the index 0. If
414 addr is greater than the number of installed filters, the opera‐
415 tion fails with the error ENOENT.
416
417 data is either a pointer to a struct sock_filter array that is
418 large enough to store the BPF program, or NULL if the program is
419 not to be stored.
420
421 Upon success, the return value is the number of instructions in
422 the BPF program. If data was NULL, then this return value can
423 be used to correctly size the struct sock_filter array passed in
424 a subsequent call.
425
426 This operation fails with the error EACCES if the caller does
427 not have the CAP_SYS_ADMIN capability or if the caller is in
428 strict or filter seccomp mode. If the filter referred to by
429 addr is not a classic BPF filter, the operation fails with the
430 error EMEDIUMTYPE.
431
432 This operation is available if the kernel was configured with
433 both the CONFIG_SECCOMP_FILTER and the CONFIG_CHECKPOINT_RESTORE
434 options.
435
436 PTRACE_DETACH
437 Restart the stopped tracee as for PTRACE_CONT, but first detach
438 from it. Under Linux, a tracee can be detached in this way
439 regardless of which method was used to initiate tracing. (addr
440 is ignored.)
441
442 PTRACE_GET_THREAD_AREA (since Linux 2.6.0)
443 This operation performs a similar task to get_thread_area(2).
444 It reads the TLS entry in the GDT whose index is given in addr,
445 placing a copy of the entry into the struct user_desc pointed to
446 by data. (By contrast with get_thread_area(2), the entry_number
447 of the struct user_desc is ignored.)
448
449 PTRACE_SET_THREAD_AREA (since Linux 2.6.0)
450 This operation performs a similar task to set_thread_area(2).
451 It sets the TLS entry in the GDT whose index is given in addr,
452 assigning it the data supplied in the struct user_desc pointed
453 to by data. (By contrast with set_thread_area(2), the
454 entry_number of the struct user_desc is ignored; in other words,
455 this ptrace operation can't be used to allocate a free TLS
456 entry.)
457
458 PTRACE_GET_SYSCALL_INFO (since Linux 5.3)
459 Retrieve information about the system call that caused the stop.
460 The information is placed into the buffer pointed by the data
461 argument, which should be a pointer to a buffer of type struct
462 ptrace_syscall_info. The addr argument contains the size of the
463 buffer pointed to by the data argument (i.e., sizeof(struct
464 ptrace_syscall_info)). The return value contains the number of
465 bytes available to be written by the kernel. If the size of the
466 data to be written by the kernel exceeds the size specified by
467 the addr argument, the output data is truncated.
468
469 The ptrace_syscall_info structure contains the following fields:
470
471 struct ptrace_syscall_info {
472 __u8 op; /* Type of system call stop */
473 __u32 arch; /* AUDIT_ARCH_* value; see seccomp(2) */
474 __u64 instruction_pointer; /* CPU instruction pointer */
475 __u64 stack_pointer; /* CPU stack pointer */
476 union {
477 struct { /* op == PTRACE_SYSCALL_INFO_ENTRY */
478 __u64 nr; /* System call number */
479 __u64 args[6]; /* System call arguments */
480 } entry;
481 struct { /* op == PTRACE_SYSCALL_INFO_EXIT */
482 __s64 rval; /* System call return value */
483 __u8 is_error; /* System call error flag;
484 Boolean: does rval contain
485 an error value (-ERRCODE) or
486 a nonerror return value? */
487 } exit;
488 struct { /* op == PTRACE_SYSCALL_INFO_SECCOMP */
489 __u64 nr; /* System call number */
490 __u64 args[6]; /* System call arguments */
491 __u32 ret_data; /* SECCOMP_RET_DATA portion
492 of SECCOMP_RET_TRACE
493 return value */
494 } seccomp;
495 };
496 };
497
498 The op, arch, instruction_pointer, and stack_pointer fields are
499 defined for all kinds of ptrace system call stops. The rest of
500 the structure is a union; one should read only those fields that
501 are meaningful for the kind of system call stop specified by the
502 op field.
503
504 The op field has one of the following values (defined in
505 <linux/ptrace.h>) indicating what type of stop occurred and
506 which part of the union is filled:
507
508 PTRACE_SYSCALL_INFO_ENTRY
509 The entry component of the union contains information
510 relating to a system call entry stop.
511
512 PTRACE_SYSCALL_INFO_EXIT
513 The exit component of the union contains information
514 relating to a system call exit stop.
515
516 PTRACE_SYSCALL_INFO_SECCOMP
517 The exit component of the union contains information
518 relating to a PTRACE_EVENT_SECCOMP stop.
519
520 PTRACE_SYSCALL_INFO_NONE
521 No component of the union contains relevant information.
522
523 Death under ptrace
524 When a (possibly multithreaded) process receives a killing signal (one
525 whose disposition is set to SIG_DFL and whose default action is to kill
526 the process), all threads exit. Tracees report their death to their
527 tracer(s). Notification of this event is delivered via waitpid(2).
528
529 Note that the killing signal will first cause signal-delivery-stop (on
530 one tracee only), and only after it is injected by the tracer (or after
531 it was dispatched to a thread which isn't traced), will death from the
532 signal happen on all tracees within a multithreaded process. (The term
533 "signal-delivery-stop" is explained below.)
534
535 SIGKILL does not generate signal-delivery-stop and therefore the tracer
536 can't suppress it. SIGKILL kills even within system calls (syscall-
537 exit-stop is not generated prior to death by SIGKILL). The net effect
538 is that SIGKILL always kills the process (all its threads), even if
539 some threads of the process are ptraced.
540
541 When the tracee calls _exit(2), it reports its death to its tracer.
542 Other threads are not affected.
543
544 When any thread executes exit_group(2), every tracee in its thread
545 group reports its death to its tracer.
546
547 If the PTRACE_O_TRACEEXIT option is on, PTRACE_EVENT_EXIT will happen
548 before actual death. This applies to exits via exit(2), exit_group(2),
549 and signal deaths (except SIGKILL, depending on the kernel version; see
550 BUGS below), and when threads are torn down on execve(2) in a multi‐
551 threaded process.
552
553 The tracer cannot assume that the ptrace-stopped tracee exists. There
554 are many scenarios when the tracee may die while stopped (such as
555 SIGKILL). Therefore, the tracer must be prepared to handle an ESRCH
556 error on any ptrace operation. Unfortunately, the same error is
557 returned if the tracee exists but is not ptrace-stopped (for commands
558 which require a stopped tracee), or if it is not traced by the process
559 which issued the ptrace call. The tracer needs to keep track of the
560 stopped/running state of the tracee, and interpret ESRCH as "tracee
561 died unexpectedly" only if it knows that the tracee has been observed
562 to enter ptrace-stop. Note that there is no guarantee that wait‐
563 pid(WNOHANG) will reliably report the tracee's death status if a ptrace
564 operation returned ESRCH. waitpid(WNOHANG) may return 0 instead. In
565 other words, the tracee may be "not yet fully dead", but already refus‐
566 ing ptrace requests.
567
568 The tracer can't assume that the tracee always ends its life by report‐
569 ing WIFEXITED(status) or WIFSIGNALED(status); there are cases where
570 this does not occur. For example, if a thread other than thread group
571 leader does an execve(2), it disappears; its PID will never be seen
572 again, and any subsequent ptrace stops will be reported under the
573 thread group leader's PID.
574
575 Stopped states
576 A tracee can be in two states: running or stopped. For the purposes of
577 ptrace, a tracee which is blocked in a system call (such as read(2),
578 pause(2), etc.) is nevertheless considered to be running, even if the
579 tracee is blocked for a long time. The state of the tracee after
580 PTRACE_LISTEN is somewhat of a gray area: it is not in any ptrace-stop
581 (ptrace commands won't work on it, and it will deliver waitpid(2) noti‐
582 fications), but it also may be considered "stopped" because it is not
583 executing instructions (is not scheduled), and if it was in group-stop
584 before PTRACE_LISTEN, it will not respond to signals until SIGCONT is
585 received.
586
587 There are many kinds of states when the tracee is stopped, and in
588 ptrace discussions they are often conflated. Therefore, it is impor‐
589 tant to use precise terms.
590
591 In this manual page, any stopped state in which the tracee is ready to
592 accept ptrace commands from the tracer is called ptrace-stop. Ptrace-
593 stops can be further subdivided into signal-delivery-stop, group-stop,
594 syscall-stop, PTRACE_EVENT stops, and so on. These stopped states are
595 described in detail below.
596
597 When the running tracee enters ptrace-stop, it notifies its tracer
598 using waitpid(2) (or one of the other "wait" system calls). Most of
599 this manual page assumes that the tracer waits with:
600
601 pid = waitpid(pid_or_minus_1, &status, __WALL);
602
603 Ptrace-stopped tracees are reported as returns with pid greater than 0
604 and WIFSTOPPED(status) true.
605
606 The __WALL flag does not include the WSTOPPED and WEXITED flags, but
607 implies their functionality.
608
609 Setting the WCONTINUED flag when calling waitpid(2) is not recommended:
610 the "continued" state is per-process and consuming it can confuse the
611 real parent of the tracee.
612
613 Use of the WNOHANG flag may cause waitpid(2) to return 0 ("no wait
614 results available yet") even if the tracer knows there should be a
615 notification. Example:
616
617 errno = 0;
618 ptrace(PTRACE_CONT, pid, 0L, 0L);
619 if (errno == ESRCH) {
620 /* tracee is dead */
621 r = waitpid(tracee, &status, __WALL | WNOHANG);
622 /* r can still be 0 here! */
623 }
624
625 The following kinds of ptrace-stops exist: signal-delivery-stops,
626 group-stops, PTRACE_EVENT stops, syscall-stops. They all are reported
627 by waitpid(2) with WIFSTOPPED(status) true. They may be differentiated
628 by examining the value status>>8, and if there is ambiguity in that
629 value, by querying PTRACE_GETSIGINFO. (Note: the WSTOPSIG(status)
630 macro can't be used to perform this examination, because it returns the
631 value (status>>8) & 0xff.)
632
633 Signal-delivery-stop
634 When a (possibly multithreaded) process receives any signal except
635 SIGKILL, the kernel selects an arbitrary thread which handles the sig‐
636 nal. (If the signal is generated with tgkill(2), the target thread can
637 be explicitly selected by the caller.) If the selected thread is
638 traced, it enters signal-delivery-stop. At this point, the signal is
639 not yet delivered to the process, and can be suppressed by the tracer.
640 If the tracer doesn't suppress the signal, it passes the signal to the
641 tracee in the next ptrace restart request. This second step of signal
642 delivery is called signal injection in this manual page. Note that if
643 the signal is blocked, signal-delivery-stop doesn't happen until the
644 signal is unblocked, with the usual exception that SIGSTOP can't be
645 blocked.
646
647 Signal-delivery-stop is observed by the tracer as waitpid(2) returning
648 with WIFSTOPPED(status) true, with the signal returned by WSTOPSIG(sta‐
649 tus). If the signal is SIGTRAP, this may be a different kind of
650 ptrace-stop; see the "Syscall-stops" and "execve" sections below for
651 details. If WSTOPSIG(status) returns a stopping signal, this may be a
652 group-stop; see below.
653
654 Signal injection and suppression
655 After signal-delivery-stop is observed by the tracer, the tracer should
656 restart the tracee with the call
657
658 ptrace(PTRACE_restart, pid, 0, sig)
659
660 where PTRACE_restart is one of the restarting ptrace requests. If sig
661 is 0, then a signal is not delivered. Otherwise, the signal sig is
662 delivered. This operation is called signal injection in this manual
663 page, to distinguish it from signal-delivery-stop.
664
665 The sig value may be different from the WSTOPSIG(status) value: the
666 tracer can cause a different signal to be injected.
667
668 Note that a suppressed signal still causes system calls to return pre‐
669 maturely. In this case, system calls will be restarted: the tracer
670 will observe the tracee to reexecute the interrupted system call (or
671 restart_syscall(2) system call for a few system calls which use a dif‐
672 ferent mechanism for restarting) if the tracer uses PTRACE_SYSCALL.
673 Even system calls (such as poll(2)) which are not restartable after
674 signal are restarted after signal is suppressed; however, kernel bugs
675 exist which cause some system calls to fail with EINTR even though no
676 observable signal is injected to the tracee.
677
678 Restarting ptrace commands issued in ptrace-stops other than signal-
679 delivery-stop are not guaranteed to inject a signal, even if sig is
680 nonzero. No error is reported; a nonzero sig may simply be ignored.
681 Ptrace users should not try to "create a new signal" this way: use
682 tgkill(2) instead.
683
684 The fact that signal injection requests may be ignored when restarting
685 the tracee after ptrace stops that are not signal-delivery-stops is a
686 cause of confusion among ptrace users. One typical scenario is that
687 the tracer observes group-stop, mistakes it for signal-delivery-stop,
688 restarts the tracee with
689
690 ptrace(PTRACE_restart, pid, 0, stopsig)
691
692 with the intention of injecting stopsig, but stopsig gets ignored and
693 the tracee continues to run.
694
695 The SIGCONT signal has a side effect of waking up (all threads of) a
696 group-stopped process. This side effect happens before signal-deliv‐
697 ery-stop. The tracer can't suppress this side effect (it can only sup‐
698 press signal injection, which only causes the SIGCONT handler to not be
699 executed in the tracee, if such a handler is installed). In fact, wak‐
700 ing up from group-stop may be followed by signal-delivery-stop for sig‐
701 nal(s) other than SIGCONT, if they were pending when SIGCONT was deliv‐
702 ered. In other words, SIGCONT may be not the first signal observed by
703 the tracee after it was sent.
704
705 Stopping signals cause (all threads of) a process to enter group-stop.
706 This side effect happens after signal injection, and therefore can be
707 suppressed by the tracer.
708
709 In Linux 2.4 and earlier, the SIGSTOP signal can't be injected.
710
711 PTRACE_GETSIGINFO can be used to retrieve a siginfo_t structure which
712 corresponds to the delivered signal. PTRACE_SETSIGINFO may be used to
713 modify it. If PTRACE_SETSIGINFO has been used to alter siginfo_t, the
714 si_signo field and the sig parameter in the restarting command must
715 match, otherwise the result is undefined.
716
717 Group-stop
718 When a (possibly multithreaded) process receives a stopping signal, all
719 threads stop. If some threads are traced, they enter a group-stop.
720 Note that the stopping signal will first cause signal-delivery-stop (on
721 one tracee only), and only after it is injected by the tracer (or after
722 it was dispatched to a thread which isn't traced), will group-stop be
723 initiated on all tracees within the multithreaded process. As usual,
724 every tracee reports its group-stop separately to the corresponding
725 tracer.
726
727 Group-stop is observed by the tracer as waitpid(2) returning with WIF‐
728 STOPPED(status) true, with the stopping signal available via WSTOP‐
729 SIG(status). The same result is returned by some other classes of
730 ptrace-stops, therefore the recommended practice is to perform the call
731
732 ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo)
733
734 The call can be avoided if the signal is not SIGSTOP, SIGTSTP, SIGTTIN,
735 or SIGTTOU; only these four signals are stopping signals. If the
736 tracer sees something else, it can't be a group-stop. Otherwise, the
737 tracer needs to call PTRACE_GETSIGINFO. If PTRACE_GETSIGINFO fails
738 with EINVAL, then it is definitely a group-stop. (Other failure codes
739 are possible, such as ESRCH ("no such process") if a SIGKILL killed the
740 tracee.)
741
742 If tracee was attached using PTRACE_SEIZE, group-stop is indicated by
743 PTRACE_EVENT_STOP: status>>16 == PTRACE_EVENT_STOP. This allows detec‐
744 tion of group-stops without requiring an extra PTRACE_GETSIGINFO call.
745
746 As of Linux 2.6.38, after the tracer sees the tracee ptrace-stop and
747 until it restarts or kills it, the tracee will not run, and will not
748 send notifications (except SIGKILL death) to the tracer, even if the
749 tracer enters into another waitpid(2) call.
750
751 The kernel behavior described in the previous paragraph causes a prob‐
752 lem with transparent handling of stopping signals. If the tracer
753 restarts the tracee after group-stop, the stopping signal is effec‐
754 tively ignored—the tracee doesn't remain stopped, it runs. If the
755 tracer doesn't restart the tracee before entering into the next wait‐
756 pid(2), future SIGCONT signals will not be reported to the tracer; this
757 would cause the SIGCONT signals to have no effect on the tracee.
758
759 Since Linux 3.4, there is a method to overcome this problem: instead of
760 PTRACE_CONT, a PTRACE_LISTEN command can be used to restart a tracee in
761 a way where it does not execute, but waits for a new event which it can
762 report via waitpid(2) (such as when it is restarted by a SIGCONT).
763
764 PTRACE_EVENT stops
765 If the tracer sets PTRACE_O_TRACE_* options, the tracee will enter
766 ptrace-stops called PTRACE_EVENT stops.
767
768 PTRACE_EVENT stops are observed by the tracer as waitpid(2) returning
769 with WIFSTOPPED(status), and WSTOPSIG(status) returns SIGTRAP. An
770 additional bit is set in the higher byte of the status word: the value
771 status>>8 will be
772
773 (SIGTRAP | PTRACE_EVENT_foo << 8).
774
775 The following events exist:
776
777 PTRACE_EVENT_VFORK
778 Stop before return from vfork(2) or clone(2) with the
779 CLONE_VFORK flag. When the tracee is continued after this stop,
780 it will wait for child to exit/exec before continuing its execu‐
781 tion (in other words, the usual behavior on vfork(2)).
782
783 PTRACE_EVENT_FORK
784 Stop before return from fork(2) or clone(2) with the exit signal
785 set to SIGCHLD.
786
787 PTRACE_EVENT_CLONE
788 Stop before return from clone(2).
789
790 PTRACE_EVENT_VFORK_DONE
791 Stop before return from vfork(2) or clone(2) with the
792 CLONE_VFORK flag, but after the child unblocked this tracee by
793 exiting or execing.
794
795 For all four stops described above, the stop occurs in the parent
796 (i.e., the tracee), not in the newly created thread.
797 PTRACE_GETEVENTMSG can be used to retrieve the new thread's ID.
798
799 PTRACE_EVENT_EXEC
800 Stop before return from execve(2). Since Linux 3.0,
801 PTRACE_GETEVENTMSG returns the former thread ID.
802
803 PTRACE_EVENT_EXIT
804 Stop before exit (including death from exit_group(2)), signal
805 death, or exit caused by execve(2) in a multithreaded process.
806 PTRACE_GETEVENTMSG returns the exit status. Registers can be
807 examined (unlike when "real" exit happens). The tracee is still
808 alive; it needs to be PTRACE_CONTed or PTRACE_DETACHed to finish
809 exiting.
810
811 PTRACE_EVENT_STOP
812 Stop induced by PTRACE_INTERRUPT command, or group-stop, or ini‐
813 tial ptrace-stop when a new child is attached (only if attached
814 using PTRACE_SEIZE).
815
816 PTRACE_EVENT_SECCOMP
817 Stop triggered by a seccomp(2) rule on tracee syscall entry when
818 PTRACE_O_TRACESECCOMP has been set by the tracer. The seccomp
819 event message data (from the SECCOMP_RET_DATA portion of the
820 seccomp filter rule) can be retrieved with PTRACE_GETEVENTMSG.
821 The semantics of this stop are described in detail in a separate
822 section below.
823
824 PTRACE_GETSIGINFO on PTRACE_EVENT stops returns SIGTRAP in si_signo,
825 with si_code set to (event<<8) | SIGTRAP.
826
827 Syscall-stops
828 If the tracee was restarted by PTRACE_SYSCALL or PTRACE_SYSEMU, the
829 tracee enters syscall-enter-stop just prior to entering any system call
830 (which will not be executed if the restart was using PTRACE_SYSEMU,
831 regardless of any change made to registers at this point or how the
832 tracee is restarted after this stop). No matter which method caused
833 the syscall-entry-stop, if the tracer restarts the tracee with
834 PTRACE_SYSCALL, the tracee enters syscall-exit-stop when the system
835 call is finished, or if it is interrupted by a signal. (That is, sig‐
836 nal-delivery-stop never happens between syscall-enter-stop and syscall-
837 exit-stop; it happens after syscall-exit-stop.). If the tracee is con‐
838 tinued using any other method (including PTRACE_SYSEMU), no syscall-
839 exit-stop occurs. Note that all mentions PTRACE_SYSEMU apply equally
840 to PTRACE_SYSEMU_SINGLESTEP.
841
842 However, even if the tracee was continued using PTRACE_SYSCALL, it is
843 not guaranteed that the next stop will be a syscall-exit-stop. Other
844 possibilities are that the tracee may stop in a PTRACE_EVENT stop
845 (including seccomp stops), exit (if it entered _exit(2) or
846 exit_group(2)), be killed by SIGKILL, or die silently (if it is a
847 thread group leader, the execve(2) happened in another thread, and that
848 thread is not traced by the same tracer; this situation is discussed
849 later).
850
851 Syscall-enter-stop and syscall-exit-stop are observed by the tracer as
852 waitpid(2) returning with WIFSTOPPED(status) true, and WSTOPSIG(status)
853 giving SIGTRAP. If the PTRACE_O_TRACESYSGOOD option was set by the
854 tracer, then WSTOPSIG(status) will give the value (SIGTRAP | 0x80).
855
856 Syscall-stops can be distinguished from signal-delivery-stop with SIG‐
857 TRAP by querying PTRACE_GETSIGINFO for the following cases:
858
859 si_code <= 0
860 SIGTRAP was delivered as a result of a user-space action, for
861 example, a system call (tgkill(2), kill(2), sigqueue(3), etc.),
862 expiration of a POSIX timer, change of state on a POSIX message
863 queue, or completion of an asynchronous I/O request.
864
865 si_code == SI_KERNEL (0x80)
866 SIGTRAP was sent by the kernel.
867
868 si_code == SIGTRAP or si_code == (SIGTRAP|0x80)
869 This is a syscall-stop.
870
871 However, syscall-stops happen very often (twice per system call), and
872 performing PTRACE_GETSIGINFO for every syscall-stop may be somewhat
873 expensive.
874
875 Some architectures allow the cases to be distinguished by examining
876 registers. For example, on x86, rax == -ENOSYS in syscall-enter-stop.
877 Since SIGTRAP (like any other signal) always happens after syscall-
878 exit-stop, and at this point rax almost never contains -ENOSYS, the
879 SIGTRAP looks like "syscall-stop which is not syscall-enter-stop"; in
880 other words, it looks like a "stray syscall-exit-stop" and can be
881 detected this way. But such detection is fragile and is best avoided.
882
883 Using the PTRACE_O_TRACESYSGOOD option is the recommended method to
884 distinguish syscall-stops from other kinds of ptrace-stops, since it is
885 reliable and does not incur a performance penalty.
886
887 Syscall-enter-stop and syscall-exit-stop are indistinguishable from
888 each other by the tracer. The tracer needs to keep track of the
889 sequence of ptrace-stops in order to not misinterpret syscall-enter-
890 stop as syscall-exit-stop or vice versa. In general, a syscall-enter-
891 stop is always followed by syscall-exit-stop, PTRACE_EVENT stop, or the
892 tracee's death; no other kinds of ptrace-stop can occur in between.
893 However, note that seccomp stops (see below) can cause syscall-exit-
894 stops, without preceding syscall-entry-stops. If seccomp is in use,
895 care needs to be taken not to misinterpret such stops as syscall-entry-
896 stops.
897
898 If after syscall-enter-stop, the tracer uses a restarting command other
899 than PTRACE_SYSCALL, syscall-exit-stop is not generated.
900
901 PTRACE_GETSIGINFO on syscall-stops returns SIGTRAP in si_signo, with
902 si_code set to SIGTRAP or (SIGTRAP|0x80).
903
904 PTRACE_EVENT_SECCOMP stops (Linux 3.5 to 4.7)
905 The behavior of PTRACE_EVENT_SECCOMP stops and their interaction with
906 other kinds of ptrace stops has changed between kernel versions. This
907 documents the behavior from their introduction until Linux 4.7 (inclu‐
908 sive). The behavior in later kernel versions is documented in the next
909 section.
910
911 A PTRACE_EVENT_SECCOMP stop occurs whenever a SECCOMP_RET_TRACE rule is
912 triggered. This is independent of which methods was used to restart
913 the system call. Notably, seccomp still runs even if the tracee was
914 restarted using PTRACE_SYSEMU and this system call is unconditionally
915 skipped.
916
917 Restarts from this stop will behave as if the stop had occurred right
918 before the system call in question. In particular, both PTRACE_SYSCALL
919 and PTRACE_SYSEMU will normally cause a subsequent syscall-entry-stop.
920 However, if after the PTRACE_EVENT_SECCOMP the system call number is
921 negative, both the syscall-entry-stop and the system call itself will
922 be skipped. This means that if the system call number is negative
923 after a PTRACE_EVENT_SECCOMP and the tracee is restarted using
924 PTRACE_SYSCALL, the next observed stop will be a syscall-exit-stop,
925 rather than the syscall-entry-stop that might have been expected.
926
927 PTRACE_EVENT_SECCOMP stops (since Linux 4.8)
928 Starting with Linux 4.8, the PTRACE_EVENT_SECCOMP stop was reordered to
929 occur between syscall-entry-stop and syscall-exit-stop. Note that sec‐
930 comp no longer runs (and no PTRACE_EVENT_SECCOMP will be reported) if
931 the system call is skipped due to PTRACE_SYSEMU.
932
933 Functionally, a PTRACE_EVENT_SECCOMP stop functions comparably to a
934 syscall-entry-stop (i.e., continuations using PTRACE_SYSCALL will cause
935 syscall-exit-stops, the system call number may be changed and any other
936 modified registers are visible to the to-be-executed system call as
937 well). Note that there may be, but need not have been a preceding
938 syscall-entry-stop.
939
940 After a PTRACE_EVENT_SECCOMP stop, seccomp will be rerun, with a SEC‐
941 COMP_RET_TRACE rule now functioning the same as a SECCOMP_RET_ALLOW.
942 Specifically, this means that if registers are not modified during the
943 PTRACE_EVENT_SECCOMP stop, the system call will then be allowed.
944
945 PTRACE_SINGLESTEP stops
946 [Details of these kinds of stops are yet to be documented.]
947
948 Informational and restarting ptrace commands
949 Most ptrace commands (all except PTRACE_ATTACH, PTRACE_SEIZE,
950 PTRACE_TRACEME, PTRACE_INTERRUPT, and PTRACE_KILL) require the tracee
951 to be in a ptrace-stop, otherwise they fail with ESRCH.
952
953 When the tracee is in ptrace-stop, the tracer can read and write data
954 to the tracee using informational commands. These commands leave the
955 tracee in ptrace-stopped state:
956
957 ptrace(PTRACE_PEEKTEXT/PEEKDATA/PEEKUSER, pid, addr, 0);
958 ptrace(PTRACE_POKETEXT/POKEDATA/POKEUSER, pid, addr, long_val);
959 ptrace(PTRACE_GETREGS/GETFPREGS, pid, 0, &struct);
960 ptrace(PTRACE_SETREGS/SETFPREGS, pid, 0, &struct);
961 ptrace(PTRACE_GETREGSET, pid, NT_foo, &iov);
962 ptrace(PTRACE_SETREGSET, pid, NT_foo, &iov);
963 ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo);
964 ptrace(PTRACE_SETSIGINFO, pid, 0, &siginfo);
965 ptrace(PTRACE_GETEVENTMSG, pid, 0, &long_var);
966 ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags);
967
968 Note that some errors are not reported. For example, setting signal
969 information (siginfo) may have no effect in some ptrace-stops, yet the
970 call may succeed (return 0 and not set errno); querying
971 PTRACE_GETEVENTMSG may succeed and return some random value if current
972 ptrace-stop is not documented as returning a meaningful event message.
973
974 The call
975
976 ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags);
977
978 affects one tracee. The tracee's current flags are replaced. Flags
979 are inherited by new tracees created and "auto-attached" via active
980 PTRACE_O_TRACEFORK, PTRACE_O_TRACEVFORK, or PTRACE_O_TRACECLONE
981 options.
982
983 Another group of commands makes the ptrace-stopped tracee run. They
984 have the form:
985
986 ptrace(cmd, pid, 0, sig);
987
988 where cmd is PTRACE_CONT, PTRACE_LISTEN, PTRACE_DETACH, PTRACE_SYSCALL,
989 PTRACE_SINGLESTEP, PTRACE_SYSEMU, or PTRACE_SYSEMU_SINGLESTEP. If the
990 tracee is in signal-delivery-stop, sig is the signal to be injected (if
991 it is nonzero). Otherwise, sig may be ignored. (When restarting a
992 tracee from a ptrace-stop other than signal-delivery-stop, recommended
993 practice is to always pass 0 in sig.)
994
995 Attaching and detaching
996 A thread can be attached to the tracer using the call
997
998 ptrace(PTRACE_ATTACH, pid, 0, 0);
999
1000 or
1001
1002 ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_flags);
1003
1004 PTRACE_ATTACH sends SIGSTOP to this thread. If the tracer wants this
1005 SIGSTOP to have no effect, it needs to suppress it. Note that if other
1006 signals are concurrently sent to this thread during attach, the tracer
1007 may see the tracee enter signal-delivery-stop with other signal(s)
1008 first! The usual practice is to reinject these signals until SIGSTOP
1009 is seen, then suppress SIGSTOP injection. The design bug here is that
1010 a ptrace attach and a concurrently delivered SIGSTOP may race and the
1011 concurrent SIGSTOP may be lost.
1012
1013 Since attaching sends SIGSTOP and the tracer usually suppresses it,
1014 this may cause a stray EINTR return from the currently executing system
1015 call in the tracee, as described in the "Signal injection and suppres‐
1016 sion" section.
1017
1018 Since Linux 3.4, PTRACE_SEIZE can be used instead of PTRACE_ATTACH.
1019 PTRACE_SEIZE does not stop the attached process. If you need to stop
1020 it after attach (or at any other time) without sending it any signals,
1021 use PTRACE_INTERRUPT command.
1022
1023 The request
1024
1025 ptrace(PTRACE_TRACEME, 0, 0, 0);
1026
1027 turns the calling thread into a tracee. The thread continues to run
1028 (doesn't enter ptrace-stop). A common practice is to follow the
1029 PTRACE_TRACEME with
1030
1031 raise(SIGSTOP);
1032
1033 and allow the parent (which is our tracer now) to observe our signal-
1034 delivery-stop.
1035
1036 If the PTRACE_O_TRACEFORK, PTRACE_O_TRACEVFORK, or PTRACE_O_TRACECLONE
1037 options are in effect, then children created by, respectively, vfork(2)
1038 or clone(2) with the CLONE_VFORK flag, fork(2) or clone(2) with the
1039 exit signal set to SIGCHLD, and other kinds of clone(2), are automati‐
1040 cally attached to the same tracer which traced their parent. SIGSTOP
1041 is delivered to the children, causing them to enter signal-delivery-
1042 stop after they exit the system call which created them.
1043
1044 Detaching of the tracee is performed by:
1045
1046 ptrace(PTRACE_DETACH, pid, 0, sig);
1047
1048 PTRACE_DETACH is a restarting operation; therefore it requires the
1049 tracee to be in ptrace-stop. If the tracee is in signal-delivery-stop,
1050 a signal can be injected. Otherwise, the sig parameter may be silently
1051 ignored.
1052
1053 If the tracee is running when the tracer wants to detach it, the usual
1054 solution is to send SIGSTOP (using tgkill(2), to make sure it goes to
1055 the correct thread), wait for the tracee to stop in signal-delivery-
1056 stop for SIGSTOP and then detach it (suppressing SIGSTOP injection). A
1057 design bug is that this can race with concurrent SIGSTOPs. Another
1058 complication is that the tracee may enter other ptrace-stops and needs
1059 to be restarted and waited for again, until SIGSTOP is seen. Yet
1060 another complication is to be sure that the tracee is not already
1061 ptrace-stopped, because no signal delivery happens while it is—not even
1062 SIGSTOP.
1063
1064 If the tracer dies, all tracees are automatically detached and
1065 restarted, unless they were in group-stop. Handling of restart from
1066 group-stop is currently buggy, but the "as planned" behavior is to
1067 leave tracee stopped and waiting for SIGCONT. If the tracee is
1068 restarted from signal-delivery-stop, the pending signal is injected.
1069
1070 execve(2) under ptrace
1071 When one thread in a multithreaded process calls execve(2), the kernel
1072 destroys all other threads in the process, and resets the thread ID of
1073 the execing thread to the thread group ID (process ID). (Or, to put
1074 things another way, when a multithreaded process does an execve(2), at
1075 completion of the call, it appears as though the execve(2) occurred in
1076 the thread group leader, regardless of which thread did the execve(2).)
1077 This resetting of the thread ID looks very confusing to tracers:
1078
1079 * All other threads stop in PTRACE_EVENT_EXIT stop, if the
1080 PTRACE_O_TRACEEXIT option was turned on. Then all other threads
1081 except the thread group leader report death as if they exited via
1082 _exit(2) with exit code 0.
1083
1084 * The execing tracee changes its thread ID while it is in the
1085 execve(2). (Remember, under ptrace, the "pid" returned from wait‐
1086 pid(2), or fed into ptrace calls, is the tracee's thread ID.) That
1087 is, the tracee's thread ID is reset to be the same as its process
1088 ID, which is the same as the thread group leader's thread ID.
1089
1090 * Then a PTRACE_EVENT_EXEC stop happens, if the PTRACE_O_TRACEEXEC
1091 option was turned on.
1092
1093 * If the thread group leader has reported its PTRACE_EVENT_EXIT stop
1094 by this time, it appears to the tracer that the dead thread leader
1095 "reappears from nowhere". (Note: the thread group leader does not
1096 report death via WIFEXITED(status) until there is at least one other
1097 live thread. This eliminates the possibility that the tracer will
1098 see it dying and then reappearing.) If the thread group leader was
1099 still alive, for the tracer this may look as if thread group leader
1100 returns from a different system call than it entered, or even
1101 "returned from a system call even though it was not in any system
1102 call". If the thread group leader was not traced (or was traced by
1103 a different tracer), then during execve(2) it will appear as if it
1104 has become a tracee of the tracer of the execing tracee.
1105
1106 All of the above effects are the artifacts of the thread ID change in
1107 the tracee.
1108
1109 The PTRACE_O_TRACEEXEC option is the recommended tool for dealing with
1110 this situation. First, it enables PTRACE_EVENT_EXEC stop, which occurs
1111 before execve(2) returns. In this stop, the tracer can use
1112 PTRACE_GETEVENTMSG to retrieve the tracee's former thread ID. (This
1113 feature was introduced in Linux 3.0.) Second, the PTRACE_O_TRACEEXEC
1114 option disables legacy SIGTRAP generation on execve(2).
1115
1116 When the tracer receives PTRACE_EVENT_EXEC stop notification, it is
1117 guaranteed that except this tracee and the thread group leader, no
1118 other threads from the process are alive.
1119
1120 On receiving the PTRACE_EVENT_EXEC stop notification, the tracer should
1121 clean up all its internal data structures describing the threads of
1122 this process, and retain only one data structure—one which describes
1123 the single still running tracee, with
1124
1125 thread ID == thread group ID == process ID.
1126
1127 Example: two threads call execve(2) at the same time:
1128
1129 *** we get syscall-enter-stop in thread 1: **
1130 PID1 execve("/bin/foo", "foo" <unfinished ...>
1131 *** we issue PTRACE_SYSCALL for thread 1 **
1132 *** we get syscall-enter-stop in thread 2: **
1133 PID2 execve("/bin/bar", "bar" <unfinished ...>
1134 *** we issue PTRACE_SYSCALL for thread 2 **
1135 *** we get PTRACE_EVENT_EXEC for PID0, we issue PTRACE_SYSCALL **
1136 *** we get syscall-exit-stop for PID0: **
1137 PID0 <... execve resumed> ) = 0
1138
1139 If the PTRACE_O_TRACEEXEC option is not in effect for the execing
1140 tracee, and if the tracee was PTRACE_ATTACHed rather that
1141 PTRACE_SEIZEd, the kernel delivers an extra SIGTRAP to the tracee after
1142 execve(2) returns. This is an ordinary signal (similar to one which
1143 can be generated by kill -TRAP), not a special kind of ptrace-stop.
1144 Employing PTRACE_GETSIGINFO for this signal returns si_code set to 0
1145 (SI_USER). This signal may be blocked by signal mask, and thus may be
1146 delivered (much) later.
1147
1148 Usually, the tracer (for example, strace(1)) would not want to show
1149 this extra post-execve SIGTRAP signal to the user, and would suppress
1150 its delivery to the tracee (if SIGTRAP is set to SIG_DFL, it is a
1151 killing signal). However, determining which SIGTRAP to suppress is not
1152 easy. Setting the PTRACE_O_TRACEEXEC option or using PTRACE_SEIZE and
1153 thus suppressing this extra SIGTRAP is the recommended approach.
1154
1155 Real parent
1156 The ptrace API (ab)uses the standard UNIX parent/child signaling over
1157 waitpid(2). This used to cause the real parent of the process to stop
1158 receiving several kinds of waitpid(2) notifications when the child
1159 process is traced by some other process.
1160
1161 Many of these bugs have been fixed, but as of Linux 2.6.38 several
1162 still exist; see BUGS below.
1163
1164 As of Linux 2.6.38, the following is believed to work correctly:
1165
1166 * exit/death by signal is reported first to the tracer, then, when the
1167 tracer consumes the waitpid(2) result, to the real parent (to the
1168 real parent only when the whole multithreaded process exits). If
1169 the tracer and the real parent are the same process, the report is
1170 sent only once.
1171
1173 On success, the PTRACE_PEEK* requests return the requested data (but
1174 see NOTES), the PTRACE_SECCOMP_GET_FILTER request returns the number of
1175 instructions in the BPF program, and other requests return zero.
1176
1177 On error, all requests return -1, and errno is set appropriately.
1178 Since the value returned by a successful PTRACE_PEEK* request may be
1179 -1, the caller must clear errno before the call, and then check it
1180 afterward to determine whether or not an error occurred.
1181
1183 EBUSY (i386 only) There was an error with allocating or freeing a
1184 debug register.
1185
1186 EFAULT There was an attempt to read from or write to an invalid area in
1187 the tracer's or the tracee's memory, probably because the area
1188 wasn't mapped or accessible. Unfortunately, under Linux, dif‐
1189 ferent variations of this fault will return EIO or EFAULT more
1190 or less arbitrarily.
1191
1192 EINVAL An attempt was made to set an invalid option.
1193
1194 EIO request is invalid, or an attempt was made to read from or write
1195 to an invalid area in the tracer's or the tracee's memory, or
1196 there was a word-alignment violation, or an invalid signal was
1197 specified during a restart request.
1198
1199 EPERM The specified process cannot be traced. This could be because
1200 the tracer has insufficient privileges (the required capability
1201 is CAP_SYS_PTRACE); unprivileged processes cannot trace pro‐
1202 cesses that they cannot send signals to or those running set-
1203 user-ID/set-group-ID programs, for obvious reasons. Alterna‐
1204 tively, the process may already be being traced, or (on kernels
1205 before 2.6.26) be init(1) (PID 1).
1206
1207 ESRCH The specified process does not exist, or is not currently being
1208 traced by the caller, or is not stopped (for requests that
1209 require a stopped tracee).
1210
1212 SVr4, 4.3BSD.
1213
1215 Although arguments to ptrace() are interpreted according to the proto‐
1216 type given, glibc currently declares ptrace() as a variadic function
1217 with only the request argument fixed. It is recommended to always sup‐
1218 ply four arguments, even if the requested operation does not use them,
1219 setting unused/ignored arguments to 0L or (void *) 0.
1220
1221 In Linux kernels before 2.6.26, init(1), the process with PID 1, may
1222 not be traced.
1223
1224 A tracees parent continues to be the tracer even if that tracer calls
1225 execve(2).
1226
1227 The layout of the contents of memory and the USER area are quite oper‐
1228 ating-system- and architecture-specific. The offset supplied, and the
1229 data returned, might not entirely match with the definition of struct
1230 user.
1231
1232 The size of a "word" is determined by the operating-system variant
1233 (e.g., for 32-bit Linux it is 32 bits).
1234
1235 This page documents the way the ptrace() call works currently in Linux.
1236 Its behavior differs significantly on other flavors of UNIX. In any
1237 case, use of ptrace() is highly specific to the operating system and
1238 architecture.
1239
1240 Ptrace access mode checking
1241 Various parts of the kernel-user-space API (not just ptrace() opera‐
1242 tions), require so-called "ptrace access mode" checks, whose outcome
1243 determines whether an operation is permitted (or, in a few cases,
1244 causes a "read" operation to return sanitized data). These checks are
1245 performed in cases where one process can inspect sensitive information
1246 about, or in some cases modify the state of, another process. The
1247 checks are based on factors such as the credentials and capabilities of
1248 the two processes, whether or not the "target" process is dumpable, and
1249 the results of checks performed by any enabled Linux Security Module
1250 (LSM)—for example, SELinux, Yama, or Smack—and by the commoncap LSM
1251 (which is always invoked).
1252
1253 Prior to Linux 2.6.27, all access checks were of a single type. Since
1254 Linux 2.6.27, two access mode levels are distinguished:
1255
1256 PTRACE_MODE_READ
1257 For "read" operations or other operations that are less danger‐
1258 ous, such as: get_robust_list(2); kcmp(2); reading
1259 /proc/[pid]/auxv, /proc/[pid]/environ, or /proc/[pid]/stat; or
1260 readlink(2) of a /proc/[pid]/ns/* file.
1261
1262 PTRACE_MODE_ATTACH
1263 For "write" operations, or other operations that are more dan‐
1264 gerous, such as: ptrace attaching (PTRACE_ATTACH) to another
1265 process or calling process_vm_writev(2). (PTRACE_MODE_ATTACH
1266 was effectively the default before Linux 2.6.27.)
1267
1268 Since Linux 4.5, the above access mode checks are combined (ORed) with
1269 one of the following modifiers:
1270
1271 PTRACE_MODE_FSCREDS
1272 Use the caller's filesystem UID and GID (see credentials(7)) or
1273 effective capabilities for LSM checks.
1274
1275 PTRACE_MODE_REALCREDS
1276 Use the caller's real UID and GID or permitted capabilities for
1277 LSM checks. This was effectively the default before Linux 4.5.
1278
1279 Because combining one of the credential modifiers with one of the
1280 aforementioned access modes is typical, some macros are defined in the
1281 kernel sources for the combinations:
1282
1283 PTRACE_MODE_READ_FSCREDS
1284 Defined as PTRACE_MODE_READ | PTRACE_MODE_FSCREDS.
1285
1286 PTRACE_MODE_READ_REALCREDS
1287 Defined as PTRACE_MODE_READ | PTRACE_MODE_REALCREDS.
1288
1289 PTRACE_MODE_ATTACH_FSCREDS
1290 Defined as PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS.
1291
1292 PTRACE_MODE_ATTACH_REALCREDS
1293 Defined as PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS.
1294
1295 One further modifier can be ORed with the access mode:
1296
1297 PTRACE_MODE_NOAUDIT (since Linux 3.3)
1298 Don't audit this access mode check. This modifier is employed
1299 for ptrace access mode checks (such as checks when reading
1300 /proc/[pid]/stat) that merely cause the output to be filtered or
1301 sanitized, rather than causing an error to be returned to the
1302 caller. In these cases, accessing the file is not a security
1303 violation and there is no reason to generate a security audit
1304 record. This modifier suppresses the generation of such an
1305 audit record for the particular access check.
1306
1307 Note that all of the PTRACE_MODE_* constants described in this subsec‐
1308 tion are kernel-internal, and not visible to user space. The constant
1309 names are mentioned here in order to label the various kinds of ptrace
1310 access mode checks that are performed for various system calls and
1311 accesses to various pseudofiles (e.g., under /proc). These names are
1312 used in other manual pages to provide a simple shorthand for labeling
1313 the different kernel checks.
1314
1315 The algorithm employed for ptrace access mode checking determines
1316 whether the calling process is allowed to perform the corresponding
1317 action on the target process. (In the case of opening /proc/[pid]
1318 files, the "calling process" is the one opening the file, and the
1319 process with the corresponding PID is the "target process".) The algo‐
1320 rithm is as follows:
1321
1322 1. If the calling thread and the target thread are in the same thread
1323 group, access is always allowed.
1324
1325 2. If the access mode specifies PTRACE_MODE_FSCREDS, then, for the
1326 check in the next step, employ the caller's filesystem UID and GID.
1327 (As noted in credentials(7), the filesystem UID and GID almost
1328 always have the same values as the corresponding effective IDs.)
1329
1330 Otherwise, the access mode specifies PTRACE_MODE_REALCREDS, so use
1331 the caller's real UID and GID for the checks in the next step.
1332 (Most APIs that check the caller's UID and GID use the effective
1333 IDs. For historical reasons, the PTRACE_MODE_REALCREDS check uses
1334 the real IDs instead.)
1335
1336 3. Deny access if neither of the following is true:
1337
1338 · The real, effective, and saved-set user IDs of the target match
1339 the caller's user ID, and the real, effective, and saved-set group
1340 IDs of the target match the caller's group ID.
1341
1342 · The caller has the CAP_SYS_PTRACE capability in the user namespace
1343 of the target.
1344
1345 4. Deny access if the target process "dumpable" attribute has a value
1346 other than 1 (SUID_DUMP_USER; see the discussion of PR_SET_DUMPABLE
1347 in prctl(2)), and the caller does not have the CAP_SYS_PTRACE capa‐
1348 bility in the user namespace of the target process.
1349
1350 5. The kernel LSM security_ptrace_access_check() interface is invoked
1351 to see if ptrace access is permitted. The results depend on the
1352 LSM(s). The implementation of this interface in the commoncap LSM
1353 performs the following steps:
1354
1355 a) If the access mode includes PTRACE_MODE_FSCREDS, then use the
1356 caller's effective capability set in the following check; other‐
1357 wise (the access mode specifies PTRACE_MODE_REALCREDS, so) use
1358 the caller's permitted capability set.
1359
1360 b) Deny access if neither of the following is true:
1361
1362 · The caller and the target process are in the same user names‐
1363 pace, and the caller's capabilities are a superset of the tar‐
1364 get process's permitted capabilities.
1365
1366 · The caller has the CAP_SYS_PTRACE capability in the target
1367 process's user namespace.
1368
1369 Note that the commoncap LSM does not distinguish between
1370 PTRACE_MODE_READ and PTRACE_MODE_ATTACH.
1371
1372 6. If access has not been denied by any of the preceding steps, then
1373 access is allowed.
1374
1375 /proc/sys/kernel/yama/ptrace_scope
1376 On systems with the Yama Linux Security Module (LSM) installed (i.e.,
1377 the kernel was configured with CONFIG_SECURITY_YAMA), the
1378 /proc/sys/kernel/yama/ptrace_scope file (available since Linux 3.4) can
1379 be used to restrict the ability to trace a process with ptrace() (and
1380 thus also the ability to use tools such as strace(1) and gdb(1)). The
1381 goal of such restrictions is to prevent attack escalation whereby a
1382 compromised process can ptrace-attach to other sensitive processes
1383 (e.g., a GPG agent or an SSH session) owned by the user in order to
1384 gain additional credentials that may exist in memory and thus expand
1385 the scope of the attack.
1386
1387 More precisely, the Yama LSM limits two types of operations:
1388
1389 * Any operation that performs a ptrace access mode PTRACE_MODE_ATTACH
1390 check—for example, ptrace() PTRACE_ATTACH. (See the "Ptrace access
1391 mode checking" discussion above.)
1392
1393 * ptrace() PTRACE_TRACEME.
1394
1395 A process that has the CAP_SYS_PTRACE capability can update the
1396 /proc/sys/kernel/yama/ptrace_scope file with one of the following val‐
1397 ues:
1398
1399 0 ("classic ptrace permissions")
1400 No additional restrictions on operations that perform
1401 PTRACE_MODE_ATTACH checks (beyond those imposed by the commoncap
1402 and other LSMs).
1403
1404 The use of PTRACE_TRACEME is unchanged.
1405
1406 1 ("restricted ptrace") [default value]
1407 When performing an operation that requires a PTRACE_MODE_ATTACH
1408 check, the calling process must either have the CAP_SYS_PTRACE
1409 capability in the user namespace of the target process or it
1410 must have a predefined relationship with the target process. By
1411 default, the predefined relationship is that the target process
1412 must be a descendant of the caller.
1413
1414 A target process can employ the prctl(2) PR_SET_PTRACER opera‐
1415 tion to declare an additional PID that is allowed to perform
1416 PTRACE_MODE_ATTACH operations on the target. See the kernel
1417 source file Documentation/admin-guide/LSM/Yama.rst (or Documen‐
1418 tation/security/Yama.txt before Linux 4.13) for further details.
1419
1420 The use of PTRACE_TRACEME is unchanged.
1421
1422 2 ("admin-only attach")
1423 Only processes with the CAP_SYS_PTRACE capability in the user
1424 namespace of the target process may perform PTRACE_MODE_ATTACH
1425 operations or trace children that employ PTRACE_TRACEME.
1426
1427 3 ("no attach")
1428 No process may perform PTRACE_MODE_ATTACH operations or trace
1429 children that employ PTRACE_TRACEME.
1430
1431 Once this value has been written to the file, it cannot be
1432 changed.
1433
1434 With respect to values 1 and 2, note that creating a new user namespace
1435 effectively removes the protection offered by Yama. This is because a
1436 process in the parent user namespace whose effective UID matches the
1437 UID of the creator of a child namespace has all capabilities (including
1438 CAP_SYS_PTRACE) when performing operations within the child user names‐
1439 pace (and further-removed descendants of that namespace). Conse‐
1440 quently, when a process tries to use user namespaces to sandbox itself,
1441 it inadvertently weakens the protections offered by the Yama LSM.
1442
1443 C library/kernel differences
1444 At the system call level, the PTRACE_PEEKTEXT, PTRACE_PEEKDATA, and
1445 PTRACE_PEEKUSER requests have a different API: they store the result at
1446 the address specified by the data parameter, and the return value is
1447 the error flag. The glibc wrapper function provides the API given in
1448 DESCRIPTION above, with the result being returned via the function
1449 return value.
1450
1452 On hosts with 2.6 kernel headers, PTRACE_SETOPTIONS is declared with a
1453 different value than the one for 2.4. This leads to applications com‐
1454 piled with 2.6 kernel headers failing when run on 2.4 kernels. This
1455 can be worked around by redefining PTRACE_SETOPTIONS to PTRACE_OLDSE‐
1456 TOPTIONS, if that is defined.
1457
1458 Group-stop notifications are sent to the tracer, but not to real par‐
1459 ent. Last confirmed on 2.6.38.6.
1460
1461 If a thread group leader is traced and exits by calling _exit(2), a
1462 PTRACE_EVENT_EXIT stop will happen for it (if requested), but the sub‐
1463 sequent WIFEXITED notification will not be delivered until all other
1464 threads exit. As explained above, if one of other threads calls
1465 execve(2), the death of the thread group leader will never be reported.
1466 If the execed thread is not traced by this tracer, the tracer will
1467 never know that execve(2) happened. One possible workaround is to
1468 PTRACE_DETACH the thread group leader instead of restarting it in this
1469 case. Last confirmed on 2.6.38.6.
1470
1471 A SIGKILL signal may still cause a PTRACE_EVENT_EXIT stop before actual
1472 signal death. This may be changed in the future; SIGKILL is meant to
1473 always immediately kill tasks even under ptrace. Last confirmed on
1474 Linux 3.13.
1475
1476 Some system calls return with EINTR if a signal was sent to a tracee,
1477 but delivery was suppressed by the tracer. (This is very typical oper‐
1478 ation: it is usually done by debuggers on every attach, in order to not
1479 introduce a bogus SIGSTOP). As of Linux 3.2.9, the following system
1480 calls are affected (this list is likely incomplete): epoll_wait(2), and
1481 read(2) from an inotify(7) file descriptor. The usual symptom of this
1482 bug is that when you attach to a quiescent process with the command
1483
1484 strace -p <process-ID>
1485
1486 then, instead of the usual and expected one-line output such as
1487
1488 restart_syscall(<... resuming interrupted call ...>_
1489
1490 or
1491
1492 select(6, [5], NULL, [5], NULL_
1493
1494 ('_' denotes the cursor position), you observe more than one line. For
1495 example:
1496
1497 clock_gettime(CLOCK_MONOTONIC, {15370, 690928118}) = 0
1498 epoll_wait(4,_
1499
1500 What is not visible here is that the process was blocked in
1501 epoll_wait(2) before strace(1) has attached to it. Attaching caused
1502 epoll_wait(2) to return to user space with the error EINTR. In this
1503 particular case, the program reacted to EINTR by checking the current
1504 time, and then executing epoll_wait(2) again. (Programs which do not
1505 expect such "stray" EINTR errors may behave in an unintended way upon
1506 an strace(1) attach.)
1507
1508 Contrary to the normal rules, the glibc wrapper for ptrace() can set
1509 errno to zero.
1510
1512 gdb(1), ltrace(1), strace(1), clone(2), execve(2), fork(2), gettid(2),
1513 prctl(2), seccomp(2), sigaction(2), tgkill(2), vfork(2), waitpid(2),
1514 exec(3), capabilities(7), signal(7)
1515
1517 This page is part of release 5.04 of the Linux man-pages project. A
1518 description of the project, information about reporting bugs, and the
1519 latest version of this page, can be found at
1520 https://www.kernel.org/doc/man-pages/.
1521
1522
1523
1524Linux 2019-10-10 PTRACE(2)