1PRCTL(2)                   Linux Programmer's Manual                  PRCTL(2)
2
3
4

NAME

6       prctl - operations on a process
7

SYNOPSIS

9       #include <sys/prctl.h>
10
11       int prctl(int option, unsigned long arg2, unsigned long arg3,
12                 unsigned long arg4, unsigned long arg5);
13

DESCRIPTION

15       prctl()  is  called  with  a first argument describing what to do (with
16       values defined in <linux/prctl.h>), and further arguments with  a  sig‐
17       nificance depending on the first one.  The first argument can be:
18
19       PR_CAP_AMBIENT (since Linux 4.3)
20              Reads  or  changes  the  ambient  capability  set of the calling
21              thread, according to the value of arg2, which must be one of the
22              following:
23
24              PR_CAP_AMBIENT_RAISE
25                     The  capability specified in arg3 is added to the ambient
26                     set.  The specified capability must already be present in
27                     both  the  permitted  and  the  inheritable  sets  of the
28                     process.   This  operation  is  not  permitted   if   the
29                     SECBIT_NO_CAP_AMBIENT_RAISE securebit is set.
30
31              PR_CAP_AMBIENT_LOWER
32                     The  capability  specified  in  arg3  is removed from the
33                     ambient set.
34
35              PR_CAP_AMBIENT_IS_SET
36                     The prctl() call returns 1 if the capability in  arg3  is
37                     in the ambient set and 0 if it is not.
38
39              PR_CAP_AMBIENT_CLEAR_ALL
40                     All  capabilities  will  be removed from the ambient set.
41                     This operation requires setting arg3 to zero.
42
43              In all of the above operations, arg4 and arg5 must be  specified
44              as 0.
45
46       PR_CAPBSET_READ (since Linux 2.6.25)
47              Return (as the function result) 1 if the capability specified in
48              arg2 is in the calling thread's capability bounding set, or 0 if
49              it   is   not.    (The   capability  constants  are  defined  in
50              <linux/capability.h>.)  The  capability  bounding  set  dictates
51              whether  the process can receive the capability through a file's
52              permitted capability set on a subsequent call to execve(2).
53
54              If the capability specified in arg2 is not valid, then the  call
55              fails with the error EINVAL.
56
57       PR_CAPBSET_DROP (since Linux 2.6.25)
58              If  the calling thread has the CAP_SETPCAP capability within its
59              user namespace, then drop the capability specified by arg2  from
60              the  calling  thread's capability bounding set.  Any children of
61              the calling thread will inherit the newly reduced bounding set.
62
63              The call fails with the error: EPERM if the calling thread  does
64              not  have  the  CAP_SETPCAP; EINVAL if arg2 does not represent a
65              valid capability; or EINVAL if file capabilities are not enabled
66              in the kernel, in which case bounding sets are not supported.
67
68       PR_SET_CHILD_SUBREAPER (since Linux 3.4)
69              If  arg2  is nonzero, set the "child subreaper" attribute of the
70              calling process; if arg2 is zero, unset the attribute.
71
72              A subreaper fulfills the role of init(1) for its descendant pro‐
73              cesses.   When  a  process becomes orphaned (i.e., its immediate
74              parent terminates) then that process will be reparented  to  the
75              nearest still living ancestor subreaper.  Subsequently, calls to
76              getppid() in the orphaned process will now return the PID of the
77              subreaper  process,  and  when  the orphan terminates, it is the
78              subreaper process that will receive a SIGCHLD signal and will be
79              able  to wait(2) on the process to discover its termination sta‐
80              tus.
81
82              The setting of this bit is not inherited by children created  by
83              fork(2)   and   clone(2).    The  setting  is  preserved  across
84              execve(2).
85
86              Establishing a subreaper process is useful in session management
87              frameworks where a hierarchical group of processes is managed by
88              a subreaper process that needs to be informed when  one  of  the
89              processes—for  example,  a double-forked daemon—terminates (per‐
90              haps so that it can restart that process).  Some init(1)  frame‐
91              works  (e.g., systemd(1)) employ a subreaper process for similar
92              reasons.
93
94       PR_GET_CHILD_SUBREAPER (since Linux 3.4)
95              Return the "child subreaper" setting of the caller, in the loca‐
96              tion pointed to by (int *) arg2.
97
98       PR_SET_DUMPABLE (since Linux 2.3.20)
99              Set  the  state of the "dumpable" flag, which determines whether
100              core dumps are produced for the calling process upon delivery of
101              a signal whose default behavior is to produce a core dump.
102
103              In  kernels  up  to  and including 2.6.12, arg2 must be either 0
104              (SUID_DUMP_DISABLE,   process   is   not    dumpable)    or    1
105              (SUID_DUMP_USER,  process  is dumpable).  Between kernels 2.6.13
106              and 2.6.17, the value 2 was also  permitted,  which  caused  any
107              binary  which normally would not be dumped to be dumped readable
108              by root only;  for  security  reasons,  this  feature  has  been
109              removed.    (See   also   the   description   of   /proc/sys/fs/
110              suid_dumpable in proc(5).)
111
112              Normally, this flag is set to 1.  However, it is  reset  to  the
113              current  value  contained in the file /proc/sys/fs/suid_dumpable
114              (which by default has the value 0),  in  the  following  circum‐
115              stances:
116
117              *  The process's effective user or group ID is changed.
118
119              *  The  process's  filesystem  user  or group ID is changed (see
120                 credentials(7)).
121
122              *  The process executes (execve(2)) a set-user-ID or  set-group-
123                 ID  program,  resulting  in  a change of either the effective
124                 user ID or the effective group ID.
125
126              *  The process executes (execve(2))  a  program  that  has  file
127                 capabilities (see capabilities(7)), but only if the permitted
128                 capabilities gained exceed those already  permitted  for  the
129                 process.
130
131              Processes  that  are  not  dumpable  can  not  be  attached  via
132              ptrace(2) PTRACE_ATTACH; see ptrace(2) for further details.
133
134              If a process is not dumpable, the  ownership  of  files  in  the
135              process's  /proc/[pid]  directory  is  affected  as described in
136              proc(5).
137
138       PR_GET_DUMPABLE (since Linux 2.3.20)
139              Return (as the function result) the current state of the calling
140              process's dumpable flag.
141
142       PR_SET_ENDIAN (since Linux 2.6.18, PowerPC only)
143              Set the endian-ness of the calling process to the value given in
144              arg2, which should  be  one  of  the  following:  PR_ENDIAN_BIG,
145              PR_ENDIAN_LITTLE, or PR_ENDIAN_PPC_LITTLE (PowerPC pseudo little
146              endian).
147
148       PR_GET_ENDIAN (since Linux 2.6.18, PowerPC only)
149              Return the endian-ness of the calling process, in  the  location
150              pointed to by (int *) arg2.
151
152       PR_SET_FP_MODE (since Linux 4.0, only on MIPS)
153              On  the MIPS architecture, user-space code can be built using an
154              ABI which permits linking with code that  has  more  restrictive
155              floating-point  (FP) requirements.  For example, user-space code
156              may be built to target the O32 FPXX ABI  and  linked  with  code
157              built  for either one of the more restrictive FP32 or FP64 ABIs.
158              When more restrictive code is linked in, the overall requirement
159              for  the  process  is to use the more restrictive floating-point
160              mode.
161
162              Because the kernel has no means of knowing in advance which mode
163              the  process  should  be executed in, and because these restric‐
164              tions  can  change  over  the  lifetime  of  the  process,   the
165              PR_SET_FP_MODE  operation  is  provided  to allow control of the
166              floating-point mode from user space.
167
168              The (unsigned int) arg2 argument is a bit  mask  describing  the
169              floating-point mode used:
170
171              PR_FP_MODE_FR
172                     When  this bit is unset (so called FR=0 or FR0 mode), the
173                     32 floating-point registers are 32 bits wide, and  64-bit
174                     registers  are  represented as a pair of registers (even-
175                     and odd- numbered, with the even-numbered  register  con‐
176                     taining  the lower 32 bits, and the odd-numbered register
177                     containing the higher 32 bits).
178
179                     When this bit is set  (on  supported  hardware),  the  32
180                     floating-point registers are 64 bits wide (so called FR=1
181                     or FR1 mode).   Note  that  modern  MIPS  implementations
182                     (MIPS R6 and newer) support FR=1 mode only.
183
184                     Applications  that  use the O32 FP32 ABI can operate only
185                     when this bit is unset (FR=0; or they can  be  used  with
186                     FRE  enabled,  see below).  Applications that use the O32
187                     FP64 ABI (and the O32 FP64A ABI, which exists to  provide
188                     the  ability  to  operate  with  existing  FP32 code; see
189                     below) can operate only when  this  bit  is  set  (FR=1).
190                     Applications  that  use the O32 FPXX ABI can operate with
191                     either FR=0 or FR=1.
192
193              PR_FP_MODE_FRE
194                     Enable emulation of  32-bit  floating-point  mode.   When
195                     this  mode  is enabled, it emulates 32-bit floating-point
196                     operations by raising a reserved-instruction exception on
197                     every instruction that uses 32-bit formats and the kernel
198                     then handles the instruction in software.   (The  problem
199                     lies  in  the discrepancy of handling odd-numbered regis‐
200                     ters which are the high 32 bits of 64-bit registers  with
201                     even  numbers  in FR=0 mode and the lower 32-bit parts of
202                     odd-numbered 64-bit registers in  FR=1  mode.)   Enabling
203                     this  bit  is  necessary  when code with the O32 FP32 ABI
204                     should operate with code with compatible the O32 FPXX  or
205                     O32  FP64A  ABIs (which require FR=1 FPU mode) or when it
206                     is executed on newer hardware  (MIPS  R6  onwards)  which
207                     lacks  FR=0  mode support when a binary with the FP32 ABI
208                     is used.
209
210                     Note that this mode makes sense only when the FPU  is  in
211                     64-bit mode (FR=1).
212
213                     Note  that the use of emulation inherently has a signifi‐
214                     cant performance hit and should be avoided if possible.
215
216              In the N32/N64 ABI, 64-bit floating-point mode is  always  used,
217              so  FPU emulation is not required and the FPU always operates in
218              FR=1 mode.
219
220              This option is mainly intended for use  by  the  dynamic  linker
221              (ld.so(8)).
222
223              The arguments arg3, arg4, and arg5 are ignored.
224
225       PR_GET_FP_MODE (since Linux 4.0, only on MIPS)
226              Get  the  current  floating-point  mode  (see the description of
227              PR_SET_FP_MODE for details).
228
229              On success, the call returns a bit  mask  which  represents  the
230              current floating-point mode.
231
232              The arguments arg2, arg3, arg4, and arg5 are ignored.
233
234       PR_SET_FPEMU (since Linux 2.4.18, 2.5.9, only on ia64)
235              Set   floating-point  emulation  control  bits  to  arg2.   Pass
236              PR_FPEMU_NOPRINT to silently  emulate  floating-point  operation
237              accesses, or PR_FPEMU_SIGFPE to not emulate floating-point oper‐
238              ations and send SIGFPE instead.
239
240       PR_GET_FPEMU (since Linux 2.4.18, 2.5.9, only on ia64)
241              Return floating-point emulation control bits,  in  the  location
242              pointed to by (int *) arg2.
243
244       PR_SET_FPEXC (since Linux 2.4.21, 2.5.32, only on PowerPC)
245              Set    floating-point    exception    mode    to   arg2.    Pass
246              PR_FP_EXC_SW_ENABLE to  use  FPEXC  for  FP  exception  enables,
247              PR_FP_EXC_DIV  for  floating-point divide by zero, PR_FP_EXC_OVF
248              for floating-point overflow,  PR_FP_EXC_UND  for  floating-point
249              underflow,  PR_FP_EXC_RES  for  floating-point  inexact  result,
250              PR_FP_EXC_INV    for    floating-point    invalid     operation,
251              PR_FP_EXC_DISABLED  for FP exceptions disabled, PR_FP_EXC_NONRE‐
252              COV for async nonrecoverable exception mode, PR_FP_EXC_ASYNC for
253              async  recoverable exception mode, PR_FP_EXC_PRECISE for precise
254              exception mode.
255
256       PR_GET_FPEXC (since Linux 2.4.21, 2.5.32, only on PowerPC)
257              Return floating-point exception mode, in the location pointed to
258              by (int *) arg2.
259
260       PR_SET_KEEPCAPS (since Linux 2.2.18)
261              Set  the state of the calling thread's "keep capabilities" flag.
262              The effect if this flag is described in  capabilities(7).   arg2
263              must  be  either  0  (clear  the flag) or 1 (set the flag).  The
264              "keep capabilities" value will be reset to 0 on subsequent calls
265              to execve(2).
266
267       PR_GET_KEEPCAPS (since Linux 2.2.18)
268              Return (as the function result) the current state of the calling
269              thread's "keep capabilities" flag.  See  capabilities(7)  for  a
270              description of this flag.
271
272       PR_MCE_KILL (since Linux 2.6.32)
273              Set  the  machine  check  memory  corruption kill policy for the
274              calling thread.  If arg2 is PR_MCE_KILL_CLEAR, clear the  thread
275              memory  corruption  kill policy and use the system-wide default.
276              (The system-wide default is defined by /proc/sys/vm/memory_fail‐
277              ure_early_kill; see proc(5).)  If arg2 is PR_MCE_KILL_SET, use a
278              thread-specific memory corruption kill policy.   In  this  case,
279              arg3    defines    whether    the    policy    is   early   kill
280              (PR_MCE_KILL_EARLY), late kill (PR_MCE_KILL_LATE), or  the  sys‐
281              tem-wide  default  (PR_MCE_KILL_DEFAULT).  Early kill means that
282              the thread receives a SIGBUS signal as soon as  hardware  memory
283              corruption  is  detected inside its address space.  In late kill
284              mode, the process is killed only when it  accesses  a  corrupted
285              page.   See sigaction(2) for more information on the SIGBUS sig‐
286              nal.  The policy is inherited by children.  The remaining unused
287              prctl() arguments must be zero for future compatibility.
288
289       PR_MCE_KILL_GET (since Linux 2.6.32)
290              Return  the  current per-process machine check kill policy.  All
291              unused prctl() arguments must be zero.
292
293       PR_SET_MM (since Linux 3.3)
294              Modify certain kernel memory map descriptor fields of the  call‐
295              ing  process.   Usually  these  fields are set by the kernel and
296              dynamic loader (see ld.so(8) for more information) and a regular
297              application  should  not  use  this feature.  However, there are
298              cases, such as self-modifying programs, where  a  program  might
299              find it useful to change its own memory map.
300
301              The  calling  process must have the CAP_SYS_RESOURCE capability.
302              The value in arg2 is one of the options below, while  arg3  pro‐
303              vides  a  new value for the option.  The arg4 and arg5 arguments
304              must be zero if unused.
305
306              Since Linux 3.10,  this  feature  is  available  all  the  time.
307              Before  Linux 3.10, this feature is available only if the kernel
308              is built with the CONFIG_CHECKPOINT_RESTORE option enabled.
309
310              PR_SET_MM_START_CODE
311                     Set the address above which the  program  text  can  run.
312                     The  corresponding  memory area must be readable and exe‐
313                     cutable, but not writable or shareable  (see  mprotect(2)
314                     and mmap(2) for more information).
315
316              PR_SET_MM_END_CODE
317                     Set  the  address  below  which the program text can run.
318                     The corresponding memory area must be readable  and  exe‐
319                     cutable, but not writable or shareable.
320
321              PR_SET_MM_START_DATA
322                     Set the address above which initialized and uninitialized
323                     (bss) data are placed.   The  corresponding  memory  area
324                     must  be  readable  and  writable,  but not executable or
325                     shareable.
326
327              PR_SET_MM_END_DATA
328                     Set the address below which initialized and uninitialized
329                     (bss)  data  are  placed.   The corresponding memory area
330                     must be readable and  writable,  but  not  executable  or
331                     shareable.
332
333              PR_SET_MM_START_STACK
334                     Set  the  start  address of the stack.  The corresponding
335                     memory area must be readable and writable.
336
337              PR_SET_MM_START_BRK
338                     Set the address above  which  the  program  heap  can  be
339                     expanded  with  brk(2) call.  The address must be greater
340                     than the ending address of the current program data  seg‐
341                     ment.   In  addition,  the combined size of the resulting
342                     heap and the size of the data segment  can't  exceed  the
343                     RLIMIT_DATA resource limit (see setrlimit(2)).
344
345              PR_SET_MM_BRK
346                     Set  the  current brk(2) value.  The requirements for the
347                     address are  the  same  as  for  the  PR_SET_MM_START_BRK
348                     option.
349
350              The following options are available since Linux 3.5.
351
352              PR_SET_MM_ARG_START
353                     Set  the  address above which the program command line is
354                     placed.
355
356              PR_SET_MM_ARG_END
357                     Set the address below which the program command  line  is
358                     placed.
359
360              PR_SET_MM_ENV_START
361                     Set  the  address  above which the program environment is
362                     placed.
363
364              PR_SET_MM_ENV_END
365                     Set the address below which the  program  environment  is
366                     placed.
367
368                     The     address    passed    with    PR_SET_MM_ARG_START,
369                     PR_SET_MM_ARG_END,        PR_SET_MM_ENV_START,        and
370                     PR_SET_MM_ENV_END  should belong to a process stack area.
371                     Thus, the corresponding memory  area  must  be  readable,
372                     writable,  and  (depending  on  the kernel configuration)
373                     have the MAP_GROWSDOWN attribute set (see mmap(2)).
374
375              PR_SET_MM_AUXV
376                     Set a new auxiliary vector.   The  arg3  argument  should
377                     provide  the address of the vector.  The arg4 is the size
378                     of the vector.
379
380              PR_SET_MM_EXE_FILE
381                     Supersede the /proc/pid/exe symbolic link with a new  one
382                     pointing  to a new executable file identified by the file
383                     descriptor provided in arg3 argument.  The file  descrip‐
384                     tor should be obtained with a regular open(2) call.
385
386                     To  change  the  symbolic  link,  one  needs to unmap all
387                     existing executable memory areas, including those created
388                     by the kernel itself (for example the kernel usually cre‐
389                     ates at least one executable  memory  area  for  the  ELF
390                     .text section).
391
392                     The  second  limitation  is  that such transitions can be
393                     done only once in  a  process  life  time.   Any  further
394                     attempts  will  be  rejected.   This  should  help system
395                     administrators monitor unusual symbolic-link  transitions
396                     over all processes running on a system.
397
398              The following options are available since Linux 3.18.
399
400              PR_SET_MM_MAP
401                     Provides  one-shot access to all the addresses by passing
402                     in a struct prctl_mm_map (as defined in <linux/prctl.h>).
403                     The arg4 argument should provide the size of the struct.
404
405                     This  feature  is  available  only if the kernel is built
406                     with the CONFIG_CHECKPOINT_RESTORE option enabled.
407
408              PR_SET_MM_MAP_SIZE
409                     Returns the size of the struct  prctl_mm_map  the  kernel
410                     expects.   This  allows  user  space to find a compatible
411                     struct.  The arg4 argument should  be  a  pointer  to  an
412                     unsigned int.
413
414                     This  feature  is  available  only if the kernel is built
415                     with the CONFIG_CHECKPOINT_RESTORE option enabled.
416
417       PR_MPX_ENABLE_MANAGEMENT, PR_MPX_DISABLE_MANAGEMENT (since Linux 3.19)
418              Enable or disable kernel management of Memory Protection  eXten‐
419              sions (MPX) bounds tables.  The arg2, arg3, arg4, and arg5 argu‐
420              ments must be zero.
421
422              MPX is  a  hardware-assisted  mechanism  for  performing  bounds
423              checking on pointers.  It consists of a set of registers storing
424              bounds information and a set  of  special  instruction  prefixes
425              that  tell  the  CPU  on  which instructions it should do bounds
426              enforcement.  There is a limited number of these  registers  and
427              when there are more pointers than registers, their contents must
428              be "spilled" into a set of  tables.   These  tables  are  called
429              "bounds  tables"  and the MPX prctl() operations control whether
430              the kernel manages their allocation and freeing.
431
432              When management is enabled, the kernel will take over allocation
433              and  freeing of the bounds tables.  It does this by trapping the
434              #BR exceptions that result at first use of missing bounds tables
435              and  instead of delivering the exception to user space, it allo‐
436              cates the table and populates  the  bounds  directory  with  the
437              location  of  the  new table.  For freeing, the kernel checks to
438              see if bounds tables are present for memory which is  not  allo‐
439              cated, and frees them if so.
440
441              Before  enabling  MPX management using PR_MPX_ENABLE_MANAGEMENT,
442              the application must first have allocated  a  user-space  buffer
443              for  the bounds directory and placed the location of that direc‐
444              tory in the bndcfgu register.
445
446              These calls fail if the CPU or  kernel  does  not  support  MPX.
447              Kernel  support  for MPX is enabled via the CONFIG_X86_INTEL_MPX
448              configuration option.  You can check whether  the  CPU  supports
449              MPX  by looking for the 'mpx' CPUID bit, like with the following
450              command:
451
452                   cat /proc/cpuinfo | grep ' mpx '
453
454              A thread may not switch in or out of long  (64-bit)  mode  while
455              MPX is enabled.
456
457              All threads in a process are affected by these calls.
458
459              The  child  of  a  fork(2) inherits the state of MPX management.
460              During execve(2), MPX management is  reset  to  a  state  as  if
461              PR_MPX_DISABLE_MANAGEMENT had been called.
462
463              For further information on Intel MPX, see the kernel source file
464              Documentation/x86/intel_mpx.txt.
465
466       PR_SET_NAME (since Linux 2.6.9)
467              Set the name of the calling thread, using the value in the loca‐
468              tion  pointed  to  by  (char *)  arg2.  The name can be up to 16
469              bytes long, including the terminating null byte.  (If the length
470              of  the  string, including the terminating null byte, exceeds 16
471              bytes, the string is silently  truncated.)   This  is  the  same
472              attribute   that   can  be  set  via  pthread_setname_np(3)  and
473              retrieved using pthread_getname_np(3).  The attribute  is  like‐
474              wise accessible via /proc/self/task/[tid]/comm, where tid is the
475              name of the calling thread.
476
477       PR_GET_NAME (since Linux 2.6.11)
478              Return the name of the calling thread, in the buffer pointed  to
479              by  (char *)  arg2.   The buffer should allow space for up to 16
480              bytes; the returned string will be null-terminated.
481
482       PR_SET_NO_NEW_PRIVS (since Linux 3.5)
483              Set the calling thread's no_new_privs bit to the value in  arg2.
484              With  no_new_privs  set  to  1,  execve(2) promises not to grant
485              privileges to do anything that could not have been done  without
486              the  execve(2)  call (for example, rendering the set-user-ID and
487              set-group-ID mode bits, and file  capabilities  non-functional).
488              Once  set, this bit cannot be unset.  The setting of this bit is
489              inherited by children created by fork(2) and clone(2), and  pre‐
490              served across execve(2).
491
492              Since  Linux  4.10, the value of a thread's no_new_privs bit can
493              be viewed via the NoNewPrivs  field  in  the  /proc/[pid]/status
494              file.
495
496              For  more  information,  see  the  kernel source file Documenta‐
497              tion/userspace-api/no_new_privs.rst        (or        Documenta‐
498              tion/prctl/no_new_privs.txt  before  Linux 4.13).  See also sec‐
499              comp(2).
500
501       PR_GET_NO_NEW_PRIVS (since Linux 3.5)
502              Return (as the function result) the value  of  the  no_new_privs
503              bit  for the calling thread.  A value of 0 indicates the regular
504              execve(2) behavior.  A value of 1 indicates execve(2) will oper‐
505              ate in the privilege-restricting mode described above.
506
507       PR_SET_PDEATHSIG (since Linux 2.1.57)
508              Set  the  parent  death  signal  of  the calling process to arg2
509              (either a signal value in the range 1..maxsig, or 0  to  clear).
510              This  is  the  signal that the calling process will get when its
511              parent dies.  This value is cleared for the child of  a  fork(2)
512              and  (since  Linux 2.4.36 / 2.6.23) when executing a set-user-ID
513              or set-group-ID binary, or a binary that has associated capabil‐
514              ities  (see  capabilities(7)).   This  value is preserved across
515              execve(2).
516
517              Warning: the "parent" in this  case  is  considered  to  be  the
518              thread  that  created  this process.  In other words, the signal
519              will be sent when that  thread  terminates  (via,  for  example,
520              pthread_exit(3)),  rather  than  after all of the threads in the
521              parent process terminate.
522
523       PR_GET_PDEATHSIG (since Linux 2.3.15)
524              Return the current value of the parent process death signal,  in
525              the location pointed to by (int *) arg2.
526
527       PR_SET_PTRACER (since Linux 3.4)
528              This is meaningful only when the Yama LSM is enabled and in mode
529              1   ("restricted    ptrace",    visible    via    /proc/sys/ker‐
530              nel/yama/ptrace_scope).   When  a "ptracer process ID" is passed
531              in arg2, the caller is declaring that the  ptracer  process  can
532              ptrace(2)  the  calling  process  as if it were a direct process
533              ancestor.  Each PR_SET_PTRACER operation replaces  the  previous
534              "ptracer process ID".  Employing PR_SET_PTRACER with arg2 set to
535              0  clears  the  caller's  "ptracer  process  ID".   If  arg2  is
536              PR_SET_PTRACER_ANY,  the  ptrace restrictions introduced by Yama
537              are effectively disabled for the calling process.
538
539              For further information, see the kernel source  file  Documenta‐
540              tion/admin-guide/LSM/Yama.rst       (or      Documentation/secu‐
541              rity/Yama.txt before Linux 4.13).
542
543       PR_SET_SECCOMP (since Linux 2.6.23)
544              Set the secure computing (seccomp) mode for the calling  thread,
545              to limit the available system calls.  The more recent seccomp(2)
546              system  call  provides  a  superset  of  the  functionality   of
547              PR_SET_SECCOMP.
548
549              The  seccomp  mode is selected via arg2.  (The seccomp constants
550              are defined in <linux/seccomp.h>.)
551
552              With arg2 set to SECCOMP_MODE_STRICT, the only system calls that
553              the  thread is permitted to make are read(2), write(2), _exit(2)
554              (but not exit_group(2)), and sigreturn(2).  Other  system  calls
555              result  in the delivery of a SIGKILL signal.  Strict secure com‐
556              puting mode is useful for number-crunching applications that may
557              need to execute untrusted byte code, perhaps obtained by reading
558              from a pipe or socket.  This operation is available only if  the
559              kernel is configured with CONFIG_SECCOMP enabled.
560
561              With arg2 set to SECCOMP_MODE_FILTER (since Linux 3.5), the sys‐
562              tem calls allowed are defined by a pointer to a Berkeley  Packet
563              Filter  passed  in  arg3.   This argument is a pointer to struct
564              sock_fprog; it can be designed to filter arbitrary system  calls
565              and  system  call arguments.  This mode is available only if the
566              kernel is configured with CONFIG_SECCOMP_FILTER enabled.
567
568              If SECCOMP_MODE_FILTER filters permit fork(2), then the  seccomp
569              mode  is  inherited by children created by fork(2); if execve(2)
570              is  permitted,  then  the  seccomp  mode  is  preserved   across
571              execve(2).  If the filters permit prctl() calls, then additional
572              filters can be added; they are run in order until the first non-
573              allow result is seen.
574
575              For  further  information, see the kernel source file Documenta‐
576              tion/userspace-api/seccomp_filter.rst       (or       Documenta‐
577              tion/prctl/seccomp_filter.txt before Linux 4.13).
578
579       PR_GET_SECCOMP (since Linux 2.6.23)
580              Return (as the function result) the secure computing mode of the
581              calling thread.  If the caller is not in secure computing  mode,
582              this operation returns 0; if the caller is in strict secure com‐
583              puting mode, then the prctl() call will cause a  SIGKILL  signal
584              to be sent to the process.  If the caller is in filter mode, and
585              this system call is allowed by the seccomp filters,  it  returns
586              2; otherwise, the process is killed with a SIGKILL signal.  This
587              operation is available only if the  kernel  is  configured  with
588              CONFIG_SECCOMP enabled.
589
590              Since  Linux  3.8,  the  Seccomp field of the /proc/[pid]/status
591              file provides a method of obtaining the same information,  with‐
592              out the risk that the process is killed; see proc(5).
593
594       PR_SET_SECUREBITS (since Linux 2.6.26)
595              Set  the  "securebits"  flags of the calling thread to the value
596              supplied in arg2.  See capabilities(7).
597
598       PR_GET_SECUREBITS (since Linux 2.6.26)
599              Return (as the function result) the "securebits"  flags  of  the
600              calling thread.  See capabilities(7).
601
602       PR_SET_THP_DISABLE (since Linux 3.15)
603              Set  the state of the "THP disable" flag for the calling thread.
604              If arg2 has a nonzero value, the flag is set,  otherwise  it  is
605              cleared.   Setting  this  flag  provides  a method for disabling
606              transparent huge pages for jobs where the code cannot  be  modi‐
607              fied,  and  using a malloc hook with madvise(2) is not an option
608              (i.e., statically allocated data).  The setting of the "THP dis‐
609              able"  flag  is  inherited by a child created via fork(2) and is
610              preserved across execve(2).
611
612       PR_TASK_PERF_EVENTS_DISABLE (since Linux 2.6.31)
613              Disable  all  performance  counters  attached  to  the   calling
614              process, regardless of whether the counters were created by this
615              process or another process.  Performance counters created by the
616              calling  process  for  other processes are unaffected.  For more
617              information on performance counters, see the Linux kernel source
618              file tools/perf/design.txt.
619
620              Originally    called    PR_TASK_PERF_COUNTERS_DISABLE;   renamed
621              (retaining the same numerical value) in Linux 2.6.32.
622
623       PR_TASK_PERF_EVENTS_ENABLE (since Linux 2.6.31)
624              The converse of PR_TASK_PERF_EVENTS_DISABLE; enable  performance
625              counters attached to the calling process.
626
627              Originally called PR_TASK_PERF_COUNTERS_ENABLE; renamed in Linux
628              2.6.32.
629
630       PR_GET_THP_DISABLE (since Linux 3.15)
631              Return (via the function result) the current setting of the "THP
632              disable"  flag  for the calling thread: either 1, if the flag is
633              set, or 0, if it is not.
634
635       PR_GET_TID_ADDRESS (since Linux 3.5)
636              Retrieve the clear_child_tid address set  by  set_tid_address(2)
637              and  the  clone(2)  CLONE_CHILD_CLEARTID  flag,  in the location
638              pointed to by (int **) arg2.  This feature is available only  if
639              the  kernel  is  built with the CONFIG_CHECKPOINT_RESTORE option
640              enabled.  Note that since the prctl() system call does not  have
641              a compat implementation for the AMD64 x32 and MIPS n32 ABIs, and
642              the kernel writes out a pointer using the kernel's pointer size,
643              this operation expects a user-space buffer of 8 (not 4) bytes on
644              these ABIs.
645
646       PR_SET_TIMERSLACK (since Linux 2.6.28)
647              Each thread has two associated timer slack values:  a  "default"
648              value, and a "current" value.  This operation sets the "current"
649              timer slack value for the calling  thread.   If  the  nanosecond
650              value  supplied in arg2 is greater than zero, then the "current"
651              value is set to this value.  If arg2 is less than  or  equal  to
652              zero,  the  "current"  timer  slack  is  reset  to  the thread's
653              "default" timer slack value.
654
655              The "current" timer slack is used by the kernel to  group  timer
656              expirations  for  the  calling  thread  that  are  close  to one
657              another; as a consequence, timer expirations for the thread  may
658              be  up  to  the  specified  number of nanoseconds late (but will
659              never expire early).  Grouping timer expirations can help reduce
660              system power consumption by minimizing CPU wake-ups.
661
662              The  timer  expirations affected by timer slack are those set by
663              select(2),   pselect(2),   poll(2),   ppoll(2),   epoll_wait(2),
664              epoll_pwait(2),  clock_nanosleep(2),  nanosleep(2), and futex(2)
665              (and thus the library functions implemented via futexes, includ‐
666              ing    pthread_cond_timedwait(3),    pthread_mutex_timedlock(3),
667              pthread_rwlock_timedrdlock(3),    pthread_rwlock_timedwrlock(3),
668              and sem_timedwait(3)).
669
670              Timer slack is not applied to threads that are scheduled under a
671              real-time scheduling policy (see sched_setscheduler(2)).
672
673              When a new thread is created, the two  timer  slack  values  are
674              made  the  same  as  the "current" value of the creating thread.
675              Thereafter, a thread can adjust its "current" timer slack  value
676              via  PR_SET_TIMERSLACK.   The  "default" value can't be changed.
677              The timer slack values of init (PID 1), the ancestor of all pro‐
678              cesses,  are  50,000  nanoseconds  (50 microseconds).  The timer
679              slack values are preserved across execve(2).
680
681              Since Linux 4.6, the "current" timer slack value of any  process
682              can  be  examined  and  changed  via the file /proc/[pid]/timer‐
683              slack_ns.  See proc(5).
684
685       PR_GET_TIMERSLACK (since Linux 2.6.28)
686              Return (as the function result) the "current" timer slack  value
687              of the calling thread.
688
689       PR_SET_TIMING (since Linux 2.6.0-test4)
690              Set  whether  to  use  (normal, traditional) statistical process
691              timing or accurate timestamp-based process  timing,  by  passing
692              PR_TIMING_STATISTICAL  or  PR_TIMING_TIMESTAMP to arg2.  PR_TIM‐
693              ING_TIMESTAMP is not currently implemented  (attempting  to  set
694              this mode will yield the error EINVAL).
695
696       PR_GET_TIMING (since Linux 2.6.0-test4)
697              Return  (as  the function result) which process timing method is
698              currently in use.
699
700       PR_SET_TSC (since Linux 2.6.26, x86 only)
701              Set the state of the  flag  determining  whether  the  timestamp
702              counter  can be read by the process.  Pass PR_TSC_ENABLE to arg2
703              to allow it to be read, or PR_TSC_SIGSEGV to generate a  SIGSEGV
704              when the process tries to read the timestamp counter.
705
706       PR_GET_TSC (since Linux 2.6.26, x86 only)
707              Return  the  state of the flag determining whether the timestamp
708              counter can be read, in the location pointed to by (int *) arg2.
709
710       PR_SET_UNALIGN
711              (Only on: ia64, since Linux 2.3.48; parisc, since Linux  2.6.15;
712              PowerPC,  since  Linux  2.6.18;  Alpha,  since Linux 2.6.22; sh,
713              since Linux 2.6.34; tile, since Linux 3.12) Set unaligned access
714              control  bits  to arg2.  Pass PR_UNALIGN_NOPRINT to silently fix
715              up unaligned user accesses,  or  PR_UNALIGN_SIGBUS  to  generate
716              SIGBUS  on  unaligned user access.  Alpha also supports an addi‐
717              tional flag with the value of 4 and no corresponding named  con‐
718              stant,  which  instructs kernel to not fix up unaligned accesses
719              (it is analogous to providing the UAC_NOFIX flag in  SSI_NVPAIRS
720              operation of the setsysinfo() system call on Tru64).
721
722       PR_GET_UNALIGN
723              (see  PR_SET_UNALIGN  for  information on versions and architec‐
724              tures) Return unaligned access control  bits,  in  the  location
725              pointed to by (unsigned int *) arg2.
726

RETURN VALUE

728       On   success,  PR_GET_DUMPABLE,  PR_GET_KEEPCAPS,  PR_GET_NO_NEW_PRIVS,
729       PR_GET_THP_DISABLE, PR_CAPBSET_READ, PR_GET_TIMING,  PR_GET_TIMERSLACK,
730       PR_GET_SECUREBITS,     PR_MCE_KILL_GET,     PR_CAP_AMBIENT+PR_CAP_AMBI‐
731       ENT_IS_SET, and (if it returns) PR_GET_SECCOMP return  the  nonnegative
732       values  described  above.  All other option values return 0 on success.
733       On error, -1 is returned, and errno is set appropriately.
734

ERRORS

736       EACCES option is PR_SET_SECCOMP and arg2  is  SECCOMP_MODE_FILTER,  but
737              the  process  does  not have the CAP_SYS_ADMIN capability or has
738              not set  the  no_new_privs  attribute  (see  the  discussion  of
739              PR_SET_NO_NEW_PRIVS above).
740
741       EACCES option is PR_SET_MM, and arg3 is PR_SET_MM_EXE_FILE, the file is
742              not executable.
743
744       EBADF  option is PR_SET_MM, arg3 is PR_SET_MM_EXE_FILE,  and  the  file
745              descriptor passed in arg4 is not valid.
746
747       EBUSY  option  is  PR_SET_MM,  arg3 is PR_SET_MM_EXE_FILE, and this the
748              second attempt to change the /proc/pid/exe symbolic link,  which
749              is prohibited.
750
751       EFAULT arg2 is an invalid address.
752
753       EFAULT option  is PR_SET_SECCOMP, arg2 is SECCOMP_MODE_FILTER, the sys‐
754              tem was built with CONFIG_SECCOMP_FILTER, and arg3 is an invalid
755              address.
756
757       EINVAL The value of option is not recognized.
758
759       EINVAL option  is  PR_MCE_KILL  or  PR_MCE_KILL_GET  or  PR_SET_MM, and
760              unused prctl() arguments were not specified as zero.
761
762       EINVAL arg2 is not valid value for this option.
763
764       EINVAL option is PR_SET_SECCOMP or PR_GET_SECCOMP, and the  kernel  was
765              not configured with CONFIG_SECCOMP.
766
767       EINVAL option  is  PR_SET_SECCOMP, arg2 is SECCOMP_MODE_FILTER, and the
768              kernel was not configured with CONFIG_SECCOMP_FILTER.
769
770       EINVAL option is PR_SET_MM, and one of the following is true
771
772              *  arg4 or arg5 is nonzero;
773
774              *  arg3 is greater than TASK_SIZE (the limit on the size of  the
775                 user address space for this architecture);
776
777              *  arg2     is     PR_SET_MM_START_CODE,     PR_SET_MM_END_CODE,
778                 PR_SET_MM_START_DATA,         PR_SET_MM_END_DATA,          or
779                 PR_SET_MM_START_STACK, and the permissions of the correspond‐
780                 ing memory area are not as required;
781
782              *  arg2 is PR_SET_MM_START_BRK or  PR_SET_MM_BRK,  and  arg3  is
783                 less  than  or equal to the end of the data segment or speci‐
784                 fies a value that would cause the RLIMIT_DATA resource  limit
785                 to be exceeded.
786
787       EINVAL option  is PR_SET_PTRACER and arg2 is not 0, PR_SET_PTRACER_ANY,
788              or the PID of an existing process.
789
790       EINVAL option is PR_SET_PDEATHSIG and arg2 is not a valid  signal  num‐
791              ber.
792
793       EINVAL option  is PR_SET_DUMPABLE and arg2 is neither SUID_DUMP_DISABLE
794              nor SUID_DUMP_USER.
795
796       EINVAL option is PR_SET_TIMING and arg2 is not PR_TIMING_STATISTICAL.
797
798       EINVAL option is PR_SET_NO_NEW_PRIVS and arg2 is  not  equal  to  1  or
799              arg3, arg4, or arg5 is nonzero.
800
801       EINVAL option  is  PR_GET_NO_NEW_PRIVS and arg2, arg3, arg4, or arg5 is
802              nonzero.
803
804       EINVAL option is PR_SET_THP_DISABLE and arg3, arg4, or arg5 is nonzero.
805
806       EINVAL option is PR_GET_THP_DISABLE and arg2, arg3, arg4,  or  arg5  is
807              nonzero.
808
809       EINVAL option is PR_CAP_AMBIENT and an unused argument (arg4, arg5, or,
810              in the case of PR_CAP_AMBIENT_CLEAR_ALL, arg3)  is  nonzero;  or
811              arg2  has  an  invalid  value;  or arg2 is PR_CAP_AMBIENT_LOWER,
812              PR_CAP_AMBIENT_RAISE, or PR_CAP_AMBIENT_IS_SET and arg3 does not
813              specify a valid capability.
814
815       ENXIO  option was PR_MPX_ENABLE_MANAGEMENT or PR_MPX_DISABLE_MANAGEMENT
816              and the kernel or the  CPU  does  not  support  MPX  management.
817              Check that the kernel and processor have MPX support.
818
819       EOPNOTSUPP
820              option  is PR_SET_FP_MODE and arg2 has an invalid or unsupported
821              value.
822
823       EPERM  option is PR_SET_SECUREBITS, and the caller does  not  have  the
824              CAP_SETPCAP  capability,  or  tried to unset a "locked" flag, or
825              tried to set a flag whose corresponding locked flag was set (see
826              capabilities(7)).
827
828       EPERM  option      is     PR_SET_KEEPCAPS,     and     the     caller's
829              SECBIT_KEEP_CAPS_LOCKED flag is set (see capabilities(7)).
830
831       EPERM  option is PR_CAPBSET_DROP, and the  caller  does  not  have  the
832              CAP_SETPCAP capability.
833
834       EPERM  option   is   PR_SET_MM,  and  the  caller  does  not  have  the
835              CAP_SYS_RESOURCE capability.
836
837       EPERM  option is PR_CAP_AMBIENT and arg2 is  PR_CAP_AMBIENT_RAISE,  but
838              either  the  capability  specified in arg3 is not present in the
839              process's permitted and  inheritable  capability  sets,  or  the
840              PR_CAP_AMBIENT_LOWER securebit has been set.
841

VERSIONS

843       The prctl() system call was introduced in Linux 2.1.57.
844

CONFORMING TO

846       This  call  is  Linux-specific.   IRIX  has a prctl() system call (also
847       introduced in Linux 2.1.44 as irix_prctl  on  the  MIPS  architecture),
848       with prototype
849
850           ptrdiff_t prctl(int option, int arg2, int arg3);
851
852       and  options  to  get the maximum number of processes per user, get the
853       maximum number of processors the calling  process  can  use,  find  out
854       whether  a specified process is currently blocked, get or set the maxi‐
855       mum stack size, and so on.
856

SEE ALSO

858       signal(2), core(5)
859

COLOPHON

861       This page is part of release 4.16 of the Linux  man-pages  project.   A
862       description  of  the project, information about reporting bugs, and the
863       latest    version    of    this    page,    can     be     found     at
864       https://www.kernel.org/doc/man-pages/.
865
866
867
868Linux                             2018-02-02                          PRCTL(2)
Impressum