1prctl(2)                      System Calls Manual                     prctl(2)
2
3
4

NAME

6       prctl - operations on a process or thread
7

LIBRARY

9       Standard C library (libc, -lc)
10

SYNOPSIS

12       #include <sys/prctl.h>
13
14       int prctl(int option, ...
15                 /* unsigned long arg2, unsigned long arg3,
16                 unsigned long arg4, unsigned long arg5 */ );
17

DESCRIPTION

19       prctl()  manipulates  various  aspects  of  the behavior of the calling
20       thread or process.
21
22       Note that careless use of some prctl() operations can confuse the user-
23       space  run-time  environment,  so  these operations should be used with
24       care.
25
26       prctl() is called with a first argument describing  what  to  do  (with
27       values  defined  in <linux/prctl.h>), and further arguments with a sig‐
28       nificance depending on the first one.  The first argument can be:
29
30       PR_CAP_AMBIENT (since Linux 4.3)
31              Reads or changes the  ambient  capability  set  of  the  calling
32              thread, according to the value of arg2, which must be one of the
33              following:
34
35              PR_CAP_AMBIENT_RAISE
36                     The capability specified in arg3 is added to the  ambient
37                     set.  The specified capability must already be present in
38                     both the  permitted  and  the  inheritable  sets  of  the
39                     process.    This   operation  is  not  permitted  if  the
40                     SECBIT_NO_CAP_AMBIENT_RAISE securebit is set.
41
42              PR_CAP_AMBIENT_LOWER
43                     The capability specified in arg3 is removed from the  am‐
44                     bient set.
45
46              PR_CAP_AMBIENT_IS_SET
47                     The  prctl()  call returns 1 if the capability in arg3 is
48                     in the ambient set and 0 if it is not.
49
50              PR_CAP_AMBIENT_CLEAR_ALL
51                     All capabilities will be removed from  the  ambient  set.
52                     This operation requires setting arg3 to zero.
53
54              In  all of the above operations, arg4 and arg5 must be specified
55              as 0.
56
57              Higher-level interfaces layered on top of the  above  operations
58              are provided in the libcap(3) library in the form of cap_get_am‐
59              bient(3), cap_set_ambient(3), and cap_reset_ambient(3).
60
61       PR_CAPBSET_READ (since Linux 2.6.25)
62              Return (as the function result) 1 if the capability specified in
63              arg2 is in the calling thread's capability bounding set, or 0 if
64              it is not.  (The capability constants are defined in  <linux/ca‐
65              pability.h>.)   The capability bounding set dictates whether the
66              process can receive the capability through  a  file's  permitted
67              capability set on a subsequent call to execve(2).
68
69              If  the capability specified in arg2 is not valid, then the call
70              fails with the error EINVAL.
71
72              A higher-level interface layered on top  of  this  operation  is
73              provided   in   the   libcap(3)   library   in   the   form   of
74              cap_get_bound(3).
75
76       PR_CAPBSET_DROP (since Linux 2.6.25)
77              If the calling thread has the CAP_SETPCAP capability within  its
78              user  namespace, then drop the capability specified by arg2 from
79              the calling thread's capability bounding set.  Any  children  of
80              the calling thread will inherit the newly reduced bounding set.
81
82              The  call fails with the error: EPERM if the calling thread does
83              not have the CAP_SETPCAP; EINVAL if arg2 does  not  represent  a
84              valid capability; or EINVAL if file capabilities are not enabled
85              in the kernel, in which case bounding sets are not supported.
86
87              A higher-level interface layered on top  of  this  operation  is
88              provided   in   the   libcap(3)   library   in   the   form   of
89              cap_drop_bound(3).
90
91       PR_SET_CHILD_SUBREAPER (since Linux 3.4)
92              If arg2 is nonzero, set the "child subreaper" attribute  of  the
93              calling process; if arg2 is zero, unset the attribute.
94
95              A subreaper fulfills the role of init(1) for its descendant pro‐
96              cesses.  When a process becomes orphaned  (i.e.,  its  immediate
97              parent  terminates), then that process will be reparented to the
98              nearest still living ancestor subreaper.  Subsequently, calls to
99              getppid(2)  in  the  orphaned process will now return the PID of
100              the subreaper process, and when the orphan terminates, it is the
101              subreaper process that will receive a SIGCHLD signal and will be
102              able to wait(2) on the process to discover its termination  sta‐
103              tus.
104
105              The  setting of the "child subreaper" attribute is not inherited
106              by children created by fork(2) and  clone(2).   The  setting  is
107              preserved across execve(2).
108
109              Establishing a subreaper process is useful in session management
110              frameworks where a hierarchical group of processes is managed by
111              a  subreaper  process  that needs to be informed when one of the
112              processes—for example, a double-forked  daemon—terminates  (per‐
113              haps  so that it can restart that process).  Some init(1) frame‐
114              works (e.g., systemd(1)) employ a subreaper process for  similar
115              reasons.
116
117       PR_GET_CHILD_SUBREAPER (since Linux 3.4)
118              Return the "child subreaper" setting of the caller, in the loca‐
119              tion pointed to by (int *) arg2.
120
121       PR_SET_DUMPABLE (since Linux 2.3.20)
122              Set the state of  the  "dumpable"  attribute,  which  determines
123              whether core dumps are produced for the calling process upon de‐
124              livery of a signal whose default behavior is to produce  a  core
125              dump.
126
127              Up  to  and  including  Linux  2.6.12,  arg2  must  be  either 0
128              (SUID_DUMP_DISABLE,   process   is   not    dumpable)    or    1
129              (SUID_DUMP_USER, process is dumpable).  Between Linux 2.6.13 and
130              Linux 2.6.17, the value 2 was also permitted, which  caused  any
131              binary  which normally would not be dumped to be dumped readable
132              by root only; for security reasons, this feature  has  been  re‐
133              moved.   (See also the description of /proc/sys/fs/suid_dumpable
134              in proc(5).)
135
136              Normally, the "dumpable" attribute is set to 1.  However, it  is
137              reset  to  the current value contained in the file /proc/sys/fs/
138              suid_dumpable (which by default has the value 0), in the follow‐
139              ing circumstances:
140
141              •  The process's effective user or group ID is changed.
142
143              •  The  process's  filesystem  user  or group ID is changed (see
144                 credentials(7)).
145
146              •  The process executes (execve(2)) a set-user-ID or  set-group-
147                 ID  program,  resulting  in  a change of either the effective
148                 user ID or the effective group ID.
149
150              •  The process executes (execve(2)) a program that has file  ca‐
151                 pabilities  (see  capabilities(7)), but only if the permitted
152                 capabilities gained exceed those already  permitted  for  the
153                 process.
154
155              Processes  that  are  not  dumpable  can  not  be  attached  via
156              ptrace(2) PTRACE_ATTACH; see ptrace(2) for further details.
157
158              If a process is not dumpable, the  ownership  of  files  in  the
159              process's  /proc/pid  directory  is  affected  as  described  in
160              proc(5).
161
162       PR_GET_DUMPABLE (since Linux 2.3.20)
163              Return (as the function result) the current state of the calling
164              process's dumpable attribute.
165
166       PR_SET_ENDIAN (since Linux 2.6.18, PowerPC only)
167              Set the endian-ness of the calling process to the value given in
168              arg2, which should  be  one  of  the  following:  PR_ENDIAN_BIG,
169              PR_ENDIAN_LITTLE, or PR_ENDIAN_PPC_LITTLE (PowerPC pseudo little
170              endian).
171
172       PR_GET_ENDIAN (since Linux 2.6.18, PowerPC only)
173              Return the endian-ness of the calling process, in  the  location
174              pointed to by (int *) arg2.
175
176       PR_SET_FP_MODE (since Linux 4.0, only on MIPS)
177              On  the MIPS architecture, user-space code can be built using an
178              ABI which permits linking with code that  has  more  restrictive
179              floating-point  (FP) requirements.  For example, user-space code
180              may be built to target the O32 FPXX ABI  and  linked  with  code
181              built  for either one of the more restrictive FP32 or FP64 ABIs.
182              When more restrictive code is linked in, the overall requirement
183              for  the  process  is to use the more restrictive floating-point
184              mode.
185
186              Because the kernel has no means of knowing in advance which mode
187              the  process  should  be executed in, and because these restric‐
188              tions  can  change  over  the  lifetime  of  the  process,   the
189              PR_SET_FP_MODE  operation  is  provided  to allow control of the
190              floating-point mode from user space.
191
192              The (unsigned int) arg2 argument is a bit  mask  describing  the
193              floating-point mode used:
194
195              PR_FP_MODE_FR
196                     When  this bit is unset (so called FR=0 or FR0 mode), the
197                     32 floating-point registers are 32 bits wide, and  64-bit
198                     registers  are  represented as a pair of registers (even-
199                     and odd- numbered, with the even-numbered  register  con‐
200                     taining  the lower 32 bits, and the odd-numbered register
201                     containing the higher 32 bits).
202
203                     When this bit is set  (on  supported  hardware),  the  32
204                     floating-point registers are 64 bits wide (so called FR=1
205                     or FR1 mode).   Note  that  modern  MIPS  implementations
206                     (MIPS R6 and newer) support FR=1 mode only.
207
208                     Applications  that  use the O32 FP32 ABI can operate only
209                     when this bit is unset (FR=0; or they can  be  used  with
210                     FRE  enabled,  see below).  Applications that use the O32
211                     FP64 ABI (and the O32 FP64A ABI, which exists to  provide
212                     the  ability  to operate with existing FP32 code; see be‐
213                     low) can operate only when this bit is set  (FR=1).   Ap‐
214                     plications that use the O32 FPXX ABI can operate with ei‐
215                     ther FR=0 or FR=1 .
216
217              PR_FP_MODE_FRE
218                     Enable emulation of  32-bit  floating-point  mode.   When
219                     this  mode  is enabled, it emulates 32-bit floating-point
220                     operations by raising a reserved-instruction exception on
221                     every instruction that uses 32-bit formats and the kernel
222                     then handles the instruction in software.   (The  problem
223                     lies  in  the discrepancy of handling odd-numbered regis‐
224                     ters which are the high 32 bits of 64-bit registers  with
225                     even  numbers  in FR=0 mode and the lower 32-bit parts of
226                     odd-numbered 64-bit registers in  FR=1  mode.)   Enabling
227                     this  bit  is  necessary  when code with the O32 FP32 ABI
228                     should operate with code with compatible the O32 FPXX  or
229                     O32  FP64A  ABIs (which require FR=1 FPU mode) or when it
230                     is executed on newer hardware  (MIPS  R6  onwards)  which
231                     lacks  FR=0  mode support when a binary with the FP32 ABI
232                     is used.
233
234                     Note that this mode makes sense only when the FPU  is  in
235                     64-bit mode (FR=1).
236
237                     Note  that the use of emulation inherently has a signifi‐
238                     cant performance hit and should be avoided if possible.
239
240              In the N32/N64 ABI, 64-bit floating-point mode is  always  used,
241              so  FPU emulation is not required and the FPU always operates in
242              FR=1 mode.
243
244              This option is mainly intended for use  by  the  dynamic  linker
245              (ld.so(8)).
246
247              The arguments arg3, arg4, and arg5 are ignored.
248
249       PR_GET_FP_MODE (since Linux 4.0, only on MIPS)
250              Return  (as the function result) the current floating-point mode
251              (see the description of PR_SET_FP_MODE for details).
252
253              On success, the call returns a bit  mask  which  represents  the
254              current floating-point mode.
255
256              The arguments arg2, arg3, arg4, and arg5 are ignored.
257
258       PR_SET_FPEMU (since Linux 2.4.18, 2.5.9, only on ia64)
259              Set   floating-point  emulation  control  bits  to  arg2.   Pass
260              PR_FPEMU_NOPRINT to silently  emulate  floating-point  operation
261              accesses, or PR_FPEMU_SIGFPE to not emulate floating-point oper‐
262              ations and send SIGFPE instead.
263
264       PR_GET_FPEMU (since Linux 2.4.18, 2.5.9, only on ia64)
265              Return floating-point emulation control bits,  in  the  location
266              pointed to by (int *) arg2.
267
268       PR_SET_FPEXC (since Linux 2.4.21, 2.5.32, only on PowerPC)
269              Set    floating-point    exception    mode    to   arg2.    Pass
270              PR_FP_EXC_SW_ENABLE to  use  FPEXC  for  FP  exception  enables,
271              PR_FP_EXC_DIV  for  floating-point divide by zero, PR_FP_EXC_OVF
272              for floating-point overflow,  PR_FP_EXC_UND  for  floating-point
273              underflow,  PR_FP_EXC_RES  for  floating-point  inexact  result,
274              PR_FP_EXC_INV    for    floating-point    invalid     operation,
275              PR_FP_EXC_DISABLED  for FP exceptions disabled, PR_FP_EXC_NONRE‐
276              COV for async nonrecoverable exception mode, PR_FP_EXC_ASYNC for
277              async  recoverable exception mode, PR_FP_EXC_PRECISE for precise
278              exception mode.
279
280       PR_GET_FPEXC (since Linux 2.4.21, 2.5.32, only on PowerPC)
281              Return floating-point exception mode, in the location pointed to
282              by (int *) arg2.
283
284       PR_SET_IO_FLUSHER (since Linux 5.6)
285              If  a  user process is involved in the block layer or filesystem
286              I/O path, and can allocate memory while processing I/O  requests
287              it  must  set  arg2  to  1.   This  will  put the process in the
288              IO_FLUSHER state, which allows  it  special  treatment  to  make
289              progress when allocating memory.  If arg2 is 0, the process will
290              clear the IO_FLUSHER state, and the  default  behavior  will  be
291              used.
292
293              The calling process must have the CAP_SYS_RESOURCE capability.
294
295              arg3, arg4, and arg5 must be zero.
296
297              The IO_FLUSHER state is inherited by a child process created via
298              fork(2) and is preserved across execve(2).
299
300              Examples of IO_FLUSHER applications are FUSE daemons,  SCSI  de‐
301              vice  emulation daemons, and daemons that perform error handling
302              like multipath path recovery applications.
303
304       PR_GET_IO_FLUSHER (Since Linux 5.6)
305              Return (as the function result)  the  IO_FLUSHER  state  of  the
306              caller.   A  value  of  1  indicates  that  the caller is in the
307              IO_FLUSHER state; 0 indicates that the  caller  is  not  in  the
308              IO_FLUSHER state.
309
310              The calling process must have the CAP_SYS_RESOURCE capability.
311
312              arg2, arg3, arg4, and arg5 must be zero.
313
314       PR_SET_KEEPCAPS (since Linux 2.2.18)
315              Set  the state of the calling thread's "keep capabilities" flag.
316              The effect of this flag is described in  capabilities(7).   arg2
317              must  be  either  0  (clear  the flag) or 1 (set the flag).  The
318              "keep capabilities" value will be reset to 0 on subsequent calls
319              to execve(2).
320
321       PR_GET_KEEPCAPS (since Linux 2.2.18)
322              Return (as the function result) the current state of the calling
323              thread's "keep capabilities" flag.  See  capabilities(7)  for  a
324              description of this flag.
325
326       PR_MCE_KILL (since Linux 2.6.32)
327              Set  the  machine  check  memory  corruption kill policy for the
328              calling thread.  If arg2 is PR_MCE_KILL_CLEAR, clear the  thread
329              memory  corruption  kill policy and use the system-wide default.
330              (The system-wide default is defined by /proc/sys/vm/memory_fail‐
331              ure_early_kill; see proc(5).)  If arg2 is PR_MCE_KILL_SET, use a
332              thread-specific memory corruption kill policy.   In  this  case,
333              arg3    defines    whether    the    policy    is   early   kill
334              (PR_MCE_KILL_EARLY), late kill (PR_MCE_KILL_LATE), or  the  sys‐
335              tem-wide  default  (PR_MCE_KILL_DEFAULT).  Early kill means that
336              the thread receives a SIGBUS signal as soon as  hardware  memory
337              corruption  is  detected inside its address space.  In late kill
338              mode, the process is killed only when it  accesses  a  corrupted
339              page.   See sigaction(2) for more information on the SIGBUS sig‐
340              nal.  The policy is inherited by children.  The remaining unused
341              prctl() arguments must be zero for future compatibility.
342
343       PR_MCE_KILL_GET (since Linux 2.6.32)
344              Return  (as the function result) the current per-process machine
345              check kill policy.  All unused prctl() arguments must be zero.
346
347       PR_SET_MM (since Linux 3.3)
348              Modify certain kernel memory map descriptor fields of the  call‐
349              ing process.  Usually these fields are set by the kernel and dy‐
350              namic loader (see ld.so(8) for more information) and  a  regular
351              application  should  not  use  this feature.  However, there are
352              cases, such as self-modifying programs, where  a  program  might
353              find it useful to change its own memory map.
354
355              The  calling  process must have the CAP_SYS_RESOURCE capability.
356              The value in arg2 is one of the options below, while  arg3  pro‐
357              vides  a  new value for the option.  The arg4 and arg5 arguments
358              must be zero if unused.
359
360              Before Linux 3.10, this feature is available only if the  kernel
361              is built with the CONFIG_CHECKPOINT_RESTORE option enabled.
362
363              PR_SET_MM_START_CODE
364                     Set  the  address  above  which the program text can run.
365                     The corresponding memory area must be readable  and  exe‐
366                     cutable,  but  not writable or shareable (see mprotect(2)
367                     and mmap(2) for more information).
368
369              PR_SET_MM_END_CODE
370                     Set the address below which the  program  text  can  run.
371                     The  corresponding  memory area must be readable and exe‐
372                     cutable, but not writable or shareable.
373
374              PR_SET_MM_START_DATA
375                     Set the address above which initialized and uninitialized
376                     (bss)  data  are  placed.   The corresponding memory area
377                     must be readable and  writable,  but  not  executable  or
378                     shareable.
379
380              PR_SET_MM_END_DATA
381                     Set the address below which initialized and uninitialized
382                     (bss) data are placed.   The  corresponding  memory  area
383                     must  be  readable  and  writable,  but not executable or
384                     shareable.
385
386              PR_SET_MM_START_STACK
387                     Set the start address of the  stack.   The  corresponding
388                     memory area must be readable and writable.
389
390              PR_SET_MM_START_BRK
391                     Set  the  address above which the program heap can be ex‐
392                     panded with brk(2) call.  The  address  must  be  greater
393                     than  the ending address of the current program data seg‐
394                     ment.  In addition, the combined size  of  the  resulting
395                     heap  and  the  size of the data segment can't exceed the
396                     RLIMIT_DATA resource limit (see setrlimit(2)).
397
398              PR_SET_MM_BRK
399                     Set the current brk(2) value.  The requirements  for  the
400                     address  are  the same as for the PR_SET_MM_START_BRK op‐
401                     tion.
402
403              The following options are available since Linux 3.5.
404
405              PR_SET_MM_ARG_START
406                     Set the address above which the program command  line  is
407                     placed.
408
409              PR_SET_MM_ARG_END
410                     Set  the  address below which the program command line is
411                     placed.
412
413              PR_SET_MM_ENV_START
414                     Set the address above which the  program  environment  is
415                     placed.
416
417              PR_SET_MM_ENV_END
418                     Set  the  address  below which the program environment is
419                     placed.
420
421                     The    address    passed    with     PR_SET_MM_ARG_START,
422                     PR_SET_MM_ARG_END,        PR_SET_MM_ENV_START,        and
423                     PR_SET_MM_ENV_END should belong to a process stack  area.
424                     Thus,  the  corresponding  memory  area must be readable,
425                     writable, and (depending  on  the  kernel  configuration)
426                     have the MAP_GROWSDOWN attribute set (see mmap(2)).
427
428              PR_SET_MM_AUXV
429                     Set  a  new  auxiliary  vector.  The arg3 argument should
430                     provide the address of the vector.  The arg4 is the  size
431                     of the vector.
432
433              PR_SET_MM_EXE_FILE
434                     Supersede  the /proc/pid/exe symbolic link with a new one
435                     pointing to a new executable file identified by the  file
436                     descriptor  provided in arg3 argument.  The file descrip‐
437                     tor should be obtained with a regular open(2) call.
438
439                     To change the symbolic link, one needs to unmap  all  ex‐
440                     isting  executable  memory areas, including those created
441                     by the kernel itself (for example the kernel usually cre‐
442                     ates  at  least  one  executable  memory area for the ELF
443                     .text section).
444
445                     In Linux 4.9 and earlier, the  PR_SET_MM_EXE_FILE  opera‐
446                     tion  can be performed only once in a process's lifetime;
447                     attempting to perform the operation a second time results
448                     in  the  error  EPERM.  This restriction was enforced for
449                     security reasons that were subsequently deemed  specious,
450                     and  the  restriction  was  removed in Linux 4.10 because
451                     some user-space applications needed to perform this oper‐
452                     ation more than once.
453
454              The following options are available since Linux 3.18.
455
456              PR_SET_MM_MAP
457                     Provides  one-shot access to all the addresses by passing
458                     in a struct prctl_mm_map (as defined in <linux/prctl.h>).
459                     The arg4 argument should provide the size of the struct.
460
461                     This  feature  is  available  only if the kernel is built
462                     with the CONFIG_CHECKPOINT_RESTORE option enabled.
463
464              PR_SET_MM_MAP_SIZE
465                     Returns the size of the struct  prctl_mm_map  the  kernel
466                     expects.   This  allows  user  space to find a compatible
467                     struct.  The arg4 argument should be a pointer to an  un‐
468                     signed int.
469
470                     This  feature  is  available  only if the kernel is built
471                     with the CONFIG_CHECKPOINT_RESTORE option enabled.
472
473       PR_SET_VMA (since Linux 5.17)
474              Sets an attribute specified in arg2  for  virtual  memory  areas
475              starting  from  the  address  specified in arg3 and spanning the
476              size specified in arg4.  arg5 specifies the value of the  attri‐
477              bute to be set.
478
479              Note  that assigning an attribute to a virtual memory area might
480              prevent it from being merged with adjacent virtual memory  areas
481              due to the difference in that attribute's value.
482
483              Currently, arg2 must be one of:
484
485              PR_SET_VMA_ANON_NAME
486                     Set  a  name  for  anonymous  virtual memory areas.  arg5
487                     should be a pointer to a null-terminated string  contain‐
488                     ing the name.  The name length including null byte cannot
489                     exceed 80 bytes.  If arg5 is NULL, the name of the appro‐
490                     priate anonymous virtual memory areas will be reset.  The
491                     name can contain only printable ascii characters (includ‐
492                     ing space), except '[', ']', '\', '$', and '`'.
493
494       PR_MPX_ENABLE_MANAGEMENT,  PR_MPX_DISABLE_MANAGEMENT (since Linux 3.19,
495       removed in Linux 5.4; only on x86)
496              Enable or disable kernel management of Memory Protection  eXten‐
497              sions (MPX) bounds tables.  The arg2, arg3, arg4, and arg5 argu‐
498              ments must be zero.
499
500              MPX is  a  hardware-assisted  mechanism  for  performing  bounds
501              checking on pointers.  It consists of a set of registers storing
502              bounds information and a set  of  special  instruction  prefixes
503              that  tell the CPU on which instructions it should do bounds en‐
504              forcement.  There is a limited number  of  these  registers  and
505              when there are more pointers than registers, their contents must
506              be "spilled" into a set of  tables.   These  tables  are  called
507              "bounds  tables"  and the MPX prctl() operations control whether
508              the kernel manages their allocation and freeing.
509
510              When management is enabled, the kernel will take over allocation
511              and  freeing of the bounds tables.  It does this by trapping the
512              #BR exceptions that result at first use of missing bounds tables
513              and  instead of delivering the exception to user space, it allo‐
514              cates the table and populates the bounds directory with the  lo‐
515              cation  of the new table.  For freeing, the kernel checks to see
516              if bounds tables are present for memory which is not  allocated,
517              and frees them if so.
518
519              Before  enabling  MPX management using PR_MPX_ENABLE_MANAGEMENT,
520              the application must first have allocated  a  user-space  buffer
521              for  the bounds directory and placed the location of that direc‐
522              tory in the bndcfgu register.
523
524              These calls fail if the CPU or  kernel  does  not  support  MPX.
525              Kernel  support  for MPX is enabled via the CONFIG_X86_INTEL_MPX
526              configuration option.  You can check whether  the  CPU  supports
527              MPX  by  looking  for the mpx CPUID bit, like with the following
528              command:
529
530                  cat /proc/cpuinfo | grep ' mpx '
531
532              A thread may not switch in or out of long  (64-bit)  mode  while
533              MPX is enabled.
534
535              All threads in a process are affected by these calls.
536
537              The  child  of  a  fork(2) inherits the state of MPX management.
538              During execve(2), MPX management is  reset  to  a  state  as  if
539              PR_MPX_DISABLE_MANAGEMENT had been called.
540
541              For further information on Intel MPX, see the kernel source file
542              Documentation/x86/intel_mpx.txt.
543
544              Due to a lack of toolchain support, PR_MPX_ENABLE_MANAGEMENT and
545              PR_MPX_DISABLE_MANAGEMENT  are  not  supported  in Linux 5.4 and
546              later.
547
548       PR_SET_NAME (since Linux 2.6.9)
549              Set the name of the calling thread, using the value in the loca‐
550              tion  pointed  to  by  (char  *) arg2.  The name can be up to 16
551              bytes long, including the terminating null byte.  (If the length
552              of  the  string, including the terminating null byte, exceeds 16
553              bytes, the string is silently truncated.)  This is the same  at‐
554              tribute  that can be set via pthread_setname_np(3) and retrieved
555              using pthread_getname_np(3).  The attribute is likewise accessi‐
556              ble via /proc/self/task/tid/comm (see proc(5)), where tid is the
557              thread ID of the calling thread, as returned by gettid(2).
558
559       PR_GET_NAME (since Linux 2.6.11)
560              Return the name of the calling thread, in the buffer pointed  to
561              by  (char  *)  arg2.  The buffer should allow space for up to 16
562              bytes; the returned string will be null-terminated.
563
564       PR_SET_NO_NEW_PRIVS (since Linux 3.5)
565              Set the calling thread's no_new_privs attribute to the value  in
566              arg2.   With  no_new_privs  set  to 1, execve(2) promises not to
567              grant privileges to do anything that could not  have  been  done
568              without the execve(2) call (for example, rendering the set-user-
569              ID and set-group-ID mode bits, and file  capabilities  non-func‐
570              tional).   Once set, the no_new_privs attribute cannot be unset.
571              The setting of this attribute is inherited by  children  created
572              by fork(2) and clone(2), and preserved across execve(2).
573
574              Since Linux 4.10, the value of a thread's no_new_privs attribute
575              can be viewed via the NoNewPrivs field in  the  /proc/pid/status
576              file.
577
578              For  more  information,  see  the  kernel source file Documenta‐
579              tion/userspace-api/no_new_privs.rst        (or        Documenta‐
580              tion/prctl/no_new_privs.txt  before  Linux 4.13).  See also sec‐
581              comp(2).
582
583       PR_GET_NO_NEW_PRIVS (since Linux 3.5)
584              Return (as the function result) the value  of  the  no_new_privs
585              attribute  for  the  calling thread.  A value of 0 indicates the
586              regular execve(2) behavior.  A value of  1  indicates  execve(2)
587              will operate in the privilege-restricting mode described above.
588
589       PR_PAC_RESET_KEYS (since Linux 5.0, only on arm64)
590              Securely reset the thread's pointer authentication keys to fresh
591              random values generated by the kernel.
592
593              The set of keys to be reset is specified by arg2, which must  be
594              a logical OR of zero or more of the following:
595
596              PR_PAC_APIAKEY
597                     instruction authentication key A
598
599              PR_PAC_APIBKEY
600                     instruction authentication key B
601
602              PR_PAC_APDAKEY
603                     data authentication key A
604
605              PR_PAC_APDBKEY
606                     data authentication key B
607
608              PR_PAC_APGAKEY
609                     generic authentication “A” key.
610
611                     (Yes folks, there really is no generic B key.)
612
613              As a special case, if arg2 is zero, then all the keys are reset.
614              Since new keys could be added in future, this is the recommended
615              way  to  completely  wipe  the existing keys when establishing a
616              clean execution context.  Note that there  is  no  need  to  use
617              PR_PAC_RESET_KEYS  in  preparation  for calling execve(2), since
618              execve(2) resets all the pointer authentication keys.
619
620              The remaining arguments arg3, arg4, and arg5 must all be zero.
621
622              If the arguments are invalid, and in particular if arg2 contains
623              set  bits  that are unrecognized or that correspond to a key not
624              available on this platform, then the call fails with error  EIN‐
625              VAL.
626
627              Warning: Because the compiler or run-time environment may be us‐
628              ing some or all of the keys, a successful PR_PAC_RESET_KEYS  may
629              crash  the  calling process.  The conditions for using it safely
630              are complex and system-dependent.  Don't use it unless you  know
631              what you are doing.
632
633              For  more  information,  see  the  kernel source file Documenta‐
634              tion/arm64/pointer-authentication.rst       (or       Documenta‐
635              tion/arm64/pointer-authentication.txt before Linux 5.3).
636
637       PR_SET_PDEATHSIG (since Linux 2.1.57)
638              Set  the parent-death signal of the calling process to arg2 (ei‐
639              ther a signal value in the range [1, NSIG - 1], or 0 to  clear).
640              This  is  the  signal that the calling process will get when its
641              parent dies.
642
643              Warning: the "parent" in this  case  is  considered  to  be  the
644              thread  that  created  this process.  In other words, the signal
645              will be sent when that  thread  terminates  (via,  for  example,
646              pthread_exit(3)),  rather  than  after all of the threads in the
647              parent process terminate.
648
649              The parent-death signal is sent upon subsequent  termination  of
650              the  parent  thread  and also upon termination of each subreaper
651              process (see the description of PR_SET_CHILD_SUBREAPER above) to
652              which  the  caller  is  subsequently  reparented.  If the parent
653              thread and all ancestor subreapers have  already  terminated  by
654              the time of the PR_SET_PDEATHSIG operation, then no parent-death
655              signal is sent to the caller.
656
657              The parent-death signal is process-directed (see signal(7)) and,
658              if  the  child installs a handler using the sigaction(2) SA_SIG‐
659              INFO flag, the si_pid field of the  siginfo_t  argument  of  the
660              handler contains the PID of the terminating parent process.
661
662              The  parent-death  signal  setting is cleared for the child of a
663              fork(2).  It is also (since Linux 2.4.36 / 2.6.23) cleared  when
664              executing a set-user-ID or set-group-ID binary, or a binary that
665              has associated capabilities  (see  capabilities(7));  otherwise,
666              this value is preserved across execve(2).  The parent-death sig‐
667              nal setting is also cleared upon changes to any of the following
668              thread  credentials:  effective  user  ID,  effective  group ID,
669              filesystem user ID, or filesystem group ID.
670
671       PR_GET_PDEATHSIG (since Linux 2.3.15)
672              Return the current value of the parent process death signal,  in
673              the location pointed to by (int *) arg2.
674
675       PR_SET_PTRACER (since Linux 3.4)
676              This is meaningful only when the Yama LSM is enabled and in mode
677              1   ("restricted    ptrace",    visible    via    /proc/sys/ker‐
678              nel/yama/ptrace_scope).   When  a "ptracer process ID" is passed
679              in arg2, the caller is declaring that the  ptracer  process  can
680              ptrace(2) the calling process as if it were a direct process an‐
681              cestor.  Each PR_SET_PTRACER  operation  replaces  the  previous
682              "ptracer process ID".  Employing PR_SET_PTRACER with arg2 set to
683              0  clears  the  caller's  "ptracer  process  ID".   If  arg2  is
684              PR_SET_PTRACER_ANY,  the  ptrace restrictions introduced by Yama
685              are effectively disabled for the calling process.
686
687              For further information, see the kernel source  file  Documenta‐
688              tion/admin-guide/LSM/Yama.rst       (or      Documentation/secu‐
689              rity/Yama.txt before Linux 4.13).
690
691       PR_SET_SECCOMP (since Linux 2.6.23)
692              Set the secure computing (seccomp) mode for the calling  thread,
693              to limit the available system calls.  The more recent seccomp(2)
694              system  call  provides  a  superset  of  the  functionality   of
695              PR_SET_SECCOMP,  and is the preferred interface for new applica‐
696              tions.
697
698              The seccomp mode is selected via arg2.  (The  seccomp  constants
699              are  defined in <linux/seccomp.h>.)  The following values can be
700              specified:
701
702              SECCOMP_MODE_STRICT (since Linux 2.6.23)
703                     See the description of  SECCOMP_SET_MODE_STRICT  in  sec‐
704                     comp(2).
705
706                     This operation is available only if the kernel is config‐
707                     ured with CONFIG_SECCOMP enabled.
708
709              SECCOMP_MODE_FILTER (since Linux 3.5)
710                     The allowed system calls are defined by a  pointer  to  a
711                     Berkeley  Packet Filter passed in arg3.  This argument is
712                     a pointer to struct sock_fprog; it  can  be  designed  to
713                     filter  arbitrary system calls and system call arguments.
714                     See the description of  SECCOMP_SET_MODE_FILTER  in  sec‐
715                     comp(2).
716
717                     This operation is available only if the kernel is config‐
718                     ured with CONFIG_SECCOMP_FILTER enabled.
719
720              For further details on seccomp filtering, see seccomp(2).
721
722       PR_GET_SECCOMP (since Linux 2.6.23)
723              Return (as the function result) the secure computing mode of the
724              calling  thread.  If the caller is not in secure computing mode,
725              this operation returns 0; if the caller is in strict secure com‐
726              puting  mode,  then the prctl() call will cause a SIGKILL signal
727              to be sent to the process.  If the caller is in filter mode, and
728              this  system  call is allowed by the seccomp filters, it returns
729              2; otherwise, the process is killed with a SIGKILL signal.
730
731              This operation is available only if  the  kernel  is  configured
732              with CONFIG_SECCOMP enabled.
733
734              Since  Linux 3.8, the Seccomp field of the /proc/pid/status file
735              provides a method of obtaining the same information, without the
736              risk that the process is killed; see proc(5).
737
738       PR_SET_SECUREBITS (since Linux 2.6.26)
739              Set  the  "securebits"  flags of the calling thread to the value
740              supplied in arg2.  See capabilities(7).
741
742       PR_GET_SECUREBITS (since Linux 2.6.26)
743              Return (as the function result) the "securebits"  flags  of  the
744              calling thread.  See capabilities(7).
745
746       PR_GET_SPECULATION_CTRL (since Linux 4.17)
747              Return  (as  the  function  result) the state of the speculation
748              misfeature specified in arg2.   Currently,  the  only  permitted
749              value  for  this argument is PR_SPEC_STORE_BYPASS (otherwise the
750              call fails with the error ENODEV).
751
752              The return value uses bits 0-3 with the following meaning:
753
754              PR_SPEC_PRCTL
755                     Mitigation can be controlled per thread by  PR_SET_SPECU‐
756                     LATION_CTRL.
757
758              PR_SPEC_ENABLE
759                     The  speculation  feature  is enabled, mitigation is dis‐
760                     abled.
761
762              PR_SPEC_DISABLE
763                     The speculation feature is disabled,  mitigation  is  en‐
764                     abled.
765
766              PR_SPEC_FORCE_DISABLE
767                     Same as PR_SPEC_DISABLE but cannot be undone.
768
769              PR_SPEC_DISABLE_NOEXEC (since Linux 5.1)
770                     Same as PR_SPEC_DISABLE, but the state will be cleared on
771                     execve(2).
772
773              If all bits are 0, then the CPU is not affected by the  specula‐
774              tion misfeature.
775
776              If  PR_SPEC_PRCTL is set, then per-thread control of the mitiga‐
777              tion is available.  If not set, prctl() for the speculation mis‐
778              feature will fail.
779
780              The  arg3, arg4, and arg5 arguments must be specified as 0; oth‐
781              erwise the call fails with the error EINVAL.
782
783       PR_SET_SPECULATION_CTRL (since Linux 4.17)
784              Sets the state of the speculation misfeature specified in  arg2.
785              The speculation-misfeature settings are per-thread attributes.
786
787              Currently, arg2 must be one of:
788
789              PR_SPEC_STORE_BYPASS
790                     Set the state of the speculative store bypass misfeature.
791
792              PR_SPEC_INDIRECT_BRANCH (since Linux 4.20)
793                     Set  the state of the indirect branch speculation misfea‐
794                     ture.
795
796              If arg2 does not have one of the above  values,  then  the  call
797              fails with the error ENODEV.
798
799              The arg3 argument is used to hand in the control value, which is
800              one of the following:
801
802              PR_SPEC_ENABLE
803                     The speculation feature is enabled,  mitigation  is  dis‐
804                     abled.
805
806              PR_SPEC_DISABLE
807                     The  speculation  feature  is disabled, mitigation is en‐
808                     abled.
809
810              PR_SPEC_FORCE_DISABLE
811                     Same as PR_SPEC_DISABLE, but cannot be undone.  A  subse‐
812                     quent prctl(arg2, PR_SPEC_ENABLE) with the same value for
813                     arg2 will fail with the error EPERM.
814
815              PR_SPEC_DISABLE_NOEXEC (since Linux 5.1)
816                     Same as PR_SPEC_DISABLE, but the state will be cleared on
817                     execve(2).   Currently  only  supported for arg2 equal to
818                     PR_SPEC_STORE_BYPASS.
819
820              Any unsupported value in arg3 will result in  the  call  failing
821              with the error ERANGE.
822
823              The  arg4  and  arg5 arguments must be specified as 0; otherwise
824              the call fails with the error EINVAL.
825
826              The  speculation  feature  can  also  be   controlled   by   the
827              spec_store_bypass_disable  boot  parameter.   This parameter may
828              enforce a read-only policy which will result in the prctl() call
829              failing with the error ENXIO.  For further details, see the ker‐
830              nel source file Documentation/admin-guide/kernel-parameters.txt.
831
832       PR_SVE_SET_VL (since Linux 4.15, only on arm64)
833              Configure the thread's SVE vector length, as specified by  (int)
834              arg2.  Arguments arg3, arg4, and arg5 are ignored.
835
836              The bits of arg2 corresponding to PR_SVE_VL_LEN_MASK must be set
837              to the desired vector length in bytes.  This is  interpreted  as
838              an  upper  bound:  the kernel will select the greatest available
839              vector length that does not exceed the value specified.  In par‐
840              ticular,  specifying  SVE_VL_MAX (defined in <asm/sigcontext.h>)
841              for the PR_SVE_VL_LEN_MASK bits requests the  maximum  supported
842              vector length.
843
844              In  addition,  the  other bits of arg2 must be set to one of the
845              following combinations of flags:
846
847              0      Perform the change immediately.  At the next execve(2) in
848                     the  thread, the vector length will be reset to the value
849                     configured in /proc/sys/abi/sve_default_vector_length.
850
851              PR_SVE_VL_INHERIT
852                     Perform the  change  immediately.   Subsequent  execve(2)
853                     calls will preserve the new vector length.
854
855              PR_SVE_SET_VL_ONEXEC
856                     Defer the change, so that it is performed at the next ex‐
857                     ecve(2) in the thread.  Further execve(2) calls will  re‐
858                     set   the  vector  length  to  the  value  configured  in
859                     /proc/sys/abi/sve_default_vector_length.
860
861              PR_SVE_SET_VL_ONEXEC | PR_SVE_VL_INHERIT
862                     Defer the change, so that it is performed at the next ex‐
863                     ecve(2) in the thread.  Further execve(2) calls will pre‐
864                     serve the new vector length.
865
866              In all cases, any previously pending  deferred  change  is  can‐
867              celed.
868
869              The  call fails with error EINVAL if SVE is not supported on the
870              platform, if arg2 is unrecognized or invalid, or  the  value  in
871              the  bits of arg2 corresponding to PR_SVE_VL_LEN_MASK is outside
872              the range SVE_VL_MIN..SVE_VL_MAX or is not a multiple of 16.
873
874              On success, a nonnegative value is returned that  describes  the
875              selected configuration.  If PR_SVE_SET_VL_ONEXEC was included in
876              arg2, then the configuration described by the return value  will
877              take effect at the next execve(2).  Otherwise, the configuration
878              is already in effect when the PR_SVE_SET_VL  call  returns.   In
879              either  case, the value is encoded in the same way as the return
880              value of PR_SVE_GET_VL.  Note that there is no explicit flag  in
881              the return value corresponding to PR_SVE_SET_VL_ONEXEC.
882
883              The configuration (including any pending deferred change) is in‐
884              herited across fork(2) and clone(2).
885
886              For more information, see  the  kernel  source  file  Documenta‐
887              tion/arm64/sve.rst  (or Documentation/arm64/sve.txt before Linux
888              5.3).
889
890              Warning: Because the compiler or run-time environment may be us‐
891              ing  SVE,  using this call without the PR_SVE_SET_VL_ONEXEC flag
892              may crash the calling process.   The  conditions  for  using  it
893              safely  are  complex  and system-dependent.  Don't use it unless
894              you really know what you are doing.
895
896       PR_SVE_GET_VL (since Linux 4.15, only on arm64)
897              Get the thread's current SVE vector length configuration.
898
899              Arguments arg2, arg3, arg4, and arg5 are ignored.
900
901              Provided that the kernel and platform support SVE,  this  opera‐
902              tion  always  succeeds,  returning  a nonnegative value that de‐
903              scribes the current configuration.  The  bits  corresponding  to
904              PR_SVE_VL_LEN_MASK   contain  the  currently  configured  vector
905              length in bytes.  The bit corresponding to PR_SVE_VL_INHERIT in‐
906              dicates  whether  the vector length will be inherited across ex‐
907              ecve(2).
908
909              Note that there is no way to determine whether there is a  pend‐
910              ing vector length change that has not yet taken effect.
911
912              For  more  information,  see  the  kernel source file Documenta‐
913              tion/arm64/sve.rst (or Documentation/arm64/sve.txt before  Linux
914              5.3).
915
916       PR_SET_SYSCALL_USER_DISPATCH (since Linux 5.11, x86 only)
917              Configure  the  Syscall  User Dispatch mechanism for the calling
918              thread.  This mechanism allows an application to selectively in‐
919              tercept  system calls so that they can be handled within the ap‐
920              plication itself.  Interception takes the form of  a  thread-di‐
921              rected  SIGSYS  signal  that  is delivered to the thread when it
922              makes a system call.  If intercepted, the system call is not ex‐
923              ecuted by the kernel.
924
925              To  enable  this  mechanism,  arg2  should be set to PR_SYS_DIS‐
926              PATCH_ON.  Once enabled, further system  calls  will  be  selec‐
927              tively  intercepted, depending on a control variable provided by
928              user space.  In this case, arg3 and arg4  respectively  identify
929              the  offset  and  length of a single contiguous memory region in
930              the process address space from where system calls are always al‐
931              lowed to be executed, regardless of the control variable.  (Typ‐
932              ically, this area would include the area  of  memory  containing
933              the C library.)
934
935              arg5  points  to  a char-sized variable that is a fast switch to
936              allow/block system call execution without the overhead of  doing
937              another system call to re-configure Syscall User Dispatch.  This
938              control variable can  either  be  set  to  SYSCALL_DISPATCH_FIL‐
939              TER_BLOCK   to   block   system   calls  from  executing  or  to
940              SYSCALL_DISPATCH_FILTER_ALLOW to temporarily allow  them  to  be
941              executed.   This  value is checked by the kernel on every system
942              call entry, and any unexpected value will raise  an  uncatchable
943              SIGSYS at that time, killing the application.
944
945              When a system call is intercepted, the kernel sends a thread-di‐
946              rected SIGSYS signal to the triggering thread.   Various  fields
947              will  be set in the siginfo_t structure (see sigaction(2)) asso‐
948              ciated with the signal:
949
950si_signo will contain SIGSYS.
951
952si_call_addr will show the address of  the  system  call  in‐
953                 struction.
954
955si_syscall  and  si_arch  will indicate which system call was
956                 attempted.
957
958si_code will contain SYS_USER_DISPATCH.
959
960si_errno will be set to 0.
961
962              The program counter will be as though the system  call  happened
963              (i.e., the program counter will not point to the system call in‐
964              struction).
965
966              When the signal handler returns to the kernel, the  system  call
967              completes immediately and returns to the calling thread, without
968              actually being executed.  If necessary (i.e., when emulating the
969              system  call  on user space.), the signal handler should set the
970              system call return value to a sane value, by modifying the  reg‐
971              ister context stored in the ucontext argument of the signal han‐
972              dler.  See sigaction(2),  sigreturn(2),  and  getcontext(3)  for
973              more information.
974
975              If  arg2 is set to PR_SYS_DISPATCH_OFF, Syscall User Dispatch is
976              disabled for that thread.  the remaining arguments must  be  set
977              to 0.
978
979              The  setting  is  not preserved across fork(2), clone(2), or ex‐
980              ecve(2).
981
982              For more information, see  the  kernel  source  file  Documenta‐
983              tion/admin-guide/syscall-user-dispatch.rst
984
985       PR_SET_TAGGED_ADDR_CTRL (since Linux 5.4, only on arm64)
986              Controls  support for passing tagged user-space addresses to the
987              kernel (i.e., addresses where bits 56—63 are not all zero).
988
989              The level of support is selected by arg2, which can  be  one  of
990              the following:
991
992              0      Addresses that are passed for the purpose of being deref‐
993                     erenced by the kernel must be untagged.
994
995              PR_TAGGED_ADDR_ENABLE
996                     Addresses that are passed for the purpose of being deref‐
997                     erenced  by the kernel may be tagged, with the exceptions
998                     summarized below.
999
1000              The remaining arguments arg3, arg4, and arg5 must all be zero.
1001
1002              On success, the mode specified in arg2 is set  for  the  calling
1003              thread and the return value is 0.  If the arguments are invalid,
1004              the mode specified in arg2 is unrecognized, or if  this  feature
1005              is    unsupported    by    the    kernel    or    disabled   via
1006              /proc/sys/abi/tagged_addr_disabled, the call fails with the  er‐
1007              ror EINVAL.
1008
1009              In  particular,  if  prctl(PR_SET_TAGGED_ADDR_CTRL,  0, 0, 0, 0)
1010              fails with EINVAL, then all addresses passed to the kernel  must
1011              be untagged.
1012
1013              Irrespective  of  which mode is set, addresses passed to certain
1014              interfaces must always be untagged:
1015
1016brk(2), mmap(2), shmat(2), shmdt(2), and the new_address  ar‐
1017                 gument of mremap(2).
1018
1019                 (Prior  to Linux 5.6 these accepted tagged addresses, but the
1020                 behaviour may not be what you expect.  Don't rely on it.)
1021
1022              •  ‘polymorphic’ interfaces that accept  pointers  to  arbitrary
1023                 types  cast  to  a void * or other generic type, specifically
1024                 prctl(), ioctl(2), and in general setsockopt(2) (only certain
1025                 specific setsockopt(2) options allow tagged addresses).
1026
1027              This  list  of exclusions may shrink when moving from one kernel
1028              version to a later kernel version.  While the  kernel  may  make
1029              some  guarantees  for  backwards  compatibility reasons, for the
1030              purposes of new software the effect of passing tagged  addresses
1031              to these interfaces is unspecified.
1032
1033              The  mode  set  by  this  call  is  inherited across fork(2) and
1034              clone(2).  The mode is reset by execve(2) to 0 (i.e., tagged ad‐
1035              dresses not permitted in the user/kernel ABI).
1036
1037              For  more  information,  see  the  kernel source file Documenta‐
1038              tion/arm64/tagged-address-abi.rst.
1039
1040              Warning: This call is primarily intended for use by the run-time
1041              environment.   A  successful  PR_SET_TAGGED_ADDR_CTRL call else‐
1042              where may crash the calling process.  The conditions  for  using
1043              it safely are complex and system-dependent.  Don't use it unless
1044              you know what you are doing.
1045
1046       PR_GET_TAGGED_ADDR_CTRL (since Linux 5.4, only on arm64)
1047              Returns the current tagged address mode for the calling thread.
1048
1049              Arguments arg2, arg3, arg4, and arg5 must all be zero.
1050
1051              If the arguments are invalid or this feature is disabled or  un‐
1052              supported by the kernel, the call fails with EINVAL.  In partic‐
1053              ular, if prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)  fails  with
1054              EINVAL,  then  this feature is definitely either unsupported, or
1055              disabled via /proc/sys/abi/tagged_addr_disabled.  In this  case,
1056              all addresses passed to the kernel must be untagged.
1057
1058              Otherwise,  the  call returns a nonnegative value describing the
1059              current tagged address mode, encoded in the same way as the arg2
1060              argument of PR_SET_TAGGED_ADDR_CTRL.
1061
1062              For  more  information,  see  the  kernel source file Documenta‐
1063              tion/arm64/tagged-address-abi.rst.
1064
1065       PR_TASK_PERF_EVENTS_DISABLE (since Linux 2.6.31)
1066              Disable  all  performance  counters  attached  to  the   calling
1067              process, regardless of whether the counters were created by this
1068              process or another process.  Performance counters created by the
1069              calling  process  for  other processes are unaffected.  For more
1070              information on performance counters, see the Linux kernel source
1071              file tools/perf/design.txt.
1072
1073              Originally  called  PR_TASK_PERF_COUNTERS_DISABLE;  renamed (re‐
1074              taining the same numerical value) in Linux 2.6.32.
1075
1076       PR_TASK_PERF_EVENTS_ENABLE (since Linux 2.6.31)
1077              The converse of PR_TASK_PERF_EVENTS_DISABLE; enable  performance
1078              counters attached to the calling process.
1079
1080              Originally called PR_TASK_PERF_COUNTERS_ENABLE; renamed in Linux
1081              2.6.32.
1082
1083       PR_SET_THP_DISABLE (since Linux 3.15)
1084              Set the state of the "THP disable" flag for the calling  thread.
1085              If  arg2  has  a nonzero value, the flag is set, otherwise it is
1086              cleared.  Setting this flag  provides  a  method  for  disabling
1087              transparent  huge  pages for jobs where the code cannot be modi‐
1088              fied, and using a malloc hook with madvise(2) is not  an  option
1089              (i.e., statically allocated data).  The setting of the "THP dis‐
1090              able" flag is inherited by a child created via  fork(2)  and  is
1091              preserved across execve(2).
1092
1093       PR_GET_THP_DISABLE (since Linux 3.15)
1094              Return  (as the function result) the current setting of the "THP
1095              disable" flag for the calling thread: either 1, if the  flag  is
1096              set, or 0, if it is not.
1097
1098       PR_GET_TID_ADDRESS (since Linux 3.5)
1099              Return the clear_child_tid address set by set_tid_address(2) and
1100              the clone(2) CLONE_CHILD_CLEARTID flag, in the location  pointed
1101              to by (int **) arg2.  This feature is available only if the ker‐
1102              nel is built with the CONFIG_CHECKPOINT_RESTORE option  enabled.
1103              Note  that  since the prctl() system call does not have a compat
1104              implementation for the AMD64 x32 and MIPS n32 ABIs, and the ker‐
1105              nel  writes  out a pointer using the kernel's pointer size, this
1106              operation expects a user-space buffer of  8  (not  4)  bytes  on
1107              these ABIs.
1108
1109       PR_SET_TIMERSLACK (since Linux 2.6.28)
1110              Each  thread  has two associated timer slack values: a "default"
1111              value, and a "current" value.  This operation sets the "current"
1112              timer  slack  value for the calling thread.  arg2 is an unsigned
1113              long value, then maximum "current" value is  ULONG_MAX  and  the
1114              minimum  "current" value is 1.  If the nanosecond value supplied
1115              in arg2 is greater than zero, then the "current" value is set to
1116              this value.  If arg2 is equal to zero, the "current" timer slack
1117              is reset to the thread's "default" timer slack value.
1118
1119              The "current" timer slack is used by the kernel to  group  timer
1120              expirations  for  the  calling  thread that are close to one an‐
1121              other; as a consequence, timer expirations for the thread may be
1122              up  to  the specified number of nanoseconds late (but will never
1123              expire early).  Grouping timer expirations can help reduce  sys‐
1124              tem power consumption by minimizing CPU wake-ups.
1125
1126              The  timer  expirations affected by timer slack are those set by
1127              select(2),   pselect(2),   poll(2),   ppoll(2),   epoll_wait(2),
1128              epoll_pwait(2),  clock_nanosleep(2),  nanosleep(2), and futex(2)
1129              (and thus the library functions implemented via futexes, includ‐
1130              ing    pthread_cond_timedwait(3),    pthread_mutex_timedlock(3),
1131              pthread_rwlock_timedrdlock(3),    pthread_rwlock_timedwrlock(3),
1132              and sem_timedwait(3)).
1133
1134              Timer slack is not applied to threads that are scheduled under a
1135              real-time scheduling policy (see sched_setscheduler(2)).
1136
1137              When a new thread is created, the two  timer  slack  values  are
1138              made  the  same  as  the "current" value of the creating thread.
1139              Thereafter, a thread can adjust its "current" timer slack  value
1140              via  PR_SET_TIMERSLACK.   The  "default" value can't be changed.
1141              The timer slack values of init (PID 1), the ancestor of all pro‐
1142              cesses,  are  50,000  nanoseconds  (50 microseconds).  The timer
1143              slack value is inherited by a child created via fork(2), and  is
1144              preserved across execve(2).
1145
1146              Since  Linux 4.6, the "current" timer slack value of any process
1147              can be  examined  and  changed  via  the  file  /proc/pid/timer‐
1148              slack_ns.  See proc(5).
1149
1150       PR_GET_TIMERSLACK (since Linux 2.6.28)
1151              Return  (as the function result) the "current" timer slack value
1152              of the calling thread.
1153
1154       PR_SET_TIMING (since Linux 2.6.0)
1155              Set whether to use  (normal,  traditional)  statistical  process
1156              timing  or  accurate  timestamp-based process timing, by passing
1157              PR_TIMING_STATISTICAL or PR_TIMING_TIMESTAMP to  arg2.   PR_TIM‐
1158              ING_TIMESTAMP  is  not  currently implemented (attempting to set
1159              this mode will yield the error EINVAL).
1160
1161       PR_GET_TIMING (since Linux 2.6.0)
1162              Return (as the function result) which process timing  method  is
1163              currently in use.
1164
1165       PR_SET_TSC (since Linux 2.6.26, x86 only)
1166              Set  the  state  of  the  flag determining whether the timestamp
1167              counter can be read by the process.  Pass PR_TSC_ENABLE to  arg2
1168              to  allow it to be read, or PR_TSC_SIGSEGV to generate a SIGSEGV
1169              when the process tries to read the timestamp counter.
1170
1171       PR_GET_TSC (since Linux 2.6.26, x86 only)
1172              Return the state of the flag determining whether  the  timestamp
1173              counter can be read, in the location pointed to by (int *) arg2.
1174
1175       PR_SET_UNALIGN
1176              (Only  on: ia64, since Linux 2.3.48; parisc, since Linux 2.6.15;
1177              PowerPC, since Linux 2.6.18;  Alpha,  since  Linux  2.6.22;  sh,
1178              since Linux 2.6.34; tile, since Linux 3.12) Set unaligned access
1179              control bits to arg2.  Pass PR_UNALIGN_NOPRINT to  silently  fix
1180              up  unaligned  user  accesses,  or PR_UNALIGN_SIGBUS to generate
1181              SIGBUS on unaligned user access.  Alpha also supports  an  addi‐
1182              tional  flag with the value of 4 and no corresponding named con‐
1183              stant, which instructs kernel to not fix up  unaligned  accesses
1184              (it  is analogous to providing the UAC_NOFIX flag in SSI_NVPAIRS
1185              operation of the setsysinfo() system call on Tru64).
1186
1187       PR_GET_UNALIGN
1188              (See PR_SET_UNALIGN for information on  versions  and  architec‐
1189              tures.)   Return  unaligned access control bits, in the location
1190              pointed to by (unsigned int *) arg2.
1191
1192       PR_GET_AUXV (since Linux 6.4)
1193              Get the auxiliary vector (auxv) into the buffer  pointed  to  by
1194              (void  *) arg2, whose length is given by arg3.  If the buffer is
1195              not long enough for the full auxiliary vector, the copy will  be
1196              truncated.   Return  (as the function result) the full length of
1197              the auxiliary vector.  arg4 and arg5 must be 0.
1198

RETURN VALUE

1200       On  success,   PR_CAP_AMBIENT+PR_CAP_AMBIENT_IS_SET,   PR_CAPBSET_READ,
1201       PR_GET_DUMPABLE,  PR_GET_FP_MODE,  PR_GET_IO_FLUSHER,  PR_GET_KEEPCAPS,
1202       PR_MCE_KILL_GET, PR_GET_NO_NEW_PRIVS, PR_GET_SECUREBITS,  PR_GET_SPECU‐
1203       LATION_CTRL,   PR_SVE_GET_VL,  PR_SVE_SET_VL,  PR_GET_TAGGED_ADDR_CTRL,
1204       PR_GET_THP_DISABLE, PR_GET_TIMING, PR_GET_TIMERSLACK, PR_GET_AUXV,  and
1205       (if  it returns) PR_GET_SECCOMP return the nonnegative values described
1206       above.  All other option values return 0 on success.  On error,  -1  is
1207       returned, and errno is set to indicate the error.
1208

ERRORS

1210       EACCES option  is  PR_SET_SECCOMP  and arg2 is SECCOMP_MODE_FILTER, but
1211              the process does not have the CAP_SYS_ADMIN  capability  or  has
1212              not  set  the  no_new_privs  attribute  (see  the  discussion of
1213              PR_SET_NO_NEW_PRIVS above).
1214
1215       EACCES option is PR_SET_MM, and arg3 is PR_SET_MM_EXE_FILE, the file is
1216              not executable.
1217
1218       EBADF  option  is  PR_SET_MM,  arg3 is PR_SET_MM_EXE_FILE, and the file
1219              descriptor passed in arg4 is not valid.
1220
1221       EBUSY  option is PR_SET_MM, arg3 is PR_SET_MM_EXE_FILE,  and  this  the
1222              second  attempt to change the /proc/pid/exe symbolic link, which
1223              is prohibited.
1224
1225       EFAULT arg2 is an invalid address.
1226
1227       EFAULT option is PR_SET_SECCOMP, arg2 is SECCOMP_MODE_FILTER, the  sys‐
1228              tem was built with CONFIG_SECCOMP_FILTER, and arg3 is an invalid
1229              address.
1230
1231       EFAULT option is PR_SET_SYSCALL_USER_DISPATCH and arg5 has  an  invalid
1232              address.
1233
1234       EINVAL The  value of option is not recognized, or not supported on this
1235              system.
1236
1237       EINVAL option is PR_MCE_KILL or PR_MCE_KILL_GET or PR_SET_MM,  and  un‐
1238              used prctl() arguments were not specified as zero.
1239
1240       EINVAL arg2 is not valid value for this option.
1241
1242       EINVAL option  is  PR_SET_SECCOMP or PR_GET_SECCOMP, and the kernel was
1243              not configured with CONFIG_SECCOMP.
1244
1245       EINVAL option is PR_SET_SECCOMP, arg2 is SECCOMP_MODE_FILTER,  and  the
1246              kernel was not configured with CONFIG_SECCOMP_FILTER.
1247
1248       EINVAL option is PR_SET_MM, and one of the following is true
1249
1250arg4 or arg5 is nonzero;
1251
1252arg3  is greater than TASK_SIZE (the limit on the size of the
1253                 user address space for this architecture);
1254
1255arg2     is     PR_SET_MM_START_CODE,     PR_SET_MM_END_CODE,
1256                 PR_SET_MM_START_DATA,          PR_SET_MM_END_DATA,         or
1257                 PR_SET_MM_START_STACK, and the permissions of the correspond‐
1258                 ing memory area are not as required;
1259
1260arg2  is  PR_SET_MM_START_BRK  or  PR_SET_MM_BRK, and arg3 is
1261                 less than or equal to the end of the data segment  or  speci‐
1262                 fies  a value that would cause the RLIMIT_DATA resource limit
1263                 to be exceeded.
1264
1265       EINVAL option is PR_SET_PTRACER and arg2 is not 0,  PR_SET_PTRACER_ANY,
1266              or the PID of an existing process.
1267
1268       EINVAL option  is  PR_SET_PDEATHSIG and arg2 is not a valid signal num‐
1269              ber.
1270
1271       EINVAL option is PR_SET_DUMPABLE and arg2 is neither  SUID_DUMP_DISABLE
1272              nor SUID_DUMP_USER.
1273
1274       EINVAL option is PR_SET_TIMING and arg2 is not PR_TIMING_STATISTICAL.
1275
1276       EINVAL option  is  PR_SET_NO_NEW_PRIVS  and  arg2  is not equal to 1 or
1277              arg3, arg4, or arg5 is nonzero.
1278
1279       EINVAL option is PR_GET_NO_NEW_PRIVS and arg2, arg3, arg4, or  arg5  is
1280              nonzero.
1281
1282       EINVAL option is PR_SET_THP_DISABLE and arg3, arg4, or arg5 is nonzero.
1283
1284       EINVAL option  is  PR_GET_THP_DISABLE  and arg2, arg3, arg4, or arg5 is
1285              nonzero.
1286
1287       EINVAL option is PR_CAP_AMBIENT and an unused argument (arg4, arg5, or,
1288              in  the  case  of PR_CAP_AMBIENT_CLEAR_ALL, arg3) is nonzero; or
1289              arg2 has an invalid  value;  or  arg2  is  PR_CAP_AMBIENT_LOWER,
1290              PR_CAP_AMBIENT_RAISE, or PR_CAP_AMBIENT_IS_SET and arg3 does not
1291              specify a valid capability.
1292
1293       EINVAL option was  PR_GET_SPECULATION_CTRL  or  PR_SET_SPECULATION_CTRL
1294              and  unused  arguments  to  prctl() are not 0.  EINVAL option is
1295              PR_PAC_RESET_KEYS and the arguments are invalid or  unsupported.
1296              See the description of PR_PAC_RESET_KEYS above for details.
1297
1298       EINVAL option  is PR_SVE_SET_VL and the arguments are invalid or unsup‐
1299              ported, or SVE is not available on this platform.  See  the  de‐
1300              scription of PR_SVE_SET_VL above for details.
1301
1302       EINVAL option  is  PR_SVE_GET_VL and SVE is not available on this plat‐
1303              form.
1304
1305       EINVAL option is PR_SET_SYSCALL_USER_DISPATCH and one of the  following
1306              is true:
1307
1308arg2  is  PR_SYS_DISPATCH_OFF and the remaining arguments are
1309                 not 0;
1310
1311arg2 is PR_SYS_DISPATCH_ON and the memory range specified  is
1312                 outside the address space of the process.
1313
1314arg2 is invalid.
1315
1316       EINVAL option  is PR_SET_TAGGED_ADDR_CTRL and the arguments are invalid
1317              or unsupported.  See the description of  PR_SET_TAGGED_ADDR_CTRL
1318              above for details.
1319
1320       EINVAL option  is PR_GET_TAGGED_ADDR_CTRL and the arguments are invalid
1321              or unsupported.  See the description of  PR_GET_TAGGED_ADDR_CTRL
1322              above for details.
1323
1324       ENODEV option  was  PR_SET_SPECULATION_CTRL  the kernel or CPU does not
1325              support the requested speculation misfeature.
1326
1327       ENXIO  option was PR_MPX_ENABLE_MANAGEMENT or PR_MPX_DISABLE_MANAGEMENT
1328              and  the  kernel  or  the  CPU  does not support MPX management.
1329              Check that the kernel and processor have MPX support.
1330
1331       ENXIO  option was PR_SET_SPECULATION_CTRL implies that the  control  of
1332              the  selected  speculation  misfeature  is  not  possible.   See
1333              PR_GET_SPECULATION_CTRL for the bit fields  to  determine  which
1334              option is available.
1335
1336       EOPNOTSUPP
1337              option  is PR_SET_FP_MODE and arg2 has an invalid or unsupported
1338              value.
1339
1340       EPERM  option is PR_SET_SECUREBITS, and the caller does  not  have  the
1341              CAP_SETPCAP  capability,  or  tried to unset a "locked" flag, or
1342              tried to set a flag whose corresponding locked flag was set (see
1343              capabilities(7)).
1344
1345       EPERM  option  is  PR_SET_SPECULATION_CTRL  wherein the speculation was
1346              disabled with PR_SPEC_FORCE_DISABLE and caller tried  to  enable
1347              it again.
1348
1349       EPERM  option      is     PR_SET_KEEPCAPS,     and     the     caller's
1350              SECBIT_KEEP_CAPS_LOCKED flag is set (see capabilities(7)).
1351
1352       EPERM  option is PR_CAPBSET_DROP, and the  caller  does  not  have  the
1353              CAP_SETPCAP capability.
1354
1355       EPERM  option   is   PR_SET_MM,  and  the  caller  does  not  have  the
1356              CAP_SYS_RESOURCE capability.
1357
1358       EPERM  option is PR_CAP_AMBIENT and arg2 is  PR_CAP_AMBIENT_RAISE,  but
1359              either  the  capability  specified in arg3 is not present in the
1360              process's permitted and  inheritable  capability  sets,  or  the
1361              PR_CAP_AMBIENT_LOWER securebit has been set.
1362
1363       ERANGE option  was  PR_SET_SPECULATION_CTRL and arg3 is not PR_SPEC_EN‐
1364              ABLE, PR_SPEC_DISABLE, PR_SPEC_FORCE_DISABLE,  nor  PR_SPEC_DIS‐
1365              ABLE_NOEXEC.
1366

VERSIONS

1368       IRIX  has  a  prctl()  system  call (also introduced in Linux 2.1.44 as
1369       irix_prctl on the MIPS architecture), with prototype
1370
1371           ptrdiff_t prctl(int option, int arg2, int arg3);
1372
1373       and options to get the maximum number of processes per  user,  get  the
1374       maximum  number  of  processors  the  calling process can use, find out
1375       whether a specified process is currently blocked, get or set the  maxi‐
1376       mum stack size, and so on.
1377

STANDARDS

1379       Linux.
1380

HISTORY

1382       Linux 2.1.57, glibc 2.0.6
1383

SEE ALSO

1385       signal(2), core(5)
1386
1387
1388
1389Linux man-pages 6.05              2023-07-28                          prctl(2)
Impressum