1CLONE(2)                   Linux Programmer's Manual                  CLONE(2)
2
3
4

NAME

6       clone, __clone2 - create a child process
7

SYNOPSIS

9       /* Prototype for the glibc wrapper function */
10
11       #define _GNU_SOURCE
12       #include <sched.h>
13
14       int clone(int (*fn)(void *), void *child_stack,
15                 int flags, void *arg, ...
16                 /* pid_t *ptid, void *newtls, pid_t *ctid */ );
17
18       /* For the prototype of the raw system call, see NOTES */
19

DESCRIPTION

21       clone() creates a new process, in a manner similar to fork(2).
22
23       This  page  describes  both  the glibc clone() wrapper function and the
24       underlying system call on which it is based.  The main  text  describes
25       the  wrapper  function;  the  differences  for  the raw system call are
26       described toward the end of this page.
27
28       Unlike fork(2), clone() allows the child process to share parts of  its
29       execution context with the calling process, such as the virtual address
30       space, the table of file descriptors, and the table of signal handlers.
31       (Note  that on this manual page, "calling process" normally corresponds
32       to "parent process".  But see the description of CLONE_PARENT below.)
33
34       One use of clone() is to implement threads: multiple flows  of  control
35       in a program that run concurrently in a shared address space.
36
37       When  the child process is created with clone(), it commences execution
38       by calling the function pointed to by the argument fn.   (This  differs
39       from  fork(2), where execution continues in the child from the point of
40       the fork(2) call.)  The arg argument is passed as the argument  of  the
41       function fn.
42
43       When  the  fn(arg) function returns, the child process terminates.  The
44       integer returned by fn is the exit status for the child  process.   The
45       child process may also terminate explicitly by calling exit(2) or after
46       receiving a fatal signal.
47
48       The child_stack argument specifies the location of the  stack  used  by
49       the  child process.  Since the child and calling process may share mem‐
50       ory, it is not possible for the child process to execute  in  the  same
51       stack  as  the calling process.  The calling process must therefore set
52       up memory space for the child stack and pass a pointer to this space to
53       clone().  Stacks grow downward on all processors that run Linux (except
54       the HP PA processors), so child_stack usually  points  to  the  topmost
55       address of the memory space set up for the child stack.
56
57       The  low  byte  of  flags contains the number of the termination signal
58       sent to the parent when the child dies.  If this signal is specified as
59       anything  other  than SIGCHLD, then the parent process must specify the
60       __WALL or __WCLONE options when waiting for the child with wait(2).  If
61       no  signal  is  specified, then the parent process is not signaled when
62       the child terminates.
63
64       flags may also be bitwise-ORed with zero or more of the following  con‐
65       stants,  in order to specify what is shared between the calling process
66       and the child process:
67
68       CLONE_CHILD_CLEARTID (since Linux 2.5.49)
69              Clear (zero) the child thread ID at the location ctid  in  child
70              memory  when  the  child  exits, and do a wakeup on the futex at
71              that address.  The  address  involved  may  be  changed  by  the
72              set_tid_address(2)  system  call.   This  is  used  by threading
73              libraries.
74
75       CLONE_CHILD_SETTID (since Linux 2.5.49)
76              Store the child thread ID at the location ctid  in  the  child's
77              memory.   The  store  operation completes before clone() returns
78              control to user space in the  child  process.   (Note  that  the
79              store operation may not have completed before clone() returns in
80              the parent process, which will be relevant if the CLONE_VM  flag
81              is also employed.)
82
83       CLONE_FILES (since Linux 2.0)
84              If CLONE_FILES is set, the calling process and the child process
85              share the same file descriptor table.  Any file descriptor  cre‐
86              ated  by  the  calling  process  or by the child process is also
87              valid in the other process.  Similarly, if one of the  processes
88              closes a file descriptor, or changes its associated flags (using
89              the fcntl(2) F_SETFD  operation),  the  other  process  is  also
90              affected.   If  a  process sharing a file descriptor table calls
91              execve(2), its file descriptor table is duplicated (unshared).
92
93              If CLONE_FILES is not set, the child process inherits a copy  of
94              all  file  descriptors opened in the calling process at the time
95              of clone().  Subsequent  operations  that  open  or  close  file
96              descriptors,  or  change  file  descriptor  flags,  performed by
97              either the calling process or the child process  do  not  affect
98              the  other  process.   Note,  however,  that the duplicated file
99              descriptors in the child refer to the same  open  file  descrip‐
100              tions  as  the  corresponding  file  descriptors  in the calling
101              process, and thus share file offsets and file status flags  (see
102              open(2)).
103
104       CLONE_FS (since Linux 2.0)
105              If  CLONE_FS  is set, the caller and the child process share the
106              same filesystem information.  This  includes  the  root  of  the
107              filesystem,  the  current working directory, and the umask.  Any
108              call to chroot(2), chdir(2), or umask(2) performed by the  call‐
109              ing process or the child process also affects the other process.
110
111              If CLONE_FS is not set, the child process works on a copy of the
112              filesystem information of the calling process at the time of the
113              clone()  call.   Calls  to chroot(2), chdir(2), or umask(2) per‐
114              formed later by one of the processes do  not  affect  the  other
115              process.
116
117       CLONE_IO (since Linux 2.6.25)
118              If  CLONE_IO  is set, then the new process shares an I/O context
119              with the calling process.  If this flag is  not  set,  then  (as
120              with fork(2)) the new process has its own I/O context.
121
122              The  I/O  context  is the I/O scope of the disk scheduler (i.e.,
123              what the I/O scheduler uses to model scheduling of  a  process's
124              I/O).  If processes share the same I/O context, they are treated
125              as one by the I/O scheduler.  As  a  consequence,  they  get  to
126              share  disk  time.   For  some  I/O schedulers, if two processes
127              share an I/O context, they will be allowed to  interleave  their
128              disk  access.  If several threads are doing I/O on behalf of the
129              same process (aio_read(3), for  instance),  they  should  employ
130              CLONE_IO to get better I/O performance.
131
132              If  the  kernel  is not configured with the CONFIG_BLOCK option,
133              this flag is a no-op.
134
135       CLONE_NEWCGROUP (since Linux 4.6)
136              Create the process in a new cgroup namespace.  If this  flag  is
137              not  set,  then  (as with fork(2)) the process is created in the
138              same cgroup namespaces as the calling  process.   This  flag  is
139              intended for the implementation of containers.
140
141              For  further information on cgroup namespaces, see cgroup_names‐
142              paces(7).
143
144              Only a privileged process (CAP_SYS_ADMIN) can employ CLONE_NEWC‐
145              GROUP.
146
147       CLONE_NEWIPC (since Linux 2.6.19)
148              If  CLONE_NEWIPC  is  set,  then create the process in a new IPC
149              namespace.  If this flag is not set, then (as with fork(2)), the
150              process  is  created  in  the  same IPC namespace as the calling
151              process.  This flag is intended for the implementation  of  con‐
152              tainers.
153
154              An  IPC  namespace  provides  an  isolated  view of System V IPC
155              objects (see sysvipc(7)) and (since Linux 2.6.30) POSIX  message
156              queues (see mq_overview(7)).  The common characteristic of these
157              IPC mechanisms is that IPC objects are identified by  mechanisms
158              other than filesystem pathnames.
159
160              Objects  created  in  an  IPC namespace are visible to all other
161              processes that are members of that namespace, but are not  visi‐
162              ble to processes in other IPC namespaces.
163
164              When  an IPC namespace is destroyed (i.e., when the last process
165              that is a member of the namespace terminates), all  IPC  objects
166              in the namespace are automatically destroyed.
167
168              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
169              CLONE_NEWIPC.  This flag can't be specified in conjunction  with
170              CLONE_SYSVSEM.
171
172              For further information on IPC namespaces, see namespaces(7).
173
174       CLONE_NEWNET (since Linux 2.6.24)
175              (The  implementation  of  this  flag was completed only by about
176              kernel version 2.6.29.)
177
178              If CLONE_NEWNET is set, then create the process in a new network
179              namespace.   If this flag is not set, then (as with fork(2)) the
180              process is created in the same network namespace as the  calling
181              process.   This  flag is intended for the implementation of con‐
182              tainers.
183
184              A network namespace provides an isolated view of the  networking
185              stack (network device interfaces, IPv4 and IPv6 protocol stacks,
186              IP  routing  tables,   firewall   rules,   the   /proc/net   and
187              /sys/class/net directory trees, sockets, etc.).  A physical net‐
188              work device can live in exactly one network namespace.   A  vir‐
189              tual network (veth(4)) device pair provides a pipe-like abstrac‐
190              tion that can be used to create tunnels between  network  names‐
191              paces,  and can be used to create a bridge to a physical network
192              device in another namespace.
193
194              When a network namespace is freed (i.e., when the  last  process
195              in  the  namespace terminates), its physical network devices are
196              moved back to the initial network namespace (not to  the  parent
197              of the process).  For further information on network namespaces,
198              see namespaces(7).
199
200              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
201              CLONE_NEWNET.
202
203       CLONE_NEWNS (since Linux 2.4.19)
204              If  CLONE_NEWNS  is  set,  the  cloned child is started in a new
205              mount namespace, initialized with a copy of the namespace of the
206              parent.   If CLONE_NEWNS is not set, the child lives in the same
207              mount namespace as the parent.
208
209              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
210              CLONE_NEWNS.   It  is  not permitted to specify both CLONE_NEWNS
211              and CLONE_FS in the same clone() call.
212
213              For further information on mount namespaces,  see  namespaces(7)
214              and mount_namespaces(7).
215
216       CLONE_NEWPID (since Linux 2.6.24)
217              If  CLONE_NEWPID  is  set,  then create the process in a new PID
218              namespace.  If this flag is not set, then (as with fork(2))  the
219              process  is  created  in  the  same PID namespace as the calling
220              process.  This flag is intended for the implementation  of  con‐
221              tainers.
222
223              For further information on PID namespaces, see namespaces(7) and
224              pid_namespaces(7).
225
226              Only a privileged process (CAP_SYS_ADMIN) can employ  CLONE_NEW‐
227              PID.    This   flag  can't  be  specified  in  conjunction  with
228              CLONE_THREAD or CLONE_PARENT.
229
230       CLONE_NEWUSER
231              (This flag first became meaningful for clone() in Linux  2.6.23,
232              the  current clone() semantics were merged in Linux 3.5, and the
233              final pieces to make the user namespaces completely usable  were
234              merged in Linux 3.8.)
235
236              If  CLONE_NEWUSER  is set, then create the process in a new user
237              namespace.  If this flag is not set, then (as with fork(2))  the
238              process  is  created  in  the same user namespace as the calling
239              process.
240
241              Before Linux 3.8, use of CLONE_NEWUSER required that the  caller
242              have three capabilities: CAP_SYS_ADMIN, CAP_SETUID, and CAP_SET‐
243              GID.  Starting with Linux 3.8, no privileges are needed to  cre‐
244              ate a user namespace.
245
246              This flag can't be specified in conjunction with CLONE_THREAD or
247              CLONE_PARENT.  For security  reasons,  CLONE_NEWUSER  cannot  be
248              specified in conjunction with CLONE_FS.
249
250              For  further  information  on user namespaces, see namespaces(7)
251              and user_namespaces(7).
252
253       CLONE_NEWUTS (since Linux 2.6.19)
254              If CLONE_NEWUTS is set, then create the process  in  a  new  UTS
255              namespace,  whose identifiers are initialized by duplicating the
256              identifiers from the UTS namespace of the calling  process.   If
257              this flag is not set, then (as with fork(2)) the process is cre‐
258              ated in the same UTS namespace as  the  calling  process.   This
259              flag is intended for the implementation of containers.
260
261              A  UTS namespace is the set of identifiers returned by uname(2);
262              among these, the domain name and the hostname can be modified by
263              setdomainname(2) and sethostname(2), respectively.  Changes made
264              to the identifiers in a UTS namespace are visible to  all  other
265              processes  in  the  same  namespace, but are not visible to pro‐
266              cesses in other UTS namespaces.
267
268              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
269              CLONE_NEWUTS.
270
271              For further information on UTS namespaces, see namespaces(7).
272
273       CLONE_PARENT (since Linux 2.3.12)
274              If  CLONE_PARENT  is  set,  then the parent of the new child (as
275              returned by getppid(2)) will be the same as that of the  calling
276              process.
277
278              If  CLONE_PARENT  is not set, then (as with fork(2)) the child's
279              parent is the calling process.
280
281              Note that it is the parent process, as returned  by  getppid(2),
282              which  is  signaled  when  the  child  terminates,  so  that  if
283              CLONE_PARENT is set, then the parent  of  the  calling  process,
284              rather than the calling process itself, will be signaled.
285
286       CLONE_PARENT_SETTID (since Linux 2.5.49)
287              Store  the  child thread ID at the location ptid in the parent's
288              memory.  (In Linux 2.5.32-2.5.48 there was a  flag  CLONE_SETTID
289              that  did  this.)   The store operation completes before clone()
290              returns control to user space.
291
292       CLONE_PID (Linux 2.0 to 2.5.15)
293              If CLONE_PID is set, the child process is created with the  same
294              process ID as the calling process.  This is good for hacking the
295              system, but otherwise  of  not  much  use.   From  Linux  2.3.21
296              onward,  this  flag  could  be specified only by the system boot
297              process (PID 0).  The flag disappeared completely from the  ker‐
298              nel  sources  in  Linux 2.5.16.  Since then, the kernel silently
299              ignores this bit if it is specified in flags.
300
301       CLONE_PTRACE (since Linux 2.2)
302              If CLONE_PTRACE is specified, and the calling process  is  being
303              traced, then trace the child also (see ptrace(2)).
304
305       CLONE_SETTLS (since Linux 2.5.32)
306              The TLS (Thread Local Storage) descriptor is set to newtls.
307
308              The  interpretation of newtls and the resulting effect is archi‐
309              tecture dependent.  On x86, newtls is interpreted  as  a  struct
310              user_desc *  (see  set_thread_area(2)).  On x86-64 it is the new
311              value to be set for the %fs base register (see  the  ARCH_SET_FS
312              argument  to  arch_prctl(2)).  On architectures with a dedicated
313              TLS register, it is the new value of that register.
314
315       CLONE_SIGHAND (since Linux 2.0)
316              If CLONE_SIGHAND is set,  the  calling  process  and  the  child
317              process share the same table of signal handlers.  If the calling
318              process or child process calls sigaction(2) to change the behav‐
319              ior  associated  with  a  signal, the behavior is changed in the
320              other process as well.  However, the calling process  and  child
321              processes  still  have distinct signal masks and sets of pending
322              signals.  So, one of them may block  or  unblock  signals  using
323              sigprocmask(2) without affecting the other process.
324
325              If  CLONE_SIGHAND  is not set, the child process inherits a copy
326              of the signal handlers  of  the  calling  process  at  the  time
327              clone() is called.  Calls to sigaction(2) performed later by one
328              of the processes have no effect on the other process.
329
330              Since  Linux  2.6.0,  flags  must  also  include   CLONE_VM   if
331              CLONE_SIGHAND is specified
332
333       CLONE_STOPPED (since Linux 2.6.0)
334              If CLONE_STOPPED is set, then the child is initially stopped (as
335              though it was sent a SIGSTOP signal), and  must  be  resumed  by
336              sending it a SIGCONT signal.
337
338              This  flag  was  deprecated  from  Linux  2.6.25 onward, and was
339              removed altogether in Linux  2.6.38.   Since  then,  the  kernel
340              silently ignores it without error.  Starting with Linux 4.6, the
341              same bit was reused for the CLONE_NEWCGROUP flag.
342
343       CLONE_SYSVSEM (since Linux 2.5.10)
344              If CLONE_SYSVSEM is set, then the child and the calling  process
345              share  a  single  list of System V semaphore adjustment (semadj)
346              values (see semop(2)).  In this case, the  shared  list  accumu‐
347              lates  semadj  values across all processes sharing the list, and
348              semaphore adjustments are performed only when the  last  process
349              that  is sharing the list terminates (or ceases sharing the list
350              using unshare(2)).  If this flag is not set, then the child  has
351              a separate semadj list that is initially empty.
352
353       CLONE_THREAD (since Linux 2.4.0)
354              If  CLONE_THREAD  is set, the child is placed in the same thread
355              group as the calling process.  To make the remainder of the dis‐
356              cussion of CLONE_THREAD more readable, the term "thread" is used
357              to refer to the processes within a thread group.
358
359              Thread groups were a feature added in Linux 2.4 to  support  the
360              POSIX  threads  notion  of  a set of threads that share a single
361              PID.  Internally, this shared PID is the so-called thread  group
362              identifier  (TGID) for the thread group.  Since Linux 2.4, calls
363              to getpid(2) return the TGID of the caller.
364
365              The threads within a group can be distinguished by  their  (sys‐
366              tem-wide) unique thread IDs (TID).  A new thread's TID is avail‐
367              able as the function result returned to the caller  of  clone(),
368              and a thread can obtain its own TID using gettid(2).
369
370              When  a call is made to clone() without specifying CLONE_THREAD,
371              then the resulting thread is placed in a new thread group  whose
372              TGID is the same as the thread's TID.  This thread is the leader
373              of the new thread group.
374
375              A new thread created  with  CLONE_THREAD  has  the  same  parent
376              process  as  the caller of clone() (i.e., like CLONE_PARENT), so
377              that calls to getppid(2) return the same value for  all  of  the
378              threads  in  a  thread group.  When a CLONE_THREAD thread termi‐
379              nates, the thread that created it using clone() is  not  sent  a
380              SIGCHLD  (or  other  termination)  signal; nor can the status of
381              such a thread be obtained using wait(2).  (The thread is said to
382              be detached.)
383
384              After  all of the threads in a thread group terminate the parent
385              process of the thread group is sent a SIGCHLD (or other termina‐
386              tion) signal.
387
388              If  any  of the threads in a thread group performs an execve(2),
389              then all threads other than the thread group leader  are  termi‐
390              nated,  and  the  new  program  is  executed in the thread group
391              leader.
392
393              If one of the threads in a thread group creates  a  child  using
394              fork(2),  then  any  thread  in  the  group can wait(2) for that
395              child.
396
397              Since Linux 2.5.35, flags must  also  include  CLONE_SIGHAND  if
398              CLONE_THREAD  is  specified  (and  note that, since Linux 2.6.0,
399              CLONE_SIGHAND also requires CLONE_VM to be included).
400
401              Signal dispositions and actions are process-wide: if  an  unhan‐
402              dled  signal is delivered to a thread, then it will affect (ter‐
403              minate, stop, continue, be ignored in) all members of the thread
404              group.
405
406              Each thread has its own signal mask, as set by sigprocmask(2).
407
408              A signal may be process-directed or thread-directed.  A process-
409              directed signal is targeted at a thread group  (i.e.,  a  TGID),
410              and  is  delivered  to an arbitrarily selected thread from among
411              those that are not blocking the signal.  A signal may be process
412              directed  because  it  was  generated  by the kernel for reasons
413              other than a hardware exception, or because it  was  sent  using
414              kill(2) or sigqueue(3).  A thread-directed signal is targeted at
415              (i.e., delivered to) a specific thread.  A signal may be  thread
416              directed    because    it    was   sent   using   tgkill(2)   or
417              pthread_sigqueue(3), or because the thread  executed  a  machine
418              language  instruction that triggered a hardware exception (e.g.,
419              invalid memory access triggering  SIGSEGV  or  a  floating-point
420              exception triggering SIGFPE).
421
422              A  call  to sigpending(2) returns a signal set that is the union
423              of the pending process-directed signals and the signals that are
424              pending for the calling thread.
425
426              If a process-directed signal is delivered to a thread group, and
427              the thread group has installed a handler for  the  signal,  then
428              the handler will be invoked in exactly one, arbitrarily selected
429              member of the thread group that has not blocked the signal.   If
430              multiple  threads in a group are waiting to accept the same sig‐
431              nal using sigwaitinfo(2), the kernel will arbitrarily select one
432              of these threads to receive the signal.
433
434       CLONE_UNTRACED (since Linux 2.5.46)
435              If  CLONE_UNTRACED  is  specified, then a tracing process cannot
436              force CLONE_PTRACE on this child process.
437
438       CLONE_VFORK (since Linux 2.2)
439              If CLONE_VFORK is set, the execution of the calling  process  is
440              suspended  until the child releases its virtual memory resources
441              via a call to execve(2) or _exit(2) (as with vfork(2)).
442
443              If CLONE_VFORK is not set, then both the calling process and the
444              child  are schedulable after the call, and an application should
445              not rely on execution occurring in any particular order.
446
447       CLONE_VM (since Linux 2.0)
448              If CLONE_VM is set, the calling process and  the  child  process
449              run in the same memory space.  In particular, memory writes per‐
450              formed by the calling process or by the child process  are  also
451              visible  in  the other process.  Moreover, any memory mapping or
452              unmapping performed with mmap(2) or munmap(2) by  the  child  or
453              calling process also affects the other process.
454
455              If  CLONE_VM  is  not  set, the child process runs in a separate
456              copy of the memory space of the calling process at the  time  of
457              clone().  Memory writes or file mappings/unmappings performed by
458              one of the processes do not affect the other, as with fork(2).
459

NOTES

461       Note that the glibc clone() wrapper function makes some changes in  the
462       memory  pointed to by child_stack (changes required to set the stack up
463       correctly for the child) before invoking the clone() system call.   So,
464       in  cases  where clone() is used to recursively create children, do not
465       use the buffer employed for the parent's stack  as  the  stack  of  the
466       child.
467
468   C library/kernel differences
469       The raw clone() system call corresponds more closely to fork(2) in that
470       execution in the child continues from the point of the call.  As  such,
471       the fn and arg arguments of the clone() wrapper function are omitted.
472
473       Another  difference  for  the  raw  clone()  system  call  is  that the
474       child_stack argument may be NULL, in which case the child uses a dupli‐
475       cate  of  the parent's stack.  (Copy-on-write semantics ensure that the
476       child gets separate copies of stack pages when either process  modifies
477       the  stack.)   In this case, for correct operation, the CLONE_VM option
478       should not be specified.  (If the  child  shares  the  parent's  memory
479       because of the use of the CLONE_VM flag, then no copy-on-write duplica‐
480       tion occurs and chaos is likely to result.)
481
482       The order of the arguments also differs in the  raw  system  call,  and
483       there are variations in the arguments across architectures, as detailed
484       in the following paragraphs.
485
486       The raw system call interface on x86-64 and  some  other  architectures
487       (including sh, tile, ia-64, and alpha) is:
488
489           long clone(unsigned long flags, void *child_stack,
490                      int *ptid, int *ctid,
491                      unsigned long newtls);
492
493       On  x86-32,  and  several  other common architectures (including score,
494       ARM, ARM 64, PA-RISC, arc, Power PC, xtensa, and MIPS),  the  order  of
495       the last two arguments is reversed:
496
497           long clone(unsigned long flags, void *child_stack,
498                     int *ptid, unsigned long newtls,
499                     int *ctid);
500
501       On  the  cris  and s390 architectures, the order of the first two argu‐
502       ments is reversed:
503
504           long clone(void *child_stack, unsigned long flags,
505                      int *ptid, int *ctid,
506                      unsigned long newtls);
507
508       On the microblaze architecture, an additional argument is supplied:
509
510           long clone(unsigned long flags, void *child_stack,
511                      int stack_size,         /* Size of stack */
512                      int *ptid, int *ctid,
513                      unsigned long newtls);
514
515   blackfin, m68k, and sparc
516       The argument-passing conventions on blackfin, m68k, and sparc are  dif‐
517       ferent  from  the descriptions above.  For details, see the kernel (and
518       glibc) source.
519
520   ia64
521       On ia64, a different interface is used:
522
523           int __clone2(int (*fn)(void *),
524                        void *child_stack_base, size_t stack_size,
525                        int flags, void *arg, ...
526                     /* pid_t *ptid, struct user_desc *tls, pid_t *ctid */ );
527
528       The prototype shown above is for the glibc wrapper  function;  for  the
529       system  call  itself,  the prototype can be described as follows (it is
530       identical to the clone() prototype on microblaze):
531
532           long clone2(unsigned long flags, void *child_stack_base,
533                       int stack_size,         /* Size of stack */
534                       int *ptid, int *ctid,
535                       unsigned long tls);
536
537       __clone2()  operates  in  the  same  way  as   clone(),   except   that
538       child_stack_base  points  to  the  lowest  address of the child's stack
539       area, and stack_size specifies the size of  the  stack  pointed  to  by
540       child_stack_base.
541
542   Linux 2.4 and earlier
543       In  Linux  2.4  and earlier, clone() does not take arguments ptid, tls,
544       and ctid.
545

RETURN VALUE

547       On success, the thread ID of the child process is returned in the call‐
548       er's  thread  of execution.  On failure, -1 is returned in the caller's
549       context, no child process will be created, and errno will be set appro‐
550       priately.
551

ERRORS

553       EAGAIN Too many processes are already running; see fork(2).
554
555       EINVAL CLONE_SIGHAND was specified, but CLONE_VM was not.  (Since Linux
556              2.6.0.)
557
558       EINVAL CLONE_THREAD was specified, but CLONE_SIGHAND was  not.   (Since
559              Linux 2.5.35.)
560
561       EINVAL CLONE_THREAD  was  specified, but the current process previously
562              called unshare(2) with the CLONE_NEWPID flag or used setns(2) to
563              reassociate itself with a PID namespace.
564
565       EINVAL Both CLONE_FS and CLONE_NEWNS were specified in flags.
566
567       EINVAL (since Linux 3.9)
568              Both CLONE_NEWUSER and CLONE_FS were specified in flags.
569
570       EINVAL Both CLONE_NEWIPC and CLONE_SYSVSEM were specified in flags.
571
572       EINVAL One (or both) of CLONE_NEWPID or CLONE_NEWUSER and one (or both)
573              of CLONE_THREAD or CLONE_PARENT were specified in flags.
574
575       EINVAL Returned by the  glibc  clone()  wrapper  function  when  fn  or
576              child_stack is specified as NULL.
577
578       EINVAL CLONE_NEWIPC was specified in flags, but the kernel was not con‐
579              figured with the CONFIG_SYSVIPC and CONFIG_IPC_NS options.
580
581       EINVAL CLONE_NEWNET was specified in flags, but the kernel was not con‐
582              figured with the CONFIG_NET_NS option.
583
584       EINVAL CLONE_NEWPID was specified in flags, but the kernel was not con‐
585              figured with the CONFIG_PID_NS option.
586
587       EINVAL CLONE_NEWUSER was specified in flags, but  the  kernel  was  not
588              configured with the CONFIG_USER_NS option.
589
590       EINVAL CLONE_NEWUTS was specified in flags, but the kernel was not con‐
591              figured with the CONFIG_UTS_NS option.
592
593       EINVAL child_stack is not aligned  to  a  suitable  boundary  for  this
594              architecture.   For  example,  on aarch64, child_stack must be a
595              multiple of 16.
596
597       ENOMEM Cannot allocate sufficient memory to allocate a  task  structure
598              for  the  child,  or to copy those parts of the caller's context
599              that need to be copied.
600
601       ENOSPC (since Linux 3.7)
602              CLONE_NEWPID was specified in flags, but the limit on the  nest‐
603              ing  depth  of  PID  namespaces  would  have  been exceeded; see
604              pid_namespaces(7).
605
606       ENOSPC (since Linux 4.9; beforehand EUSERS)
607              CLONE_NEWUSER was specified in flags, and the call  would  cause
608              the  limit  on  the  number  of  nested  user  namespaces  to be
609              exceeded.  See user_namespaces(7).
610
611              From Linux 3.11 to Linux 4.8, the error diagnosed in  this  case
612              was EUSERS.
613
614       ENOSPC (since Linux 4.9)
615              One  of the values in flags specified the creation of a new user
616              namespace, but doing so would have caused the limit  defined  by
617              the  corresponding  file  in /proc/sys/user to be exceeded.  For
618              further details, see namespaces(7).
619
620       EPERM  CLONE_NEWCGROUP,   CLONE_NEWIPC,   CLONE_NEWNET,    CLONE_NEWNS,
621              CLONE_NEWPID,  or  CLONE_NEWUTS was specified by an unprivileged
622              process (process without CAP_SYS_ADMIN).
623
624       EPERM  CLONE_PID was specified by  a  process  other  than  process  0.
625              (This error occurs only on Linux 2.5.15 and earlier.)
626
627       EPERM  CLONE_NEWUSER  was  specified in flags, but either the effective
628              user ID or the effective group ID of the caller does not have  a
629              mapping in the parent namespace (see user_namespaces(7)).
630
631       EPERM (since Linux 3.9)
632              CLONE_NEWUSER  was  specified  in  flags  and the caller is in a
633              chroot environment (i.e., the caller's root directory  does  not
634              match  the  root  directory  of  the mount namespace in which it
635              resides).
636
637       ERESTARTNOINTR (since Linux 2.6.17)
638              System call was interrupted by a signal and will  be  restarted.
639              (This can be seen only during a trace.)
640
641       EUSERS (Linux 3.11 to Linux 4.8)
642              CLONE_NEWUSER  was specified in flags, and the limit on the num‐
643              ber of nested user namespaces would be exceeded.  See  the  dis‐
644              cussion of the ENOSPC error above.
645

CONFORMING TO

647       clone()  is  Linux-specific and should not be used in programs intended
648       to be portable.
649

NOTES

651       The kcmp(2) system call can be used to test whether two processes share
652       various  resources  such as a file descriptor table, System V semaphore
653       undo operations, or a virtual address space.
654
655       Handlers registered using pthread_atfork(3) are not executed  during  a
656       call to clone().
657
658       In  the  Linux  2.4.x  series, CLONE_THREAD generally does not make the
659       parent of the new thread the same as the parent of the calling process.
660       However,  for  kernel  versions  2.4.7  to 2.4.18 the CLONE_THREAD flag
661       implied the CLONE_PARENT flag (as in Linux 2.6.0 and later).
662
663       For a while there was CLONE_DETACHED  (introduced  in  2.5.32):  parent
664       wants no child-exit signal.  In Linux 2.6.2, the need to give this flag
665       together with CLONE_THREAD disappeared.  This flag  is  still  defined,
666       but has no effect.
667
668       On  i386,  clone()  should not be called through vsyscall, but directly
669       through int $0x80.
670

BUGS

672       GNU C library versions 2.3.4 up to and including 2.24 contained a wrap‐
673       per  function  for  getpid(2)  that  performed  caching  of PIDs.  This
674       caching relied on support in the glibc wrapper for clone(), but limita‐
675       tions  in the implementation meant that the cache was not up to date in
676       some circumstances.  In particular, if a signal was  delivered  to  the
677       child immediately after the clone() call, then a call to getpid(2) in a
678       handler for the signal could return the  PID  of  the  calling  process
679       ("the parent"), if the clone wrapper had not yet had a chance to update
680       the PID cache in the child.  (This discussion ignores  the  case  where
681       the  child was created using CLONE_THREAD, when getpid(2) should return
682       the same value in the child and in the  process  that  called  clone(),
683       since  the  caller  and  the  child  are in the same thread group.  The
684       stale-cache problem also does not occur if the flags argument  includes
685       CLONE_VM.)   To  get  the truth, it was sometimes necessary to use code
686       such as the following:
687
688           #include <syscall.h>
689
690           pid_t mypid;
691
692           mypid = syscall(SYS_getpid);
693
694       Because of the stale-cache problem, as well as other problems noted  in
695       getpid(2), the PID caching feature was removed in glibc 2.25.
696

EXAMPLE

698       The following program demonstrates the use of clone() to create a child
699       process that executes in a separate UTS namespace.  The  child  changes
700       the  hostname in its UTS namespace.  Both parent and child then display
701       the system hostname, making it possible to see that the  hostname  dif‐
702       fers  in the UTS namespaces of the parent and child.  For an example of
703       the use of this program, see setns(2).
704
705   Program source
706       #define _GNU_SOURCE
707       #include <sys/wait.h>
708       #include <sys/utsname.h>
709       #include <sched.h>
710       #include <string.h>
711       #include <stdio.h>
712       #include <stdlib.h>
713       #include <unistd.h>
714
715       #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
716                               } while (0)
717
718       static int              /* Start function for cloned child */
719       childFunc(void *arg)
720       {
721           struct utsname uts;
722
723           /* Change hostname in UTS namespace of child */
724
725           if (sethostname(arg, strlen(arg)) == -1)
726               errExit("sethostname");
727
728           /* Retrieve and display hostname */
729
730           if (uname(&uts) == -1)
731               errExit("uname");
732           printf("uts.nodename in child:  %s\n", uts.nodename);
733
734           /* Keep the namespace open for a while, by sleeping.
735              This allows some experimentation--for example, another
736              process might join the namespace. */
737
738           sleep(200);
739
740           return 0;           /* Child terminates now */
741       }
742
743       #define STACK_SIZE (1024 * 1024)    /* Stack size for cloned child */
744
745       int
746       main(int argc, char *argv[])
747       {
748           char *stack;                    /* Start of stack buffer */
749           char *stackTop;                 /* End of stack buffer */
750           pid_t pid;
751           struct utsname uts;
752
753           if (argc < 2) {
754               fprintf(stderr, "Usage: %s <child-hostname>\n", argv[0]);
755               exit(EXIT_SUCCESS);
756           }
757
758           /* Allocate stack for child */
759
760           stack = malloc(STACK_SIZE);
761           if (stack == NULL)
762               errExit("malloc");
763           stackTop = stack + STACK_SIZE;  /* Assume stack grows downward */
764
765           /* Create child that has its own UTS namespace;
766              child commences execution in childFunc() */
767
768           pid = clone(childFunc, stackTop, CLONE_NEWUTS | SIGCHLD, argv[1]);
769           if (pid == -1)
770               errExit("clone");
771           printf("clone() returned %ld\n", (long) pid);
772
773           /* Parent falls through to here */
774
775           sleep(1);           /* Give child time to change its hostname */
776
777           /* Display hostname in parent's UTS namespace. This will be
778              different from hostname in child's UTS namespace. */
779
780           if (uname(&uts) == -1)
781               errExit("uname");
782           printf("uts.nodename in parent: %s\n", uts.nodename);
783
784           if (waitpid(pid, NULL, 0) == -1)    /* Wait for child */
785               errExit("waitpid");
786           printf("child has terminated\n");
787
788           exit(EXIT_SUCCESS);
789       }
790

SEE ALSO

792       fork(2), futex(2), getpid(2), gettid(2),  kcmp(2),  set_thread_area(2),
793       set_tid_address(2),  setns(2), tkill(2), unshare(2), wait(2), capabili‐
794       ties(7), namespaces(7), pthreads(7)
795

COLOPHON

797       This page is part of release 5.02 of the Linux  man-pages  project.   A
798       description  of  the project, information about reporting bugs, and the
799       latest    version    of    this    page,    can     be     found     at
800       https://www.kernel.org/doc/man-pages/.
801
802
803
804Linux                             2019-08-02                          CLONE(2)
Impressum