1CLONE(2)                   Linux Programmer's Manual                  CLONE(2)
2
3
4

NAME

6       clone, __clone2 - create a child process
7

SYNOPSIS

9       /* Prototype for the glibc wrapper function */
10
11       #define _GNU_SOURCE
12       #include <sched.h>
13
14       int clone(int (*fn)(void *), void *child_stack,
15                 int flags, void *arg, ...
16                 /* pid_t *ptid, void *newtls, pid_t *ctid */ );
17
18       /* For the prototype of the raw system call, see NOTES */
19

DESCRIPTION

21       clone() creates a new process, in a manner similar to fork(2).
22
23       This  page  describes  both  the glibc clone() wrapper function and the
24       underlying system call on which it is based.  The main  text  describes
25       the  wrapper  function;  the  differences  for  the raw system call are
26       described toward the end of this page.
27
28       Unlike fork(2), clone() allows the child process to share parts of  its
29       execution context with the calling process, such as the virtual address
30       space, the table of file descriptors, and the table of signal handlers.
31       (Note  that on this manual page, "calling process" normally corresponds
32       to "parent process".  But see the description of CLONE_PARENT below.)
33
34       One use of clone() is to implement threads: multiple flows  of  control
35       in a program that run concurrently in a shared address space.
36
37       When  the child process is created with clone(), it commences execution
38       by calling the function pointed to by the argument fn.   (This  differs
39       from  fork(2), where execution continues in the child from the point of
40       the fork(2) call.)  The arg argument is passed as the argument  of  the
41       function fn.
42
43       When  the  fn(arg) function returns, the child process terminates.  The
44       integer returned by fn is the exit status for the child  process.   The
45       child process may also terminate explicitly by calling exit(2) or after
46       receiving a fatal signal.
47
48       The child_stack argument specifies the location of the  stack  used  by
49       the  child process.  Since the child and calling process may share mem‐
50       ory, it is not possible for the child process to execute  in  the  same
51       stack  as  the calling process.  The calling process must therefore set
52       up memory space for the child stack and pass a pointer to this space to
53       clone().  Stacks grow downward on all processors that run Linux (except
54       the HP PA processors), so child_stack usually  points  to  the  topmost
55       address of the memory space set up for the child stack.
56
57       The  low  byte  of  flags contains the number of the termination signal
58       sent to the parent when the child dies.  If this signal is specified as
59       anything  other  than SIGCHLD, then the parent process must specify the
60       __WALL or __WCLONE options when waiting for the child with wait(2).  If
61       no  signal  is  specified, then the parent process is not signaled when
62       the child terminates.
63
64       flags may also be bitwise-ORed with zero or more of the following  con‐
65       stants,  in order to specify what is shared between the calling process
66       and the child process:
67
68       CLONE_CHILD_CLEARTID (since Linux 2.5.49)
69              Clear (zero) the child thread ID at the location ctid  in  child
70              memory  when  the  child  exits, and do a wakeup on the futex at
71              that address.  The  address  involved  may  be  changed  by  the
72              set_tid_address(2)  system  call.   This  is  used  by threading
73              libraries.
74
75       CLONE_CHILD_SETTID (since Linux 2.5.49)
76              Store the child thread ID at the location ctid  in  the  child's
77              memory.   The  store  operation completes before clone() returns
78              control to user space.
79
80       CLONE_FILES (since Linux 2.0)
81              If CLONE_FILES is set, the calling process and the child process
82              share  the same file descriptor table.  Any file descriptor cre‐
83              ated by the calling process or by  the  child  process  is  also
84              valid  in the other process.  Similarly, if one of the processes
85              closes a file descriptor, or changes its associated flags (using
86              the  fcntl(2)  F_SETFD  operation),  the  other  process is also
87              affected.  If a process sharing a file  descriptor  table  calls
88              execve(2), its file descriptor table is duplicated (unshared).
89
90              If  CLONE_FILES is not set, the child process inherits a copy of
91              all file descriptors opened in the calling process at  the  time
92              of  clone().   Subsequent  operations  that  open  or close file
93              descriptors, or  change  file  descriptor  flags,  performed  by
94              either  the  calling  process or the child process do not affect
95              the other process.  Note,  however,  that  the  duplicated  file
96              descriptors  in  the  child refer to the same open file descrip‐
97              tions as the  corresponding  file  descriptors  in  the  calling
98              process,  and thus share file offsets and file status flags (see
99              open(2)).
100
101       CLONE_FS (since Linux 2.0)
102              If CLONE_FS is set, the caller and the child process  share  the
103              same  filesystem  information.   This  includes  the root of the
104              filesystem, the current working directory, and the  umask.   Any
105              call  to chroot(2), chdir(2), or umask(2) performed by the call‐
106              ing process or the child process also affects the other process.
107
108              If CLONE_FS is not set, the child process works on a copy of the
109              filesystem information of the calling process at the time of the
110              clone() call.  Calls to chroot(2), chdir(2),  or  umask(2)  per‐
111              formed  later  by  one  of the processes do not affect the other
112              process.
113
114       CLONE_IO (since Linux 2.6.25)
115              If CLONE_IO is set, then the new process shares an  I/O  context
116              with  the  calling  process.   If this flag is not set, then (as
117              with fork(2)) the new process has its own I/O context.
118
119              The I/O context is the I/O scope of the  disk  scheduler  (i.e.,
120              what  the  I/O scheduler uses to model scheduling of a process's
121              I/O).  If processes share the same I/O context, they are treated
122              as  one  by  the  I/O  scheduler.  As a consequence, they get to
123              share disk time.  For some  I/O  schedulers,  if  two  processes
124              share  an  I/O context, they will be allowed to interleave their
125              disk access.  If several threads are doing I/O on behalf of  the
126              same  process  (aio_read(3),  for  instance), they should employ
127              CLONE_IO to get better I/O performance.
128
129              If the kernel is not configured with  the  CONFIG_BLOCK  option,
130              this flag is a no-op.
131
132       CLONE_NEWCGROUP (since Linux 4.6)
133              Create  the  process in a new cgroup namespace.  If this flag is
134              not set, then (as with fork(2)) the process is  created  in  the
135              same  cgroup  namespaces  as  the calling process.  This flag is
136              intended for the implementation of containers.
137
138              For further information on cgroup namespaces, see  cgroup_names‐
139              paces(7).
140
141              Only a privileged process (CAP_SYS_ADMIN) can employ CLONE_NEWC‐
142              GROUP.
143
144       CLONE_NEWIPC (since Linux 2.6.19)
145              If CLONE_NEWIPC is set, then create the process  in  a  new  IPC
146              namespace.  If this flag is not set, then (as with fork(2)), the
147              process is created in the same  IPC  namespace  as  the  calling
148              process.   This  flag is intended for the implementation of con‐
149              tainers.
150
151              An IPC namespace provides  an  isolated  view  of  System V  IPC
152              objects  (see  svipc(7))  and (since Linux 2.6.30) POSIX message
153              queues (see mq_overview(7)).  The common characteristic of these
154              IPC  mechanisms is that IPC objects are identified by mechanisms
155              other than filesystem pathnames.
156
157              Objects created in an IPC namespace are  visible  to  all  other
158              processes  that are members of that namespace, but are not visi‐
159              ble to processes in other IPC namespaces.
160
161              When an IPC namespace is destroyed (i.e., when the last  process
162              that  is  a member of the namespace terminates), all IPC objects
163              in the namespace are automatically destroyed.
164
165              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
166              CLONE_NEWIPC.   This flag can't be specified in conjunction with
167              CLONE_SYSVSEM.
168
169              For further information on IPC namespaces, see namespaces(7).
170
171       CLONE_NEWNET (since Linux 2.6.24)
172              (The implementation of this flag was  completed  only  by  about
173              kernel version 2.6.29.)
174
175              If CLONE_NEWNET is set, then create the process in a new network
176              namespace.  If this flag is not set, then (as with fork(2))  the
177              process  is created in the same network namespace as the calling
178              process.  This flag is intended for the implementation  of  con‐
179              tainers.
180
181              A  network namespace provides an isolated view of the networking
182              stack (network device interfaces, IPv4 and IPv6 protocol stacks,
183              IP   routing   tables,   firewall   rules,   the  /proc/net  and
184              /sys/class/net directory trees, sockets, etc.).  A physical net‐
185              work  device  can live in exactly one network namespace.  A vir‐
186              tual network (veth(4)) device pair provides a pipe-like abstrac‐
187              tion  that  can be used to create tunnels between network names‐
188              paces, and can be used to create a bridge to a physical  network
189              device in another namespace.
190
191              When  a  network namespace is freed (i.e., when the last process
192              in the namespace terminates), its physical network  devices  are
193              moved  back  to the initial network namespace (not to the parent
194              of the process).  For further information on network namespaces,
195              see namespaces(7).
196
197              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
198              CLONE_NEWNET.
199
200       CLONE_NEWNS (since Linux 2.4.19)
201              If CLONE_NEWNS is set, the cloned child  is  started  in  a  new
202              mount namespace, initialized with a copy of the namespace of the
203              parent.  If CLONE_NEWNS is not set, the child lives in the  same
204              mount namespace as the parent.
205
206              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
207              CLONE_NEWNS.  It is not permitted to  specify  both  CLONE_NEWNS
208              and CLONE_FS in the same clone() call.
209
210              For  further  information on mount namespaces, see namespaces(7)
211              and mount_namespaces(7).
212
213       CLONE_NEWPID (since Linux 2.6.24)
214              If CLONE_NEWPID is set, then create the process  in  a  new  PID
215              namespace.   If this flag is not set, then (as with fork(2)) the
216              process is created in the same  PID  namespace  as  the  calling
217              process.   This  flag is intended for the implementation of con‐
218              tainers.
219
220              For further information on PID namespaces, see namespaces(7) and
221              pid_namespaces(7).
222
223              Only  a privileged process (CAP_SYS_ADMIN) can employ CLONE_NEW‐
224              PID.   This  flag  can't  be  specified  in   conjunction   with
225              CLONE_THREAD or CLONE_PARENT.
226
227       CLONE_NEWUSER
228              (This  flag first became meaningful for clone() in Linux 2.6.23,
229              the current clone() semantics were merged in Linux 3.5, and  the
230              final  pieces to make the user namespaces completely usable were
231              merged in Linux 3.8.)
232
233              If CLONE_NEWUSER is set, then create the process in a  new  user
234              namespace.   If this flag is not set, then (as with fork(2)) the
235              process is created in the same user  namespace  as  the  calling
236              process.
237
238              Before  Linux 3.8, use of CLONE_NEWUSER required that the caller
239              have three capabilities: CAP_SYS_ADMIN, CAP_SETUID, and CAP_SET‐
240              GID.   Starting with Linux 3.8, no privileges are needed to cre‐
241              ate a user namespace.
242
243              This flag can't be specified in conjunction with CLONE_THREAD or
244              CLONE_PARENT.   For  security  reasons,  CLONE_NEWUSER cannot be
245              specified in conjunction with CLONE_FS.
246
247              For further information on user  namespaces,  see  namespaces(7)
248              and user_namespaces(7).
249
250       CLONE_NEWUTS (since Linux 2.6.19)
251              If  CLONE_NEWUTS  is  set,  then create the process in a new UTS
252              namespace, whose identifiers are initialized by duplicating  the
253              identifiers  from  the UTS namespace of the calling process.  If
254              this flag is not set, then (as with fork(2)) the process is cre‐
255              ated  in  the  same  UTS namespace as the calling process.  This
256              flag is intended for the implementation of containers.
257
258              A UTS namespace is the set of identifiers returned by  uname(2);
259              among these, the domain name and the hostname can be modified by
260              setdomainname(2) and sethostname(2), respectively.  Changes made
261              to  the  identifiers in a UTS namespace are visible to all other
262              processes in the same namespace, but are  not  visible  to  pro‐
263              cesses in other UTS namespaces.
264
265              Only   a   privileged   process   (CAP_SYS_ADMIN)   can   employ
266              CLONE_NEWUTS.
267
268              For further information on UTS namespaces, see namespaces(7).
269
270       CLONE_PARENT (since Linux 2.3.12)
271              If CLONE_PARENT is set, then the parent of  the  new  child  (as
272              returned  by getppid(2)) will be the same as that of the calling
273              process.
274
275              If CLONE_PARENT is not set, then (as with fork(2))  the  child's
276              parent is the calling process.
277
278              Note  that  it is the parent process, as returned by getppid(2),
279              which  is  signaled  when  the  child  terminates,  so  that  if
280              CLONE_PARENT  is  set,  then  the parent of the calling process,
281              rather than the calling process itself, will be signaled.
282
283       CLONE_PARENT_SETTID (since Linux 2.5.49)
284              Store the child thread ID at the location ptid in  the  parent's
285              memory.   (In  Linux 2.5.32-2.5.48 there was a flag CLONE_SETTID
286              that did this.)  The store operation  completes  before  clone()
287              returns control to user space.
288
289       CLONE_PID (Linux 2.0 to 2.5.15)
290              If  CLONE_PID is set, the child process is created with the same
291              process ID as the calling process.  This is good for hacking the
292              system,  but  otherwise  of  not  much  use.   From Linux 2.3.21
293              onward, this flag could be specified only  by  the  system  boot
294              process  (PID 0).  The flag disappeared completely from the ker‐
295              nel sources in Linux 2.5.16.  Since then,  the  kernel  silently
296              ignores this bit if it is specified in flags.
297
298       CLONE_PTRACE (since Linux 2.2)
299              If  CLONE_PTRACE  is specified, and the calling process is being
300              traced, then trace the child also (see ptrace(2)).
301
302       CLONE_SETTLS (since Linux 2.5.32)
303              The TLS (Thread Local Storage) descriptor is set to newtls.
304
305              The interpretation of newtls and the resulting effect is  archi‐
306              tecture  dependent.   On  x86, newtls is interpreted as a struct
307              user_desc * (see set_thread_area(2)).  On x86-64 it is  the  new
308              value  to  be set for the %fs base register (see the ARCH_SET_FS
309              argument to arch_prctl(2)).  On architectures with  a  dedicated
310              TLS register, it is the new value of that register.
311
312       CLONE_SIGHAND (since Linux 2.0)
313              If  CLONE_SIGHAND  is  set,  the  calling  process and the child
314              process share the same table of signal handlers.  If the calling
315              process or child process calls sigaction(2) to change the behav‐
316              ior associated with a signal, the behavior  is  changed  in  the
317              other  process  as well.  However, the calling process and child
318              processes still have distinct signal masks and sets  of  pending
319              signals.   So,  one  of  them may block or unblock signals using
320              sigprocmask(2) without affecting the other process.
321
322              If CLONE_SIGHAND is not set, the child process inherits  a  copy
323              of  the  signal  handlers  of  the  calling  process at the time
324              clone() is called.  Calls to sigaction(2) performed later by one
325              of the processes have no effect on the other process.
326
327              Since  Linux  2.6.0-test6,  flags  must also include CLONE_VM if
328              CLONE_SIGHAND is specified
329
330       CLONE_STOPPED (since Linux 2.6.0-test2)
331              If CLONE_STOPPED is set, then the child is initially stopped (as
332              though  it  was  sent  a SIGSTOP signal), and must be resumed by
333              sending it a SIGCONT signal.
334
335              This flag was deprecated  from  Linux  2.6.25  onward,  and  was
336              removed  altogether  in  Linux  2.6.38.   Since then, the kernel
337              silently ignores it without error.  Starting with Linux 4.6, the
338              same bit was reused for the CLONE_NEWCGROUP flag.
339
340       CLONE_SYSVSEM (since Linux 2.5.10)
341              If  CLONE_SYSVSEM is set, then the child and the calling process
342              share a single list of System V  semaphore  adjustment  (semadj)
343              values  (see  semop(2)).   In this case, the shared list accumu‐
344              lates semadj values across all processes sharing the  list,  and
345              semaphore  adjustments  are performed only when the last process
346              that is sharing the list terminates (or ceases sharing the  list
347              using  unshare(2)).  If this flag is not set, then the child has
348              a separate semadj list that is initially empty.
349
350       CLONE_THREAD (since Linux 2.4.0-test8)
351              If CLONE_THREAD is set, the child is placed in the  same  thread
352              group as the calling process.  To make the remainder of the dis‐
353              cussion of CLONE_THREAD more readable, the term "thread" is used
354              to refer to the processes within a thread group.
355
356              Thread  groups  were a feature added in Linux 2.4 to support the
357              POSIX threads notion of a set of threads  that  share  a  single
358              PID.   Internally, this shared PID is the so-called thread group
359              identifier (TGID) for the thread group.  Since Linux 2.4,  calls
360              to getpid(2) return the TGID of the caller.
361
362              The  threads  within a group can be distinguished by their (sys‐
363              tem-wide) unique thread IDs (TID).  A new thread's TID is avail‐
364              able  as  the function result returned to the caller of clone(),
365              and a thread can obtain its own TID using gettid(2).
366
367              When a call is made to clone() without specifying  CLONE_THREAD,
368              then  the resulting thread is placed in a new thread group whose
369              TGID is the same as the thread's TID.  This thread is the leader
370              of the new thread group.
371
372              A  new  thread  created  with  CLONE_THREAD  has the same parent
373              process as the caller of clone() (i.e., like  CLONE_PARENT),  so
374              that  calls  to  getppid(2) return the same value for all of the
375              threads in a thread group.  When a  CLONE_THREAD  thread  termi‐
376              nates,  the  thread  that created it using clone() is not sent a
377              SIGCHLD (or other termination) signal; nor  can  the  status  of
378              such a thread be obtained using wait(2).  (The thread is said to
379              be detached.)
380
381              After all of the threads in a thread group terminate the  parent
382              process of the thread group is sent a SIGCHLD (or other termina‐
383              tion) signal.
384
385              If any of the threads in a thread group performs  an  execve(2),
386              then  all  threads other than the thread group leader are termi‐
387              nated, and the new program  is  executed  in  the  thread  group
388              leader.
389
390              If  one  of  the threads in a thread group creates a child using
391              fork(2), then any thread in  the  group  can  wait(2)  for  that
392              child.
393
394              Since  Linux  2.5.35,  flags  must also include CLONE_SIGHAND if
395              CLONE_THREAD  is  specified  (and   note   that,   since   Linux
396              2.6.0-test6,   CLONE_SIGHAND   also   requires  CLONE_VM  to  be
397              included).
398
399              Signals may be sent to a thread group as a whole (i.e., a  TGID)
400              using  kill(2),  or  to  a  specific  thread  (i.e.,  TID) using
401              tgkill(2).
402
403              Signal dispositions and actions are process-wide: if  an  unhan‐
404              dled  signal is delivered to a thread, then it will affect (ter‐
405              minate, stop, continue, be ignored in) all members of the thread
406              group.
407
408              Each  thread  has its own signal mask, as set by sigprocmask(2),
409              but signals can be pending either: for the whole process  (i.e.,
410              deliverable  to  any member of the thread group), when sent with
411              kill(2); or for an individual thread, when sent with  tgkill(2).
412              A  call  to sigpending(2) returns a signal set that is the union
413              of the signals pending for the whole  process  and  the  signals
414              that are pending for the calling thread.
415
416              If  kill(2)  is used to send a signal to a thread group, and the
417              thread group has installed a handler for the  signal,  then  the
418              handler  will  be  invoked  in exactly one, arbitrarily selected
419              member of the thread group that has not blocked the signal.   If
420              multiple  threads in a group are waiting to accept the same sig‐
421              nal using sigwaitinfo(2), the kernel will arbitrarily select one
422              of these threads to receive a signal sent using kill(2).
423
424       CLONE_UNTRACED (since Linux 2.5.46)
425              If  CLONE_UNTRACED  is  specified, then a tracing process cannot
426              force CLONE_PTRACE on this child process.
427
428       CLONE_VFORK (since Linux 2.2)
429              If CLONE_VFORK is set, the execution of the calling  process  is
430              suspended  until the child releases its virtual memory resources
431              via a call to execve(2) or _exit(2) (as with vfork(2)).
432
433              If CLONE_VFORK is not set, then both the calling process and the
434              child  are schedulable after the call, and an application should
435              not rely on execution occurring in any particular order.
436
437       CLONE_VM (since Linux 2.0)
438              If CLONE_VM is set, the calling process and  the  child  process
439              run in the same memory space.  In particular, memory writes per‐
440              formed by the calling process or by the child process  are  also
441              visible  in  the other process.  Moreover, any memory mapping or
442              unmapping performed with mmap(2) or munmap(2) by  the  child  or
443              calling process also affects the other process.
444
445              If  CLONE_VM  is  not  set, the child process runs in a separate
446              copy of the memory space of the calling process at the  time  of
447              clone().  Memory writes or file mappings/unmappings performed by
448              one of the processes do not affect the other, as with fork(2).
449

NOTES

451       Note that the glibc clone() wrapper function makes some changes in  the
452       memory  pointed to by child_stack (changes required to set the stack up
453       correctly for the child) before invoking the clone() system call.   So,
454       in  cases  where clone() is used to recursively create children, do not
455       use the buffer employed for the parent's stack  as  the  stack  of  the
456       child.
457
458   C library/kernel differences
459       The raw clone() system call corresponds more closely to fork(2) in that
460       execution in the child continues from the point of the call.  As  such,
461       the fn and arg arguments of the clone() wrapper function are omitted.
462
463       Another  difference  for  the  raw  clone()  system  call  is  that the
464       child_stack argument may be NULL, in which case the child uses a dupli‐
465       cate  of  the parent's stack.  (Copy-on-write semantics ensure that the
466       child gets separate copies of stack pages when either process  modifies
467       the  stack.)   In this case, for correct operation, the CLONE_VM option
468       should not be specified.  (If the  child  shares  the  parent's  memory
469       because of the use of the CLONE_VM flag, then no copy-on-write duplica‐
470       tion occurs and chaos is likely to result.)
471
472       The order of the arguments also differs in the  raw  system  call,  and
473       there are variations in the arguments across architectures, as detailed
474       in the following paragraphs.
475
476       The raw system call interface on x86-64 and  some  other  architectures
477       (including sh, tile, and alpha) is:
478
479           long clone(unsigned long flags, void *child_stack,
480                      int *ptid, int *ctid,
481                      unsigned long newtls);
482
483       On  x86-32,  and  several  other common architectures (including score,
484       ARM, ARM 64, PA-RISC, arc, Power PC, xtensa, and MIPS),  the  order  of
485       the last two arguments is reversed:
486
487           long clone(unsigned long flags, void *child_stack,
488                     int *ptid, unsigned long newtls,
489                     int *ctid);
490
491       On  the  cris  and s390 architectures, the order of the first two argu‐
492       ments is reversed:
493
494           long clone(void *child_stack, unsigned long flags,
495                      int *ptid, int *ctid,
496                      unsigned long newtls);
497
498       On the microblaze architecture, an additional argument is supplied:
499
500           long clone(unsigned long flags, void *child_stack,
501                      int stack_size,         /* Size of stack */
502                      int *ptid, int *ctid,
503                      unsigned long newtls);
504
505   blackfin, m68k, and sparc
506       The argument-passing conventions on blackfin, m68k, and sparc are  dif‐
507       ferent  from  the descriptions above.  For details, see the kernel (and
508       glibc) source.
509
510   ia64
511       On ia64, a different interface is used:
512
513       int __clone2(int (*fn)(void *),
514                    void *child_stack_base, size_t stack_size,
515                    int flags, void *arg, ...
516                 /* pid_t *ptid, struct user_desc *tls, pid_t *ctid */ );
517
518       The prototype shown above is for the glibc wrapper  function;  the  raw
519       system  call interface has no fn or arg argument, and changes the order
520       of the arguments so that flags is the first argument, and  tls  is  the
521       last argument.
522
523       __clone2()   operates   in   the  same  way  as  clone(),  except  that
524       child_stack_base points to the lowest  address  of  the  child's  stack
525       area,  and  stack_size  specifies  the  size of the stack pointed to by
526       child_stack_base.
527
528   Linux 2.4 and earlier
529       In Linux 2.4 and earlier, clone() does not take  arguments  ptid,  tls,
530       and ctid.
531

RETURN VALUE

533       On success, the thread ID of the child process is returned in the call‐
534       er's thread of execution.  On failure, -1 is returned in  the  caller's
535       context, no child process will be created, and errno will be set appro‐
536       priately.
537

ERRORS

539       EAGAIN Too many processes are already running; see fork(2).
540
541       EINVAL CLONE_SIGHAND was specified, but CLONE_VM was not.  (Since Linux
542              2.6.0-test6.)
543
544       EINVAL CLONE_THREAD  was  specified, but CLONE_SIGHAND was not.  (Since
545              Linux 2.5.35.)
546
547       EINVAL Both CLONE_FS and CLONE_NEWNS were specified in flags.
548
549       EINVAL (since Linux 3.9)
550              Both CLONE_NEWUSER and CLONE_FS were specified in flags.
551
552       EINVAL Both CLONE_NEWIPC and CLONE_SYSVSEM were specified in flags.
553
554       EINVAL One (or both) of CLONE_NEWPID or CLONE_NEWUSER and one (or both)
555              of CLONE_THREAD or CLONE_PARENT were specified in flags.
556
557       EINVAL Returned  by  the  glibc  clone()  wrapper  function  when fn or
558              child_stack is specified as NULL.
559
560       EINVAL CLONE_NEWIPC was specified in flags, but the kernel was not con‐
561              figured with the CONFIG_SYSVIPC and CONFIG_IPC_NS options.
562
563       EINVAL CLONE_NEWNET was specified in flags, but the kernel was not con‐
564              figured with the CONFIG_NET_NS option.
565
566       EINVAL CLONE_NEWPID was specified in flags, but the kernel was not con‐
567              figured with the CONFIG_PID_NS option.
568
569       EINVAL CLONE_NEWUTS was specified in flags, but the kernel was not con‐
570              figured with the CONFIG_UTS option.
571
572       EINVAL child_stack is not aligned  to  a  suitable  boundary  for  this
573              architecture.   For  example,  on aarch64, child_stack must be a
574              multiple of 16.
575
576       ENOMEM Cannot allocate sufficient memory to allocate a  task  structure
577              for  the  child,  or to copy those parts of the caller's context
578              that need to be copied.
579
580       ENOSPC (since Linux 3.7)
581              CLONE_NEWPID was specified in flags, but the limit on the  nest‐
582              ing  depth  of  PID  namespaces  would  have  been exceeded; see
583              pid_namespaces(7).
584
585       ENOSPC (since Linux 4.9; beforehand EUSERS)
586              CLONE_NEWUSER was specified in flags, and the call  would  cause
587              the  limit  on  the  number  of  nested  user  namespaces  to be
588              exceeded.  See user_namespaces(7).
589
590              From Linux 3.11 to Linux 4.8, the error diagnosed in  this  case
591              was EUSERS.
592
593       ENOSPC (since Linux 4.9)
594              One  of the values in flags specified the creation of a new user
595              namespace, but doing so would have caused the limit  defined  by
596              the  corresponding  file  in /proc/sys/user to be exceeded.  For
597              further details, see namespaces(7).
598
599       EPERM  CLONE_NEWCGROUP,   CLONE_NEWIPC,   CLONE_NEWNET,    CLONE_NEWNS,
600              CLONE_NEWPID,  or  CLONE_NEWUTS was specified by an unprivileged
601              process (process without CAP_SYS_ADMIN).
602
603       EPERM  CLONE_PID was specified by  a  process  other  than  process  0.
604              (This error occurs only on Linux 2.5.15 and earlier.)
605
606       EPERM  CLONE_NEWUSER  was  specified in flags, but either the effective
607              user ID or the effective group ID of the caller does not have  a
608              mapping in the parent namespace (see user_namespaces(7)).
609
610       EPERM (since Linux 3.9)
611              CLONE_NEWUSER  was  specified  in  flags  and the caller is in a
612              chroot environment (i.e., the caller's root directory  does  not
613              match  the  root  directory  of  the mount namespace in which it
614              resides).
615
616       ERESTARTNOINTR (since Linux 2.6.17)
617              System call was interrupted by a signal and will  be  restarted.
618              (This can be seen only during a trace.)
619
620       EUSERS (Linux 3.11 to Linux 4.8)
621              CLONE_NEWUSER  was specified in flags, and the limit on the num‐
622              ber of nested user namespaces would be exceeded.  See  the  dis‐
623              cussion of the ENOSPC error above.
624

CONFORMING TO

626       clone()  is  Linux-specific and should not be used in programs intended
627       to be portable.
628

NOTES

630       The kcmp(2) system call can be used to test whether two processes share
631       various  resources  such as a file descriptor table, System V semaphore
632       undo operations, or a virtual address space.
633
634       Handlers registered using pthread_atfork(3) are not executed  during  a
635       call to clone().
636
637       In  the  Linux  2.4.x  series, CLONE_THREAD generally does not make the
638       parent of the new thread the same as the parent of the calling process.
639       However,  for  kernel  versions  2.4.7  to 2.4.18 the CLONE_THREAD flag
640       implied the CLONE_PARENT flag (as in Linux 2.6.0 and later).
641
642       For a while there was CLONE_DETACHED  (introduced  in  2.5.32):  parent
643       wants no child-exit signal.  In Linux 2.6.2, the need to give this flag
644       together with CLONE_THREAD disappeared.  This flag  is  still  defined,
645       but has no effect.
646
647       On  i386,  clone()  should not be called through vsyscall, but directly
648       through int $0x80.
649

BUGS

651       GNU C library versions 2.3.4 up to and including 2.24 contained a wrap‐
652       per  function  for  getpid(2)  that  performed  caching  of PIDs.  This
653       caching relied on support in the glibc wrapper for clone(), but limita‐
654       tions  in the implementation meant that the cache was not up to date in
655       some circumstances.  In particular, if a signal was  delivered  to  the
656       child immediately after the clone() call, then a call to getpid(2) in a
657       handler for the signal could return the  PID  of  the  calling  process
658       ("the parent"), if the clone wrapper had not yet had a chance to update
659       the PID cache in the child.  (This discussion ignores  the  case  where
660       the  child was created using CLONE_THREAD, when getpid(2) should return
661       the same value in the child and in the  process  that  called  clone(),
662       since  the  caller  and  the  child  are in the same thread group.  The
663       stale-cache problem also does not occur if the flags argument  includes
664       CLONE_VM.)   To  get  the truth, it was sometimes necessary to use code
665       such as the following:
666
667           #include <syscall.h>
668
669           pid_t mypid;
670
671           mypid = syscall(SYS_getpid);
672
673       Because of the stale-cache problem, as well as other problems noted  in
674       getpid(2), the PID caching feature was removed in glibc 2.25.
675

EXAMPLE

677       The following program demonstrates the use of clone() to create a child
678       process that executes in a separate UTS namespace.  The  child  changes
679       the  hostname in its UTS namespace.  Both parent and child then display
680       the system hostname, making it possible to see that the  hostname  dif‐
681       fers  in the UTS namespaces of the parent and child.  For an example of
682       the use of this program, see setns(2).
683
684   Program source
685       #define _GNU_SOURCE
686       #include <sys/wait.h>
687       #include <sys/utsname.h>
688       #include <sched.h>
689       #include <string.h>
690       #include <stdio.h>
691       #include <stdlib.h>
692       #include <unistd.h>
693
694       #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
695                               } while (0)
696
697       static int              /* Start function for cloned child */
698       childFunc(void *arg)
699       {
700           struct utsname uts;
701
702           /* Change hostname in UTS namespace of child */
703
704           if (sethostname(arg, strlen(arg)) == -1)
705               errExit("sethostname");
706
707           /* Retrieve and display hostname */
708
709           if (uname(&uts) == -1)
710               errExit("uname");
711           printf("uts.nodename in child:  %s\n", uts.nodename);
712
713           /* Keep the namespace open for a while, by sleeping.
714              This allows some experimentation--for example, another
715              process might join the namespace. */
716
717           sleep(200);
718
719           return 0;           /* Child terminates now */
720       }
721
722       #define STACK_SIZE (1024 * 1024)    /* Stack size for cloned child */
723
724       int
725       main(int argc, char *argv[])
726       {
727           char *stack;                    /* Start of stack buffer */
728           char *stackTop;                 /* End of stack buffer */
729           pid_t pid;
730           struct utsname uts;
731
732           if (argc < 2) {
733               fprintf(stderr, "Usage: %s <child-hostname>\n", argv[0]);
734               exit(EXIT_SUCCESS);
735           }
736
737           /* Allocate stack for child */
738
739           stack = malloc(STACK_SIZE);
740           if (stack == NULL)
741               errExit("malloc");
742           stackTop = stack + STACK_SIZE;  /* Assume stack grows downward */
743
744           /* Create child that has its own UTS namespace;
745              child commences execution in childFunc() */
746
747           pid = clone(childFunc, stackTop, CLONE_NEWUTS | SIGCHLD, argv[1]);
748           if (pid == -1)
749               errExit("clone");
750           printf("clone() returned %ld\n", (long) pid);
751
752           /* Parent falls through to here */
753
754           sleep(1);           /* Give child time to change its hostname */
755
756           /* Display hostname in parent's UTS namespace. This will be
757              different from hostname in child's UTS namespace. */
758
759           if (uname(&uts) == -1)
760               errExit("uname");
761           printf("uts.nodename in parent: %s\n", uts.nodename);
762
763           if (waitpid(pid, NULL, 0) == -1)    /* Wait for child */
764               errExit("waitpid");
765           printf("child has terminated\n");
766
767           exit(EXIT_SUCCESS);
768       }
769

SEE ALSO

771       fork(2), futex(2), getpid(2), gettid(2),  kcmp(2),  set_thread_area(2),
772       set_tid_address(2),  setns(2), tkill(2), unshare(2), wait(2), capabili‐
773       ties(7), namespaces(7), pthreads(7)
774

COLOPHON

776       This page is part of release 4.16 of the Linux  man-pages  project.   A
777       description  of  the project, information about reporting bugs, and the
778       latest    version    of    this    page,    can     be     found     at
779       https://www.kernel.org/doc/man-pages/.
780
781
782
783Linux                             2017-09-15                          CLONE(2)
Impressum