1NAMESPACES(7) Linux Programmer's Manual NAMESPACES(7)
2
3
4
6 namespaces - overview of Linux namespaces
7
9 A namespace wraps a global system resource in an abstraction that makes
10 it appear to the processes within the namespace that they have their
11 own isolated instance of the global resource. Changes to the global
12 resource are visible to other processes that are members of the names‐
13 pace, but are invisible to other processes. One use of namespaces is
14 to implement containers.
15
16 Linux provides the following namespaces:
17
18 Namespace Constant Isolates
19 Cgroup CLONE_NEWCGROUP Cgroup root directory
20 IPC CLONE_NEWIPC System V IPC, POSIX message queues
21 Network CLONE_NEWNET Network devices, stacks, ports, etc.
22 Mount CLONE_NEWNS Mount points
23 PID CLONE_NEWPID Process IDs
24 User CLONE_NEWUSER User and group IDs
25 UTS CLONE_NEWUTS Hostname and NIS domain name
26
27 This page describes the various namespaces and the associated /proc
28 files, and summarizes the APIs for working with namespaces.
29
30 The namespaces API
31 As well as various /proc files described below, the namespaces API
32 includes the following system calls:
33
34 clone(2)
35 The clone(2) system call creates a new process. If the flags
36 argument of the call specifies one or more of the CLONE_NEW*
37 flags listed below, then new namespaces are created for each
38 flag, and the child process is made a member of those names‐
39 paces. (This system call also implements a number of features
40 unrelated to namespaces.)
41
42 setns(2)
43 The setns(2) system call allows the calling process to join an
44 existing namespace. The namespace to join is specified via a
45 file descriptor that refers to one of the /proc/[pid]/ns files
46 described below.
47
48 unshare(2)
49 The unshare(2) system call moves the calling process to a new
50 namespace. If the flags argument of the call specifies one or
51 more of the CLONE_NEW* flags listed below, then new namespaces
52 are created for each flag, and the calling process is made a
53 member of those namespaces. (This system call also implements a
54 number of features unrelated to namespaces.)
55
56 ioctl(2)
57 Various ioctl(2) operations can be used to discover information
58 about namespaces. These operations are described in
59 ioctl_ns(2).
60
61 Creation of new namespaces using clone(2) and unshare(2) in most cases
62 requires the CAP_SYS_ADMIN capability, since, in the new namespace, the
63 creator will have the power to change global resources that are visible
64 to other processes that are subsequently created in, or join the names‐
65 pace. User namespaces are the exception: since Linux 3.8, no privilege
66 is required to create a user namespace.
67
68 The /proc/[pid]/ns/ directory
69 Each process has a /proc/[pid]/ns/ subdirectory containing one entry
70 for each namespace that supports being manipulated by setns(2):
71
72 $ ls -l /proc/$$/ns
73 total 0
74 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 cgroup -> cgroup:[4026531835]
75 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 ipc -> ipc:[4026531839]
76 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 mnt -> mnt:[4026531840]
77 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 net -> net:[4026531969]
78 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid -> pid:[4026531836]
79 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 pid_for_children -> pid:[4026531834]
80 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 user -> user:[4026531837]
81 lrwxrwxrwx. 1 mtk mtk 0 Apr 28 12:46 uts -> uts:[4026531838]
82
83 Bind mounting (see mount(2)) one of the files in this directory to
84 somewhere else in the filesystem keeps the corresponding namespace of
85 the process specified by pid alive even if all processes currently in
86 the namespace terminate.
87
88 Opening one of the files in this directory (or a file that is bind
89 mounted to one of these files) returns a file handle for the corre‐
90 sponding namespace of the process specified by pid. As long as this
91 file descriptor remains open, the namespace will remain alive, even if
92 all processes in the namespace terminate. The file descriptor can be
93 passed to setns(2).
94
95 In Linux 3.7 and earlier, these files were visible as hard links.
96 Since Linux 3.8, they appear as symbolic links. If two processes are
97 in the same namespace, then the device IDs and inode numbers of their
98 /proc/[pid]/ns/xxx symbolic links will be the same; an application can
99 check this using the stat.st_dev and stat.st_ino fields returned by
100 stat(2). The content of this symbolic link is a string containing the
101 namespace type and inode number as in the following example:
102
103 $ readlink /proc/$$/ns/uts
104 uts:[4026531838]
105
106 The symbolic links in this subdirectory are as follows:
107
108 /proc/[pid]/ns/cgroup (since Linux 4.6)
109 This file is a handle for the cgroup namespace of the process.
110
111 /proc/[pid]/ns/ipc (since Linux 3.0)
112 This file is a handle for the IPC namespace of the process.
113
114 /proc/[pid]/ns/mnt (since Linux 3.8)
115 This file is a handle for the mount namespace of the process.
116
117 /proc/[pid]/ns/net (since Linux 3.0)
118 This file is a handle for the network namespace of the process.
119
120 /proc/[pid]/ns/pid (since Linux 3.8)
121 This file is a handle for the PID namespace of the process.
122 This handle is permanent for the lifetime of the process (i.e.,
123 a process's PID namespace membership never changes).
124
125 /proc/[pid]/ns/pid_for_children (since Linux 4.12)
126 This file is a handle for the PID namespace of child processes
127 created by this process. This can change as a consequence of
128 calls to unshare(2) and setns(2) (see pid_namespaces(7)), so the
129 file may differ from /proc/[pid]/ns/pid. The symbolic link
130 gains a value only after the first child process is created in
131 the namespace. (Beforehand, readlink(2) of the symbolic link
132 will return an empty buffer.)
133
134 /proc/[pid]/ns/user (since Linux 3.8)
135 This file is a handle for the user namespace of the process.
136
137 /proc/[pid]/ns/uts (since Linux 3.0)
138 This file is a handle for the UTS namespace of the process.
139
140 Permission to dereference or read (readlink(2)) these symbolic links is
141 governed by a ptrace access mode PTRACE_MODE_READ_FSCREDS check; see
142 ptrace(2).
143
144 The /proc/sys/user directory
145 The files in the /proc/sys/user directory (which is present since Linux
146 4.9) expose limits on the number of namespaces of various types that
147 can be created. The files are as follows:
148
149 max_cgroup_namespaces
150 The value in this file defines a per-user limit on the number of
151 cgroup namespaces that may be created in the user namespace.
152
153 max_ipc_namespaces
154 The value in this file defines a per-user limit on the number of
155 ipc namespaces that may be created in the user namespace.
156
157 max_mnt_namespaces
158 The value in this file defines a per-user limit on the number of
159 mount namespaces that may be created in the user namespace.
160
161 max_net_namespaces
162 The value in this file defines a per-user limit on the number of
163 network namespaces that may be created in the user namespace.
164
165 max_pid_namespaces
166 The value in this file defines a per-user limit on the number of
167 pid namespaces that may be created in the user namespace.
168
169 max_user_namespaces
170 The value in this file defines a per-user limit on the number of
171 user namespaces that may be created in the user namespace.
172
173 max_uts_namespaces
174 The value in this file defines a per-user limit on the number of
175 uts namespaces that may be created in the user namespace.
176
177 Note the following details about these files:
178
179 * The values in these files are modifiable by privileged processes.
180
181 * The values exposed by these files are the limits for the user names‐
182 pace in which the opening process resides.
183
184 * The limits are per-user. Each user in the same user namespace can
185 create namespaces up to the defined limit.
186
187 * The limits apply to all users, including UID 0.
188
189 * These limits apply in addition to any other per-namespace limits
190 (such as those for PID and user namespaces) that may be enforced.
191
192 * Upon encountering these limits, clone(2) and unshare(2) fail with
193 the error ENOSPC.
194
195 * For the initial user namespace, the default value in each of these
196 files is half the limit on the number of threads that may be created
197 (/proc/sys/kernel/threads-max). In all descendant user namespaces,
198 the default value in each file is MAXINT.
199
200 * When a namespace is created, the object is also accounted against
201 ancestor namespaces. More precisely:
202
203 + Each user namespace has a creator UID.
204
205 + When a namespace is created, it is accounted against the creator
206 UIDs in each of the ancestor user namespaces, and the kernel
207 ensures that the corresponding namespace limit for the creator
208 UID in the ancestor namespace is not exceeded.
209
210 + The aforementioned point ensures that creating a new user names‐
211 pace cannot be used as a means to escape the limits in force in
212 the current user namespace.
213
214 Cgroup namespaces (CLONE_NEWCGROUP)
215 See cgroup_namespaces(7).
216
217 IPC namespaces (CLONE_NEWIPC)
218 IPC namespaces isolate certain IPC resources, namely, System V IPC
219 objects (see sysvipc(7)) and (since Linux 2.6.30) POSIX message queues
220 (see mq_overview(7)). The common characteristic of these IPC mecha‐
221 nisms is that IPC objects are identified by mechanisms other than
222 filesystem pathnames.
223
224 Each IPC namespace has its own set of System V IPC identifiers and its
225 own POSIX message queue filesystem. Objects created in an IPC names‐
226 pace are visible to all other processes that are members of that names‐
227 pace, but are not visible to processes in other IPC namespaces.
228
229 The following /proc interfaces are distinct in each IPC namespace:
230
231 * The POSIX message queue interfaces in /proc/sys/fs/mqueue.
232
233 * The System V IPC interfaces in /proc/sys/kernel, namely: msgmax,
234 msgmnb, msgmni, sem, shmall, shmmax, shmmni, and shm_rmid_forced.
235
236 * The System V IPC interfaces in /proc/sysvipc.
237
238 When an IPC namespace is destroyed (i.e., when the last process that is
239 a member of the namespace terminates), all IPC objects in the namespace
240 are automatically destroyed.
241
242 Use of IPC namespaces requires a kernel that is configured with the
243 CONFIG_IPC_NS option.
244
245 Network namespaces (CLONE_NEWNET)
246 See network_namespaces(7).
247
248 Mount namespaces (CLONE_NEWNS)
249 See mount_namespaces(7).
250
251 PID namespaces (CLONE_NEWPID)
252 See pid_namespaces(7).
253
254 User namespaces (CLONE_NEWUSER)
255 See user_namespaces(7).
256
257 UTS namespaces (CLONE_NEWUTS)
258 UTS namespaces provide isolation of two system identifiers: the host‐
259 name and the NIS domain name. These identifiers are set using sethost‐
260 name(2) and setdomainname(2), and can be retrieved using uname(2),
261 gethostname(2), and getdomainname(2).
262
263 When a process creates a new UTS namespace using clone(2) or unshare(2)
264 with the CLONE_NEWUTS flag, the hostname and domain of the new UTS
265 namespace are copied from the corresponding values in the caller's UTS
266 namespace.
267
268 Use of UTS namespaces requires a kernel that is configured with the
269 CONFIG_UTS_NS option.
270
271 Namespace lifetime
272 Absent any other factors, a namespace is automatically torn down when
273 the last process in the namespace terminates or leaves the namespace.
274 However, there are a number of other factors that may pin a namespace
275 into existence even though it has no member processes. These factors
276 include the following:
277
278 * An open file descriptor or a bind mount exists for the corresponding
279 /proc/[pid]/ns/* file.
280
281 * The namespace is hierarchical (i.e., a PID or user namespace), and
282 has a child namespace.
283
284 * It is a user namespace that owns one or more nonuser namespaces.
285
286 * It is a PID namespace, and there is a process that refers to the
287 namespace via a /proc/[pid]/ns/pid_for_children symbolic link.
288
289 * It is an IPC namespace, and a corresponding mount of an mqueue
290 filesystem (see mq_overview(7)) refers to this namespace.
291
292 * It is a PID namespace, and a corresponding mount of a proc(5)
293 filesystem refers to this namespace.
294
296 See clone(2) and user_namespaces(7).
297
299 nsenter(1), readlink(1), unshare(1), clone(2), ioctl_ns(2), setns(2),
300 unshare(2), proc(5), capabilities(7), cgroup_namespaces(7), cgroups(7),
301 credentials(7), network_namespaces(7), pid_namespaces(7), user_names‐
302 paces(7), lsns(8), pam_namespace(8), switch_root(8)
303
305 This page is part of release 5.02 of the Linux man-pages project. A
306 description of the project, information about reporting bugs, and the
307 latest version of this page, can be found at
308 https://www.kernel.org/doc/man-pages/.
309
310
311
312Linux 2019-08-02 NAMESPACES(7)