1UNSHARE(1) User Commands UNSHARE(1)
2
3
4
6 unshare - run program in new namespaces
7
9 unshare [options] [program [arguments]]
10
12 The unshare command creates new namespaces (as specified by the
13 command-line options described below) and then executes the specified
14 program. If program is not given, then "${SHELL}" is run (default:
15 /bin/sh).
16
17 By default, a new namespace persists only as long as it has member
18 processes. A new namespace can be made persistent even when it has no
19 member processes by bind mounting /proc/pid/ns/type files to a
20 filesystem path. A namespace that has been made persistent in this way
21 can subsequently be entered with nsenter(1) even after the program
22 terminates (except PID namespaces where a permanently running init
23 process is required). Once a persistent namespace is no longer needed,
24 it can be unpersisted by using umount(8) to remove the bind mount. See
25 the EXAMPLES section for more details.
26
27 unshare since util-linux version 2.36 uses
28 /proc/[pid]/ns/pid_for_children and /proc/[pid]/ns/time_for_children
29 files for persistent PID and TIME namespaces. This change requires
30 Linux kernel 4.17 or newer.
31
32 The following types of namespaces can be created with unshare:
33
34 mount namespace
35 Mounting and unmounting filesystems will not affect the rest of the
36 system, except for filesystems which are explicitly marked as
37 shared (with mount --make-shared; see /proc/self/mountinfo or
38 findmnt -o+PROPAGATION for the shared flags). For further details,
39 see mount_namespaces(7).
40
41 unshare since util-linux version 2.27 automatically sets
42 propagation to private in a new mount namespace to make sure that
43 the new namespace is really unshared. It’s possible to disable this
44 feature with option --propagation unchanged. Note that private is
45 the kernel default.
46
47 UTS namespace
48 Setting hostname or domainname will not affect the rest of the
49 system. For further details, see uts_namespaces(7).
50
51 IPC namespace
52 The process will have an independent namespace for POSIX message
53 queues as well as System V message queues, semaphore sets and
54 shared memory segments. For further details, see ipc_namespaces(7).
55
56 network namespace
57 The process will have independent IPv4 and IPv6 stacks, IP routing
58 tables, firewall rules, the /proc/net and /sys/class/net directory
59 trees, sockets, etc. For further details, see
60 network_namespaces(7).
61
62 PID namespace
63 Children will have a distinct set of PID-to-process mappings from
64 their parent. For further details, see pid_namespaces(7).
65
66 cgroup namespace
67 The process will have a virtualized view of /proc/self/cgroup, and
68 new cgroup mounts will be rooted at the namespace cgroup root. For
69 further details, see cgroup_namespaces(7).
70
71 user namespace
72 The process will have a distinct set of UIDs, GIDs and
73 capabilities. For further details, see user_namespaces(7).
74
75 time namespace
76 The process can have a distinct view of CLOCK_MONOTONIC and/or
77 CLOCK_BOOTTIME which can be changed using
78 /proc/self/timens_offsets. For further details, see
79 time_namespaces(7).
80
82 -i, --ipc[=file]
83 Create a new IPC namespace. If file is specified, then the
84 namespace is made persistent by creating a bind mount at file.
85
86 -m, --mount[=file]
87 Create a new mount namespace. If file is specified, then the
88 namespace is made persistent by creating a bind mount at file. Note
89 that file must be located on a mount whose propagation type is not
90 shared (or an error results). Use the command findmnt
91 -o+PROPAGATION when not sure about the current setting. See also
92 the examples below.
93
94 -n, --net[=file]
95 Create a new network namespace. If file is specified, then the
96 namespace is made persistent by creating a bind mount at file.
97
98 -p, --pid[=file]
99 Create a new PID namespace. If file is specified, then the
100 namespace is made persistent by creating a bind mount at file.
101 (Creation of a persistent PID namespace will fail if the --fork
102 option is not also specified.)
103
104 See also the --fork and --mount-proc options.
105
106 -u, --uts[=file]
107 Create a new UTS namespace. If file is specified, then the
108 namespace is made persistent by creating a bind mount at file.
109
110 -U, --user[=file]
111 Create a new user namespace. If file is specified, then the
112 namespace is made persistent by creating a bind mount at file.
113
114 -C, --cgroup[=file]
115 Create a new cgroup namespace. If file is specified, then the
116 namespace is made persistent by creating a bind mount at file.
117
118 -T, --time[=file]
119 Create a new time namespace. If file is specified, then the
120 namespace is made persistent by creating a bind mount at file. The
121 --monotonic and --boottime options can be used to specify the
122 corresponding offset in the time namespace.
123
124 -f, --fork
125 Fork the specified program as a child process of unshare rather
126 than running it directly. This is useful when creating a new PID
127 namespace. Note that when unshare is waiting for the child process,
128 then it ignores SIGINT and SIGTERM and does not forward any signals
129 to the child. It is necessary to send signals to the child process.
130
131 --keep-caps
132 When the --user option is given, ensure that capabilities granted
133 in the user namespace are preserved in the child process.
134
135 --kill-child[=signame]
136 When unshare terminates, have signame be sent to the forked child
137 process. Combined with --pid this allows for an easy and reliable
138 killing of the entire process tree below unshare. If not given,
139 signame defaults to SIGKILL. This option implies --fork.
140
141 --mount-proc[=mountpoint]
142 Just before running the program, mount the proc filesystem at
143 mountpoint (default is /proc). This is useful when creating a new
144 PID namespace. It also implies creating a new mount namespace since
145 the /proc mount would otherwise mess up existing programs on the
146 system. The new proc filesystem is explicitly mounted as private
147 (with MS_PRIVATE|MS_REC).
148
149 --map-user=uid|name
150 Run the program only after the current effective user ID has been
151 mapped to uid. If this option is specified multiple times, the last
152 occurrence takes precedence. This option implies --user.
153
154 --map-users=inneruid:outeruid:count|auto
155 Run the program only after the block of user IDs of size count
156 beginning at outeruid has been mapped to the block of user IDs
157 beginning at inneruid. This mapping is created with newuidmap(1).
158 If the range of user IDs overlaps with the mapping specified by
159 --map-user, then a "hole" will be removed from the mapping. This
160 may result in the highest user ID of the mapping not being mapped.
161 The special value auto will map the first block of user IDs owned
162 by the effective user from /etc/subuid to a block starting at user
163 ID 0. If this option is specified multiple times, the last
164 occurrence takes precedence. This option implies --user.
165
166 Before util-linux version 2.39, this option expected a
167 comma-separated argument of the form outeruid,inneruid,count but
168 that format is now deprecated for consistency with the ordering
169 used in /proc/[pid]/uid_map and the X-mount.idmap mount option.
170
171 --map-group=gid|name
172 Run the program only after the current effective group ID has been
173 mapped to gid. If this option is specified multiple times, the last
174 occurrence takes precedence. This option implies --setgroups=deny
175 and --user.
176
177 --map-groups=innergid:outergid:count|auto
178 Run the program only after the block of group IDs of size count
179 beginning at outergid has been mapped to the block of group IDs
180 beginning at innergid. This mapping is created with newgidmap(1).
181 If the range of group IDs overlaps with the mapping specified by
182 --map-group, then a "hole" will be removed from the mapping. This
183 may result in the highest group ID of the mapping not being mapped.
184 The special value auto will map the first block of user IDs owned
185 by the effective user from /etc/subgid to a block starting at group
186 ID 0. If this option is specified multiple times, the last
187 occurrence takes precedence. This option implies --user.
188
189 Before util-linux version 2.39, this option expected a
190 comma-separated argument of the form outergid,innergid,count but
191 that format is now deprecated for consistency with the ordering
192 used in /proc/[pid]/gid_map and the X-mount.idmap mount option.
193
194 --map-auto
195 Map the first block of user IDs owned by the effective user from
196 /etc/subuid to a block starting at user ID 0. In the same manner,
197 also map the first block of group IDs owned by the effective group
198 from /etc/subgid to a block starting at group ID 0. This option is
199 intended to handle the common case where the first block of
200 subordinate user and group IDs can map the whole user and group ID
201 space. This option is equivalent to specifying --map-users=auto and
202 --map-groups=auto.
203
204 -r, --map-root-user
205 Run the program only after the current effective user and group IDs
206 have been mapped to the superuser UID and GID in the newly created
207 user namespace. This makes it possible to conveniently gain
208 capabilities needed to manage various aspects of the newly created
209 namespaces (such as configuring interfaces in the network namespace
210 or mounting filesystems in the mount namespace) even when run
211 unprivileged. As a mere convenience feature, it does not support
212 more sophisticated use cases, such as mapping multiple ranges of
213 UIDs and GIDs. This option implies --setgroups=deny and --user.
214 This option is equivalent to --map-user=0 --map-group=0.
215
216 -c, --map-current-user
217 Run the program only after the current effective user and group IDs
218 have been mapped to the same UID and GID in the newly created user
219 namespace. This option implies --setgroups=deny and --user. This
220 option is equivalent to --map-user=$(id -ru) --map-group=$(id -rg).
221
222 --propagation private|shared|slave|unchanged
223 Recursively set the mount propagation flag in the new mount
224 namespace. The default is to set the propagation to private. It is
225 possible to disable this feature with the argument unchanged. The
226 option is silently ignored when the mount namespace (--mount) is
227 not requested.
228
229 --setgroups allow|deny
230 Allow or deny the setgroups(2) system call in a user namespace.
231
232 To be able to call setgroups(2), the calling process must at least
233 have CAP_SETGID. But since Linux 3.19 a further restriction
234 applies: the kernel gives permission to call setgroups(2) only
235 after the GID map (/proc/pid*/gid_map*) has been set. The GID map
236 is writable by root when setgroups(2) is enabled (i.e., allow, the
237 default), and the GID map becomes writable by unprivileged
238 processes when setgroups(2) is permanently disabled (with deny).
239
240 -R, --root=dir
241 run the command with root directory set to dir.
242
243 -w, --wd=dir
244 change working directory to dir.
245
246 -S, --setuid uid
247 Set the user ID which will be used in the entered namespace.
248
249 -G, --setgid gid
250 Set the group ID which will be used in the entered namespace and
251 drop supplementary groups.
252
253 --monotonic offset
254 Set the offset of CLOCK_MONOTONIC which will be used in the entered
255 time namespace. This option requires unsharing a time namespace
256 with --time.
257
258 --boottime offset
259 Set the offset of CLOCK_BOOTTIME which will be used in the entered
260 time namespace. This option requires unsharing a time namespace
261 with --time.
262
263 -h, --help
264 Display help text and exit.
265
266 -V, --version
267 Print version and exit.
268
270 The proc and sysfs filesystems mounting as root in a user namespace
271 have to be restricted so that a less privileged user cannot get more
272 access to sensitive files that a more privileged user made unavailable.
273 In short the rule for proc and sysfs is as close to a bind mount as
274 possible.
275
277 The following command creates a PID namespace, using --fork to ensure
278 that the executed command is performed in a child process that (being
279 the first process in the namespace) has PID 1. The --mount-proc option
280 ensures that a new mount namespace is also simultaneously created and
281 that a new proc(5) filesystem is mounted that contains information
282 corresponding to the new PID namespace. When the readlink(1) command
283 terminates, the new namespaces are automatically torn down.
284
285 # unshare --fork --pid --mount-proc readlink /proc/self
286 1
287
288 As an unprivileged user, create a new user namespace where the user’s
289 credentials are mapped to the root IDs inside the namespace:
290
291 $ id -u; id -g
292 1000
293 1000
294 $ unshare --user --map-root-user \
295 sh -c 'whoami; cat /proc/self/uid_map /proc/self/gid_map'
296 root
297 0 1000 1
298 0 1000 1
299
300 As an unprivileged user, create a user namespace where the first 65536
301 IDs are all mapped, and the user’s credentials are mapped to the root
302 IDs inside the namespace. The map is determined by the subordinate IDs
303 assigned in subuid(5) and subgid(5). Demonstrate this mapping by
304 creating a file with user ID 1 and group ID 1. For brevity, only the
305 user ID mappings are shown:
306
307 $ id -u
308 1000
309 $ cat /etc/subuid
310 1000:100000:65536
311 $ unshare --user --map-auto --map-root-user
312 # id -u
313 0
314 # cat /proc/self/uid_map
315 0 1000 1
316 1 100000 65535
317 # touch file; chown 1:1 file
318 # ls -ln --time-style=+ file
319 -rw-r--r-- 1 1 1 0 file
320 # exit
321 $ ls -ln --time-style=+ file
322 -rw-r--r-- 1 100000 100000 0 file
323
324 The first of the following commands creates a new persistent UTS
325 namespace and modifies the hostname as seen in that namespace. The
326 namespace is then entered with nsenter(1) in order to display the
327 modified hostname; this step demonstrates that the UTS namespace
328 continues to exist even though the namespace had no member processes
329 after the unshare command terminated. The namespace is then destroyed
330 by removing the bind mount.
331
332 # touch /root/uts-ns
333 # unshare --uts=/root/uts-ns hostname FOO
334 # nsenter --uts=/root/uts-ns hostname
335 FOO
336 # umount /root/uts-ns
337
338 The following commands establish a persistent mount namespace
339 referenced by the bind mount /root/namespaces/mnt. In order to ensure
340 that the creation of that bind mount succeeds, the parent directory
341 (/root/namespaces) is made a bind mount whose propagation type is not
342 shared.
343
344 # mount --bind /root/namespaces /root/namespaces
345 # mount --make-private /root/namespaces
346 # touch /root/namespaces/mnt
347 # unshare --mount=/root/namespaces/mnt
348
349 The following commands demonstrate the use of the --kill-child option
350 when creating a PID namespace, in order to ensure that when unshare is
351 killed, all of the processes within the PID namespace are killed.
352
353 # set +m # Don't print job status messages
354
355 # unshare --pid --fork --mount-proc --kill-child -- \
356 bash --norc -c '(sleep 555 &) && (ps a &) && sleep 999' &
357 [1] 53456
358 # PID TTY STAT TIME COMMAND
359 1 pts/3 S+ 0:00 sleep 999
360 3 pts/3 S+ 0:00 sleep 555
361 5 pts/3 R+ 0:00 ps a
362
363 # ps h -o 'comm' $! # Show that background job is unshare(1)
364 unshare
365 # kill $! # Kill unshare(1)
366 # pidof sleep
367
368 The pidof(1) command prints no output, because the sleep processes have
369 been killed. More precisely, when the sleep process that has PID 1 in
370 the namespace (i.e., the namespace’s init process) was killed, this
371 caused all other processes in the namespace to be killed. By contrast,
372 a similar series of commands where the --kill-child option is not used
373 shows that when unshare terminates, the processes in the PID namespace
374 are not killed:
375
376 # unshare --pid --fork --mount-proc -- \
377 bash --norc -c '(sleep 555 &) && (ps a &) && sleep 999' &
378 [1] 53479
379 # PID TTY STAT TIME COMMAND
380 1 pts/3 S+ 0:00 sleep 999
381 3 pts/3 S+ 0:00 sleep 555
382 5 pts/3 R+ 0:00 ps a
383
384 # kill $!
385 # pidof sleep
386 53482 53480
387
388 The following example demonstrates the creation of a time namespace
389 where the boottime clock is set to a point several years in the past:
390
391 # uptime -p # Show uptime in initial time namespace
392 up 21 hours, 30 minutes
393 # unshare --time --fork --boottime 300000000 uptime -p
394 up 9 years, 28 weeks, 1 day, 2 hours, 50 minutes
395
397 Mikhail Gusarov <dottedmag@dottedmag.net>, Karel Zak <kzak@redhat.com>
398
400 newuidmap(1), newgidmap(1), clone(2), unshare(2), namespaces(7),
401 mount(8)
402
404 For bug reports, use the issue tracker at
405 https://github.com/util-linux/util-linux/issues.
406
408 The unshare command is part of the util-linux package which can be
409 downloaded from Linux Kernel Archive
410 <https://www.kernel.org/pub/linux/utils/util-linux/>.
411
412
413
414util-linux 2.39.2 2023-06-14 UNSHARE(1)