unshare(1)

1UNSHARE(1)                       User Commands                      UNSHARE(1)
2
3
4

NAME

6       unshare - run program in new namespaces
7

SYNOPSIS

9       unshare [options] [program [arguments]]
10

DESCRIPTION

12       The unshare command creates new namespaces (as specified by the
13       command-line options described below) and then executes the specified
14       program. If program is not given, then "${SHELL}" is run (default:
15       /bin/sh).
16
17       By default, a new namespace persists only as long as it has member
18       processes. A new namespace can be made persistent even when it has no
19       member processes by bind mounting /proc/pid/ns/type files to a
20       filesystem path. A namespace that has been made persistent in this way
21       can subsequently be entered with nsenter(1) even after the program
22       terminates (except PID namespaces where a permanently running init
23       process is required). Once a persistent namespace is no longer needed,
24       it can be unpersisted by using umount(8) to remove the bind mount. See
25       the EXAMPLES section for more details.
26
27       unshare since util-linux version 2.36 uses
28       /proc/[pid]/ns/pid_for_children and /proc/[pid]/ns/time_for_children
29       files for persistent PID and TIME namespaces. This change requires
30       Linux kernel 4.17 or newer.
31
32       The following types of namespaces can be created with unshare:
33
34       mount namespace
35           Mounting and unmounting filesystems will not affect the rest of the
36           system, except for filesystems which are explicitly marked as
37           shared (with mount --make-shared; see /proc/self/mountinfo or
38           findmnt -o+PROPAGATION for the shared flags). For further details,
39           see mount_namespaces(7).
40
41           unshare since util-linux version 2.27 automatically sets
42           propagation to private in a new mount namespace to make sure that
43           the new namespace is really unshared. It’s possible to disable this
44           feature with option --propagation unchanged. Note that private is
45           the kernel default.
46
47       UTS namespace
48           Setting hostname or domainname will not affect the rest of the
49           system. For further details, see uts_namespaces(7).
50
51       IPC namespace
52           The process will have an independent namespace for POSIX message
53           queues as well as System V message queues, semaphore sets and
54           shared memory segments. For further details, see ipc_namespaces(7).
55
56       network namespace
57           The process will have independent IPv4 and IPv6 stacks, IP routing
58           tables, firewall rules, the /proc/net and /sys/class/net directory
59           trees, sockets, etc. For further details, see
60           network_namespaces(7).
61
62       PID namespace
63           Children will have a distinct set of PID-to-process mappings from
64           their parent. For further details, see pid_namespaces(7).
65
66       cgroup namespace
67           The process will have a virtualized view of /proc/self/cgroup, and
68           new cgroup mounts will be rooted at the namespace cgroup root. For
69           further details, see cgroup_namespaces(7).
70
71       user namespace
72           The process will have a distinct set of UIDs, GIDs and
73           capabilities. For further details, see user_namespaces(7).
74
75       time namespace
76           The process can have a distinct view of CLOCK_MONOTONIC and/or
77           CLOCK_BOOTTIME which can be changed using
78           /proc/self/timens_offsets. For further details, see
79           time_namespaces(7).
80

OPTIONS

82       -i, --ipc[=file]
83           Create a new IPC namespace. If file is specified, then the
84           namespace is made persistent by creating a bind mount at file.
85
86       -m, --mount[=file]
87           Create a new mount namespace. If file is specified, then the
88           namespace is made persistent by creating a bind mount at file. Note
89           that file must be located on a mount whose propagation type is not
90           shared (or an error results). Use the command findmnt
91           -o+PROPAGATION when not sure about the current setting. See also
92           the examples below.
93
94       -n, --net[=file]
95           Create a new network namespace. If file is specified, then the
96           namespace is made persistent by creating a bind mount at file.
97
98       -p, --pid[=file]
99           Create a new PID namespace. If file is specified, then the
100           namespace is made persistent by creating a bind mount at file.
101           (Creation of a persistent PID namespace will fail if the --fork
102           option is not also specified.)
103
104           See also the --fork and --mount-proc options.
105
106       -u, --uts[=file]
107           Create a new UTS namespace. If file is specified, then the
108           namespace is made persistent by creating a bind mount at file.
109
110       -U, --user[=file]
111           Create a new user namespace. If file is specified, then the
112           namespace is made persistent by creating a bind mount at file.
113
114       -C, --cgroup[=file]
115           Create a new cgroup namespace. If file is specified, then the
116           namespace is made persistent by creating a bind mount at file.
117
118       -T, --time[=file]
119           Create a new time namespace. If file is specified, then the
120           namespace is made persistent by creating a bind mount at file. The
121           --monotonic and --boottime options can be used to specify the
122           corresponding offset in the time namespace.
123
124       -f, --fork
125           Fork the specified program as a child process of unshare rather
126           than running it directly. This is useful when creating a new PID
127           namespace. Note that when unshare is waiting for the child process,
128           then it ignores SIGINT and SIGTERM and does not forward any signals
129           to the child. It is necessary to send signals to the child process.
130
131       --keep-caps
132           When the --user option is given, ensure that capabilities granted
133           in the user namespace are preserved in the child process.
134
135       --kill-child[=signame]
136           When unshare terminates, have signame be sent to the forked child
137           process. Combined with --pid this allows for an easy and reliable
138           killing of the entire process tree below unshare. If not given,
139           signame defaults to SIGKILL. This option implies --fork.
140
141       --mount-proc[=mountpoint]
142           Just before running the program, mount the proc filesystem at
143           mountpoint (default is /proc). This is useful when creating a new
144           PID namespace. It also implies creating a new mount namespace since
145           the /proc mount would otherwise mess up existing programs on the
146           system. The new proc filesystem is explicitly mounted as private
147           (with MS_PRIVATE|MS_REC).
148
149       --map-user=uid|name
150           Run the program only after the current effective user ID has been
151           mapped to uid. If this option is specified multiple times, the last
152           occurrence takes precedence. This option implies --user.
153
154       --map-users=outeruid,inneruid,count|auto
155           Run the program only after the block of user IDs of size count
156           beginning at outeruid has been mapped to the block of user IDs
157           beginning at inneruid. This mapping is created with newuidmap(1).
158           If the range of user IDs overlaps with the mapping specified by
159           --map-user, then a "hole" will be removed from the mapping. This
160           may result in the highest user ID of the mapping not being mapped.
161           The special value auto will map the first block of user IDs owned
162           by the effective user from /etc/subuid to a block starting at user
163           ID 0. If this option is specified multiple times, the last
164           occurrence takes precedence. This option implies --user.
165
166       --map-group=gid|name
167           Run the program only after the current effective group ID has been
168           mapped to gid. If this option is specified multiple times, the last
169           occurrence takes precedence. This option implies --setgroups=deny
170           and --user.
171
172       --map-groups=outergid,innergid,count|auto
173           Run the program only after the block of group IDs of size count
174           beginning at outergid has been mapped to the block of group IDs
175           beginning at innergid. This mapping is created with newgidmap(1).
176           If the range of group IDs overlaps with the mapping specified by
177           --map-group, then a "hole" will be removed from the mapping. This
178           may result in the highest group ID of the mapping not being mapped.
179           The special value auto will map the first block of user IDs owned
180           by the effective user from /etc/subgid to a block starting at group
181           ID 0. If this option is specified multiple times, the last
182           occurrence takes precedence. This option implies --user.
183
184       --map-auto
185           Map the first block of user IDs owned by the effective user from
186           /etc/subuid to a block starting at user ID 0. In the same manner,
187           also map the first block of group IDs owned by the effective group
188           from /etc/subgid to a block starting at group ID 0. This option is
189           intended to handle the common case where the first block of
190           subordinate user and group IDs can map the whole user and group ID
191           space. This option is equivalent to specifying --map-users=auto and
192           --map-groups=auto.
193
194       -r, --map-root-user
195           Run the program only after the current effective user and group IDs
196           have been mapped to the superuser UID and GID in the newly created
197           user namespace. This makes it possible to conveniently gain
198           capabilities needed to manage various aspects of the newly created
199           namespaces (such as configuring interfaces in the network namespace
200           or mounting filesystems in the mount namespace) even when run
201           unprivileged. As a mere convenience feature, it does not support
202           more sophisticated use cases, such as mapping multiple ranges of
203           UIDs and GIDs. This option implies --setgroups=deny and --user.
204           This option is equivalent to --map-user=0 --map-group=0.
205
206       -c, --map-current-user
207           Run the program only after the current effective user and group IDs
208           have been mapped to the same UID and GID in the newly created user
209           namespace. This option implies --setgroups=deny and --user. This
210           option is equivalent to --map-user=$(id -ru) --map-group=$(id -rg).
211
212       --propagation private|shared|slave|unchanged
213           Recursively set the mount propagation flag in the new mount
214           namespace. The default is to set the propagation to private. It is
215           possible to disable this feature with the argument unchanged. The
216           option is silently ignored when the mount namespace (--mount) is
217           not requested.
218
219       --setgroups allow|deny
220           Allow or deny the setgroups(2) system call in a user namespace.
221
222           To be able to call setgroups(2), the calling process must at least
223           have CAP_SETGID. But since Linux 3.19 a further restriction
224           applies: the kernel gives permission to call setgroups(2) only
225           after the GID map (/proc/pid*/gid_map*) has been set. The GID map
226           is writable by root when setgroups(2) is enabled (i.e., allow, the
227           default), and the GID map becomes writable by unprivileged
228           processes when setgroups(2) is permanently disabled (with deny).
229
230       -R, --root=dir
231           run the command with root directory set to dir.
232
233       -w, --wd=dir
234           change working directory to dir.
235
236       -S, --setuid uid
237           Set the user ID which will be used in the entered namespace.
238
239       -G, --setgid gid
240           Set the group ID which will be used in the entered namespace and
241           drop supplementary groups.
242
243       --monotonic offset
244           Set the offset of CLOCK_MONOTONIC which will be used in the entered
245           time namespace. This option requires unsharing a time namespace
246           with --time.
247
248       --boottime offset
249           Set the offset of CLOCK_BOOTTIME which will be used in the entered
250           time namespace. This option requires unsharing a time namespace
251           with --time.
252
253       -h, --help
254           Display help text and exit.
255
256       -V, --version
257           Print version and exit.
258

NOTES

260       The proc and sysfs filesystems mounting as root in a user namespace
261       have to be restricted so that a less privileged user cannot get more
262       access to sensitive files that a more privileged user made unavailable.
263       In short the rule for proc and sysfs is as close to a bind mount as
264       possible.
265

EXAMPLES

267       The following command creates a PID namespace, using --fork to ensure
268       that the executed command is performed in a child process that (being
269       the first process in the namespace) has PID 1. The --mount-proc option
270       ensures that a new mount namespace is also simultaneously created and
271       that a new proc(5) filesystem is mounted that contains information
272       corresponding to the new PID namespace. When the readlink(1) command
273       terminates, the new namespaces are automatically torn down.
274
275           # unshare --fork --pid --mount-proc readlink /proc/self
276           1
277
278       As an unprivileged user, create a new user namespace where the user’s
279       credentials are mapped to the root IDs inside the namespace:
280
281           $ id -u; id -g
282           1000
283           1000
284           $ unshare --user --map-root-user \
285                   sh -c ''whoami; cat /proc/self/uid_map /proc/self/gid_map''
286           root
287                    0       1000          1
288                    0       1000          1
289
290       As an unprivileged user, create a user namespace where the first 65536
291       IDs are all mapped, and the user’s credentials are mapped to the root
292       IDs inside the namespace. The map is determined by the subordinate IDs
293       assigned in subuid(5) and subgid(5). Demonstrate this mapping by
294       creating a file with user ID 1 and group ID 1. For brevity, only the
295       user ID mappings are shown:
296
297           $ id -u
298           1000
299           $ cat /etc/subuid
300           1000:100000:65536
301           $ unshare --user --map-auto --map-root-user
302           # id -u
303           0
304           # cat /proc/self/uid_map
305                    0       1000          1
306                    1     100000      65535
307           # touch file; chown 1:1 file
308           # ls -ln --time-style=+ file
309           -rw-r--r-- 1 1 1 0  file
310           # exit
311           $ ls -ln --time-style=+ file
312           -rw-r--r-- 1 100000 100000 0  file
313
314       The first of the following commands creates a new persistent UTS
315       namespace and modifies the hostname as seen in that namespace. The
316       namespace is then entered with nsenter(1) in order to display the
317       modified hostname; this step demonstrates that the UTS namespace
318       continues to exist even though the namespace had no member processes
319       after the unshare command terminated. The namespace is then destroyed
320       by removing the bind mount.
321
322           # touch /root/uts-ns
323           # unshare --uts=/root/uts-ns hostname FOO
324           # nsenter --uts=/root/uts-ns hostname
325           FOO
326           # umount /root/uts-ns
327
328       The following commands establish a persistent mount namespace
329       referenced by the bind mount /root/namespaces/mnt. In order to ensure
330       that the creation of that bind mount succeeds, the parent directory
331       (/root/namespaces) is made a bind mount whose propagation type is not
332       shared.
333
334           # mount --bind /root/namespaces /root/namespaces
335           # mount --make-private /root/namespaces
336           # touch /root/namespaces/mnt
337           # unshare --mount=/root/namespaces/mnt
338
339       The following commands demonstrate the use of the --kill-child option
340       when creating a PID namespace, in order to ensure that when unshare is
341       killed, all of the processes within the PID namespace are killed.
342
343           # set +m                # Don't print job status messages
344
345
346           # unshare --pid --fork --mount-proc --kill-child -- \
347
348
349                  bash --norc -c ''(sleep 555 &) && (ps a &) && sleep 999'' &
350           [1] 53456
351           #     PID TTY      STAT   TIME COMMAND
352                 1 pts/3    S+     0:00 sleep 999
353                 3 pts/3    S+     0:00 sleep 555
354                 5 pts/3    R+     0:00 ps a
355
356           # ps h -o 'comm' $! # Show that background job is unshare(1)
357           unshare
358           # kill $! # Kill unshare(1)
359           # pidof sleep
360
361       The pidof(1) command prints no output, because the sleep processes have
362       been killed. More precisely, when the sleep process that has PID 1 in
363       the namespace (i.e., the namespace’s init process) was killed, this
364       caused all other processes in the namespace to be killed. By contrast,
365       a similar series of commands where the --kill-child option is not used
366       shows that when unshare terminates, the processes in the PID namespace
367       are not killed:
368
369           # unshare --pid --fork --mount-proc -- \
370
371
372                  bash --norc -c ''(sleep 555 &) && (ps a &) && sleep 999'' &
373           [1] 53479
374           #     PID TTY      STAT   TIME COMMAND
375                 1 pts/3    S+     0:00 sleep 999
376                 3 pts/3    S+     0:00 sleep 555
377                 5 pts/3    R+     0:00 ps a
378
379           # kill $!
380           # pidof sleep
381           53482 53480
382
383       The following example demonstrates the creation of a time namespace
384       where the boottime clock is set to a point several years in the past:
385
386           # uptime -p             # Show uptime in initial time namespace
387           up 21 hours, 30 minutes
388           # unshare --time --fork --boottime 300000000 uptime -p
389           up 9 years, 28 weeks, 1 day, 2 hours, 50 minutes
390

AUTHORS

392       Mikhail Gusarov <dottedmag@dottedmag.net>, Karel Zak <kzak@redhat.com>
393

REPORTING BUGS

398       For bug reports, use the issue tracker at
399       https://github.com/util-linux/util-linux/issues.
400

AVAILABILITY

402       The unshare command is part of the util-linux package which can be
403       downloaded from Linux Kernel Archive
404       <https://www.kernel.org/pub/linux/utils/util-linux/>.
405
406
407
408util-linux 2.38.1                 2022-05-11                        UNSHARE(1)