1CH-RUN(1)                        Charliecloud                        CH-RUN(1)
2
3
4

NAME

6       ch-run - Run a command in a Charliecloud container
7

SYNOPSIS

9          $ ch-run [OPTION...] NEWROOT CMD [ARG...]
10

DESCRIPTION

12       Run  command  CMD  in a fully unprivileged Charliecloud container using
13       the flattened and unpacked image directory located at NEWROOT.
14
15          -b, --bind=SRC[:DST]
16                 Bind-mount SRC at guest DST. The default destination  if  not
17                 specified  is to use the same path as the host; i.e., the de‐
18                 fault is --bind=SRC:SRC. Can be repeated.
19
20                 If --write is given and DST does not exist, it will  be  cre‐
21                 ated  as  an  empty  directory. However, DST must be entirely
22                 within the image itself; DST cannot  enter  a  previous  bind
23                 mount.   For  example, --bind /foo:/tmp/foo will fail because
24                 /tmp is shared with the host via  bind-mount  (unless  --pri‐
25                 vate-tmp is given).
26
27                 Most images do have ten directories /mnt/[0-9] already avail‐
28                 able as mount points.
29
30                 Symlinks in DST are followed, and  absolute  links  can  have
31                 surprising  behavior.  Bind-mounting  happens after namespace
32                 setup but before pivoting into the container image, so  abso‐
33                 lute  links use the host root. For example, suppose the image
34                 has a symlink /foo  ->  /mnt.   Then,  --bind=/bar:/foo  will
35                 bind-mount  on  the host’s /mnt, which is inaccessible on the
36                 host because namespaces are already set up and also  inacces‐
37                 sible  in  the container because of the subsequent pivot into
38                 the image. Currently, this problem is only detected when  DST
39                 needs  to  be  created: ch-run will refuse to follow absolute
40                 symlinks in this case, to avoid directory creation surprises.
41
42          -c, --cd=DIR
43                 Initial working directory in container.
44
45          --ch-ssh
46                 Bind ch-ssh(1) into container at /usr/bin/ch-ssh.
47
48          --env-no-expand
49                 don’t expand variables when using --set-env
50
51          -g, --gid=GID
52                 Run as group GID within container.
53
54          -j, --join
55                 Use the same container (namespaces) as  peer  ch-run  invoca‐
56                 tions.
57
58          --join-pid=PID
59                 Join the namespaces of an existing process.
60
61          --join-ct=N
62                 Number of ch-run peers (implies --join; default: see below).
63
64          --join-tag=TAG
65                 Label for ch-run peer group (implies --join; default: see be‐
66                 low).
67
68          --no-home
69                 By  default,  your  host  home  directory  (i.e.,  $HOME)  is
70                 bind-mounted  at  guest  /home/$USER. This is accomplished by
71                 mounting a new tmpfs at /home, which hides any image  content
72                 under  that  path.  If  this  is  specified, neither of these
73                 things happens and the image’s /home is exposed unaltered.
74
75          --no-passwd
76                 By default, temporary /etc/passwd and  /etc/group  files  are
77                 created  according  to the UID and GID maps for the container
78                 and bind-mounted into it. If this is specified, no such  tem‐
79                 porary files are created and the image’s files are exposed.
80
81          -t, --private-tmp
82                 By  default,  /tmp is shared with the host. If this is speci‐
83                 fied, a new tmpfs is mounted on the container’s /tmp instead.
84
85          --set-env=FILE, --set-env=VAR=VALUE
86                 set environment variable(s), either as specified in host path
87                 FILE, or set variable VAR to VALUE
88
89          -u, --uid=UID
90                 Run as user UID within container.
91
92          --unset-env=GLOB
93                 Unset environment variables whose names match GLOB.
94
95          -v, --verbose
96                 Be more verbose (can be repeated).
97
98          -w, --write
99                 Mount  image  read-write  (by  default,  the image is mounted
100                 read-only).
101
102          -?, --help
103                 Print help and exit.
104
105          --usage
106                 Print a short usage message and exit.
107
108          -V, --version
109                 Print version and exit.
110
111       Note: Because ch-run is fully  unprivileged,  it  is  not  possible  to
112       change  UIDs  and  GIDs within the container (the relevant system calls
113       fail). In particular, setuid, setgid, and  setcap  executables  do  not
114       work.  As  a  precaution, ch-run calls prctl(PR_SET_NO_NEW_PRIVS, 1) to
115       disable these executables within the container. This  does  not  reduce
116       functionality  but  is a “belt and suspenders” precaution to reduce the
117       attack surface should bugs in these system calls or elsewhere arise.
118

HOST FILES AND DIRECTORIES AVAILABLE IN CONTAINER VIA BIND MOUNTS

120       In addition to any directories  specified  by  the  user  with  --bind,
121       ch-run has standard host files and directories that are bind-mounted in
122       as well.
123
124       The following host files and directories are bind-mounted at  the  same
125       location  in the container. These give access to the host’s devices and
126       various kernel facilities. (Recall that Charliecloud  provides  minimal
127       isolation  and  containerized  processes are mostly normal unprivileged
128       processes.) They cannot be disabled and are required; i.e.,  they  must
129       exist both on host and within the image.
130
131/dev
132
133/proc
134
135/sys
136
137       Optional;  bind-mounted only if path exists on both host and within the
138       image, without error or warning if not.
139
140/etc/hosts and /etc/resolv.conf. Because  Charliecloud  containers
141            share the host network namespace, they need the same hostname res‐
142            olution configuration.
143
144/etc/machine-id. Provides a unique ID  for  the  OS  installation;
145            matching  the  host  works  for most situations. Needed to support
146            D-Bus, some software licensing situations, and  likely  other  use
147            cases. See also issue #1050.
148
149/var/lib/hugetlbfs    at    guest   /var/opt/cray/hugetlbfs,   and
150            /var/opt/cray/alps/spool. These support Cray MPI.
151
152$PREFIX/bin/ch-ssh at guest /usr/bin/ch-ssh. SSH wrapper that  au‐
153            tomatically containerizes after connecting.
154
155       Additional bind mounts done by default but can be disabled; see the op‐
156       tions above.
157
158$HOME at /home/$USER (and image /home is hidden).  Makes user data
159            and init files available.
160
161/tmp.  Provides  a  temporary directory that persists between con‐
162            tainer runs and is shared with non-containerized application  com‐
163            ponents.
164
165          • temporary files at /etc/passwd and /etc/group. Usernames and group
166            names need to be customized for each container run.
167

MULTIPLE PROCESSES IN THE SAME CONTAINER WITH --JOIN

169       By default, different ch-run invocations use different user  and  mount
170       namespaces  (i.e.,  different  containers). While this has no impact on
171       sharing most resources between invocations, there are a  few  important
172       exceptions.  These include:
173
174       1. ptrace(2), used by debuggers and related tools. One can attach a de‐
175          bugger to processes in descendant namespaces, but not sibling  name‐
176          spaces.   The practical effect of this is that (without --join), you
177          can’t run a command with ch-run and then attach to it with a  debug‐
178          ger also run with ch-run.
179
180       2. Cross-memory attach (CMA) is used by cooperating processes to commu‐
181          nicate by simply reading and writing one another’s memory.  This  is
182          also  not permitted between sibling namespaces. This affects various
183          MPI implementations that use CMA to pass messages between  ranks  on
184          the same node, because it’s faster than traditional shared memory.
185
186       --join  is  designed to address this by placing related ch-run commands
187       (the “peer group”) in the same container. This is done by  one  of  the
188       peers  creating  the  namespaces with unshare(2) and the others joining
189       with setns(2).
190
191       To do so, we need to know the number of peers and a name for the group.
192       These  are  specified  by  additional arguments that can (hopefully) be
193       left at default values in most cases:
194
195--join-ct sets the number of peers. The default is the value  of  the
196         first  of  the  following  environment  variables  that  is  defined:
197         OMPI_COMM_WORLD_LOCAL_SIZE,                SLURM_STEP_TASKS_PER_NODE,
198         SLURM_CPUS_ON_NODE.
199
200--join-tag sets the tag that names the peer group. The default is en‐
201         vironment variable SLURM_STEP_ID, if defined; otherwise, the  PID  of
202         ch-run’s  parent.  Tags  can be re-used for peer groups that start at
203         different times, i.e., once all peer ch-run have replaced  themselves
204         with the user command, the tag can be re-used.
205
206       Caveats:
207
208       • One  cannot  currently  add peers after the fact, for example, if one
209         decides to start a debugger after the fact. (This  is  only  required
210         for code with bugs and is thus an unusual use case.)
211
212ch-run  instances  race.  The  winner  of this race sets up the name‐
213         spaces, and the other peers use the winner to find the namespaces  to
214         join. Therefore, if the user command of the winner exits, any remain‐
215         ing peers will not be able to join the namespaces, even if  they  are
216         still  active.  There  is  currently  no general way to specify which
217         ch-run should be the winner.
218
219       • If --join-ct is too high, the winning ch-run’s user command exits be‐
220         fore  all peers join, or ch-run itself crashes, IPC resources such as
221         semaphores and shared memory segments will be leaked. These appear as
222         files in /dev/shm/ and can be removed with rm(1).
223
224       • Many  of  the  arguments  given to the race losers, such as the image
225         path and --bind, will be ignored in favor of what was  given  to  the
226         winner.
227

ENVIRONMENT VARIABLES

229       ch-run  leaves  environment variables unchanged, i.e. the host environ‐
230       ment is passed through unaltered, except:
231
232       • limited tweaks to avoid significant guest breakage;
233
234       • user-set variables via --set-env;
235
236       • user-unset variables via --unset-env; and
237
238       • set CH_RUNNING.
239
240       This section describes these features.
241
242       The default tweaks happen first, and then --set-env and --unset-env  in
243       the order specified on the command line. The latter two can be repeated
244       arbitrarily many times, e.g. to add/remove multiple  variable  sets  or
245       add only some variables in a file.
246
247   Default behavior
248       By default, ch-run makes the following environment variable changes:
249
250$CH_RUNNING: Set to Weird Al Yankovic. While a process can figure out
251         that it’s in an unprivileged container and what namespaces are active
252         without  this  hint,  the checks can be messy, and there is no way to
253         tell that it’s a Charliecloud container specifically.  This  variable
254         makes  such  a  test simple and well-defined. (Note: This variable is
255         unaffected by --unset-env.)
256
257$HOME: If the path to your home directory is not /home/$USER  on  the
258         host,  then  an  inherited  $HOME will be incorrect inside the guest.
259         This confuses some software, such as Spack.
260
261         Thus, we change $HOME to /home/$USER, unless --no-home is  specified,
262         in which case it is left unchanged.
263
264$PATH: Newer Linux distributions replace some root-level directories,
265         such as /bin, with symlinks to their counterparts in /usr.
266
267         Some of these distributions (e.g., Fedora 24) have also dropped  /bin
268         from  the default $PATH. This is a problem when the guest OS does not
269         have a merged /usr (e.g., Debian 8 “Jessie”). Thus, we  add  /bin  to
270         $PATH if it’s not already present.
271
272         Further reading:
273
274The case for the /usr Merge
275
276Fedora
277
278Debian
279
280   Setting variables with --set-env
281       The purpose of --set-env is to set environment variables in addition to
282       (or instead of) those inherited from the host shell.
283
284       If the argument contains an equals character, then it is interpreted as
285       a  variable name and value; otherwise, it is a host path to a file with
286       one variable name/value per line  (guest  paths  can  be  specified  by
287       prepending the image path). Values given replace any already set (i.e.,
288       if a variable is repeated, the last value wins). Environment  variables
289       in  the  value are expanded unless --env-no-expand is given, though see
290       below for syntax differences from the shell.
291
292       For example, to prepend /opt/bin to the current shell’s path (note pro‐
293       tecting  expansion of $PATH by the shell, though here the results would
294       be equivalent if we let the shell do it):
295
296          $ ch-run --set-env='PATH=/opt/bin:$PATH' ...
297
298       To add variables set by Dockerfile ENV instructions to the current  en‐
299       vironment:
300
301          $ ch-run --set-env=$IMG/ch/environment ...
302
303       To  prepend  /opt/bin to the path set by the Dockerfile (here we really
304       can’t let the shell expand $PATH):
305
306          $ ch-run --set-env=$IMG/ch/environment --set-env='PATH=/opt/bin:$PATH' ...
307
308       The syntax of the argument is a key-value pair separated by  the  first
309       equals  character  (=,  ASCII 61), with optional single straight quotes
310       (', ASCII 39) around the value, though be aware that  quotes  are  also
311       interpreted  by the shell. Newlines (ASCII 10) are not permitted in ei‐
312       ther key or value. The value may be empty, but not the key.
313
314       Environment variables in the value are expanded unless  --env-no-expand
315       is given. In this case, the value is a sequence of possibly-empty items
316       separated by colon (:, ASCII 58). If an item begins  with  dollar  sign
317       ($,  ASCII  36),  then  the rest of the item the name of an environment
318       variable. If this variable is set to a non-empty value, that  value  is
319       substituted for the item; otherwise (i.e., the variable is unset or the
320       empty string), the item is deleted, including a  delimiter  colon.  The
321       purpose  of  omitting  empty expansions is to avoid surprising behavior
322       such as an empty element in $PATH meaning the current directory. If  no
323       expansions happen, this paragraph is a no-op.
324
325       If a file is given instead, it is a sequence of such arguments, one per
326       line.  Empty lines are ignored. No comments are interpreted. (This syn‐
327       tax is designed to accept the output of printenv and be easily produced
328       by other simple mechanisms.)
329
330       Examples of valid arguments, assuming that environment variable $BAR is
331       set to bar and $UNSET is unset (or set to the empty string):
332
333                 ┌───────────────────┬───────┬─────────────────────┐
334                 │Line               │ Key   │ Value               │
335                 ├───────────────────┼───────┼─────────────────────┤
336FOO=bar            FOO   bar                 
337                 ├───────────────────┼───────┼─────────────────────┤
338FOO=bar=baz        FOO   bar=baz             
339                 ├───────────────────┼───────┼─────────────────────┤
340FLAGS=-march=foo   FLAGS -march=foo          
341-mtune=bar         │       │ -mtune=bar          
342                 ├───────────────────┼───────┼─────────────────────┤
343FLAGS='-march=foo  FLAGS -march=foo          
344-mtune=bar'        │       │ -mtune=bar          
345                 ├───────────────────┼───────┼─────────────────────┤
346FOO=$BAR           FOO   bar                 
347                 ├───────────────────┼───────┼─────────────────────┤
348FOO=$BAR:baz       FOO   bar:baz             
349                 ├───────────────────┼───────┼─────────────────────┤
350FOO=               FOO   │ empty  string  (not │
351                 │                   │       │ unset)              │
352                 ├───────────────────┼───────┼─────────────────────┤
353FOO=$UNSET         FOO   │ empty  string  (not │
354                 │                   │       │ unset or $UNSET)    │
355                 ├───────────────────┼───────┼─────────────────────┤
356FOO=baz:$UNSET:qux FOO   baz:qux        (not │
357                 │                   │       │ baz::qux)           │
358                 ├───────────────────┼───────┼─────────────────────┤
359FOO=:bar:baz::     FOO   :bar:baz::          
360                 ├───────────────────┼───────┼─────────────────────┤
361FOO=''             FOO   │ empty  string  (not │
362                 │                   │       │ unset)              │
363                 ├───────────────────┼───────┼─────────────────────┤
364FOO=''''           FOO   ''    (two   single │
365                 │                   │       │ quotes)             │
366                 └───────────────────┴───────┴─────────────────────┘
367
368       Example invalid lines:
369
370                           ┌────────┬─────────────────────┐
371                           │Line    │ Problem             │
372                           ├────────┼─────────────────────┤
373FOO bar │ no separator        │
374                           ├────────┼─────────────────────┤
375=bar    │ key cannot be empty │
376                           └────────┴─────────────────────┘
377
378       Example valid lines that are probably not what you want:
379
380              ┌─────────────────┬───────┬───────────┬──────────────────┐
381              │Line             │ Key   │ Value     │ Problem          │
382              ├─────────────────┼───────┼───────────┼──────────────────┤
383FOO="bar"        FOO   "bar"     │ double    quotes │
384              │                 │       │           │ aren’t stripped  │
385              ├─────────────────┼───────┼───────────┼──────────────────┤
386FOO=bar # baz    FOO   bar # baz │ comments     not │
387              │                 │       │           │ supported        │
388              ├─────────────────┼───────┼───────────┼──────────────────┤
389FOO=bar\tbaz     FOO   bar\tbaz  │ backslashes  are │
390              │                 │       │           │ not special      │
391              ├─────────────────┼───────┼───────────┼──────────────────┤
392​ FOO=bar        ​ FOO bar       │ leading space in │
393              │                 │       │           │ key              │
394              ├─────────────────┼───────┼───────────┼──────────────────┤
395FOO= bar         FOO   ​ bar     │ leading space in │
396              │                 │       │           │ value            │
397              └─────────────────┴───────┴───────────┴──────────────────┘
398
399
400$FOO=bar         $FOO  bar       │ variables    not │
401              │                 │       │           │ expanded in key  │
402              ├─────────────────┼───────┼───────────┼──────────────────┤
403FOO=$BAR baz:qux FOO   qux       │ variable BAR baz 
404              │                 │       │           │ not set          │
405              └─────────────────┴───────┴───────────┴──────────────────┘
406
407   Removing variables with --unset-env
408       The purpose of --unset-env=GLOB is to remove unwanted environment vari‐
409       ables. The argument GLOB is a glob pattern (dialect fnmatch(3) with  no
410       flags); all variables with matching names are removed from the environ‐
411       ment.
412
413       WARNING:
414          Because the shell also interprets glob  patterns,  if  any  wildcard
415          characters  are  in GLOB, it is important to put it in single quotes
416          to avoid surprises.
417
418       GLOB must be a non-empty string.
419
420       Example 1: Remove the single environment variable FOO:
421
422          $ export FOO=bar
423          $ env | fgrep FOO
424          FOO=bar
425          $ ch-run --unset-env=FOO $CH_TEST_IMGDIR/chtest -- env | fgrep FOO
426          $
427
428       Example 2: Hide from a container the fact that it’s running in a  Slurm
429       allocation,  by  removing all variables beginning with SLURM. You might
430       want to do this to test an MPI program with one rank and no launcher:
431
432          $ salloc -N1
433          $ env | egrep '^SLURM' | wc
434             44      44    1092
435          $ ch-run $CH_TEST_IMGDIR/mpihello-openmpi -- /hello/hello
436          [... long error message ...]
437          $ ch-run --unset-env='SLURM*' $CH_TEST_IMGDIR/mpihello-openmpi -- /hello/hello
438          0: MPI version:
439          Open MPI v3.1.3, package: Open MPI root@c897a83f6f92 Distribution, ident: 3.1.3, repo rev: v3.1.3, Oct 29, 2018
440          0: init ok cn001.localdomain, 1 ranks, userns 4026532530
441          0: send/receive ok
442          0: finalize ok
443
444       Example 3: Clear the environment completely (remove all variables):
445
446          $ ch-run --unset-env='*' $CH_TEST_IMGDIR/chtest -- env
447          $
448
449       Note that some programs, such as shells, set some environment variables
450       even if started with no init files:
451
452          $ ch-run --unset-env='*' $CH_TEST_IMGDIR/debian9 -- bash --noprofile --norc -c env
453          SHLVL=1
454          PWD=/
455          _=/usr/bin/env
456          $
457

EXAMPLES

459       Run  the  command  echo hello inside a Charliecloud container using the
460       unpacked image at /data/foo:
461
462          $ ch-run /data/foo -- echo hello
463          hello
464
465       Run an MPI job that can use CMA to communicate:
466
467          $ srun ch-run --join /data/foo -- bar
468

REPORTING BUGS

470       If Charliecloud was obtained from your  Linux  distribution,  use  your
471       distribution’s bug reporting procedures.
472
473       Otherwise, report bugs to: <https://github.com/hpc/charliecloud/issues>
474

SEE ALSO

476       charliecloud(7)
477
478       Full documentation at: <https://hpc.github.io/charliecloud>
479
481       2014–2021, Triad National Security, LLC
482
483
484
485
4860.25                         2021-09-20 00:00 UTC                    CH-RUN(1)
Impressum