1CH-FROMHOST(1) Charliecloud CH-FROMHOST(1)
2
3
4
6 ch-fromhost - Inject files from the host into an image directory, with
7 various magic
8
10 $ ch-fromhost [OPTION ...] [FILE_OPTION ...] IMGDIR
11
13 NOTE:
14 This command is experimental. Features may be incomplete and/or
15 buggy. Please report any issues you find, so we can fix them!
16
17 Inject files from the host into the Charliecloud image directory
18 IMGDIR.
19
20 The purpose of this command is to inject arbitrary host files into a
21 container necessary to access host specific resources; usually GPU or
22 proprietary interconnets. It is not a general copy-to-image tool; see
23 further discussion on use cases below.
24
25 It should be run after:code:ch-convert and before ch-run. After invoca‐
26 tion, the image is no longer portable to other hosts.
27
28 Injection is not atomic; if an error occurs partway through injection,
29 the image is left in an undefined state and should be re-unpacked from
30 storage. Injection is currently implemented using a simple file copy,
31 but that may change in the future.
32
33 Arbitrary file and libfabric injection are handled differently.
34
35 Arbitrary files
36 Arbitrary file paths that contain the strings /bin or /sbin are assumed
37 to be executables and placed in /usr/bin within the container. Paths
38 that are not loadable libfabric providers and contain the strings /lib
39 or .so are assumed to be shared libraries and are placed in the
40 first-priority directory reported by ldconfig (see --lib-path below).
41 Other files are placed in the directory specified by --dest.
42
43 If any shared libraries are injected, run ldconfig inside the container
44 (using ch-run -w) after injection.
45
46 Libfabric
47 MPI implementations have numerous ways of communicating messages over
48 interconnects. We use libfabric (OFI), an OpenFabric framework that ex‐
49 ports fabric communication services to applications, to manage these
50 communcations with built-in, or loadable, fabric providers.
51
52 • https://ofiwg.github.io/libfabric
53
54 • https://ofiwg.github.io/libfabric/v1.14.0/man/fi_provider.3.html
55
56 Using OFI, we can (a) uniformly manage fabric communcation services for
57 both OpenMPI and MPICH, and (b) use simplified methods of accessing
58 proprietary host hardware, e.g., Cray’s Gemini/Aries and Slingshot
59 (CXI).
60
61 OFI providers implement the application facing software interfaces
62 needed to access network specific protocols, drivers, and hardware.
63 Loadable providers, i.e., compiled OFI libraries that end in -fi.so,
64 for example, Cray’s libgnix-fi.so, can be copied into, and used, by an
65 image with a MPI configured against OFI. Alternatively, the image’s
66 libfabric.so can be overwritten with the host’s. See details and quirks
67 below.
68
70 To specify which files to inject
71 -c, --cmd CMD
72 Inject files listed in the standard output of command CMD.
73
74 -f, --file FILE
75 Inject files listed in the file FILE.
76
77 -p, --path PATH
78 Inject the file at PATH.
79
80 --cray-mpi-cxi
81 Inject cray-libfabric for slingshot. This is equivalent to
82 --path $CH_FROMHOST_OFI_CXI, where $CH_FROMHOST_OFI_CXI is
83 the path the Cray host libfabric libfabric.so.
84
85 --cray-mpi-gni
86 Inject cray gemini/aries GNI libfabric provider libg‐
87 nix-fi.so. This is equivalent to --fi-provider
88 $CH_FROMHOST_OFI_GNI, where CH_FROMHOST_OFI_GNI is the path
89 to the Cray host ugni provider libgnix-fi.so.
90
91 --nvidia
92 Use nvidia-container-cli list (from libnvidia-container) to
93 find executables and libraries to inject.
94
95 These can be repeated, and at least one must be specified.
96
97 To specify the destination within the image
98 -d, --dest DST
99 Place files specified later in directory IMGDIR/DST, overrid‐
100 ing the inferred destination, if any. If a file’s destination
101 cannot be inferred and --dest has not been specified, exit
102 with an error. This can be repeated to place files in varying
103 destinations.
104
105 Additional arguments
106 --fi-path
107 Print the guest destination path for libfabric providers and
108 replacement.
109
110 --lib-path
111 Print the guest destination path for shared libraries in‐
112 ferred as described above.
113
114 --no-ldconfig
115 Don’t run ldconfig even if we appear to have injected shared
116 libraries.
117
118 -h, --help
119 Print help and exit.
120
121 -v, --verbose
122 List the injected files.
123
124 --version
125 Print version and exit.
126
128 This command does a lot of heuristic magic; while it can copy arbitrary
129 files into an image, this usage is discouraged and prone to error. Here
130 are some use cases and the recommended approach:
131
132 1. I have some files on my build host that I want to include in the im‐
133 age. Use the COPY instruction within your Dockerfile. Note that
134 it’s OK to build an image that meets your specific needs but isn’t
135 generally portable, e.g., only runs on specific micro-architectures
136 you’re using.
137
138 2. I have an already built image and want to install a program I com‐
139 piled separately into the image. Consider whether a building a new
140 derived image with a Dockerfile is appropriate. Another good option
141 is to bind-mount the directory containing your program at run time.
142 A less good option is to cp(1) the program into your image, because
143 this permanently alters the image in a non-reproducible way.
144
145 3. I have some shared libraries that I need in the image for function‐
146 ality or performance, and they aren’t available in a place where I
147 can use COPY. This is the intended use case of ch-fromhost. You can
148 use --cmd, --file, --ofi, and/or --path to put together a custom so‐
149 lution. But, please consider filing an issue so we can package your
150 functionality with a tidy option like --nvidia.
151
153 The implementation of libfabric provider injection and replacement is
154 experimental and has a couple quirks.
155
156 1. Containers must have the following software installed:
157
158 a. libfabric (https://ofiwg.github.io/libfabric/). See char‐
159 liecloud/examples/Dockerfile.libfabric.
160
161 b. Corresponding open source MPI implementation configured and built
162 against the container libfabric, e.g., - MPICH, or - OpenMPI.
163 See charliecloud/examples/Dockerfile.mpich and charliecloud/exam‐
164 ples/Dockerfile.openmpi.
165
166 2. At run time, a libfabric provider can be specified with the variable
167 FI_PROVIDER. The path to search for shared providers can be speci‐
168 fied with FI_PROVIDER_PATH. These variables can be inherited from
169 the host or explicitly set with the container’s environment file
170 /ch/environent via --set-env.
171
172 To avoid issues and reduce complexity, the inferred injection desti‐
173 nation for libfabric providers and replacement will always at the
174 path in the image where libfabric.so is found.
175
176 3. The Cray GNI loadable provider, libgnix-fi.so, will link to com‐
177 piler(s) in the programming environment by default. For example, if
178 it is built under the PrgEnv-intel programming environment, it will
179 have links to files at paths /opt/gcc and /opt/intel that ch-run
180 will not bind automatically.
181
182 Managing all possible bind mount paths is untenable. Thus, this ex‐
183 perimental implementation injects libraries linked to a libg‐
184 nix-fi.so built with the minimal modules necessary to compile, i.e.:
185
186 • modules
187
188 • craype-network-aries
189
190 • eproxy
191
192 • slurm
193
194 • cray-mpich
195
196 • craype-haswell
197
198 • craype-hugepages2M
199
200 A Cray GNI provider linked against more complicated PE’s will still
201 work, assuming 1) the user explicitly bind-mounts missing libraries
202 listed from its ldd output, and 2) all such libraries do not con‐
203 flict with container functionality, e.g., glibc.so, etc.
204
205 4. At the time of this writing, a Cray Slingshot optimized provider is
206 not available; however, recent libfabric source acitivity indicates
207 there may be at some point, see:
208 https://github.com/ofiwg/libfabric/pull/7839We.
209
210 For now, on Cray systems with Slingshot, CXI, we need overwrite the
211 container’s libfabric.so with the hosts using --path. See examples
212 for details.
213
214 5. Tested only for C programs compiled with GCC. Additional bind mount
215 or kludging may be needed for untested use cases. If you’d like to
216 use another compiler or programming environment, please get in touch
217 so we can implement the necessary support.
218
219 Please file a bug if we missed anything above or if you know how to
220 make the code better.
221
223 Symbolic links are dereferenced, i.e., the files pointed to are in‐
224 jected, not the links themselves.
225
226 As a corollary, do not include symlinks to shared libraries. These will
227 be re-created by ldconfig.
228
229 There are two alternate approaches for nVidia GPU libraries:
230
231 1. Link libnvidia-containers into ch-run and call the library func‐
232 tions directly. However, this would mean that Charliecloud would
233 either (a) need to be compiled differently on machines with and
234 without nVidia GPUs or (b) have libnvidia-containers available
235 even on machines without nVidia GPUs. Neither of these is consis‐
236 tent with Charliecloud’s philosophies of simplicity and minimal
237 dependencies.
238
239 2. Use nvidia-container-cli configure to do the injecting. This
240 would require that containers have a half-started state, where
241 the namespaces are active and everything is mounted but
242 pivot_root(2) has not been performed. This is not feasible be‐
243 cause Charliecloud has no notion of a half-started container.
244
245 Further, while these alternate approaches would simplify or eliminate
246 this script for nVidia GPUs, they would not solve the problem for other
247 situations.
248
250 File paths may not contain colons or newlines.
251
252 ldconfig tends to print stat errors; these are typically non-fatal and
253 occur when trying to probe common library paths. See issue #732.
254
256 libfabric
257 Cray Slingshot CXI injection.
258
259 Replace image libabfric, i.e., libfabric.so, with Cray host’s libfabric
260 at host path /opt/cray-libfabric/lib64/libfabric.so.
261
262 $ ch-fromhost -v --path /opt/cray-libfabric/lib64/libfabric.so /tmp/ompi
263 [ debug ] queueing files
264 [ debug ] cray libfabric: /opt/cray-libfabric/lib64/libfabric.so
265 [ debug ] searching image for inferred libfabric destiation
266 [ debug ] found /tmp/ompi/usr/local/lib/libfabric.so
267 [ debug ] adding cray libfabric libraries
268 [ debug ] skipping /lib64/libcom_err.so.2
269 [...]
270 [ debug ] queueing files
271 [ debug ] shared library: /usr/lib64/libcxi.so.1
272 [ debug ] queueing files
273 [ debug ] shared library: /usr/lib64/libcxi.so.1.2.1
274 [ debug ] queueing files
275 [ debug ] shared library: /usr/lib64/libjson-c.so.3
276 [ debug ] queueing files
277 [ debug ] shared library: /usr/lib64/libjson-c.so.3.0.1
278 [...]
279 [ debug ] queueing files
280 [ debug ] shared library: /usr/lib64/libssh.so.4
281 [ debug ] queueing files
282 [ debug ] shared library: /usr/lib64/libssh.so.4.7.4
283 [...]
284 [ debug ] inferred shared library destination: /tmp/ompi//usr/local/lib
285 [ debug ] injecting into image: /tmp/ompi/
286 [ debug ] mkdir -p /tmp/ompi//var/lib/hugetlbfs
287 [ debug ] mkdir -p /tmp/ompi//var/spool/slurmd
288 [ debug ] echo '/usr/lib64' >> /tmp/ompi//etc/ld.so.conf.d/ch-ofi.conf
289 [ debug ] /opt/cray-libfabric/lib64/libfabric.so -> /usr/local/lib (inferred)
290 [ debug ] /usr/lib64/libcxi.so.1 -> /usr/local/lib (inferred)
291 [ debug ] /usr/lib64/libcxi.so.1.2.1 -> /usr/local/lib (inferred)
292 [ debug ] /usr/lib64/libjson-c.so.3 -> /usr/local/lib (inferred)
293 [ debug ] /usr/lib64/libjson-c.so.3.0.1 -> /usr/local/lib (inferred)
294 [ debug ] /usr/lib64/libssh.so.4 -> /usr/local/lib (inferred)
295 [ debug ] /usr/lib64/libssh.so.4.7.4 -> /usr/local/lib (inferred)
296 [ debug ] running ldconfig
297 [ debug ] ch-run -w /tmp/ompi/ -- /sbin/ldconfig
298 [ debug ] validating ldconfig cache
299 done
300
301 Same as above, except also inject Cray’s fi_info to verify Slingshot
302 provider access.
303
304 $ ch-fromhost -v --path /opt/cray/libfabric/1.15.0.0/lib64/libfabric.so \
305 -d /usr/local/bin \
306 --path /opt/cray/libfabric/1.15.0.0/lib64/libfabric.so \
307 /tmp/ompi
308 [...]
309 $ ch-run /tmp/ompi/ -- fi_info -p cxi
310 provider: cxi
311 fabric: cxi
312 [...]
313 type: FI_EP_RDM
314 protocol: FI_PROTO_CXI
315
316 Cray GNI shared provider injection.
317
318 Add Cray host built GNI provider libgnix-fi.so to the image and verify
319 with fi_info.
320
321 $ ch-fromhost -v --path /home/ofi/libgnix-fi.so /tmp/ompi
322 [ debug ] queueing files
323 [ debug ] libfabric shared provider: /home/ofi/libgnix-fi.so
324 [ debug ] searching /tmp/ompi for libfabric shared provider destination
325 [ debug ] found: /tmp/ompi/usr/local/lib/libfabric.so
326 [ debug ] inferred provider destination: //usr/local/lib/libfabric
327 [ debug ] injecting into image: /tmp/ompi
328 [ debug ] mkdir -p /tmp/ompi//usr/local/lib/libfabric
329 [ debug ] mkdir -p /tmp/ompi/var/lib/hugetlbfs
330 [ debug ] mkdir -p /tmp/ompi/var/opt/cray/alps/spool
331 [ debug ] mkdir -p /tmp/ompi/opt/cray/wlm_detect
332 [ debug ] mkdir -p /tmp/ompi/etc/opt/cray/wlm_detect
333 [ debug ] mkdir -p /tmp/ompi/opt/cray/udreg
334 [ debug ] mkdir -p /tmp/ompi/opt/cray/xpmem
335 [ debug ] mkdir -p /tmp/ompi/opt/cray/ugni
336 [ debug ] mkdir -p /tmp/ompi/opt/cray/alps
337 [ debug ] echo '/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
338 [ debug ] echo '/opt/cray/alps/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
339 [ debug ] echo '/opt/cray/udreg/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
340 [ debug ] echo '/opt/cray/ugni/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
341 [ debug ] echo '/opt/cray/wlm_detect/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
342 [ debug ] echo '/opt/cray/xpmem/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
343 [ debug ] echo '/usr/lib64' >> /tmp/ompi/etc/ld.so.conf.d/ch-ofi.conf
344 [ debug ] /home/ofi/libgnix-fi.so -> //usr/local/lib/libfabric (inferred)
345 [ debug ] running ldconfig
346 [ debug ] ch-run -w /tmp/ompi -- /sbin/ldconfig
347 [ debug ] validating ldconfig cache
348 done
349
350 $ ch-run /tmp/ompi -- fi_info -p gni
351 provider: gni
352 fabric: gni
353 [...]
354 type: FI_EP_RDM
355 protocol: FI_PROTO_GNI
356
357 Arbitrary
358 Place shared library /usr/lib64/libfoo.so at path /usr/lib/libfoo.so
359 (assuming /usr/lib is the first directory searched by the dynamic
360 loader in the image), within the image /var/tmp/baz and executable
361 /bin/bar at path /usr/bin/bar. Then, create appropriate symlinks to
362 libfoo and update the ld.so cache.
363
364 $ cat qux.txt
365 /bin/bar
366 /usr/lib64/libfoo.so
367 $ ch-fromhost --file qux.txt /var/tmp/baz
368
369 Same as above:
370
371 $ ch-fromhost --cmd 'cat qux.txt' /var/tmp/baz
372
373 Same as above:
374
375 $ ch-fromhost --path /bin/bar --path /usr/lib64/libfoo.so /var/tmp/baz
376
377 Same as above, but place the files into /corge instead (and the shared
378 library will not be found by ldconfig):
379
380 $ ch-fromhost --dest /corge --file qux.txt /var/tmp/baz
381
382 Same as above, and also place file /etc/quux at /etc/quux within the
383 container:
384
385 $ ch-fromhost --file qux.txt --dest /etc --path /etc/quux /var/tmp/baz
386
387 Inject the executables and libraries recommended by nVidia into the im‐
388 age, and then run ldconfig:
389
390 $ ch-fromhost --nvidia /var/tmp/baz
391 asking ldconfig for shared library destination
392 /sbin/ldconfig: Can’t stat /libx32: No such file or directory
393 /sbin/ldconfig: Can’t stat /usr/libx32: No such file or directory
394 shared library destination: /usr/lib64//bind9-export
395 injecting into image: /var/tmp/baz
396 /usr/bin/nvidia-smi -> /usr/bin (inferred)
397 /usr/bin/nvidia-debugdump -> /usr/bin (inferred)
398 /usr/bin/nvidia-persistenced -> /usr/bin (inferred)
399 /usr/bin/nvidia-cuda-mps-control -> /usr/bin (inferred)
400 /usr/bin/nvidia-cuda-mps-server -> /usr/bin (inferred)
401 /usr/lib64/libnvidia-ml.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
402 /usr/lib64/libnvidia-cfg.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
403 [...]
404 /usr/lib64/libGLESv2_nvidia.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
405 /usr/lib64/libGLESv1_CM_nvidia.so.460.32.03 -> /usr/lib64//bind9-export (inferred)
406 running ldconfig
407
409 This command was inspired by the similar Shifter feature that allows
410 Shifter containers to use the Cray Aries network. We particularly ap‐
411 preciate the help provided by Shane Canon and Doug Jacobsen during our
412 implementation of --cray-mpi.
413
414 We appreciate the advice of Ryan Olson at nVidia on implementing
415 --nvidia.
416
418 If Charliecloud was obtained from your Linux distribution, use your
419 distribution’s bug reporting procedures.
420
421 Otherwise, report bugs to: https://github.com/hpc/charliecloud/issues
422
424 charliecloud(7)
425
426 Full documentation at: <https://hpc.github.io/charliecloud>
427
429 2014–2022, Triad National Security, LLC and others
430
431
432
433
4340.32 2023-07-19 00:00 UTC CH-FROMHOST(1)