1lxc.container.conf(5) lxc.container.conf(5)
2
3
4
6 lxc.container.conf - LXC container configuration file
7
9 LXC is the well-known and heavily tested low-level Linux container run‐
10 time. It is in active development since 2008 and has proven itself in
11 critical production environments world-wide. Some of its core contribu‐
12 tors are the same people that helped to implement various well-known
13 containerization features inside the Linux kernel.
14
15 LXC's main focus is system containers. That is, containers which offer
16 an environment as close as possible as the one you'd get from a VM but
17 without the overhead that comes with running a separate kernel and sim‐
18 ulating all the hardware.
19
20 This is achieved through a combination of kernel security features such
21 as namespaces, mandatory access control and control groups.
22
23 LXC has support for unprivileged containers. Unprivileged containers
24 are containers that are run without any privilege. This requires sup‐
25 port for user namespaces in the kernel that the container is run on.
26 LXC was the first runtime to support unprivileged containers after user
27 namespaces were merged into the mainline kernel.
28
29 In essence, user namespaces isolate given sets of UIDs and GIDs. This
30 is achieved by establishing a mapping between a range of UIDs and GIDs
31 on the host to a different (unprivileged) range of UIDs and GIDs in the
32 container. The kernel will translate this mapping in such a way that
33 inside the container all UIDs and GIDs appear as you would expect from
34 the host whereas on the host these UIDs and GIDs are in fact unprivi‐
35 leged. For example, a process running as UID and GID 0 inside the con‐
36 tainer might appear as UID and GID 100000 on the host. The implementa‐
37 tion and working details can be gathered from the corresponding user
38 namespace man page. UID and GID mappings can be defined with the
39 lxc.idmap key.
40
41 Linux containers are defined with a simple configuration file. Each op‐
42 tion in the configuration file has the form key = value fitting in one
43 line. The "#" character means the line is a comment. List options, like
44 capabilities and cgroups options, can be used with no value to clear
45 any previously defined values of that option.
46
47 LXC namespaces configuration keys use single dots. This means complex
48 configuration keys such as lxc.net.0 expose various subkeys such as
49 lxc.net.0.type, lxc.net.0.link, lxc.net.0.ipv6.address, and others for
50 even more fine-grained configuration.
51
52 CONFIGURATION
53 In order to ease administration of multiple related containers, it is
54 possible to have a container configuration file cause another file to
55 be loaded. For instance, network configuration can be defined in one
56 common file which is included by multiple containers. Then, if the con‐
57 tainers are moved to another host, only one file may need to be up‐
58 dated.
59
60 lxc.include
61 Specify the file to be included. The included file must be in
62 the same valid lxc configuration file format.
63
64 ARCHITECTURE
65 Allows one to set the architecture for the container. For example, set
66 a 32bits architecture for a container running 32bits binaries on a
67 64bits host. This fixes the container scripts which rely on the archi‐
68 tecture to do some work like downloading the packages.
69
70 lxc.arch
71 Specify the architecture for the container.
72
73 Some valid options are x86, i686, x86_64, amd64
74
75 HOSTNAME
76 The utsname section defines the hostname to be set for the container.
77 That means the container can set its own hostname without changing the
78 one from the system. That makes the hostname private for the container.
79
80 lxc.uts.name
81 specify the hostname for the container
82
83 HALT SIGNAL
84 Allows one to specify signal name or number sent to the container's
85 init process to cleanly shutdown the container. Different init systems
86 could use different signals to perform clean shutdown sequence. This
87 option allows the signal to be specified in kill(1) fashion, e.g. SIG‐
88 PWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is
89 SIGPWR.
90
91 lxc.signal.halt
92 specify the signal used to halt the container
93
94 REBOOT SIGNAL
95 Allows one to specify signal name or number to reboot the container.
96 This option allows signal to be specified in kill(1) fashion, e.g.
97 SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal
98 is SIGINT.
99
100 lxc.signal.reboot
101 specify the signal used to reboot the container
102
103 STOP SIGNAL
104 Allows one to specify signal name or number to forcibly shutdown the
105 container. This option allows signal to be specified in kill(1) fash‐
106 ion, e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. The de‐
107 fault signal is SIGKILL.
108
109 lxc.signal.stop
110 specify the signal used to stop the container
111
112 INIT COMMAND
113 Sets the command to use as the init system for the containers.
114
115 lxc.execute.cmd
116 Absolute path from container rootfs to the binary to run by de‐
117 fault. This mostly makes sense for lxc-execute.
118
119 lxc.init.cmd
120 Absolute path from container rootfs to the binary to use as
121 init. This mostly makes sense for lxc-start. Default is
122 /sbin/init.
123
124 INIT WORKING DIRECTORY
125 Sets the absolute path inside the container as the working directory
126 for the containers. LXC will switch to this directory before executing
127 init.
128
129 lxc.init.cwd
130 Absolute path inside the container to use as the working direc‐
131 tory.
132
133 INIT ID
134 Sets the UID/GID to use for the init system, and subsequent commands.
135 Note that using a non-root UID when booting a system container will
136 likely not work due to missing privileges. Setting the UID/GID is
137 mostly useful when running application containers. Defaults to:
138 UID(0), GID(0)
139
140 lxc.init.uid
141 UID to use for init.
142
143 lxc.init.gid
144 GID to use for init.
145
146 PROC
147 Configure proc filesystem for the container.
148
149 lxc.proc.[proc file name]
150 Specify the proc file name to be set. The file names available
151 are those listed under /proc/PID/. Example:
152
153 lxc.proc.oom_score_adj = 10
154
155
156 EPHEMERAL
157 Allows one to specify whether a container will be destroyed on shut‐
158 down.
159
160 lxc.ephemeral
161 The only allowed values are 0 and 1. Set this to 1 to destroy a
162 container on shutdown.
163
164 NETWORK
165 The network section defines how the network is virtualized in the con‐
166 tainer. The network virtualization acts at layer two. In order to use
167 the network virtualization, parameters must be specified to define the
168 network interfaces of the container. Several virtual interfaces can be
169 assigned and used in a container even if the system has only one physi‐
170 cal network interface.
171
172 lxc.net
173 may be used without a value to clear all previous network op‐
174 tions.
175
176 lxc.net.[i].type
177 specify what kind of network virtualization to be used for the
178 container. Must be specified before any other option(s) on the
179 net device. Multiple networks can be specified by using an ad‐
180 ditional index i after all lxc.net.* keys. For example,
181 lxc.net.0.type = veth and lxc.net.1.type = veth specify two dif‐
182 ferent networks of the same type. All keys sharing the same in‐
183 dex i will be treated as belonging to the same network. For ex‐
184 ample, lxc.net.0.link = br0 will belong to lxc.net.0.type. Cur‐
185 rently, the different virtualization types can be:
186
187 none: will cause the container to share the host's network name‐
188 space. This means the host network devices are usable in the
189 container. It also means that if both the container and host
190 have upstart as init, 'halt' in a container (for instance) will
191 shut down the host. Note that unprivileged containers do not
192 work with this setting due to an inability to mount sysfs. An
193 unsafe workaround would be to bind mount the host's sysfs.
194
195 empty: will create only the loopback interface.
196
197 veth: a virtual ethernet pair device is created with one side
198 assigned to the container and the other side on the host.
199 lxc.net.[i].veth.mode specifies the mode the veth parent will
200 use on the host. The accepted modes are bridge and router. The
201 mode defaults to bridge if not specified. In bridge mode the
202 host side is attached to a bridge specified by the
203 lxc.net.[i].link option. If the bridge link is not specified,
204 then the veth pair device will be created but not attached to
205 any bridge. Otherwise, the bridge has to be created on the sys‐
206 tem before starting the container. lxc won't handle any config‐
207 uration outside of the container. In router mode static routes
208 are created on the host for the container's IP addresses point‐
209 ing to the host side veth interface. Additionally Proxy ARP and
210 Proxy NDP entries are added on the host side veth interface for
211 the gateway IPs defined in the container to allow the container
212 to reach the host. By default, lxc chooses a name for the net‐
213 work device belonging to the outside of the container, but if
214 you wish to handle this name yourselves, you can tell lxc to set
215 a specific name with the lxc.net.[i].veth.pair option (except
216 for unprivileged containers where this option is ignored for se‐
217 curity reasons). Static routes can be added on the host point‐
218 ing to the container using the lxc.net.[i].veth.ipv4.route and
219 lxc.net.[i].veth.ipv6.route options. Several lines specify sev‐
220 eral routes. The route is in format x.y.z.t/m, eg.
221 192.168.1.0/24.
222
223 vlan: a vlan interface is linked with the interface specified by
224 the lxc.net.[i].link and assigned to the container. The vlan
225 identifier is specified with the option lxc.net.[i].vlan.id.
226
227 macvlan: a macvlan interface is linked with the interface speci‐
228 fied by the lxc.net.[i].link and assigned to the container.
229 lxc.net.[i].macvlan.mode specifies the mode the macvlan will use
230 to communicate between different macvlan on the same upper de‐
231 vice. The accepted modes are private, vepa, bridge and passthru.
232 In private mode, the device never communicates with any other
233 device on the same upper_dev (default). In vepa mode, the new
234 Virtual Ethernet Port Aggregator (VEPA) mode, it assumes that
235 the adjacent bridge returns all frames where both source and
236 destination are local to the macvlan port, i.e. the bridge is
237 set up as a reflective relay. Broadcast frames coming in from
238 the upper_dev get flooded to all macvlan interfaces in VEPA
239 mode, local frames are not delivered locally. In bridge mode, it
240 provides the behavior of a simple bridge between different
241 macvlan interfaces on the same port. Frames from one interface
242 to another one get delivered directly and are not sent out ex‐
243 ternally. Broadcast frames get flooded to all other bridge ports
244 and to the external interface, but when they come back from a
245 reflective relay, we don't deliver them again. Since we know all
246 the MAC addresses, the macvlan bridge mode does not require
247 learning or STP like the bridge module does. In passthru mode,
248 all frames received by the physical interface are forwarded to
249 the macvlan interface. Only one macvlan interface in passthru
250 mode is possible for one physical interface.
251
252 ipvlan: an ipvlan interface is linked with the interface speci‐
253 fied by the lxc.net.[i].link and assigned to the container.
254 lxc.net.[i].ipvlan.mode specifies the mode the ipvlan will use
255 to communicate between different ipvlan on the same upper de‐
256 vice. The accepted modes are l3, l3s and l2. It defaults to l3
257 mode. In l3 mode TX processing up to L3 happens on the stack
258 instance attached to the dependent device and packets are
259 switched to the stack instance of the parent device for the L2
260 processing and routing from that instance will be used before
261 packets are queued on the outbound device. In this mode the de‐
262 pendent devices will not receive nor can send multicast / broad‐
263 cast traffic. In l3s mode TX processing is very similar to the
264 L3 mode except that iptables (conn-tracking) works in this mode
265 and hence it is L3-symmetric (L3s). This will have slightly
266 less performance but that shouldn't matter since you are choos‐
267 ing this mode over plain-L3 mode to make conn-tracking work. In
268 l2 mode TX processing happens on the stack instance attached to
269 the dependent device and packets are switched and queued to the
270 parent device to send devices out. In this mode the dependent
271 devices will RX/TX multicast and broadcast (if applicable) as
272 well. lxc.net.[i].ipvlan.isolation specifies the isolation
273 mode. The accepted isolation values are bridge, private and
274 vepa. It defaults to bridge. In bridge isolation mode depen‐
275 dent devices can cross-talk among themselves apart from talking
276 through the parent device. In private isolation mode the port
277 is set in private mode. i.e. port won't allow cross communica‐
278 tion between dependent devices. In vepa isolation mode the port
279 is set in VEPA mode. i.e. port will offload switching function‐
280 ality to the external entity as described in 802.1Qbg.
281
282 phys: an already existing interface specified by the
283 lxc.net.[i].link is assigned to the container.
284
285 lxc.net.[i].flags
286 Specify an action to do for the network.
287
288 up: activates the interface.
289
290 lxc.net.[i].link
291 Specify the interface to be used for real network traffic.
292
293 lxc.net.[i].l2proxy
294 Controls whether layer 2 IP neighbour proxy entries will be
295 added to the lxc.net.[i].link interface for the IP addresses of
296 the container. Can be set to 0 or 1. Defaults to 0. When used
297 with IPv4 addresses, the following sysctl values need to be set:
298 net.ipv4.conf.[link].forwarding=1 When used with IPv6 addresses,
299 the following sysctl values need to be set:
300 net.ipv6.conf.[link].proxy_ndp=1 net.ipv6.conf.[link].forward‐
301 ing=1
302
303 lxc.net.[i].mtu
304 Specify the maximum transfer unit for this interface.
305
306 lxc.net.[i].name
307 The interface name is dynamically allocated, but if another name
308 is needed because the configuration files being used by the con‐
309 tainer use a generic name, eg. eth0, this option will rename the
310 interface in the container.
311
312 lxc.net.[i].hwaddr
313 The interface mac address is dynamically allocated by default to
314 the virtual interface, but in some cases, this is needed to re‐
315 solve a mac address conflict or to always have the same link-lo‐
316 cal ipv6 address. Any "x" in address will be replaced by random
317 value, this allows setting hwaddr templates.
318
319 lxc.net.[i].ipv4.address
320 Specify the ipv4 address to assign to the virtualized interface.
321 Several lines specify several ipv4 addresses. The address is in
322 format x.y.z.t/m, eg. 192.168.1.123/24.
323
324 lxc.net.[i].ipv4.gateway
325 Specify the ipv4 address to use as the gateway inside the con‐
326 tainer. The address is in format x.y.z.t, eg. 192.168.1.123.
327 Can also have the special value auto, which means to take the
328 primary address from the bridge interface (as specified by the
329 lxc.net.[i].link option) and use that as the gateway. auto is
330 only available when using the veth, macvlan and ipvlan network
331 types. Can also have the special value of dev, which means to
332 set the default gateway as a device route. This is primarily
333 for use with layer 3 network modes, such as IPVLAN.
334
335 lxc.net.[i].ipv6.address
336 Specify the ipv6 address to assign to the virtualized interface.
337 Several lines specify several ipv6 addresses. The address is in
338 format x::y/m, eg. 2003:db8:1:0:214:1234:fe0b:3596/64
339
340 lxc.net.[i].ipv6.gateway
341 Specify the ipv6 address to use as the gateway inside the con‐
342 tainer. The address is in format x::y, eg. 2003:db8:1:0::1 Can
343 also have the special value auto, which means to take the pri‐
344 mary address from the bridge interface (as specified by the
345 lxc.net.[i].link option) and use that as the gateway. auto is
346 only available when using the veth, macvlan and ipvlan network
347 types. Can also have the special value of dev, which means to
348 set the default gateway as a device route. This is primarily
349 for use with layer 3 network modes, such as IPVLAN.
350
351 lxc.net.[i].script.up
352 Add a configuration option to specify a script to be executed
353 after creating and configuring the network used from the host
354 side.
355
356 In addition to the information available to all hooks. The fol‐
357 lowing information is provided to the script:
358
359 • LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
360
361 • LXC_HOOK_SECTION: the section type 'net'.
362
363 • LXC_NET_TYPE: the network type. This is one of the valid net‐
364 work types listed here (e.g. 'vlan', 'macvlan', 'ipvlan',
365 'veth').
366
367 • LXC_NET_PARENT: the parent device on the host. This is only
368 set for network types 'mavclan', 'veth', 'phys'.
369
370 • LXC_NET_PEER: the name of the peer device on the host. This is
371 only set for 'veth' network types. Note that this information
372 is only available when lxc.hook.version is set to 1.
373
374 Whether this information is provided in the form of environment vari‐
375 ables or as arguments to the script depends on the value of
376 lxc.hook.version. If set to 1 then information is provided in the form
377 of environment variables. If set to 0 information is provided as argu‐
378 ments to the script.
379
380 Standard output from the script is logged at debug level. Standard er‐
381 ror is not logged, but can be captured by the hook redirecting its
382 standard error to standard output.
383
384 lxc.net.[i].script.down
385 Add a configuration option to specify a script to be executed
386 before destroying the network used from the host side.
387
388 In addition to the information available to all hooks. The fol‐
389 lowing information is provided to the script:
390
391 • LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
392
393 • LXC_HOOK_SECTION: the section type 'net'.
394
395 • LXC_NET_TYPE: the network type. This is one of the valid net‐
396 work types listed here (e.g. 'vlan', 'macvlan', 'ipvlan',
397 'veth').
398
399 • LXC_NET_PARENT: the parent device on the host. This is only
400 set for network types 'mavclan', 'veth', 'phys'.
401
402 • LXC_NET_PEER: the name of the peer device on the host. This is
403 only set for 'veth' network types. Note that this information
404 is only available when lxc.hook.version is set to 1.
405
406 Whether this information is provided in the form of environment vari‐
407 ables or as arguments to the script depends on the value of
408 lxc.hook.version. If set to 1 then information is provided in the form
409 of environment variables. If set to 0 information is provided as argu‐
410 ments to the script.
411
412 Standard output from the script is logged at debug level. Standard er‐
413 ror is not logged, but can be captured by the hook redirecting its
414 standard error to standard output.
415
416 NEW PSEUDO TTY INSTANCE (DEVPTS)
417 For stricter isolation the container can have its own private instance
418 of the pseudo tty.
419
420 lxc.pty.max
421 If set, the container will have a new pseudo tty instance, mak‐
422 ing this private to it. The value specifies the maximum number
423 of pseudo ttys allowed for a pty instance (this limitation is
424 not implemented yet).
425
426 CONTAINER SYSTEM CONSOLE
427 If the container is configured with a root filesystem and the inittab
428 file is setup to use the console, you may want to specify where the
429 output of this console goes.
430
431 lxc.console.buffer.size
432 Setting this option instructs liblxc to allocate an in-memory
433 ringbuffer. The container's console output will be written to
434 the ringbuffer. Note that ringbuffer must be at least as big as
435 a standard page size. When passed a value smaller than a single
436 page size liblxc will allocate a ringbuffer of a single page
437 size. A page size is usually 4KB. The keyword 'auto' will cause
438 liblxc to allocate a ringbuffer of 128KB. When manually speci‐
439 fying a size for the ringbuffer the value should be a power of 2
440 when converted to bytes. Valid size prefixes are 'KB', 'MB',
441 'GB'. (Note that all conversions are based on multiples of 1024.
442 That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'. Addi‐
443 tionally, the case of the suffix is ignored, i.e. 'kB', 'KB' and
444 'Kb' are treated equally.)
445
446 lxc.console.size
447 Setting this option instructs liblxc to place a limit on the
448 size of the console log file specified in lxc.console.logfile.
449 Note that size of the log file must be at least as big as a
450 standard page size. When passed a value smaller than a single
451 page size liblxc will set the size of log file to a single page
452 size. A page size is usually 4KB. The keyword 'auto' will cause
453 liblxc to place a limit of 128KB on the log file. When manually
454 specifying a size for the log file the value should be a power
455 of 2 when converted to bytes. Valid size prefixes are 'KB',
456 'MB', 'GB'. (Note that all conversions are based on multiples of
457 1024. That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'.
458 Additionally, the case of the suffix is ignored, i.e. 'kB', 'KB'
459 and 'Kb' are treated equally.) If users want to mirror the con‐
460 sole ringbuffer on disk they should set lxc.console.size equal
461 to lxc.console.buffer.size.
462
463 lxc.console.logfile
464 Specify a path to a file where the console output will be writ‐
465 ten. Note that in contrast to the on-disk ringbuffer logfile
466 this file will keep growing potentially filling up the users
467 disks if not rotated and deleted. This problem can also be
468 avoided by using the in-memory ringbuffer options lxc.con‐
469 sole.buffer.size and lxc.console.buffer.logfile.
470
471 lxc.console.rotate
472 Whether to rotate the console logfile specified in lxc.con‐
473 sole.logfile. Users can send an API request to rotate the log‐
474 file. Note that the old logfile will have the same name as the
475 original with the suffix ".1" appended. Users wishing to pre‐
476 vent the console log file from filling the disk should rotate
477 the logfile and delete it if unneeded. This problem can also be
478 avoided by using the in-memory ringbuffer options lxc.con‐
479 sole.buffer.size and lxc.console.buffer.logfile.
480
481 lxc.console.path
482 Specify a path to a device to which the console will be at‐
483 tached. The keyword 'none' will simply disable the console.
484 Note, when specifying 'none' and creating a device node for the
485 console in the container at /dev/console or bind-mounting the
486 hosts's /dev/console into the container at /dev/console the con‐
487 tainer will have direct access to the hosts's /dev/console.
488 This is dangerous when the container has write access to the de‐
489 vice and should thus be used with caution.
490
491 CONSOLE THROUGH THE TTYS
492 This option is useful if the container is configured with a root
493 filesystem and the inittab file is setup to launch a getty on the ttys.
494 The option specifies the number of ttys to be available for the con‐
495 tainer. The number of gettys in the inittab file of the container
496 should not be greater than the number of ttys specified in this option,
497 otherwise the excess getty sessions will die and respawn indefinitely
498 giving annoying messages on the console or in /var/log/messages.
499
500 lxc.tty.max
501 Specify the number of tty to make available to the container.
502
503 CONSOLE DEVICES LOCATION
504 LXC consoles are provided through Unix98 PTYs created on the host and
505 bind-mounted over the expected devices in the container. By default,
506 they are bind-mounted over /dev/console and /dev/ttyN. This can prevent
507 package upgrades in the guest. Therefore you can specify a directory
508 location (under /dev under which LXC will create the files and bind-
509 mount over them. These will then be symbolically linked to /dev/console
510 and /dev/ttyN. A package upgrade can then succeed as it is able to re‐
511 move and replace the symbolic links.
512
513 lxc.tty.dir
514 Specify a directory under /dev under which to create the con‐
515 tainer console devices. Note that LXC will move any bind-mounts
516 or device nodes for /dev/console into this directory.
517
518 /DEV DIRECTORY
519 By default, lxc creates a few symbolic links (fd,stdin,stdout,stderr)
520 in the container's /dev directory but does not automatically create de‐
521 vice node entries. This allows the container's /dev to be set up as
522 needed in the container rootfs. If lxc.autodev is set to 1, then after
523 mounting the container's rootfs LXC will mount a fresh tmpfs under /dev
524 (limited to 500K by default, unless defined in lxc.autodev.tmpfs.size)
525 and fill in a minimal set of initial devices. This is generally re‐
526 quired when starting a container containing a "systemd" based "init"
527 but may be optional at other times. Additional devices in the contain‐
528 ers /dev directory may be created through the use of the lxc.hook.au‐
529 todev hook.
530
531 lxc.autodev
532 Set this to 0 to stop LXC from mounting and populating a minimal
533 /dev when starting the container.
534
535 lxc.autodev.tmpfs.size
536 Set this to define the size of the /dev tmpfs. The default
537 value is 500000 (500K). If the parameter is used but without
538 value, the default value is used.
539
540 MOUNT POINTS
541 The mount points section specifies the different places to be mounted.
542 These mount points will be private to the container and won't be visi‐
543 ble by the processes running outside of the container. This is useful
544 to mount /etc, /var or /home for examples.
545
546 NOTE - LXC will generally ensure that mount targets and relative bind-
547 mount sources are properly confined under the container root, to avoid
548 attacks involving over-mounting host directories and files. (Symbolic
549 links in absolute mount sources are ignored) However, if the container
550 configuration first mounts a directory which is under the control of
551 the container user, such as /home/joe, into the container at some path,
552 and then mounts under path, then a TOCTTOU attack would be possible
553 where the container user modifies a symbolic link under their home di‐
554 rectory at just the right time.
555
556 lxc.mount.fstab
557 specify a file location in the fstab format, containing the
558 mount information. The mount target location can and in most
559 cases should be a relative path, which will become relative to
560 the mounted container root. For instance,
561
562 proc proc proc nodev,noexec,nosuid 0 0
563
564
565 Will mount a proc filesystem under the container's /proc, re‐
566 gardless of where the root filesystem comes from. This is re‐
567 silient to block device backed filesystems as well as container
568 cloning.
569
570 Note that when mounting a filesystem from an image file or block
571 device the third field (fs_vfstype) cannot be auto as with
572 mount(8) but must be explicitly specified.
573
574 lxc.mount.entry
575 Specify a mount point corresponding to a line in the fstab for‐
576 mat. Moreover lxc supports mount propagation, such as rshared
577 or rprivate, and adds three additional mount options. optional
578 don't fail if mount does not work. create=dir or create=file to
579 create dir (or file) when the point will be mounted. relative
580 source path is taken to be relative to the mounted container
581 root. For instance,
582
583 dev/null proc/kcore none bind,relative 0 0
584
585
586 Will expand dev/null to ${LXC_ROOTFS_MOUNT}/dev/null, and mount
587 it to proc/kcore inside the container.
588
589 lxc.mount.auto
590 specify which standard kernel file systems should be automati‐
591 cally mounted. This may dramatically simplify the configuration.
592 The file systems are:
593
594 • proc:mixed (or proc): mount /proc as read-write, but remount
595 /proc/sys and /proc/sysrq-trigger read-only for security /
596 container isolation purposes.
597
598 • proc:rw: mount /proc as read-write
599
600 • sys:mixed (or sys): mount /sys as read-only but with /sys/de‐
601 vices/virtual/net writable.
602
603 • sys:ro: mount /sys as read-only for security / container iso‐
604 lation purposes.
605
606 • sys:rw: mount /sys as read-write
607
608 • cgroup:mixed: Mount a tmpfs to /sys/fs/cgroup, create directo‐
609 ries for all hierarchies to which the container is added, cre‐
610 ate subdirectories in those hierarchies with the name of the
611 cgroup, and bind-mount the container's own cgroup into that
612 directory. The container will be able to write to its own
613 cgroup directory, but not the parents, since they will be re‐
614 mounted read-only.
615
616 • cgroup:mixed:force: The force option will cause LXC to perform
617 the cgroup mounts for the container under all circumstances.
618 Otherwise it is similar to cgroup:mixed. This is mainly use‐
619 ful when the cgroup namespaces are enabled where LXC will nor‐
620 mally leave mounting cgroups to the init binary of the con‐
621 tainer since it is perfectly safe to do so.
622
623 • cgroup:ro: similar to cgroup:mixed, but everything will be
624 mounted read-only.
625
626 • cgroup:ro:force: The force option will cause LXC to perform
627 the cgroup mounts for the container under all circumstances.
628 Otherwise it is similar to cgroup:ro. This is mainly useful
629 when the cgroup namespaces are enabled where LXC will normally
630 leave mounting cgroups to the init binary of the container
631 since it is perfectly safe to do so.
632
633 • cgroup:rw: similar to cgroup:mixed, but everything will be
634 mounted read-write. Note that the paths leading up to the con‐
635 tainer's own cgroup will be writable, but will not be a cgroup
636 filesystem but just part of the tmpfs of /sys/fs/cgroup
637
638 • cgroup:rw:force: The force option will cause LXC to perform
639 the cgroup mounts for the container under all circumstances.
640 Otherwise it is similar to cgroup:rw. This is mainly useful
641 when the cgroup namespaces are enabled where LXC will normally
642 leave mounting cgroups to the init binary of the container
643 since it is perfectly safe to do so.
644
645 • cgroup (without specifier): defaults to cgroup:rw if the con‐
646 tainer retains the CAP_SYS_ADMIN capability, cgroup:mixed oth‐
647 erwise.
648
649 • cgroup-full:mixed: mount a tmpfs to /sys/fs/cgroup, create di‐
650 rectories for all hierarchies to which the container is added,
651 bind-mount the hierarchies from the host to the container and
652 make everything read-only except the container's own cgroup.
653 Note that compared to cgroup, where all paths leading up to
654 the container's own cgroup are just simple directories in the
655 underlying tmpfs, here /sys/fs/cgroup/$hierarchy will contain
656 the host's full cgroup hierarchy, albeit read-only outside the
657 container's own cgroup. This may leak quite a bit of informa‐
658 tion into the container.
659
660 • cgroup-full:mixed:force: The force option will cause LXC to
661 perform the cgroup mounts for the container under all circum‐
662 stances. Otherwise it is similar to cgroup-full:mixed. This
663 is mainly useful when the cgroup namespaces are enabled where
664 LXC will normally leave mounting cgroups to the init binary of
665 the container since it is perfectly safe to do so.
666
667 • cgroup-full:ro: similar to cgroup-full:mixed, but everything
668 will be mounted read-only.
669
670 • cgroup-full:ro:force: The force option will cause LXC to per‐
671 form the cgroup mounts for the container under all circum‐
672 stances. Otherwise it is similar to cgroup-full:ro. This is
673 mainly useful when the cgroup namespaces are enabled where LXC
674 will normally leave mounting cgroups to the init binary of the
675 container since it is perfectly safe to do so.
676
677 • cgroup-full:rw: similar to cgroup-full:mixed, but everything
678 will be mounted read-write. Note that in this case, the con‐
679 tainer may escape its own cgroup. (Note also that if the con‐
680 tainer has CAP_SYS_ADMIN support and can mount the cgroup
681 filesystem itself, it may do so anyway.)
682
683 • cgroup-full:rw:force: The force option will cause LXC to per‐
684 form the cgroup mounts for the container under all circum‐
685 stances. Otherwise it is similar to cgroup-full:rw. This is
686 mainly useful when the cgroup namespaces are enabled where LXC
687 will normally leave mounting cgroups to the init binary of the
688 container since it is perfectly safe to do so.
689
690 • cgroup-full (without specifier): defaults to cgroup-full:rw if
691 the container retains the CAP_SYS_ADMIN capability,
692 cgroup-full:mixed otherwise.
693
694 If cgroup namespaces are enabled, then any cgroup auto-mounting request
695 will be ignored, since the container can mount the filesystems itself,
696 and automounting can confuse the container init.
697
698 Note that if automatic mounting of the cgroup filesystem is enabled,
699 the tmpfs under /sys/fs/cgroup will always be mounted read-write (but
700 for the :mixed and :ro cases, the individual hierarchies,
701 /sys/fs/cgroup/$hierarchy, will be read-only). This is in order to work
702 around a quirk in Ubuntu's mountall(8) command that will cause contain‐
703 ers to wait for user input at boot if /sys/fs/cgroup is mounted read-
704 only and the container can't remount it read-write due to a lack of
705 CAP_SYS_ADMIN.
706
707 Examples:
708
709 lxc.mount.auto = proc sys cgroup
710 lxc.mount.auto = proc:rw sys:rw cgroup-full:rw
711
712
713 ROOT FILE SYSTEM
714 The root file system of the container can be different than that of the
715 host system.
716
717 lxc.rootfs.path
718 specify the root file system for the container. It can be an im‐
719 age file, a directory or a block device. If not specified, the
720 container shares its root file system with the host.
721
722 For directory or simple block-device backed containers, a path‐
723 name can be used. If the rootfs is backed by a nbd device, then
724 nbd:file:1 specifies that file should be attached to a nbd de‐
725 vice, and partition 1 should be mounted as the rootfs. nbd:file
726 specifies that the nbd device itself should be mounted. over‐
727 layfs:/lower:/upper specifies that the rootfs should be an over‐
728 lay with /upper being mounted read-write over a read-only mount
729 of /lower. For overlay multiple /lower directories can be spec‐
730 ified. loop:/file tells lxc to attach /file to a loop device and
731 mount the loop device.
732
733 lxc.rootfs.mount
734 where to recursively bind lxc.rootfs.path before pivoting. This
735 is to ensure success of the pivot_root(8) syscall. Any directory
736 suffices, the default should generally work.
737
738 lxc.rootfs.options
739 Specify extra mount options to use when mounting the rootfs.
740 The format of the mount options corresponds to the format used
741 in fstab. In addition, LXC supports the custom idmap= mount op‐
742 tion. This option can be used to tell LXC to create an idmapped
743 mount for the container's rootfs. This is useful when the user
744 doesn't want to recursively chown the rootfs of the container to
745 match the idmapping of the user namespace the container is going
746 to use. Instead an idmapped mount can be used to handle this.
747 The argument for idmap= can either be a path pointing to a user
748 namespace file that LXC will open and use to idmap the rootfs or
749 the special value "container" which will instruct LXC to use the
750 container's user namespace to idmap the rootfs.
751
752 lxc.rootfs.managed
753 Set this to 0 to indicate that LXC is not managing the container
754 storage, then LXC will not modify the container storage. The de‐
755 fault is 1.
756
757 CONTROL GROUPS ("CGROUPS")
758 The control group section contains the configuration for the different
759 subsystem. lxc does not check the correctness of the subsystem name.
760 This has the disadvantage of not detecting configuration errors until
761 the container is started, but has the advantage of permitting any fu‐
762 ture subsystem.
763
764 The kernel implementation of cgroups has changed significantly over the
765 years. With Linux 4.5 support for a new cgroup filesystem was added
766 usually referred to as "cgroup2" or "unified hierarchy". Since then the
767 old cgroup filesystem is usually referred to as "cgroup1" or the
768 "legacy hierarchies". Please see the cgroups manual page for a detailed
769 explanation of the differences between the two versions.
770
771 LXC distinguishes settings for the legacy and the unified hierarchy by
772 using different configuration key prefixes. To alter settings for con‐
773 trollers in a legacy hierarchy the key prefix lxc.cgroup. must be used
774 and in order to alter the settings for a controller in the unified hi‐
775 erarchy the lxc.cgroup2. key must be used. Note that LXC will ignore
776 lxc.cgroup. settings on systems that only use the unified hierarchy.
777 Conversely, it will ignore lxc.cgroup2. options on systems that only
778 use legacy hierachies.
779
780 At its core a cgroup hierarchy is a way to hierarchically organize pro‐
781 cesses. Usually a cgroup hierarchy will have one or more "controllers"
782 enabled. A "controller" in a cgroup hierarchy is usually responsible
783 for distributing a specific type of system resource along the hierar‐
784 chy. Controllers include the "pids" controller, the "cpu" controller,
785 the "memory" controller and others. Some controllers however do not
786 fall into the category of distributing a system resource, instead they
787 are often referred to as "utility" controllers. One utility controller
788 is the device controller. Instead of distributing a system resource it
789 allows to manage device access.
790
791 In the legacy hierarchy the device controller was implemented like most
792 other controllers as a set of files that could be written to. These
793 files where named "devices.allow" and "devices.deny". The legacy device
794 controller allowed the implementation of both "allowlists" and
795 "denylists".
796
797 An allowlist is a device program that by default blocks access to all
798 devices. In order to access specific devices "allow rules" for particu‐
799 lar devices or device classes must be specified. In contrast, a
800 denylist is a device program that by default allows access to all de‐
801 vices. In order to restrict access to specific devices "deny rules" for
802 particular devices or device classes must be specified.
803
804 In the unified cgroup hierarchy the implementation of the device con‐
805 troller has completely changed. Instead of files to read from and write
806 to a eBPF program of BPF_PROG_TYPE_CGROUP_DEVICE can be attached to a
807 cgroup. Even though the kernel implementation has changed completely
808 LXC tries to allow for the same semantics to be followed in the legacy
809 device cgroup and the unified eBPF-based device controller. The follow‐
810 ing paragraphs explain the semantics for the unified eBPF-based device
811 controller.
812
813 As mentioned the format for specifying device rules for the unified
814 eBPF-based device controller is the same as for the legacy cgroup de‐
815 vice controller; only the configuration key prefix has changed.
816 Specifically, device rules for the legacy cgroup device controller are
817 specified via lxc.cgroup.devices.allow and lxc.cgroup.devices.deny
818 whereas for the cgroup2 eBPF-based device controller lxc.cgroup2.de‐
819 vices.allow and lxc.cgroup2.devices.deny must be used.
820
821 • A allowlist device rule
822
823 lxc.cgroup2.devices.deny = a
824
825
826 will cause LXC to instruct the kernel to block access to all devices
827 by default. To grant access to devices allow device rules must be
828 added via the lxc.cgroup2.devices.allow key. This is referred to as a
829 "allowlist" device program.
830
831 • A denylist device rule
832
833 lxc.cgroup2.devices.allow = a
834
835
836 will cause LXC to instruct the kernel to allow access to all devices
837 by default. To deny access to devices deny device rules must be added
838 via lxc.cgroup2.devices.deny key. This is referred to as a
839 "denylist" device program.
840
841 • Specifying any of the aformentioned two rules will cause all previous
842 rules to be cleared, i.e. the device list will be reset.
843
844 • When an allowlist program is requested, i.e. access to all devices is
845 blocked by default, specific deny rules for individual devices or de‐
846 vice classes are ignored.
847
848 • When a denylist program is requested, i.e. access to all devices is
849 allowed by default, specific allow rules for individual devices or
850 device classes are ignored.
851
852 For example the set of rules:
853
854 lxc.cgroup2.devices.deny = a
855 lxc.cgroup2.devices.allow = c *:* m
856 lxc.cgroup2.devices.allow = b *:* m
857 lxc.cgroup2.devices.allow = c 1:3 rwm
858
859
860 implements an allowlist device program, i.e. the kernel will block ac‐
861 cess to all devices not specifically allowed in this list. This partic‐
862 ular program states that all character and block devices may be created
863 but only /dev/null might be read or written.
864
865 If we instead switch to the following set of rules:
866
867 lxc.cgroup2.devices.allow = a
868 lxc.cgroup2.devices.deny = c *:* m
869 lxc.cgroup2.devices.deny = b *:* m
870 lxc.cgroup2.devices.deny = c 1:3 rwm
871
872
873 then LXC would instruct the kernel to implement a denylist, i.e. the
874 kernel will allow access to all devices not specifically denied in this
875 list. This particular program states that no character devices or block
876 devices might be created and that /dev/null is not allow allowed to be
877 read, written, or created.
878
879 Now consider the same program but followed by a "global rule" which de‐
880 termines the type of device program (allowlist or denylist) as ex‐
881 plained above:
882
883 lxc.cgroup2.devices.allow = a
884 lxc.cgroup2.devices.deny = c *:* m
885 lxc.cgroup2.devices.deny = b *:* m
886 lxc.cgroup2.devices.deny = c 1:3 rwm
887 lxc.cgroup2.devices.allow = a
888
889
890 The last line will cause LXC to reset the device list without changing
891 the type of device program.
892
893 If we specify:
894
895 lxc.cgroup2.devices.allow = a
896 lxc.cgroup2.devices.deny = c *:* m
897 lxc.cgroup2.devices.deny = b *:* m
898 lxc.cgroup2.devices.deny = c 1:3 rwm
899 lxc.cgroup2.devices.deny = a
900
901
902 instead then the last line will cause LXC to reset the device list and
903 switch from a allowlist program to a denylist program.
904
905 lxc.cgroup.[controller name].[controller file]
906 Specify the control group value to be set on a legacy cgroup hi‐
907 erarchy. The controller name is the literal name of the control
908 group. The permitted names and the syntax of their values is not
909 dictated by LXC, instead it depends on the features of the Linux
910 kernel running at the time the container is started, eg.
911 lxc.cgroup.cpuset.cpus
912
913 lxc.cgroup2.[controller name].[controller file]
914 Specify the control group value to be set on the unified cgroup
915 hierarchy. The controller name is the literal name of the con‐
916 trol group. The permitted names and the syntax of their values
917 is not dictated by LXC, instead it depends on the features of
918 the Linux kernel running at the time the container is started,
919 eg. lxc.cgroup2.memory.high
920
921 lxc.cgroup.dir
922 specify a directory or path in which the container's cgroup will
923 be created. For example, setting lxc.cgroup.dir =
924 my-cgroup/first for a container named "c1" will create the con‐
925 tainer's cgroup as a sub-cgroup of "my-cgroup". For example, if
926 the user's current cgroup "my-user" is located in the root
927 cgroup of the cpuset controller in a cgroup v1 hierarchy this
928 would create the cgroup "/sys/fs/cgroup/cpuset/my-user/my-
929 cgroup/first/c1" for the container. Any missing cgroups will be
930 created by LXC. This presupposes that the user has write access
931 to its current cgroup.
932
933 lxc.cgroup.relative
934 Set this to 1 to instruct LXC to never escape to the root
935 cgroup. This makes it easy for users to adhere to restrictions
936 enforced by cgroup2 and systemd. Specifically, this makes it
937 possible to run LXC containers as systemd services.
938
939 CAPABILITIES
940 The capabilities can be dropped in the container if this one is run as
941 root.
942
943 lxc.cap.drop
944 Specify the capability to be dropped in the container. A single
945 line defining several capabilities with a space separation is
946 allowed. The format is the lower case of the capability defini‐
947 tion without the "CAP_" prefix, eg. CAP_SYS_MODULE should be
948 specified as sys_module. See capabilities(7). If used with no
949 value, lxc will clear any drop capabilities specified up to this
950 point.
951
952 lxc.cap.keep
953 Specify the capability to be kept in the container. All other
954 capabilities will be dropped. When a special value of "none" is
955 encountered, lxc will clear any keep capabilities specified up
956 to this point. A value of "none" alone can be used to drop all
957 capabilities.
958
959 NAMESPACES
960 A namespace can be cloned (lxc.namespace.clone), kept (lxc.name‐
961 space.keep) or shared (lxc.namespace.share.[namespace identifier]).
962
963 lxc.namespace.clone
964 Specify namespaces which the container is supposed to be created
965 with. The namespaces to create are specified as a space sepa‐
966 rated list. Each namespace must correspond to one of the stan‐
967 dard namespace identifiers as seen in the /proc/PID/ns direc‐
968 tory. When lxc.namespace.clone is not explicitly set all name‐
969 spaces supported by the kernel and the current configuration
970 will be used.
971
972 To create a new mount, net and ipc namespace set lxc.name‐
973 space.clone=mount net ipc.
974
975 lxc.namespace.keep
976 Specify namespaces which the container is supposed to inherit
977 from the process that created it. The namespaces to keep are
978 specified as a space separated list. Each namespace must corre‐
979 spond to one of the standard namespace identifiers as seen in
980 the /proc/PID/ns directory. The lxc.namespace.keep is a
981 denylist option, i.e. it is useful when enforcing that contain‐
982 ers must keep a specific set of namespaces.
983
984 To keep the network, user and ipc namespace set lxc.name‐
985 space.keep=user net ipc.
986
987 Note that sharing pid namespaces will likely not work with most
988 init systems.
989
990 Note that if the container requests a new user namespace and the
991 container wants to inherit the network namespace it needs to in‐
992 herit the user namespace as well.
993
994 lxc.namespace.share.[namespace identifier]
995 Specify a namespace to inherit from another container or
996 process. The [namespace identifier] suffix needs to be replaced
997 with one of the namespaces that appear in the /proc/PID/ns di‐
998 rectory.
999
1000 To inherit the namespace from another process set the lxc.name‐
1001 space.share.[namespace identifier] to the PID of the process,
1002 e.g. lxc.namespace.share.net=42.
1003
1004 To inherit the namespace from another container set the
1005 lxc.namespace.share.[namespace identifier] to the name of the
1006 container, e.g. lxc.namespace.share.pid=c3.
1007
1008 To inherit the namespace from another container located in a
1009 different path than the standard liblxc path set the lxc.name‐
1010 space.share.[namespace identifier] to the full path to the con‐
1011 tainer, e.g. lxc.namespace.share.user=/opt/c3.
1012
1013 In order to inherit namespaces the caller needs to have suffi‐
1014 cient privilege over the process or container.
1015
1016 Note that sharing pid namespaces between system containers will
1017 likely not work with most init systems.
1018
1019 Note that if two processes are in different user namespaces and
1020 one process wants to inherit the other's network namespace it
1021 usually needs to inherit the user namespace as well.
1022
1023 Note that without careful additional configuration of an LSM,
1024 sharing user+pid namespaces with a task may allow that task to
1025 escalate privileges to that of the task calling liblxc.
1026
1027 RESOURCE LIMITS
1028 The soft and hard resource limits for the container can be changed.
1029 Unprivileged containers can only lower them. Resources which are not
1030 explicitly specified will be inherited.
1031
1032 lxc.prlimit.[limit name]
1033 Specify the resource limit to be set. A limit is specified as
1034 two colon separated values which are either numeric or the word
1035 'unlimited'. A single value can be used as a shortcut to set
1036 both soft and hard limit to the same value. The permitted names
1037 the "RLIMIT_" resource names in lowercase without the "RLIMIT_"
1038 prefix, eg. RLIMIT_NOFILE should be specified as "nofile". See
1039 setrlimit(2). If used with no value, lxc will clear the re‐
1040 source limit specified up to this point. A resource with no ex‐
1041 plicitly configured limitation will be inherited from the
1042 process starting up the container.
1043
1044 SYSCTL
1045 Configure kernel parameters for the container.
1046
1047 lxc.sysctl.[kernel parameters name]
1048 Specify the kernel parameters to be set. The parameters avail‐
1049 able are those listed under /proc/sys/. Note that not all
1050 sysctls are namespaced. Changing Non-namespaced sysctls will
1051 cause the system-wide setting to be modified. sysctl(8). If
1052 used with no value, lxc will clear the parameters specified up
1053 to this point.
1054
1055 APPARMOR PROFILE
1056 If lxc was compiled and installed with apparmor support, and the host
1057 system has apparmor enabled, then the apparmor profile under which the
1058 container should be run can be specified in the container configura‐
1059 tion. The default is lxc-container-default-cgns if the host kernel is
1060 cgroup namespace aware, or lxc-container-default otherwise.
1061
1062 lxc.apparmor.profile
1063 Specify the apparmor profile under which the container should be
1064 run. To specify that the container should be unconfined, use
1065
1066 lxc.apparmor.profile = unconfined
1067
1068 If the apparmor profile should remain unchanged (i.e. if you are
1069 nesting containers and are already confined), then use
1070
1071 lxc.apparmor.profile = unchanged
1072
1073 If you instruct LXC to generate the apparmor profile, then use
1074
1075 lxc.apparmor.profile = generated
1076
1077 lxc.apparmor.allow_incomplete
1078 Apparmor profiles are pathname based. Therefore many file re‐
1079 strictions require mount restrictions to be effective against a
1080 determined attacker. However, these mount restrictions are not
1081 yet implemented in the upstream kernel. Without the mount re‐
1082 strictions, the apparmor profiles still protect against acciden‐
1083 tal damager.
1084
1085 If this flag is 0 (default), then the container will not be
1086 started if the kernel lacks the apparmor mount features, so that
1087 a regression after a kernel upgrade will be detected. To start
1088 the container under partial apparmor protection, set this flag
1089 to 1.
1090
1091 lxc.apparmor.allow_nesting
1092 If set this to 1, causes the following changes. When generated
1093 apparmor profiles are used, they will contain the necessary
1094 changes to allow creating a nested container. In addition to the
1095 usual mount points, /dev/.lxc/proc and /dev/.lxc/sys will con‐
1096 tain procfs and sysfs mount points without the lxcfs overlays,
1097 which, if generated apparmor profiles are being used, will not
1098 be read/writable directly.
1099
1100 lxc.apparmor.raw
1101 A list of raw AppArmor profile lines to append to the profile.
1102 Only valid when using generated profiles.
1103
1104 SELINUX CONTEXT
1105 If lxc was compiled and installed with SELinux support, and the host
1106 system has SELinux enabled, then the SELinux context under which the
1107 container should be run can be specified in the container configura‐
1108 tion. The default is unconfined_t, which means that lxc will not at‐
1109 tempt to change contexts. See /usr/share/lxc/selinux/lxc.te for an ex‐
1110 ample policy and more information.
1111
1112 lxc.selinux.context
1113 Specify the SELinux context under which the container should be
1114 run or unconfined_t. For example
1115
1116 lxc.selinux.context = system_u:system_r:lxc_t:s0:c22
1117
1118 lxc.selinux.context.keyring
1119 Specify the SELinux context under which the container's keyring
1120 should be created. By default this the same as lxc.selinux.con‐
1121 text, or the context lxc is executed under if lxc.selinux.con‐
1122 text has not been set.
1123
1124 lxc.selinux.context.keyring = system_u:system_r:lxc_t:s0:c22
1125
1126 KERNEL KEYRING
1127 The Linux Keyring facility is primarily a way for various kernel compo‐
1128 nents to retain or cache security data, authentication keys, encryption
1129 keys, and other data in the kernel. By default lxc will create a new
1130 session keyring for the started application.
1131
1132 lxc.keyring.session
1133 Disable the creation of new session keyring by lxc. The started
1134 application will then inherit the current session keyring. By
1135 default, or when passing the value 1, a new keyring will be cre‐
1136 ated.
1137
1138 lxc.keyring.session = 0
1139
1140 SECCOMP CONFIGURATION
1141 A container can be started with a reduced set of available system calls
1142 by loading a seccomp profile at startup. The seccomp configuration file
1143 must begin with a version number on the first line, a policy type on
1144 the second line, followed by the configuration.
1145
1146 Versions 1 and 2 are currently supported. In version 1, the policy is a
1147 simple allowlist. The second line therefore must read "allowlist", with
1148 the rest of the file containing one (numeric) syscall number per line.
1149 Each syscall number is allowlisted, while every unlisted number is
1150 denylisted for use in the container
1151
1152 In version 2, the policy may be denylist or allowlist, supports per-
1153 rule and per-policy default actions, and supports per-architecture sys‐
1154 tem call resolution from textual names.
1155
1156 An example denylist policy, in which all system calls are allowed ex‐
1157 cept for mknod, which will simply do nothing and return 0 (success),
1158 looks like:
1159
1160 2
1161 denylist
1162 mknod errno 0
1163 ioctl notify
1164
1165
1166 Specifying "errno" as action will cause LXC to register a seccomp fil‐
1167 ter that will cause a specific errno to be returned to the caller. The
1168 errno value can be specified after the "errno" action word.
1169
1170 Specifying "notify" as action will cause LXC to register a seccomp lis‐
1171 tener and retrieve a listener file descriptor from the kernel. When a
1172 syscall is made that is registered as "notify" the kernel will generate
1173 a poll event and send a message over the file descriptor. The caller
1174 can read this message, inspect the syscalls including its arguments.
1175 Based on this information the caller is expected to send back a message
1176 informing the kernel which action to take. Until that message is sent
1177 the kernel will block the calling process. The format of the messages
1178 to read and sent is documented in seccomp itself.
1179
1180 lxc.seccomp.profile
1181 Specify a file containing the seccomp configuration to load be‐
1182 fore the container starts.
1183
1184 lxc.seccomp.allow_nesting
1185 If this flag is set to 1, then seccomp filters will be stacked
1186 regardless of whether a seccomp profile is already loaded. This
1187 allows nested containers to load their own seccomp profile. The
1188 default setting is 0.
1189
1190 lxc.seccomp.notify.proxy
1191 Specify a unix socket to which LXC will connect and forward sec‐
1192 comp events to. The path must be in the form
1193 unix:/path/to/socket or unix:@socket. The former specifies a
1194 path-bound unix domain socket while the latter specifies an ab‐
1195 stract unix domain socket.
1196
1197 lxc.seccomp.notify.cookie
1198 An additional string sent along with proxied seccomp notifica‐
1199 tion requests.
1200
1201 PR_SET_NO_NEW_PRIVS
1202 With PR_SET_NO_NEW_PRIVS active execve() promises not to grant privi‐
1203 leges to do anything that could not have been done without the execve()
1204 call (for example, rendering the set-user-ID and set-group-ID mode
1205 bits, and file capabilities non-functional). Once set, this bit cannot
1206 be unset. The setting of this bit is inherited by children created by
1207 fork() and clone(), and preserved across execve(). Note that
1208 PR_SET_NO_NEW_PRIVS is applied after the container has changed into its
1209 intended AppArmor profile or SElinux context.
1210
1211 lxc.no_new_privs
1212 Specify whether the PR_SET_NO_NEW_PRIVS flag should be set for
1213 the container. Set to 1 to activate.
1214
1215 UID MAPPINGS
1216 A container can be started in a private user namespace with user and
1217 group id mappings. For instance, you can map userid 0 in the container
1218 to userid 200000 on the host. The root user in the container will be
1219 privileged in the container, but unprivileged on the host. Normally a
1220 system container will want a range of ids, so you would map, for in‐
1221 stance, user and group ids 0 through 20,000 in the container to the ids
1222 200,000 through 220,000.
1223
1224 lxc.idmap
1225 Four values must be provided. First a character, either 'u', or
1226 'g', to specify whether user or group ids are being mapped. Next
1227 is the first userid as seen in the user namespace of the con‐
1228 tainer. Next is the userid as seen on the host. Finally, a range
1229 indicating the number of consecutive ids to map.
1230
1231 CONTAINER HOOKS
1232 Container hooks are programs or scripts which can be executed at vari‐
1233 ous times in a container's lifetime.
1234
1235 When a container hook is executed, additional information is passed
1236 along. The lxc.hook.version argument can be used to determine if the
1237 following arguments are passed as command line arguments or through en‐
1238 vironment variables. The arguments are:
1239
1240 • Container name.
1241
1242 • Section (always 'lxc').
1243
1244 • The hook type (i.e. 'clone' or 'pre-mount').
1245
1246 • Additional arguments. In the case of the clone hook, any extra argu‐
1247 ments passed will appear as further arguments to the hook. In the
1248 case of the stop hook, paths to filedescriptors for each of the con‐
1249 tainer's namespaces along with their types are passed.
1250
1251 The following environment variables are set:
1252
1253 • LXC_CGNS_AWARE: indicator whether the container is cgroup namespace
1254 aware.
1255
1256 • LXC_CONFIG_FILE: the path to the container configuration file.
1257
1258 • LXC_HOOK_TYPE: the hook type (e.g. 'clone', 'mount', 'pre-mount').
1259 Note that the existence of this environment variable is conditional
1260 on the value of lxc.hook.version. If it is set to 1 then
1261 LXC_HOOK_TYPE will be set.
1262
1263 • LXC_HOOK_SECTION: the section type (e.g. 'lxc', 'net'). Note that the
1264 existence of this environment variable is conditional on the value of
1265 lxc.hook.version. If it is set to 1 then LXC_HOOK_SECTION will be
1266 set.
1267
1268 • LXC_HOOK_VERSION: the version of the hooks. This value is identical
1269 to the value of the container's lxc.hook.version config item. If it
1270 is set to 0 then old-style hooks are used. If it is set to 1 then
1271 new-style hooks are used.
1272
1273 • LXC_LOG_LEVEL: the container's log level.
1274
1275 • LXC_NAME: is the container's name.
1276
1277 • LXC_[NAMESPACE IDENTIFIER]_NS: path under /proc/PID/fd/ to a file de‐
1278 scriptor referring to the container's namespace. For each preserved
1279 namespace type there will be a separate environment variable. These
1280 environment variables will only be set if lxc.hook.version is set to
1281 1.
1282
1283 • LXC_ROOTFS_MOUNT: the path to the mounted root filesystem.
1284
1285 • LXC_ROOTFS_PATH: this is the lxc.rootfs.path entry for the container.
1286 Note this is likely not where the mounted rootfs is to be found, use
1287 LXC_ROOTFS_MOUNT for that.
1288
1289 • LXC_SRC_NAME: in the case of the clone hook, this is the original
1290 container's name.
1291
1292 Standard output from the hooks is logged at debug level. Standard er‐
1293 ror is not logged, but can be captured by the hook redirecting its
1294 standard error to standard output.
1295
1296 lxc.hook.version
1297 To pass the arguments in new style via environment variables set
1298 to 1 otherwise set to 0 to pass them as arguments. This setting
1299 affects all hooks arguments that were traditionally passed as
1300 arguments to the script. Specifically, it affects the container
1301 name, section (e.g. 'lxc', 'net') and hook type (e.g. 'clone',
1302 'mount', 'pre-mount') arguments. If new-style hooks are used
1303 then the arguments will be available as environment variables.
1304 The container name will be set in LXC_NAME. (This is set inde‐
1305 pendently of the value used for this config item.) The section
1306 will be set in LXC_HOOK_SECTION and the hook type will be set in
1307 LXC_HOOK_TYPE. It also affects how the paths to file descrip‐
1308 tors referring to the container's namespaces are passed. If set
1309 to 1 then for each namespace a separate environment variable
1310 LXC_[NAMESPACE IDENTIFIER]_NS will be set. If set to 0 then the
1311 paths will be passed as arguments to the stop hook.
1312
1313 lxc.hook.pre-start
1314 A hook to be run in the host's namespace before the container
1315 ttys, consoles, or mounts are up.
1316
1317 lxc.hook.pre-mount
1318 A hook to be run in the container's fs namespace but before the
1319 rootfs has been set up. This allows for manipulation of the
1320 rootfs, i.e. to mount an encrypted filesystem. Mounts done in
1321 this hook will not be reflected on the host (apart from mounts
1322 propagation), so they will be automatically cleaned up when the
1323 container shuts down.
1324
1325 lxc.hook.mount
1326 A hook to be run in the container's namespace after mounting has
1327 been done, but before the pivot_root.
1328
1329 lxc.hook.autodev
1330 A hook to be run in the container's namespace after mounting has
1331 been done and after any mount hooks have run, but before the
1332 pivot_root, if lxc.autodev == 1. The purpose of this hook is to
1333 assist in populating the /dev directory of the container when
1334 using the autodev option for systemd based containers. The con‐
1335 tainer's /dev directory is relative to the ${LXC_ROOTFS_MOUNT}
1336 environment variable available when the hook is run.
1337
1338 lxc.hook.start-host
1339 A hook to be run in the host's namespace after the container has
1340 been setup, and immediately before starting the container init.
1341
1342 lxc.hook.start
1343 A hook to be run in the container's namespace immediately before
1344 executing the container's init. This requires the program to be
1345 available in the container.
1346
1347 lxc.hook.stop
1348 A hook to be run in the host's namespace with references to the
1349 container's namespaces after the container has been shut down.
1350 For each namespace an extra argument is passed to the hook con‐
1351 taining the namespace's type and a filename that can be used to
1352 obtain a file descriptor to the corresponding namespace, sepa‐
1353 rated by a colon. The type is the name as it would appear in the
1354 /proc/PID/ns directory. For instance for the mount namespace
1355 the argument usually looks like mnt:/proc/PID/fd/12.
1356
1357 lxc.hook.post-stop
1358 A hook to be run in the host's namespace after the container has
1359 been shut down.
1360
1361 lxc.hook.clone
1362 A hook to be run when the container is cloned to a new one. See
1363 lxc-clone(1) for more information.
1364
1365 lxc.hook.destroy
1366 A hook to be run when the container is destroyed.
1367
1368 CONTAINER HOOKS ENVIRONMENT VARIABLES
1369 A number of environment variables are made available to the startup
1370 hooks to provide configuration information and assist in the function‐
1371 ing of the hooks. Not all variables are valid in all contexts. In par‐
1372 ticular, all paths are relative to the host system and, as such, not
1373 valid during the lxc.hook.start hook.
1374
1375 LXC_NAME
1376 The LXC name of the container. Useful for logging messages in
1377 common log environments. [-n]
1378
1379 LXC_CONFIG_FILE
1380 Host relative path to the container configuration file. This
1381 gives the container to reference the original, top level, con‐
1382 figuration file for the container in order to locate any addi‐
1383 tional configuration information not otherwise made available.
1384 [-f]
1385
1386 LXC_CONSOLE
1387 The path to the console output of the container if not NULL.
1388 [-c] [lxc.console.path]
1389
1390 LXC_CONSOLE_LOGPATH
1391 The path to the console log output of the container if not NULL.
1392 [-L]
1393
1394 LXC_ROOTFS_MOUNT
1395 The mount location to which the container is initially bound.
1396 This will be the host relative path to the container rootfs for
1397 the container instance being started and is where changes should
1398 be made for that instance. [lxc.rootfs.mount]
1399
1400 LXC_ROOTFS_PATH
1401 The host relative path to the container root which has been
1402 mounted to the rootfs.mount location. [lxc.rootfs.path]
1403
1404 LXC_SRC_NAME
1405 Only for the clone hook. Is set to the original container name.
1406
1407 LXC_TARGET
1408 Only for the stop hook. Is set to "stop" for a container shut‐
1409 down or "reboot" for a container reboot.
1410
1411 LXC_CGNS_AWARE
1412 If unset, then this version of lxc is not aware of cgroup name‐
1413 spaces. If set, it will be set to 1, and lxc is aware of cgroup
1414 namespaces. Note this does not guarantee that cgroup namespaces
1415 are enabled in the kernel. This is used by the lxcfs mount hook.
1416
1417 LOGGING
1418 Logging can be configured on a per-container basis. By default, depend‐
1419 ing upon how the lxc package was compiled, container startup is logged
1420 only at the ERROR level, and logged to a file named after the container
1421 (with '.log' appended) either under the container path, or under
1422 /var/log/lxc.
1423
1424 Both the default log level and the log file can be specified in the
1425 container configuration file, overriding the default behavior. Note
1426 that the configuration file entries can in turn be overridden by the
1427 command line options to lxc-start.
1428
1429 lxc.log.level
1430 The level at which to log. The log level is an integer in the
1431 range of 0..8 inclusive, where a lower number means more verbose
1432 debugging. In particular 0 = trace, 1 = debug, 2 = info, 3 = no‐
1433 tice, 4 = warn, 5 = error, 6 = critical, 7 = alert, and 8 = fa‐
1434 tal. If unspecified, the level defaults to 5 (error), so that
1435 only errors and above are logged.
1436
1437 Note that when a script (such as either a hook script or a net‐
1438 work interface up or down script) is called, the script's stan‐
1439 dard output is logged at level 1, debug.
1440
1441 lxc.log.file
1442 The file to which logging info should be written.
1443
1444 lxc.log.syslog
1445 Send logging info to syslog. It respects the log level defined
1446 in lxc.log.level. The argument should be the syslog facility to
1447 use, valid ones are: daemon, local0, local1, local2, local3, lo‐
1448 cal4, local5, local5, local6, local7.
1449
1450 AUTOSTART
1451 The autostart options support marking which containers should be auto-
1452 started and in what order. These options may be used by LXC tools di‐
1453 rectly or by external tooling provided by the distributions.
1454
1455 lxc.start.auto
1456 Whether the container should be auto-started. Valid values are
1457 0 (off) and 1 (on).
1458
1459 lxc.start.delay
1460 How long to wait (in seconds) after the container is started be‐
1461 fore starting the next one.
1462
1463 lxc.start.order
1464 An integer used to sort the containers when auto-starting a se‐
1465 ries of containers at once. A lower value means an earlier
1466 start.
1467
1468 lxc.monitor.unshare
1469 If not zero the mount namespace will be unshared from the host
1470 before initializing the container (before running any pre-start
1471 hooks). This requires the CAP_SYS_ADMIN capability at startup.
1472 Default is 0.
1473
1474 lxc.monitor.signal.pdeath
1475 Set the signal to be sent to the container's init when the lxc
1476 monitor exits. By default it is set to SIGKILL which will cause
1477 all container processes to be killed when the lxc monitor
1478 process dies. To ensure that containers stay alive even if lxc
1479 monitor dies set this to 0.
1480
1481 lxc.group
1482 A multi-value key (can be used multiple times) to put the con‐
1483 tainer in a container group. Those groups can then be used
1484 (amongst other things) to start a series of related containers.
1485
1486 AUTOSTART AND SYSTEM BOOT
1487 Each container can be part of any number of groups or no group at all.
1488 Two groups are special. One is the NULL group, i.e. the container does
1489 not belong to any group. The other group is the "onboot" group.
1490
1491 When the system boots with the LXC service enabled, it will first at‐
1492 tempt to boot any containers with lxc.start.auto == 1 that is a member
1493 of the "onboot" group. The startup will be in order of lxc.start.order.
1494 If an lxc.start.delay has been specified, that delay will be honored
1495 before attempting to start the next container to give the current con‐
1496 tainer time to begin initialization and reduce overloading the host
1497 system. After starting the members of the "onboot" group, the LXC sys‐
1498 tem will proceed to boot containers with lxc.start.auto == 1 which are
1499 not members of any group (the NULL group) and proceed as with the on‐
1500 boot group.
1501
1502 CONTAINER ENVIRONMENT
1503 If you want to pass environment variables into the container (that is,
1504 environment variables which will be available to init and all of its
1505 descendents), you can use lxc.environment parameters to do so. Be care‐
1506 ful that you do not pass in anything sensitive; any process in the con‐
1507 tainer which doesn't have its environment scrubbed will have these
1508 variables available to it, and environment variables are always avail‐
1509 able via /proc/PID/environ.
1510
1511 This configuration parameter can be specified multiple times; once for
1512 each environment variable you wish to configure.
1513
1514 lxc.environment
1515 Specify an environment variable to pass into the container. Ex‐
1516 ample:
1517
1518 lxc.environment = APP_ENV=production
1519 lxc.environment = SYSLOG_SERVER=192.0.2.42
1520
1521
1522 It is possible to inherit host environment variables by setting
1523 the name of the variable without a "=" sign. For example:
1524
1525 lxc.environment = PATH
1526
1527
1529 In addition to the few examples given below, you will find some other
1530 examples of configuration file in /usr/share/doc/lxc/examples
1531
1532 NETWORK
1533 This configuration sets up a container to use a veth pair device with
1534 one side plugged to a bridge br0 (which has been configured before on
1535 the system by the administrator). The virtual network device visible in
1536 the container is renamed to eth0.
1537
1538 lxc.uts.name = myhostname
1539 lxc.net.0.type = veth
1540 lxc.net.0.flags = up
1541 lxc.net.0.link = br0
1542 lxc.net.0.name = eth0
1543 lxc.net.0.hwaddr = 4a:49:43:49:79:bf
1544 lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
1545 lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
1546
1547
1548 UID/GID MAPPING
1549 This configuration will map both user and group ids in the range 0-9999
1550 in the container to the ids 100000-109999 on the host.
1551
1552 lxc.idmap = u 0 100000 10000
1553 lxc.idmap = g 0 100000 10000
1554
1555
1556 CONTROL GROUP
1557 This configuration will setup several control groups for the applica‐
1558 tion, cpuset.cpus restricts usage of the defined cpu, cpus.share prior‐
1559 itize the control group, devices.allow makes usable the specified de‐
1560 vices.
1561
1562 lxc.cgroup.cpuset.cpus = 0,1
1563 lxc.cgroup.cpu.shares = 1234
1564 lxc.cgroup.devices.deny = a
1565 lxc.cgroup.devices.allow = c 1:3 rw
1566 lxc.cgroup.devices.allow = b 8:0 rw
1567
1568
1569 COMPLEX CONFIGURATION
1570 This example show a complex configuration making a complex network
1571 stack, using the control groups, setting a new hostname, mounting some
1572 locations and a changing root file system.
1573
1574 lxc.uts.name = complex
1575 lxc.net.0.type = veth
1576 lxc.net.0.flags = up
1577 lxc.net.0.link = br0
1578 lxc.net.0.hwaddr = 4a:49:43:49:79:bf
1579 lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
1580 lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
1581 lxc.net.0.ipv6.address = 2003:db8:1:0:214:5432:feab:3588
1582 lxc.net.1.type = macvlan
1583 lxc.net.1.flags = up
1584 lxc.net.1.link = eth0
1585 lxc.net.1.hwaddr = 4a:49:43:49:79:bd
1586 lxc.net.1.ipv4.address = 10.2.3.4/24
1587 lxc.net.1.ipv4.address = 192.168.10.125/24
1588 lxc.net.1.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3596
1589 lxc.net.2.type = phys
1590 lxc.net.2.flags = up
1591 lxc.net.2.link = random0
1592 lxc.net.2.hwaddr = 4a:49:43:49:79:ff
1593 lxc.net.2.ipv4.address = 10.2.3.6/24
1594 lxc.net.2.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3297
1595 lxc.cgroup.cpuset.cpus = 0,1
1596 lxc.cgroup.cpu.shares = 1234
1597 lxc.cgroup.devices.deny = a
1598 lxc.cgroup.devices.allow = c 1:3 rw
1599 lxc.cgroup.devices.allow = b 8:0 rw
1600 lxc.mount.fstab = /etc/fstab.complex
1601 lxc.mount.entry = /lib /root/myrootfs/lib none ro,bind 0 0
1602 lxc.rootfs.path = dir:/mnt/rootfs.complex
1603 lxc.rootfs.options = idmap=container
1604 lxc.cap.drop = sys_module mknod setuid net_raw
1605 lxc.cap.drop = mac_override
1606
1607
1609 chroot(1), pivot_root(8), fstab(5), capabilities(7)
1610
1612 lxc(7), lxc-create(1), lxc-copy(1), lxc-destroy(1), lxc-start(1), lxc-
1613 stop(1), lxc-execute(1), lxc-console(1), lxc-monitor(1), lxc-wait(1),
1614 lxc-cgroup(1), lxc-ls(1), lxc-info(1), lxc-freeze(1), lxc-unfreeze(1),
1615 lxc-attach(1), lxc.conf(5)
1616
1618 Daniel Lezcano <daniel.lezcano@free.fr>
1619
1620
1621
1622 2021-09-18 lxc.container.conf(5)