1lxc.container.conf(5) lxc.container.conf(5)
2
3
4
6 lxc.container.conf - LXC container configuration file
7
9 LXC is the well-known and heavily tested low-level Linux container run‐
10 time. It is in active development since 2008 and has proven itself in
11 critical production environments world-wide. Some of its core contribu‐
12 tors are the same people that helped to implement various well-known
13 containerization features inside the Linux kernel.
14
15 LXC's main focus is system containers. That is, containers which offer
16 an environment as close as possible as the one you'd get from a VM but
17 without the overhead that comes with running a separate kernel and sim‐
18 ulating all the hardware.
19
20 This is achieved through a combination of kernel security features such
21 as namespaces, mandatory access control and control groups.
22
23 LXC has support for unprivileged containers. Unprivileged containers
24 are containers that are run without any privilege. This requires sup‐
25 port for user namespaces in the kernel that the container is run on.
26 LXC was the first runtime to support unprivileged containers after user
27 namespaces were merged into the mainline kernel.
28
29 In essence, user namespaces isolate given sets of UIDs and GIDs. This
30 is achieved by establishing a mapping between a range of UIDs and GIDs
31 on the host to a different (unprivileged) range of UIDs and GIDs in the
32 container. The kernel will translate this mapping in such a way that
33 inside the container all UIDs and GIDs appear as you would expect from
34 the host whereas on the host these UIDs and GIDs are in fact unprivi‐
35 leged. For example, a process running as UID and GID 0 inside the con‐
36 tainer might appear as UID and GID 100000 on the host. The implementa‐
37 tion and working details can be gathered from the corresponding user
38 namespace man page. UID and GID mappings can be defined with the
39 lxc.idmap key.
40
41 Linux containers are defined with a simple configuration file. Each op‐
42 tion in the configuration file has the form key = value fitting in one
43 line. The "#" character means the line is a comment. List options, like
44 capabilities and cgroups options, can be used with no value to clear
45 any previously defined values of that option.
46
47 LXC namespaces configuration keys use single dots. This means complex
48 configuration keys such as lxc.net.0 expose various subkeys such as
49 lxc.net.0.type, lxc.net.0.link, lxc.net.0.ipv6.address, and others for
50 even more fine-grained configuration.
51
52 CONFIGURATION
53 In order to ease administration of multiple related containers, it is
54 possible to have a container configuration file cause another file to
55 be loaded. For instance, network configuration can be defined in one
56 common file which is included by multiple containers. Then, if the con‐
57 tainers are moved to another host, only one file may need to be up‐
58 dated.
59
60 lxc.include
61 Specify the file to be included. The included file must be in
62 the same valid lxc configuration file format.
63
64 ARCHITECTURE
65 Allows one to set the architecture for the container. For example, set
66 a 32bits architecture for a container running 32bits binaries on a
67 64bits host. This fixes the container scripts which rely on the archi‐
68 tecture to do some work like downloading the packages.
69
70 lxc.arch
71 Specify the architecture for the container.
72
73 Some valid options are x86, i686, x86_64, amd64
74
75 HOSTNAME
76 The utsname section defines the hostname to be set for the container.
77 That means the container can set its own hostname without changing the
78 one from the system. That makes the hostname private for the container.
79
80 lxc.uts.name
81 specify the hostname for the container
82
83 HALT SIGNAL
84 Allows one to specify signal name or number sent to the container's
85 init process to cleanly shutdown the container. Different init systems
86 could use different signals to perform clean shutdown sequence. This
87 option allows the signal to be specified in kill(1) fashion, e.g. SIG‐
88 PWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is
89 SIGPWR.
90
91 lxc.signal.halt
92 specify the signal used to halt the container
93
94 REBOOT SIGNAL
95 Allows one to specify signal name or number to reboot the container.
96 This option allows signal to be specified in kill(1) fashion, e.g.
97 SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal
98 is SIGINT.
99
100 lxc.signal.reboot
101 specify the signal used to reboot the container
102
103 STOP SIGNAL
104 Allows one to specify signal name or number to forcibly shutdown the
105 container. This option allows signal to be specified in kill(1) fash‐
106 ion, e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. The de‐
107 fault signal is SIGKILL.
108
109 lxc.signal.stop
110 specify the signal used to stop the container
111
112 INIT COMMAND
113 Sets the command to use as the init system for the containers.
114
115 lxc.execute.cmd
116 Absolute path from container rootfs to the binary to run by de‐
117 fault. This mostly makes sense for lxc-execute.
118
119 lxc.init.cmd
120 Absolute path from container rootfs to the binary to use as
121 init. This mostly makes sense for lxc-start. Default is
122 /sbin/init.
123
124 INIT WORKING DIRECTORY
125 Sets the absolute path inside the container as the working directory
126 for the containers. LXC will switch to this directory before executing
127 init.
128
129 lxc.init.cwd
130 Absolute path inside the container to use as the working direc‐
131 tory.
132
133 INIT ID
134 Sets the UID/GID to use for the init system, and subsequent commands.
135 Note that using a non-root UID when booting a system container will
136 likely not work due to missing privileges. Setting the UID/GID is
137 mostly useful when running application containers. Defaults to:
138 UID(0), GID(0)
139
140 lxc.init.uid
141 UID to use for init.
142
143 lxc.init.gid
144 GID to use for init.
145
146 CORE SCHEDULING
147 Core scheduling defines if the container payload is marked as being
148 schedulable on the same core. Doing so will cause the kernel scheduler
149 to ensure that tasks that are not in the same group never run simulta‐
150 neously on a core. This can serve as an extra security measure to pre‐
151 vent the container payload from using cross hyper thread attacks.
152
153 lxc.sched.core
154 The only allowed values are 0 and 1. Set this to 1 to create a
155 core scheduling domain for the container or 0 to not create one.
156 If not set explicitly no core scheduling domain will be created
157 for the container.
158
159 PROC
160 Configure proc filesystem for the container.
161
162 lxc.proc.[proc file name]
163 Specify the proc file name to be set. The file names available
164 are those listed under /proc/PID/. Example:
165
166 lxc.proc.oom_score_adj = 10
167
168
169 EPHEMERAL
170 Allows one to specify whether a container will be destroyed on shut‐
171 down.
172
173 lxc.ephemeral
174 The only allowed values are 0 and 1. Set this to 1 to destroy a
175 container on shutdown.
176
177 NETWORK
178 The network section defines how the network is virtualized in the con‐
179 tainer. The network virtualization acts at layer two. In order to use
180 the network virtualization, parameters must be specified to define the
181 network interfaces of the container. Several virtual interfaces can be
182 assigned and used in a container even if the system has only one physi‐
183 cal network interface.
184
185 lxc.net
186 may be used without a value to clear all previous network op‐
187 tions.
188
189 lxc.net.[i].type
190 specify what kind of network virtualization to be used for the
191 container. Must be specified before any other option(s) on the
192 net device. Multiple networks can be specified by using an ad‐
193 ditional index i after all lxc.net.* keys. For example,
194 lxc.net.0.type = veth and lxc.net.1.type = veth specify two dif‐
195 ferent networks of the same type. All keys sharing the same in‐
196 dex i will be treated as belonging to the same network. For ex‐
197 ample, lxc.net.0.link = br0 will belong to lxc.net.0.type. Cur‐
198 rently, the different virtualization types can be:
199
200 none: will cause the container to share the host's network name‐
201 space. This means the host network devices are usable in the
202 container. It also means that if both the container and host
203 have upstart as init, 'halt' in a container (for instance) will
204 shut down the host. Note that unprivileged containers do not
205 work with this setting due to an inability to mount sysfs. An
206 unsafe workaround would be to bind mount the host's sysfs.
207
208 empty: will create only the loopback interface.
209
210 veth: a virtual ethernet pair device is created with one side
211 assigned to the container and the other side on the host.
212 lxc.net.[i].veth.mode specifies the mode the veth parent will
213 use on the host. The accepted modes are bridge and router. The
214 mode defaults to bridge if not specified. In bridge mode the
215 host side is attached to a bridge specified by the
216 lxc.net.[i].link option. If the bridge link is not specified,
217 then the veth pair device will be created but not attached to
218 any bridge. Otherwise, the bridge has to be created on the sys‐
219 tem before starting the container. lxc won't handle any config‐
220 uration outside of the container. In router mode static routes
221 are created on the host for the container's IP addresses point‐
222 ing to the host side veth interface. Additionally Proxy ARP and
223 Proxy NDP entries are added on the host side veth interface for
224 the gateway IPs defined in the container to allow the container
225 to reach the host. By default, lxc chooses a name for the net‐
226 work device belonging to the outside of the container, but if
227 you wish to handle this name yourselves, you can tell lxc to set
228 a specific name with the lxc.net.[i].veth.pair option (except
229 for unprivileged containers where this option is ignored for se‐
230 curity reasons). Static routes can be added on the host point‐
231 ing to the container using the lxc.net.[i].veth.ipv4.route and
232 lxc.net.[i].veth.ipv6.route options. Several lines specify sev‐
233 eral routes. The route is in format x.y.z.t/m, eg.
234 192.168.1.0/24.
235
236 vlan: a vlan interface is linked with the interface specified by
237 the lxc.net.[i].link and assigned to the container. The vlan
238 identifier is specified with the option lxc.net.[i].vlan.id.
239
240 macvlan: a macvlan interface is linked with the interface speci‐
241 fied by the lxc.net.[i].link and assigned to the container.
242 lxc.net.[i].macvlan.mode specifies the mode the macvlan will use
243 to communicate between different macvlan on the same upper de‐
244 vice. The accepted modes are private, vepa, bridge and passthru.
245 In private mode, the device never communicates with any other
246 device on the same upper_dev (default). In vepa mode, the new
247 Virtual Ethernet Port Aggregator (VEPA) mode, it assumes that
248 the adjacent bridge returns all frames where both source and
249 destination are local to the macvlan port, i.e. the bridge is
250 set up as a reflective relay. Broadcast frames coming in from
251 the upper_dev get flooded to all macvlan interfaces in VEPA
252 mode, local frames are not delivered locally. In bridge mode, it
253 provides the behavior of a simple bridge between different
254 macvlan interfaces on the same port. Frames from one interface
255 to another one get delivered directly and are not sent out ex‐
256 ternally. Broadcast frames get flooded to all other bridge ports
257 and to the external interface, but when they come back from a
258 reflective relay, we don't deliver them again. Since we know all
259 the MAC addresses, the macvlan bridge mode does not require
260 learning or STP like the bridge module does. In passthru mode,
261 all frames received by the physical interface are forwarded to
262 the macvlan interface. Only one macvlan interface in passthru
263 mode is possible for one physical interface.
264
265 ipvlan: an ipvlan interface is linked with the interface speci‐
266 fied by the lxc.net.[i].link and assigned to the container.
267 lxc.net.[i].ipvlan.mode specifies the mode the ipvlan will use
268 to communicate between different ipvlan on the same upper de‐
269 vice. The accepted modes are l3, l3s and l2. It defaults to l3
270 mode. In l3 mode TX processing up to L3 happens on the stack
271 instance attached to the dependent device and packets are
272 switched to the stack instance of the parent device for the L2
273 processing and routing from that instance will be used before
274 packets are queued on the outbound device. In this mode the de‐
275 pendent devices will not receive nor can send multicast / broad‐
276 cast traffic. In l3s mode TX processing is very similar to the
277 L3 mode except that iptables (conn-tracking) works in this mode
278 and hence it is L3-symmetric (L3s). This will have slightly
279 less performance but that shouldn't matter since you are choos‐
280 ing this mode over plain-L3 mode to make conn-tracking work. In
281 l2 mode TX processing happens on the stack instance attached to
282 the dependent device and packets are switched and queued to the
283 parent device to send devices out. In this mode the dependent
284 devices will RX/TX multicast and broadcast (if applicable) as
285 well. lxc.net.[i].ipvlan.isolation specifies the isolation
286 mode. The accepted isolation values are bridge, private and
287 vepa. It defaults to bridge. In bridge isolation mode depen‐
288 dent devices can cross-talk among themselves apart from talking
289 through the parent device. In private isolation mode the port
290 is set in private mode. i.e. port won't allow cross communica‐
291 tion between dependent devices. In vepa isolation mode the port
292 is set in VEPA mode. i.e. port will offload switching function‐
293 ality to the external entity as described in 802.1Qbg.
294
295 phys: an already existing interface specified by the
296 lxc.net.[i].link is assigned to the container.
297
298 lxc.net.[i].flags
299 Specify an action to do for the network.
300
301 up: activates the interface.
302
303 lxc.net.[i].link
304 Specify the interface to be used for real network traffic.
305
306 lxc.net.[i].l2proxy
307 Controls whether layer 2 IP neighbour proxy entries will be
308 added to the lxc.net.[i].link interface for the IP addresses of
309 the container. Can be set to 0 or 1. Defaults to 0. When used
310 with IPv4 addresses, the following sysctl values need to be set:
311 net.ipv4.conf.[link].forwarding=1 When used with IPv6 addresses,
312 the following sysctl values need to be set:
313 net.ipv6.conf.[link].proxy_ndp=1 net.ipv6.conf.[link].forward‐
314 ing=1
315
316 lxc.net.[i].mtu
317 Specify the maximum transfer unit for this interface.
318
319 lxc.net.[i].name
320 The interface name is dynamically allocated, but if another name
321 is needed because the configuration files being used by the con‐
322 tainer use a generic name, eg. eth0, this option will rename the
323 interface in the container.
324
325 lxc.net.[i].hwaddr
326 The interface mac address is dynamically allocated by default to
327 the virtual interface, but in some cases, this is needed to re‐
328 solve a mac address conflict or to always have the same link-lo‐
329 cal ipv6 address. Any "x" in address will be replaced by random
330 value, this allows setting hwaddr templates.
331
332 lxc.net.[i].ipv4.address
333 Specify the ipv4 address to assign to the virtualized interface.
334 Several lines specify several ipv4 addresses. The address is in
335 format x.y.z.t/m, eg. 192.168.1.123/24. You can optionally
336 specify the broadcast address after the IP address, e.g.
337 192.168.1.123/24 255.255.255.255. Otherwise it is automatically
338 calculated from the IP address.
339
340 lxc.net.[i].ipv4.gateway
341 Specify the ipv4 address to use as the gateway inside the con‐
342 tainer. The address is in format x.y.z.t, eg. 192.168.1.123.
343 Can also have the special value auto, which means to take the
344 primary address from the bridge interface (as specified by the
345 lxc.net.[i].link option) and use that as the gateway. auto is
346 only available when using the veth, macvlan and ipvlan network
347 types. Can also have the special value of dev, which means to
348 set the default gateway as a device route. This is primarily
349 for use with layer 3 network modes, such as IPVLAN.
350
351 lxc.net.[i].ipv6.address
352 Specify the ipv6 address to assign to the virtualized interface.
353 Several lines specify several ipv6 addresses. The address is in
354 format x::y/m, eg. 2003:db8:1:0:214:1234:fe0b:3596/64
355
356 lxc.net.[i].ipv6.gateway
357 Specify the ipv6 address to use as the gateway inside the con‐
358 tainer. The address is in format x::y, eg. 2003:db8:1:0::1 Can
359 also have the special value auto, which means to take the pri‐
360 mary address from the bridge interface (as specified by the
361 lxc.net.[i].link option) and use that as the gateway. auto is
362 only available when using the veth, macvlan and ipvlan network
363 types. Can also have the special value of dev, which means to
364 set the default gateway as a device route. This is primarily
365 for use with layer 3 network modes, such as IPVLAN.
366
367 lxc.net.[i].script.up
368 Add a configuration option to specify a script to be executed
369 after creating and configuring the network used from the host
370 side.
371
372 In addition to the information available to all hooks. The fol‐
373 lowing information is provided to the script:
374
375 • LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
376
377 • LXC_HOOK_SECTION: the section type 'net'.
378
379 • LXC_NET_TYPE: the network type. This is one of the valid net‐
380 work types listed here (e.g. 'vlan', 'macvlan', 'ipvlan',
381 'veth').
382
383 • LXC_NET_PARENT: the parent device on the host. This is only
384 set for network types 'mavclan', 'veth', 'phys'.
385
386 • LXC_NET_PEER: the name of the peer device on the host. This is
387 only set for 'veth' network types. Note that this information
388 is only available when lxc.hook.version is set to 1.
389
390 Whether this information is provided in the form of environment vari‐
391 ables or as arguments to the script depends on the value of
392 lxc.hook.version. If set to 1 then information is provided in the form
393 of environment variables. If set to 0 information is provided as argu‐
394 ments to the script.
395
396 Standard output from the script is logged at debug level. Standard er‐
397 ror is not logged, but can be captured by the hook redirecting its
398 standard error to standard output.
399
400 lxc.net.[i].script.down
401 Add a configuration option to specify a script to be executed
402 before destroying the network used from the host side.
403
404 In addition to the information available to all hooks. The fol‐
405 lowing information is provided to the script:
406
407 • LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
408
409 • LXC_HOOK_SECTION: the section type 'net'.
410
411 • LXC_NET_TYPE: the network type. This is one of the valid net‐
412 work types listed here (e.g. 'vlan', 'macvlan', 'ipvlan',
413 'veth').
414
415 • LXC_NET_PARENT: the parent device on the host. This is only
416 set for network types 'mavclan', 'veth', 'phys'.
417
418 • LXC_NET_PEER: the name of the peer device on the host. This is
419 only set for 'veth' network types. Note that this information
420 is only available when lxc.hook.version is set to 1.
421
422 Whether this information is provided in the form of environment vari‐
423 ables or as arguments to the script depends on the value of
424 lxc.hook.version. If set to 1 then information is provided in the form
425 of environment variables. If set to 0 information is provided as argu‐
426 ments to the script.
427
428 Standard output from the script is logged at debug level. Standard er‐
429 ror is not logged, but can be captured by the hook redirecting its
430 standard error to standard output.
431
432 NEW PSEUDO TTY INSTANCE (DEVPTS)
433 For stricter isolation the container can have its own private instance
434 of the pseudo tty.
435
436 lxc.pty.max
437 If set, the container will have a new pseudo tty instance, mak‐
438 ing this private to it. The value specifies the maximum number
439 of pseudo ttys allowed for a pty instance (this limitation is
440 not implemented yet).
441
442 CONTAINER SYSTEM CONSOLE
443 If the container is configured with a root filesystem and the inittab
444 file is setup to use the console, you may want to specify where the
445 output of this console goes.
446
447 lxc.console.buffer.size
448 Setting this option instructs liblxc to allocate an in-memory
449 ringbuffer. The container's console output will be written to
450 the ringbuffer. Note that ringbuffer must be at least as big as
451 a standard page size. When passed a value smaller than a single
452 page size liblxc will allocate a ringbuffer of a single page
453 size. A page size is usually 4KB. The keyword 'auto' will cause
454 liblxc to allocate a ringbuffer of 128KB. When manually speci‐
455 fying a size for the ringbuffer the value should be a power of 2
456 when converted to bytes. Valid size prefixes are 'KB', 'MB',
457 'GB'. (Note that all conversions are based on multiples of 1024.
458 That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'. Addi‐
459 tionally, the case of the suffix is ignored, i.e. 'kB', 'KB' and
460 'Kb' are treated equally.)
461
462 lxc.console.size
463 Setting this option instructs liblxc to place a limit on the
464 size of the console log file specified in lxc.console.logfile.
465 Note that size of the log file must be at least as big as a
466 standard page size. When passed a value smaller than a single
467 page size liblxc will set the size of log file to a single page
468 size. A page size is usually 4KB. The keyword 'auto' will cause
469 liblxc to place a limit of 128KB on the log file. When manually
470 specifying a size for the log file the value should be a power
471 of 2 when converted to bytes. Valid size prefixes are 'KB',
472 'MB', 'GB'. (Note that all conversions are based on multiples of
473 1024. That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'.
474 Additionally, the case of the suffix is ignored, i.e. 'kB', 'KB'
475 and 'Kb' are treated equally.) If users want to mirror the con‐
476 sole ringbuffer on disk they should set lxc.console.size equal
477 to lxc.console.buffer.size.
478
479 lxc.console.logfile
480 Specify a path to a file where the console output will be writ‐
481 ten. Note that in contrast to the on-disk ringbuffer logfile
482 this file will keep growing potentially filling up the users
483 disks if not rotated and deleted. This problem can also be
484 avoided by using the in-memory ringbuffer options lxc.con‐
485 sole.buffer.size and lxc.console.buffer.logfile.
486
487 lxc.console.rotate
488 Whether to rotate the console logfile specified in lxc.con‐
489 sole.logfile. Users can send an API request to rotate the log‐
490 file. Note that the old logfile will have the same name as the
491 original with the suffix ".1" appended. Users wishing to pre‐
492 vent the console log file from filling the disk should rotate
493 the logfile and delete it if unneeded. This problem can also be
494 avoided by using the in-memory ringbuffer options lxc.con‐
495 sole.buffer.size and lxc.console.buffer.logfile.
496
497 lxc.console.path
498 Specify a path to a device to which the console will be at‐
499 tached. The keyword 'none' will simply disable the console.
500 Note, when specifying 'none' and creating a device node for the
501 console in the container at /dev/console or bind-mounting the
502 hosts's /dev/console into the container at /dev/console the con‐
503 tainer will have direct access to the hosts's /dev/console.
504 This is dangerous when the container has write access to the de‐
505 vice and should thus be used with caution.
506
507 CONSOLE THROUGH THE TTYS
508 This option is useful if the container is configured with a root
509 filesystem and the inittab file is setup to launch a getty on the ttys.
510 The option specifies the number of ttys to be available for the con‐
511 tainer. The number of gettys in the inittab file of the container
512 should not be greater than the number of ttys specified in this option,
513 otherwise the excess getty sessions will die and respawn indefinitely
514 giving annoying messages on the console or in /var/log/messages.
515
516 lxc.tty.max
517 Specify the number of tty to make available to the container.
518
519 CONSOLE DEVICES LOCATION
520 LXC consoles are provided through Unix98 PTYs created on the host and
521 bind-mounted over the expected devices in the container. By default,
522 they are bind-mounted over /dev/console and /dev/ttyN. This can prevent
523 package upgrades in the guest. Therefore you can specify a directory
524 location (under /dev under which LXC will create the files and bind-
525 mount over them. These will then be symbolically linked to /dev/console
526 and /dev/ttyN. A package upgrade can then succeed as it is able to re‐
527 move and replace the symbolic links.
528
529 lxc.tty.dir
530 Specify a directory under /dev under which to create the con‐
531 tainer console devices. Note that LXC will move any bind-mounts
532 or device nodes for /dev/console into this directory.
533
534 /DEV DIRECTORY
535 By default, lxc creates a few symbolic links (fd,stdin,stdout,stderr)
536 in the container's /dev directory but does not automatically create de‐
537 vice node entries. This allows the container's /dev to be set up as
538 needed in the container rootfs. If lxc.autodev is set to 1, then after
539 mounting the container's rootfs LXC will mount a fresh tmpfs under /dev
540 (limited to 500K by default, unless defined in lxc.autodev.tmpfs.size)
541 and fill in a minimal set of initial devices. This is generally re‐
542 quired when starting a container containing a "systemd" based "init"
543 but may be optional at other times. Additional devices in the contain‐
544 ers /dev directory may be created through the use of the lxc.hook.au‐
545 todev hook.
546
547 lxc.autodev
548 Set this to 0 to stop LXC from mounting and populating a minimal
549 /dev when starting the container.
550
551 lxc.autodev.tmpfs.size
552 Set this to define the size of the /dev tmpfs. The default
553 value is 500000 (500K). If the parameter is used but without
554 value, the default value is used.
555
556 MOUNT POINTS
557 The mount points section specifies the different places to be mounted.
558 These mount points will be private to the container and won't be visi‐
559 ble by the processes running outside of the container. This is useful
560 to mount /etc, /var or /home for examples.
561
562 NOTE - LXC will generally ensure that mount targets and relative bind-
563 mount sources are properly confined under the container root, to avoid
564 attacks involving over-mounting host directories and files. (Symbolic
565 links in absolute mount sources are ignored) However, if the container
566 configuration first mounts a directory which is under the control of
567 the container user, such as /home/joe, into the container at some path,
568 and then mounts under path, then a TOCTTOU attack would be possible
569 where the container user modifies a symbolic link under their home di‐
570 rectory at just the right time.
571
572 lxc.mount.fstab
573 specify a file location in the fstab format, containing the
574 mount information. The mount target location can and in most
575 cases should be a relative path, which will become relative to
576 the mounted container root. For instance,
577
578 proc proc proc nodev,noexec,nosuid 0 0
579
580
581 Will mount a proc filesystem under the container's /proc, re‐
582 gardless of where the root filesystem comes from. This is re‐
583 silient to block device backed filesystems as well as container
584 cloning.
585
586 Note that when mounting a filesystem from an image file or block
587 device the third field (fs_vfstype) cannot be auto as with
588 mount(8) but must be explicitly specified.
589
590 lxc.mount.entry
591 Specify a mount point corresponding to a line in the fstab for‐
592 mat. Moreover lxc supports mount propagation, such as rshared
593 or rprivate, and adds three additional mount options. optional
594 don't fail if mount does not work. create=dir or create=file to
595 create dir (or file) when the point will be mounted. relative
596 source path is taken to be relative to the mounted container
597 root. For instance,
598
599 dev/null proc/kcore none bind,relative 0 0
600
601
602 Will expand dev/null to ${LXC_ROOTFS_MOUNT}/dev/null, and mount
603 it to proc/kcore inside the container.
604
605 lxc.mount.auto
606 specify which standard kernel file systems should be automati‐
607 cally mounted. This may dramatically simplify the configuration.
608 The file systems are:
609
610 • proc:mixed (or proc): mount /proc as read-write, but remount
611 /proc/sys and /proc/sysrq-trigger read-only for security /
612 container isolation purposes.
613
614 • proc:rw: mount /proc as read-write
615
616 • sys:mixed (or sys): mount /sys as read-only but with /sys/de‐
617 vices/virtual/net writable.
618
619 • sys:ro: mount /sys as read-only for security / container iso‐
620 lation purposes.
621
622 • sys:rw: mount /sys as read-write
623
624 • cgroup:mixed: Mount a tmpfs to /sys/fs/cgroup, create directo‐
625 ries for all hierarchies to which the container is added, cre‐
626 ate subdirectories in those hierarchies with the name of the
627 cgroup, and bind-mount the container's own cgroup into that
628 directory. The container will be able to write to its own
629 cgroup directory, but not the parents, since they will be re‐
630 mounted read-only.
631
632 • cgroup:mixed:force: The force option will cause LXC to perform
633 the cgroup mounts for the container under all circumstances.
634 Otherwise it is similar to cgroup:mixed. This is mainly use‐
635 ful when the cgroup namespaces are enabled where LXC will nor‐
636 mally leave mounting cgroups to the init binary of the con‐
637 tainer since it is perfectly safe to do so.
638
639 • cgroup:ro: similar to cgroup:mixed, but everything will be
640 mounted read-only.
641
642 • cgroup:ro:force: The force option will cause LXC to perform
643 the cgroup mounts for the container under all circumstances.
644 Otherwise it is similar to cgroup:ro. This is mainly useful
645 when the cgroup namespaces are enabled where LXC will normally
646 leave mounting cgroups to the init binary of the container
647 since it is perfectly safe to do so.
648
649 • cgroup:rw: similar to cgroup:mixed, but everything will be
650 mounted read-write. Note that the paths leading up to the con‐
651 tainer's own cgroup will be writable, but will not be a cgroup
652 filesystem but just part of the tmpfs of /sys/fs/cgroup
653
654 • cgroup:rw:force: The force option will cause LXC to perform
655 the cgroup mounts for the container under all circumstances.
656 Otherwise it is similar to cgroup:rw. This is mainly useful
657 when the cgroup namespaces are enabled where LXC will normally
658 leave mounting cgroups to the init binary of the container
659 since it is perfectly safe to do so.
660
661 • cgroup (without specifier): defaults to cgroup:rw if the con‐
662 tainer retains the CAP_SYS_ADMIN capability, cgroup:mixed oth‐
663 erwise.
664
665 • cgroup-full:mixed: mount a tmpfs to /sys/fs/cgroup, create di‐
666 rectories for all hierarchies to which the container is added,
667 bind-mount the hierarchies from the host to the container and
668 make everything read-only except the container's own cgroup.
669 Note that compared to cgroup, where all paths leading up to
670 the container's own cgroup are just simple directories in the
671 underlying tmpfs, here /sys/fs/cgroup/$hierarchy will contain
672 the host's full cgroup hierarchy, albeit read-only outside the
673 container's own cgroup. This may leak quite a bit of informa‐
674 tion into the container.
675
676 • cgroup-full:mixed:force: The force option will cause LXC to
677 perform the cgroup mounts for the container under all circum‐
678 stances. Otherwise it is similar to cgroup-full:mixed. This
679 is mainly useful when the cgroup namespaces are enabled where
680 LXC will normally leave mounting cgroups to the init binary of
681 the container since it is perfectly safe to do so.
682
683 • cgroup-full:ro: similar to cgroup-full:mixed, but everything
684 will be mounted read-only.
685
686 • cgroup-full:ro:force: The force option will cause LXC to per‐
687 form the cgroup mounts for the container under all circum‐
688 stances. Otherwise it is similar to cgroup-full:ro. This is
689 mainly useful when the cgroup namespaces are enabled where LXC
690 will normally leave mounting cgroups to the init binary of the
691 container since it is perfectly safe to do so.
692
693 • cgroup-full:rw: similar to cgroup-full:mixed, but everything
694 will be mounted read-write. Note that in this case, the con‐
695 tainer may escape its own cgroup. (Note also that if the con‐
696 tainer has CAP_SYS_ADMIN support and can mount the cgroup
697 filesystem itself, it may do so anyway.)
698
699 • cgroup-full:rw:force: The force option will cause LXC to per‐
700 form the cgroup mounts for the container under all circum‐
701 stances. Otherwise it is similar to cgroup-full:rw. This is
702 mainly useful when the cgroup namespaces are enabled where LXC
703 will normally leave mounting cgroups to the init binary of the
704 container since it is perfectly safe to do so.
705
706 • cgroup-full (without specifier): defaults to cgroup-full:rw if
707 the container retains the CAP_SYS_ADMIN capability,
708 cgroup-full:mixed otherwise.
709
710 If cgroup namespaces are enabled, then any cgroup auto-mounting request
711 will be ignored, since the container can mount the filesystems itself,
712 and automounting can confuse the container init.
713
714 Note that if automatic mounting of the cgroup filesystem is enabled,
715 the tmpfs under /sys/fs/cgroup will always be mounted read-write (but
716 for the :mixed and :ro cases, the individual hierarchies,
717 /sys/fs/cgroup/$hierarchy, will be read-only). This is in order to work
718 around a quirk in Ubuntu's mountall(8) command that will cause contain‐
719 ers to wait for user input at boot if /sys/fs/cgroup is mounted read-
720 only and the container can't remount it read-write due to a lack of
721 CAP_SYS_ADMIN.
722
723 Examples:
724
725 lxc.mount.auto = proc sys cgroup
726 lxc.mount.auto = proc:rw sys:rw cgroup-full:rw
727
728
729 ROOT FILE SYSTEM
730 The root file system of the container can be different than that of the
731 host system.
732
733 lxc.rootfs.path
734 specify the root file system for the container. It can be an im‐
735 age file, a directory or a block device. If not specified, the
736 container shares its root file system with the host.
737
738 For directory or simple block-device backed containers, a path‐
739 name can be used. If the rootfs is backed by a nbd device, then
740 nbd:file:1 specifies that file should be attached to a nbd de‐
741 vice, and partition 1 should be mounted as the rootfs. nbd:file
742 specifies that the nbd device itself should be mounted. over‐
743 layfs:/lower:/upper specifies that the rootfs should be an over‐
744 lay with /upper being mounted read-write over a read-only mount
745 of /lower. For overlay multiple /lower directories can be spec‐
746 ified. loop:/file tells lxc to attach /file to a loop device and
747 mount the loop device.
748
749 lxc.rootfs.mount
750 where to recursively bind lxc.rootfs.path before pivoting. This
751 is to ensure success of the pivot_root(8) syscall. Any directory
752 suffices, the default should generally work.
753
754 lxc.rootfs.options
755 Specify extra mount options to use when mounting the rootfs.
756 The format of the mount options corresponds to the format used
757 in fstab. In addition, LXC supports the custom idmap= mount op‐
758 tion. This option can be used to tell LXC to create an idmapped
759 mount for the container's rootfs. This is useful when the user
760 doesn't want to recursively chown the rootfs of the container to
761 match the idmapping of the user namespace the container is going
762 to use. Instead an idmapped mount can be used to handle this.
763 The argument for idmap= can either be a path pointing to a user
764 namespace file that LXC will open and use to idmap the rootfs or
765 the special value "container" which will instruct LXC to use the
766 container's user namespace to idmap the rootfs.
767
768 lxc.rootfs.managed
769 Set this to 0 to indicate that LXC is not managing the container
770 storage, then LXC will not modify the container storage. The de‐
771 fault is 1.
772
773 CONTROL GROUPS ("CGROUPS")
774 The control group section contains the configuration for the different
775 subsystem. lxc does not check the correctness of the subsystem name.
776 This has the disadvantage of not detecting configuration errors until
777 the container is started, but has the advantage of permitting any fu‐
778 ture subsystem.
779
780 The kernel implementation of cgroups has changed significantly over the
781 years. With Linux 4.5 support for a new cgroup filesystem was added
782 usually referred to as "cgroup2" or "unified hierarchy". Since then the
783 old cgroup filesystem is usually referred to as "cgroup1" or the
784 "legacy hierarchies". Please see the cgroups manual page for a detailed
785 explanation of the differences between the two versions.
786
787 LXC distinguishes settings for the legacy and the unified hierarchy by
788 using different configuration key prefixes. To alter settings for con‐
789 trollers in a legacy hierarchy the key prefix lxc.cgroup. must be used
790 and in order to alter the settings for a controller in the unified hi‐
791 erarchy the lxc.cgroup2. key must be used. Note that LXC will ignore
792 lxc.cgroup. settings on systems that only use the unified hierarchy.
793 Conversely, it will ignore lxc.cgroup2. options on systems that only
794 use legacy hierarchies.
795
796 At its core a cgroup hierarchy is a way to hierarchically organize pro‐
797 cesses. Usually a cgroup hierarchy will have one or more "controllers"
798 enabled. A "controller" in a cgroup hierarchy is usually responsible
799 for distributing a specific type of system resource along the hierar‐
800 chy. Controllers include the "pids" controller, the "cpu" controller,
801 the "memory" controller and others. Some controllers however do not
802 fall into the category of distributing a system resource, instead they
803 are often referred to as "utility" controllers. One utility controller
804 is the device controller. Instead of distributing a system resource it
805 allows one to manage device access.
806
807 In the legacy hierarchy the device controller was implemented like most
808 other controllers as a set of files that could be written to. These
809 files where named "devices.allow" and "devices.deny". The legacy device
810 controller allowed the implementation of both "allowlists" and
811 "denylists".
812
813 An allowlist is a device program that by default blocks access to all
814 devices. In order to access specific devices "allow rules" for particu‐
815 lar devices or device classes must be specified. In contrast, a
816 denylist is a device program that by default allows access to all de‐
817 vices. In order to restrict access to specific devices "deny rules" for
818 particular devices or device classes must be specified.
819
820 In the unified cgroup hierarchy the implementation of the device con‐
821 troller has completely changed. Instead of files to read from and write
822 to a eBPF program of BPF_PROG_TYPE_CGROUP_DEVICE can be attached to a
823 cgroup. Even though the kernel implementation has changed completely
824 LXC tries to allow for the same semantics to be followed in the legacy
825 device cgroup and the unified eBPF-based device controller. The follow‐
826 ing paragraphs explain the semantics for the unified eBPF-based device
827 controller.
828
829 As mentioned the format for specifying device rules for the unified
830 eBPF-based device controller is the same as for the legacy cgroup de‐
831 vice controller; only the configuration key prefix has changed.
832 Specifically, device rules for the legacy cgroup device controller are
833 specified via lxc.cgroup.devices.allow and lxc.cgroup.devices.deny
834 whereas for the cgroup2 eBPF-based device controller lxc.cgroup2.de‐
835 vices.allow and lxc.cgroup2.devices.deny must be used.
836
837 • A allowlist device rule
838
839 lxc.cgroup2.devices.deny = a
840
841
842 will cause LXC to instruct the kernel to block access to all devices
843 by default. To grant access to devices allow device rules must be
844 added via the lxc.cgroup2.devices.allow key. This is referred to as a
845 "allowlist" device program.
846
847 • A denylist device rule
848
849 lxc.cgroup2.devices.allow = a
850
851
852 will cause LXC to instruct the kernel to allow access to all devices
853 by default. To deny access to devices deny device rules must be added
854 via lxc.cgroup2.devices.deny key. This is referred to as a
855 "denylist" device program.
856
857 • Specifying any of the aforementioned two rules will cause all previ‐
858 ous rules to be cleared, i.e. the device list will be reset.
859
860 • When an allowlist program is requested, i.e. access to all devices is
861 blocked by default, specific deny rules for individual devices or de‐
862 vice classes are ignored.
863
864 • When a denylist program is requested, i.e. access to all devices is
865 allowed by default, specific allow rules for individual devices or
866 device classes are ignored.
867
868 For example the set of rules:
869
870 lxc.cgroup2.devices.deny = a
871 lxc.cgroup2.devices.allow = c *:* m
872 lxc.cgroup2.devices.allow = b *:* m
873 lxc.cgroup2.devices.allow = c 1:3 rwm
874
875
876 implements an allowlist device program, i.e. the kernel will block ac‐
877 cess to all devices not specifically allowed in this list. This partic‐
878 ular program states that all character and block devices may be created
879 but only /dev/null might be read or written.
880
881 If we instead switch to the following set of rules:
882
883 lxc.cgroup2.devices.allow = a
884 lxc.cgroup2.devices.deny = c *:* m
885 lxc.cgroup2.devices.deny = b *:* m
886 lxc.cgroup2.devices.deny = c 1:3 rwm
887
888
889 then LXC would instruct the kernel to implement a denylist, i.e. the
890 kernel will allow access to all devices not specifically denied in this
891 list. This particular program states that no character devices or block
892 devices might be created and that /dev/null is not allow allowed to be
893 read, written, or created.
894
895 Now consider the same program but followed by a "global rule" which de‐
896 termines the type of device program (allowlist or denylist) as ex‐
897 plained above:
898
899 lxc.cgroup2.devices.allow = a
900 lxc.cgroup2.devices.deny = c *:* m
901 lxc.cgroup2.devices.deny = b *:* m
902 lxc.cgroup2.devices.deny = c 1:3 rwm
903 lxc.cgroup2.devices.allow = a
904
905
906 The last line will cause LXC to reset the device list without changing
907 the type of device program.
908
909 If we specify:
910
911 lxc.cgroup2.devices.allow = a
912 lxc.cgroup2.devices.deny = c *:* m
913 lxc.cgroup2.devices.deny = b *:* m
914 lxc.cgroup2.devices.deny = c 1:3 rwm
915 lxc.cgroup2.devices.deny = a
916
917
918 instead then the last line will cause LXC to reset the device list and
919 switch from a allowlist program to a denylist program.
920
921 lxc.cgroup.[controller name].[controller file]
922 Specify the control group value to be set on a legacy cgroup hi‐
923 erarchy. The controller name is the literal name of the control
924 group. The permitted names and the syntax of their values is not
925 dictated by LXC, instead it depends on the features of the Linux
926 kernel running at the time the container is started, eg.
927 lxc.cgroup.cpuset.cpus
928
929 lxc.cgroup2.[controller name].[controller file]
930 Specify the control group value to be set on the unified cgroup
931 hierarchy. The controller name is the literal name of the con‐
932 trol group. The permitted names and the syntax of their values
933 is not dictated by LXC, instead it depends on the features of
934 the Linux kernel running at the time the container is started,
935 eg. lxc.cgroup2.memory.high
936
937 lxc.cgroup.dir
938 specify a directory or path in which the container's cgroup will
939 be created. For example, setting lxc.cgroup.dir =
940 my-cgroup/first for a container named "c1" will create the con‐
941 tainer's cgroup as a sub-cgroup of "my-cgroup". For example, if
942 the user's current cgroup "my-user" is located in the root
943 cgroup of the cpuset controller in a cgroup v1 hierarchy this
944 would create the cgroup "/sys/fs/cgroup/cpuset/my-user/my-
945 cgroup/first/c1" for the container. Any missing cgroups will be
946 created by LXC. This presupposes that the user has write access
947 to its current cgroup.
948
949 lxc.cgroup.relative
950 Set this to 1 to instruct LXC to never escape to the root
951 cgroup. This makes it easy for users to adhere to restrictions
952 enforced by cgroup2 and systemd. Specifically, this makes it
953 possible to run LXC containers as systemd services.
954
955 CAPABILITIES
956 The capabilities can be dropped in the container if this one is run as
957 root.
958
959 lxc.cap.drop
960 Specify the capability to be dropped in the container. A single
961 line defining several capabilities with a space separation is
962 allowed. The format is the lower case of the capability defini‐
963 tion without the "CAP_" prefix, eg. CAP_SYS_MODULE should be
964 specified as sys_module. See capabilities(7). If used with no
965 value, lxc will clear any drop capabilities specified up to this
966 point.
967
968 lxc.cap.keep
969 Specify the capability to be kept in the container. All other
970 capabilities will be dropped. When a special value of "none" is
971 encountered, lxc will clear any keep capabilities specified up
972 to this point. A value of "none" alone can be used to drop all
973 capabilities.
974
975 NAMESPACES
976 A namespace can be cloned (lxc.namespace.clone), kept (lxc.name‐
977 space.keep) or shared (lxc.namespace.share.[namespace identifier]).
978
979 lxc.namespace.clone
980 Specify namespaces which the container is supposed to be created
981 with. The namespaces to create are specified as a space sepa‐
982 rated list. Each namespace must correspond to one of the stan‐
983 dard namespace identifiers as seen in the /proc/PID/ns direc‐
984 tory. When lxc.namespace.clone is not explicitly set all name‐
985 spaces supported by the kernel and the current configuration
986 will be used.
987
988 To create a new mount, net and ipc namespace set lxc.name‐
989 space.clone=mount net ipc.
990
991 lxc.namespace.keep
992 Specify namespaces which the container is supposed to inherit
993 from the process that created it. The namespaces to keep are
994 specified as a space separated list. Each namespace must corre‐
995 spond to one of the standard namespace identifiers as seen in
996 the /proc/PID/ns directory. The lxc.namespace.keep is a
997 denylist option, i.e. it is useful when enforcing that contain‐
998 ers must keep a specific set of namespaces.
999
1000 To keep the network, user and ipc namespace set lxc.name‐
1001 space.keep=user net ipc.
1002
1003 Note that sharing pid namespaces will likely not work with most
1004 init systems.
1005
1006 Note that if the container requests a new user namespace and the
1007 container wants to inherit the network namespace it needs to in‐
1008 herit the user namespace as well.
1009
1010 lxc.namespace.share.[namespace identifier]
1011 Specify a namespace to inherit from another container or
1012 process. The [namespace identifier] suffix needs to be replaced
1013 with one of the namespaces that appear in the /proc/PID/ns di‐
1014 rectory.
1015
1016 To inherit the namespace from another process set the lxc.name‐
1017 space.share.[namespace identifier] to the PID of the process,
1018 e.g. lxc.namespace.share.net=42.
1019
1020 To inherit the namespace from another container set the
1021 lxc.namespace.share.[namespace identifier] to the name of the
1022 container, e.g. lxc.namespace.share.pid=c3.
1023
1024 To inherit the namespace from another container located in a
1025 different path than the standard liblxc path set the lxc.name‐
1026 space.share.[namespace identifier] to the full path to the con‐
1027 tainer, e.g. lxc.namespace.share.user=/opt/c3.
1028
1029 In order to inherit namespaces the caller needs to have suffi‐
1030 cient privilege over the process or container.
1031
1032 Note that sharing pid namespaces between system containers will
1033 likely not work with most init systems.
1034
1035 Note that if two processes are in different user namespaces and
1036 one process wants to inherit the other's network namespace it
1037 usually needs to inherit the user namespace as well.
1038
1039 Note that without careful additional configuration of an LSM,
1040 sharing user+pid namespaces with a task may allow that task to
1041 escalate privileges to that of the task calling liblxc.
1042
1043 RESOURCE LIMITS
1044 The soft and hard resource limits for the container can be changed.
1045 Unprivileged containers can only lower them. Resources which are not
1046 explicitly specified will be inherited.
1047
1048 lxc.prlimit.[limit name]
1049 Specify the resource limit to be set. A limit is specified as
1050 two colon separated values which are either numeric or the word
1051 'unlimited'. A single value can be used as a shortcut to set
1052 both soft and hard limit to the same value. The permitted names
1053 the "RLIMIT_" resource names in lowercase without the "RLIMIT_"
1054 prefix, eg. RLIMIT_NOFILE should be specified as "nofile". See
1055 setrlimit(2). If used with no value, lxc will clear the re‐
1056 source limit specified up to this point. A resource with no ex‐
1057 plicitly configured limitation will be inherited from the
1058 process starting up the container.
1059
1060 SYSCTL
1061 Configure kernel parameters for the container.
1062
1063 lxc.sysctl.[kernel parameters name]
1064 Specify the kernel parameters to be set. The parameters avail‐
1065 able are those listed under /proc/sys/. Note that not all
1066 sysctls are namespaced. Changing Non-namespaced sysctls will
1067 cause the system-wide setting to be modified. sysctl(8). If
1068 used with no value, lxc will clear the parameters specified up
1069 to this point.
1070
1071 APPARMOR PROFILE
1072 If lxc was compiled and installed with apparmor support, and the host
1073 system has apparmor enabled, then the apparmor profile under which the
1074 container should be run can be specified in the container configura‐
1075 tion. The default is lxc-container-default-cgns if the host kernel is
1076 cgroup namespace aware, or lxc-container-default otherwise.
1077
1078 lxc.apparmor.profile
1079 Specify the apparmor profile under which the container should be
1080 run. To specify that the container should be unconfined, use
1081
1082 lxc.apparmor.profile = unconfined
1083
1084 If the apparmor profile should remain unchanged (i.e. if you are
1085 nesting containers and are already confined), then use
1086
1087 lxc.apparmor.profile = unchanged
1088
1089 If you instruct LXC to generate the apparmor profile, then use
1090
1091 lxc.apparmor.profile = generated
1092
1093 lxc.apparmor.allow_incomplete
1094 Apparmor profiles are pathname based. Therefore many file re‐
1095 strictions require mount restrictions to be effective against a
1096 determined attacker. However, these mount restrictions are not
1097 yet implemented in the upstream kernel. Without the mount re‐
1098 strictions, the apparmor profiles still protect against acciden‐
1099 tal damager.
1100
1101 If this flag is 0 (default), then the container will not be
1102 started if the kernel lacks the apparmor mount features, so that
1103 a regression after a kernel upgrade will be detected. To start
1104 the container under partial apparmor protection, set this flag
1105 to 1.
1106
1107 lxc.apparmor.allow_nesting
1108 If set this to 1, causes the following changes. When generated
1109 apparmor profiles are used, they will contain the necessary
1110 changes to allow creating a nested container. In addition to the
1111 usual mount points, /dev/.lxc/proc and /dev/.lxc/sys will con‐
1112 tain procfs and sysfs mount points without the lxcfs overlays,
1113 which, if generated apparmor profiles are being used, will not
1114 be read/writable directly.
1115
1116 lxc.apparmor.raw
1117 A list of raw AppArmor profile lines to append to the profile.
1118 Only valid when using generated profiles.
1119
1120 SELINUX CONTEXT
1121 If lxc was compiled and installed with SELinux support, and the host
1122 system has SELinux enabled, then the SELinux context under which the
1123 container should be run can be specified in the container configura‐
1124 tion. The default is unconfined_t, which means that lxc will not at‐
1125 tempt to change contexts. See /usr/share/lxc/selinux/lxc.te for an ex‐
1126 ample policy and more information.
1127
1128 lxc.selinux.context
1129 Specify the SELinux context under which the container should be
1130 run or unconfined_t. For example
1131
1132 lxc.selinux.context = system_u:system_r:lxc_t:s0:c22
1133
1134 lxc.selinux.context.keyring
1135 Specify the SELinux context under which the container's keyring
1136 should be created. By default this the same as lxc.selinux.con‐
1137 text, or the context lxc is executed under if lxc.selinux.con‐
1138 text has not been set.
1139
1140 lxc.selinux.context.keyring = system_u:system_r:lxc_t:s0:c22
1141
1142 KERNEL KEYRING
1143 The Linux Keyring facility is primarily a way for various kernel compo‐
1144 nents to retain or cache security data, authentication keys, encryption
1145 keys, and other data in the kernel. By default lxc will create a new
1146 session keyring for the started application.
1147
1148 lxc.keyring.session
1149 Disable the creation of new session keyring by lxc. The started
1150 application will then inherit the current session keyring. By
1151 default, or when passing the value 1, a new keyring will be cre‐
1152 ated.
1153
1154 lxc.keyring.session = 0
1155
1156 SECCOMP CONFIGURATION
1157 A container can be started with a reduced set of available system calls
1158 by loading a seccomp profile at startup. The seccomp configuration file
1159 must begin with a version number on the first line, a policy type on
1160 the second line, followed by the configuration.
1161
1162 Versions 1 and 2 are currently supported. In version 1, the policy is a
1163 simple allowlist. The second line therefore must read "allowlist", with
1164 the rest of the file containing one (numeric) syscall number per line.
1165 Each syscall number is allowlisted, while every unlisted number is
1166 denylisted for use in the container
1167
1168 In version 2, the policy may be denylist or allowlist, supports per-
1169 rule and per-policy default actions, and supports per-architecture sys‐
1170 tem call resolution from textual names.
1171
1172 An example denylist policy, in which all system calls are allowed ex‐
1173 cept for mknod, which will simply do nothing and return 0 (success),
1174 looks like:
1175
1176 2
1177 denylist
1178 mknod errno 0
1179 ioctl notify
1180
1181
1182 Specifying "errno" as action will cause LXC to register a seccomp fil‐
1183 ter that will cause a specific errno to be returned to the caller. The
1184 errno value can be specified after the "errno" action word.
1185
1186 Specifying "notify" as action will cause LXC to register a seccomp lis‐
1187 tener and retrieve a listener file descriptor from the kernel. When a
1188 syscall is made that is registered as "notify" the kernel will generate
1189 a poll event and send a message over the file descriptor. The caller
1190 can read this message, inspect the syscalls including its arguments.
1191 Based on this information the caller is expected to send back a message
1192 informing the kernel which action to take. Until that message is sent
1193 the kernel will block the calling process. The format of the messages
1194 to read and sent is documented in seccomp itself.
1195
1196 lxc.seccomp.profile
1197 Specify a file containing the seccomp configuration to load be‐
1198 fore the container starts.
1199
1200 lxc.seccomp.allow_nesting
1201 If this flag is set to 1, then seccomp filters will be stacked
1202 regardless of whether a seccomp profile is already loaded. This
1203 allows nested containers to load their own seccomp profile. The
1204 default setting is 0.
1205
1206 lxc.seccomp.notify.proxy
1207 Specify a unix socket to which LXC will connect and forward sec‐
1208 comp events to. The path must be in the form
1209 unix:/path/to/socket or unix:@socket. The former specifies a
1210 path-bound unix domain socket while the latter specifies an ab‐
1211 stract unix domain socket.
1212
1213 lxc.seccomp.notify.cookie
1214 An additional string sent along with proxied seccomp notifica‐
1215 tion requests.
1216
1217 PR_SET_NO_NEW_PRIVS
1218 With PR_SET_NO_NEW_PRIVS active execve() promises not to grant privi‐
1219 leges to do anything that could not have been done without the execve()
1220 call (for example, rendering the set-user-ID and set-group-ID mode
1221 bits, and file capabilities non-functional). Once set, this bit cannot
1222 be unset. The setting of this bit is inherited by children created by
1223 fork() and clone(), and preserved across execve(). Note that
1224 PR_SET_NO_NEW_PRIVS is applied after the container has changed into its
1225 intended AppArmor profile or SElinux context.
1226
1227 lxc.no_new_privs
1228 Specify whether the PR_SET_NO_NEW_PRIVS flag should be set for
1229 the container. Set to 1 to activate.
1230
1231 UID MAPPINGS
1232 A container can be started in a private user namespace with user and
1233 group id mappings. For instance, you can map userid 0 in the container
1234 to userid 200000 on the host. The root user in the container will be
1235 privileged in the container, but unprivileged on the host. Normally a
1236 system container will want a range of ids, so you would map, for in‐
1237 stance, user and group ids 0 through 20,000 in the container to the ids
1238 200,000 through 220,000.
1239
1240 lxc.idmap
1241 Four values must be provided. First a character, either 'u', or
1242 'g', to specify whether user or group ids are being mapped. Next
1243 is the first userid as seen in the user namespace of the con‐
1244 tainer. Next is the userid as seen on the host. Finally, a range
1245 indicating the number of consecutive ids to map.
1246
1247 CONTAINER HOOKS
1248 Container hooks are programs or scripts which can be executed at vari‐
1249 ous times in a container's lifetime.
1250
1251 When a container hook is executed, additional information is passed
1252 along. The lxc.hook.version argument can be used to determine if the
1253 following arguments are passed as command line arguments or through en‐
1254 vironment variables. The arguments are:
1255
1256 • Container name.
1257
1258 • Section (always 'lxc').
1259
1260 • The hook type (i.e. 'clone' or 'pre-mount').
1261
1262 • Additional arguments. In the case of the clone hook, any extra argu‐
1263 ments passed will appear as further arguments to the hook. In the
1264 case of the stop hook, paths to filedescriptors for each of the con‐
1265 tainer's namespaces along with their types are passed.
1266
1267 The following environment variables are set:
1268
1269 • LXC_CGNS_AWARE: indicator whether the container is cgroup namespace
1270 aware.
1271
1272 • LXC_CONFIG_FILE: the path to the container configuration file.
1273
1274 • LXC_HOOK_TYPE: the hook type (e.g. 'clone', 'mount', 'pre-mount').
1275 Note that the existence of this environment variable is conditional
1276 on the value of lxc.hook.version. If it is set to 1 then
1277 LXC_HOOK_TYPE will be set.
1278
1279 • LXC_HOOK_SECTION: the section type (e.g. 'lxc', 'net'). Note that the
1280 existence of this environment variable is conditional on the value of
1281 lxc.hook.version. If it is set to 1 then LXC_HOOK_SECTION will be
1282 set.
1283
1284 • LXC_HOOK_VERSION: the version of the hooks. This value is identical
1285 to the value of the container's lxc.hook.version config item. If it
1286 is set to 0 then old-style hooks are used. If it is set to 1 then
1287 new-style hooks are used.
1288
1289 • LXC_LOG_LEVEL: the container's log level.
1290
1291 • LXC_NAME: is the container's name.
1292
1293 • LXC_[NAMESPACE IDENTIFIER]_NS: path under /proc/PID/fd/ to a file de‐
1294 scriptor referring to the container's namespace. For each preserved
1295 namespace type there will be a separate environment variable. These
1296 environment variables will only be set if lxc.hook.version is set to
1297 1.
1298
1299 • LXC_ROOTFS_MOUNT: the path to the mounted root filesystem.
1300
1301 • LXC_ROOTFS_PATH: this is the lxc.rootfs.path entry for the container.
1302 Note this is likely not where the mounted rootfs is to be found, use
1303 LXC_ROOTFS_MOUNT for that.
1304
1305 • LXC_SRC_NAME: in the case of the clone hook, this is the original
1306 container's name.
1307
1308 Standard output from the hooks is logged at debug level. Standard er‐
1309 ror is not logged, but can be captured by the hook redirecting its
1310 standard error to standard output.
1311
1312 lxc.hook.version
1313 To pass the arguments in new style via environment variables set
1314 to 1 otherwise set to 0 to pass them as arguments. This setting
1315 affects all hooks arguments that were traditionally passed as
1316 arguments to the script. Specifically, it affects the container
1317 name, section (e.g. 'lxc', 'net') and hook type (e.g. 'clone',
1318 'mount', 'pre-mount') arguments. If new-style hooks are used
1319 then the arguments will be available as environment variables.
1320 The container name will be set in LXC_NAME. (This is set inde‐
1321 pendently of the value used for this config item.) The section
1322 will be set in LXC_HOOK_SECTION and the hook type will be set in
1323 LXC_HOOK_TYPE. It also affects how the paths to file descrip‐
1324 tors referring to the container's namespaces are passed. If set
1325 to 1 then for each namespace a separate environment variable
1326 LXC_[NAMESPACE IDENTIFIER]_NS will be set. If set to 0 then the
1327 paths will be passed as arguments to the stop hook.
1328
1329 lxc.hook.pre-start
1330 A hook to be run in the host's namespace before the container
1331 ttys, consoles, or mounts are up.
1332
1333 lxc.hook.pre-mount
1334 A hook to be run in the container's fs namespace but before the
1335 rootfs has been set up. This allows for manipulation of the
1336 rootfs, i.e. to mount an encrypted filesystem. Mounts done in
1337 this hook will not be reflected on the host (apart from mounts
1338 propagation), so they will be automatically cleaned up when the
1339 container shuts down.
1340
1341 lxc.hook.mount
1342 A hook to be run in the container's namespace after mounting has
1343 been done, but before the pivot_root.
1344
1345 lxc.hook.autodev
1346 A hook to be run in the container's namespace after mounting has
1347 been done and after any mount hooks have run, but before the
1348 pivot_root, if lxc.autodev == 1. The purpose of this hook is to
1349 assist in populating the /dev directory of the container when
1350 using the autodev option for systemd based containers. The con‐
1351 tainer's /dev directory is relative to the ${LXC_ROOTFS_MOUNT}
1352 environment variable available when the hook is run.
1353
1354 lxc.hook.start-host
1355 A hook to be run in the host's namespace after the container has
1356 been setup, and immediately before starting the container init.
1357
1358 lxc.hook.start
1359 A hook to be run in the container's namespace immediately before
1360 executing the container's init. This requires the program to be
1361 available in the container.
1362
1363 lxc.hook.stop
1364 A hook to be run in the host's namespace with references to the
1365 container's namespaces after the container has been shut down.
1366 For each namespace an extra argument is passed to the hook con‐
1367 taining the namespace's type and a filename that can be used to
1368 obtain a file descriptor to the corresponding namespace, sepa‐
1369 rated by a colon. The type is the name as it would appear in the
1370 /proc/PID/ns directory. For instance for the mount namespace
1371 the argument usually looks like mnt:/proc/PID/fd/12.
1372
1373 lxc.hook.post-stop
1374 A hook to be run in the host's namespace after the container has
1375 been shut down.
1376
1377 lxc.hook.clone
1378 A hook to be run when the container is cloned to a new one. See
1379 lxc-clone(1) for more information.
1380
1381 lxc.hook.destroy
1382 A hook to be run when the container is destroyed.
1383
1384 CONTAINER HOOKS ENVIRONMENT VARIABLES
1385 A number of environment variables are made available to the startup
1386 hooks to provide configuration information and assist in the function‐
1387 ing of the hooks. Not all variables are valid in all contexts. In par‐
1388 ticular, all paths are relative to the host system and, as such, not
1389 valid during the lxc.hook.start hook.
1390
1391 LXC_NAME
1392 The LXC name of the container. Useful for logging messages in
1393 common log environments. [-n]
1394
1395 LXC_CONFIG_FILE
1396 Host relative path to the container configuration file. This
1397 gives the container to reference the original, top level, con‐
1398 figuration file for the container in order to locate any addi‐
1399 tional configuration information not otherwise made available.
1400 [-f]
1401
1402 LXC_CONSOLE
1403 The path to the console output of the container if not NULL.
1404 [-c] [lxc.console.path]
1405
1406 LXC_CONSOLE_LOGPATH
1407 The path to the console log output of the container if not NULL.
1408 [-L]
1409
1410 LXC_ROOTFS_MOUNT
1411 The mount location to which the container is initially bound.
1412 This will be the host relative path to the container rootfs for
1413 the container instance being started and is where changes should
1414 be made for that instance. [lxc.rootfs.mount]
1415
1416 LXC_ROOTFS_PATH
1417 The host relative path to the container root which has been
1418 mounted to the rootfs.mount location. [lxc.rootfs.path]
1419
1420 LXC_SRC_NAME
1421 Only for the clone hook. Is set to the original container name.
1422
1423 LXC_TARGET
1424 Only for the stop hook. Is set to "stop" for a container shut‐
1425 down or "reboot" for a container reboot.
1426
1427 LXC_CGNS_AWARE
1428 If unset, then this version of lxc is not aware of cgroup name‐
1429 spaces. If set, it will be set to 1, and lxc is aware of cgroup
1430 namespaces. Note this does not guarantee that cgroup namespaces
1431 are enabled in the kernel. This is used by the lxcfs mount hook.
1432
1433 LOGGING
1434 Logging can be configured on a per-container basis. By default, depend‐
1435 ing upon how the lxc package was compiled, container startup is logged
1436 only at the ERROR level, and logged to a file named after the container
1437 (with '.log' appended) either under the container path, or under
1438 /var/log/lxc.
1439
1440 Both the default log level and the log file can be specified in the
1441 container configuration file, overriding the default behavior. Note
1442 that the configuration file entries can in turn be overridden by the
1443 command line options to lxc-start.
1444
1445 lxc.log.level
1446 The level at which to log. The log level is an integer in the
1447 range of 0..8 inclusive, where a lower number means more verbose
1448 debugging. In particular 0 = trace, 1 = debug, 2 = info, 3 = no‐
1449 tice, 4 = warn, 5 = error, 6 = critical, 7 = alert, and 8 = fa‐
1450 tal. If unspecified, the level defaults to 5 (error), so that
1451 only errors and above are logged.
1452
1453 Note that when a script (such as either a hook script or a net‐
1454 work interface up or down script) is called, the script's stan‐
1455 dard output is logged at level 1, debug.
1456
1457 lxc.log.file
1458 The file to which logging info should be written.
1459
1460 lxc.log.syslog
1461 Send logging info to syslog. It respects the log level defined
1462 in lxc.log.level. The argument should be the syslog facility to
1463 use, valid ones are: daemon, local0, local1, local2, local3, lo‐
1464 cal4, local5, local5, local6, local7.
1465
1466 AUTOSTART
1467 The autostart options support marking which containers should be auto-
1468 started and in what order. These options may be used by LXC tools di‐
1469 rectly or by external tooling provided by the distributions.
1470
1471 lxc.start.auto
1472 Whether the container should be auto-started. Valid values are
1473 0 (off) and 1 (on).
1474
1475 lxc.start.delay
1476 How long to wait (in seconds) after the container is started be‐
1477 fore starting the next one.
1478
1479 lxc.start.order
1480 An integer used to sort the containers when auto-starting a se‐
1481 ries of containers at once. A lower value means an earlier
1482 start.
1483
1484 lxc.monitor.unshare
1485 If not zero the mount namespace will be unshared from the host
1486 before initializing the container (before running any pre-start
1487 hooks). This requires the CAP_SYS_ADMIN capability at startup.
1488 Default is 0.
1489
1490 lxc.monitor.signal.pdeath
1491 Set the signal to be sent to the container's init when the lxc
1492 monitor exits. By default it is set to SIGKILL which will cause
1493 all container processes to be killed when the lxc monitor
1494 process dies. To ensure that containers stay alive even if lxc
1495 monitor dies set this to 0.
1496
1497 lxc.group
1498 A multi-value key (can be used multiple times) to put the con‐
1499 tainer in a container group. Those groups can then be used
1500 (amongst other things) to start a series of related containers.
1501
1502 AUTOSTART AND SYSTEM BOOT
1503 Each container can be part of any number of groups or no group at all.
1504 Two groups are special. One is the NULL group, i.e. the container does
1505 not belong to any group. The other group is the "onboot" group.
1506
1507 When the system boots with the LXC service enabled, it will first at‐
1508 tempt to boot any containers with lxc.start.auto == 1 that is a member
1509 of the "onboot" group. The startup will be in order of lxc.start.order.
1510 If an lxc.start.delay has been specified, that delay will be honored
1511 before attempting to start the next container to give the current con‐
1512 tainer time to begin initialization and reduce overloading the host
1513 system. After starting the members of the "onboot" group, the LXC sys‐
1514 tem will proceed to boot containers with lxc.start.auto == 1 which are
1515 not members of any group (the NULL group) and proceed as with the on‐
1516 boot group.
1517
1518 CONTAINER ENVIRONMENT
1519 If you want to pass environment variables into the container (that is,
1520 environment variables which will be available to init and all of its
1521 descendents), you can use lxc.environment parameters to do so. Be care‐
1522 ful that you do not pass in anything sensitive; any process in the con‐
1523 tainer which doesn't have its environment scrubbed will have these
1524 variables available to it, and environment variables are always avail‐
1525 able via /proc/PID/environ.
1526
1527 This configuration parameter can be specified multiple times; once for
1528 each environment variable you wish to configure.
1529
1530 lxc.environment
1531 Specify an environment variable to pass into the container. Ex‐
1532 ample:
1533
1534 lxc.environment = APP_ENV=production
1535 lxc.environment = SYSLOG_SERVER=192.0.2.42
1536
1537
1538 It is possible to inherit host environment variables by setting
1539 the name of the variable without a "=" sign. For example:
1540
1541 lxc.environment = PATH
1542
1543
1545 In addition to the few examples given below, you will find some other
1546 examples of configuration file in /usr/share/doc/lxc/examples
1547
1548 NETWORK
1549 This configuration sets up a container to use a veth pair device with
1550 one side plugged to a bridge br0 (which has been configured before on
1551 the system by the administrator). The virtual network device visible in
1552 the container is renamed to eth0.
1553
1554 lxc.uts.name = myhostname
1555 lxc.net.0.type = veth
1556 lxc.net.0.flags = up
1557 lxc.net.0.link = br0
1558 lxc.net.0.name = eth0
1559 lxc.net.0.hwaddr = 4a:49:43:49:79:bf
1560 lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
1561 lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
1562
1563
1564 UID/GID MAPPING
1565 This configuration will map both user and group ids in the range 0-9999
1566 in the container to the ids 100000-109999 on the host.
1567
1568 lxc.idmap = u 0 100000 10000
1569 lxc.idmap = g 0 100000 10000
1570
1571
1572 CONTROL GROUP
1573 This configuration will setup several control groups for the applica‐
1574 tion, cpuset.cpus restricts usage of the defined cpu, cpus.share prior‐
1575 itize the control group, devices.allow makes usable the specified de‐
1576 vices.
1577
1578 lxc.cgroup.cpuset.cpus = 0,1
1579 lxc.cgroup.cpu.shares = 1234
1580 lxc.cgroup.devices.deny = a
1581 lxc.cgroup.devices.allow = c 1:3 rw
1582 lxc.cgroup.devices.allow = b 8:0 rw
1583
1584
1585 COMPLEX CONFIGURATION
1586 This example show a complex configuration making a complex network
1587 stack, using the control groups, setting a new hostname, mounting some
1588 locations and a changing root file system.
1589
1590 lxc.uts.name = complex
1591 lxc.net.0.type = veth
1592 lxc.net.0.flags = up
1593 lxc.net.0.link = br0
1594 lxc.net.0.hwaddr = 4a:49:43:49:79:bf
1595 lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
1596 lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
1597 lxc.net.0.ipv6.address = 2003:db8:1:0:214:5432:feab:3588
1598 lxc.net.1.type = macvlan
1599 lxc.net.1.flags = up
1600 lxc.net.1.link = eth0
1601 lxc.net.1.hwaddr = 4a:49:43:49:79:bd
1602 lxc.net.1.ipv4.address = 10.2.3.4/24
1603 lxc.net.1.ipv4.address = 192.168.10.125/24
1604 lxc.net.1.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3596
1605 lxc.net.2.type = phys
1606 lxc.net.2.flags = up
1607 lxc.net.2.link = random0
1608 lxc.net.2.hwaddr = 4a:49:43:49:79:ff
1609 lxc.net.2.ipv4.address = 10.2.3.6/24
1610 lxc.net.2.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3297
1611 lxc.cgroup.cpuset.cpus = 0,1
1612 lxc.cgroup.cpu.shares = 1234
1613 lxc.cgroup.devices.deny = a
1614 lxc.cgroup.devices.allow = c 1:3 rw
1615 lxc.cgroup.devices.allow = b 8:0 rw
1616 lxc.mount.fstab = /etc/fstab.complex
1617 lxc.mount.entry = /lib /root/myrootfs/lib none ro,bind 0 0
1618 lxc.rootfs.path = dir:/mnt/rootfs.complex
1619 lxc.rootfs.options = idmap=container
1620 lxc.cap.drop = sys_module mknod setuid net_raw
1621 lxc.cap.drop = mac_override
1622
1623
1625 chroot(1), pivot_root(8), fstab(5), capabilities(7)
1626
1628 lxc(7), lxc-create(1), lxc-copy(1), lxc-destroy(1), lxc-start(1), lxc-
1629 stop(1), lxc-execute(1), lxc-console(1), lxc-monitor(1), lxc-wait(1),
1630 lxc-cgroup(1), lxc-ls(1), lxc-info(1), lxc-freeze(1), lxc-unfreeze(1),
1631 lxc-attach(1), lxc.conf(5)
1632
1634 Daniel Lezcano <daniel.lezcano@free.fr>
1635
1636
1637
1638 2022-07-21 lxc.container.conf(5)