1lxc.container.conf(5) lxc.container.conf(5)
2
3
4
6 lxc.container.conf - LXC container configuration file
7
9 LXC is the well-known and heavily tested low-level Linux container run‐
10 time. It is in active development since 2008 and has proven itself in
11 critical production environments world-wide. Some of its core contribu‐
12 tors are the same people that helped to implement various well-known
13 containerization features inside the Linux kernel.
14
15 LXC's main focus is system containers. That is, containers which offer
16 an environment as close as possible as the one you'd get from a VM but
17 without the overhead that comes with running a separate kernel and sim‐
18 ulating all the hardware.
19
20 This is achieved through a combination of kernel security features such
21 as namespaces, mandatory access control and control groups.
22
23 LXC has support for unprivileged containers. Unprivileged containers
24 are containers that are run without any privilege. This requires sup‐
25 port for user namespaces in the kernel that the container is run on.
26 LXC was the first runtime to support unprivileged containers after user
27 namespaces were merged into the mainline kernel.
28
29 In essence, user namespaces isolate given sets of UIDs and GIDs. This
30 is achieved by establishing a mapping between a range of UIDs and GIDs
31 on the host to a different (unprivileged) range of UIDs and GIDs in the
32 container. The kernel will translate this mapping in such a way that
33 inside the container all UIDs and GIDs appear as you would expect from
34 the host whereas on the host these UIDs and GIDs are in fact unprivi‐
35 leged. For example, a process running as UID and GID 0 inside the con‐
36 tainer might appear as UID and GID 100000 on the host. The implementa‐
37 tion and working details can be gathered from the corresponding user
38 namespace man page. UID and GID mappings can be defined with the
39 lxc.idmap key.
40
41 Linux containers are defined with a simple configuration file. Each op‐
42 tion in the configuration file has the form key = value fitting in one
43 line. The "#" character means the line is a comment. List options, like
44 capabilities and cgroups options, can be used with no value to clear
45 any previously defined values of that option.
46
47 LXC namespaces configuration keys use single dots. This means complex
48 configuration keys such as lxc.net.0 expose various subkeys such as
49 lxc.net.0.type, lxc.net.0.link, lxc.net.0.ipv6.address, and others for
50 even more fine-grained configuration.
51
52 CONFIGURATION
53 In order to ease administration of multiple related containers, it is
54 possible to have a container configuration file cause another file to
55 be loaded. For instance, network configuration can be defined in one
56 common file which is included by multiple containers. Then, if the con‐
57 tainers are moved to another host, only one file may need to be up‐
58 dated.
59
60 lxc.include
61 Specify the file to be included. The included file must be in
62 the same valid lxc configuration file format.
63
64 ARCHITECTURE
65 Allows one to set the architecture for the container. For example, set
66 a 32bits architecture for a container running 32bits binaries on a
67 64bits host. This fixes the container scripts which rely on the archi‐
68 tecture to do some work like downloading the packages.
69
70 lxc.arch
71 Specify the architecture for the container.
72
73 Some valid options are x86, i686, x86_64, amd64
74
75 HOSTNAME
76 The utsname section defines the hostname to be set for the container.
77 That means the container can set its own hostname without changing the
78 one from the system. That makes the hostname private for the container.
79
80 lxc.uts.name
81 specify the hostname for the container
82
83 HALT SIGNAL
84 Allows one to specify signal name or number sent to the container's
85 init process to cleanly shutdown the container. Different init systems
86 could use different signals to perform clean shutdown sequence. This
87 option allows the signal to be specified in kill(1) fashion, e.g. SIG‐
88 PWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is
89 SIGPWR.
90
91 lxc.signal.halt
92 specify the signal used to halt the container
93
94 REBOOT SIGNAL
95 Allows one to specify signal name or number to reboot the container.
96 This option allows signal to be specified in kill(1) fashion, e.g.
97 SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal
98 is SIGINT.
99
100 lxc.signal.reboot
101 specify the signal used to reboot the container
102
103 STOP SIGNAL
104 Allows one to specify signal name or number to forcibly shutdown the
105 container. This option allows signal to be specified in kill(1) fash‐
106 ion, e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. The de‐
107 fault signal is SIGKILL.
108
109 lxc.signal.stop
110 specify the signal used to stop the container
111
112 INIT COMMAND
113 Sets the command to use as the init system for the containers.
114
115 lxc.execute.cmd
116 Absolute path from container rootfs to the binary to run by de‐
117 fault. This mostly makes sense for lxc-execute.
118
119 lxc.init.cmd
120 Absolute path from container rootfs to the binary to use as
121 init. This mostly makes sense for lxc-start. Default is
122 /sbin/init.
123
124 INIT WORKING DIRECTORY
125 Sets the absolute path inside the container as the working directory
126 for the containers. LXC will switch to this directory before executing
127 init.
128
129 lxc.init.cwd
130 Absolute path inside the container to use as the working direc‐
131 tory.
132
133 INIT ID
134 Sets the UID/GID to use for the init system, and subsequent commands.
135 Note that using a non-root UID when booting a system container will
136 likely not work due to missing privileges. Setting the UID/GID is
137 mostly useful when running application containers. Defaults to:
138 UID(0), GID(0)
139
140 lxc.init.uid
141 UID to use for init.
142
143 lxc.init.gid
144 GID to use for init.
145
146 PROC
147 Configure proc filesystem for the container.
148
149 lxc.proc.[proc file name]
150 Specify the proc file name to be set. The file names available
151 are those listed under /proc/PID/. Example:
152
153 lxc.proc.oom_score_adj = 10
154
155
156 EPHEMERAL
157 Allows one to specify whether a container will be destroyed on shut‐
158 down.
159
160 lxc.ephemeral
161 The only allowed values are 0 and 1. Set this to 1 to destroy a
162 container on shutdown.
163
164 NETWORK
165 The network section defines how the network is virtualized in the con‐
166 tainer. The network virtualization acts at layer two. In order to use
167 the network virtualization, parameters must be specified to define the
168 network interfaces of the container. Several virtual interfaces can be
169 assigned and used in a container even if the system has only one physi‐
170 cal network interface.
171
172 lxc.net
173 may be used without a value to clear all previous network op‐
174 tions.
175
176 lxc.net.[i].type
177 specify what kind of network virtualization to be used for the
178 container. Must be specified before any other option(s) on the
179 net device. Multiple networks can be specified by using an ad‐
180 ditional index i after all lxc.net.* keys. For example,
181 lxc.net.0.type = veth and lxc.net.1.type = veth specify two dif‐
182 ferent networks of the same type. All keys sharing the same in‐
183 dex i will be treated as belonging to the same network. For ex‐
184 ample, lxc.net.0.link = br0 will belong to lxc.net.0.type. Cur‐
185 rently, the different virtualization types can be:
186
187 none: will cause the container to share the host's network name‐
188 space. This means the host network devices are usable in the
189 container. It also means that if both the container and host
190 have upstart as init, 'halt' in a container (for instance) will
191 shut down the host. Note that unprivileged containers do not
192 work with this setting due to an inability to mount sysfs. An
193 unsafe workaround would be to bind mount the host's sysfs.
194
195 empty: will create only the loopback interface.
196
197 veth: a virtual ethernet pair device is created with one side
198 assigned to the container and the other side on the host.
199 lxc.net.[i].veth.mode specifies the mode the veth parent will
200 use on the host. The accepted modes are bridge and router. The
201 mode defaults to bridge if not specified. In bridge mode the
202 host side is attached to a bridge specified by the
203 lxc.net.[i].link option. If the bridge link is not specified,
204 then the veth pair device will be created but not attached to
205 any bridge. Otherwise, the bridge has to be created on the sys‐
206 tem before starting the container. lxc won't handle any config‐
207 uration outside of the container. In router mode static routes
208 are created on the host for the container's IP addresses point‐
209 ing to the host side veth interface. Additionally Proxy ARP and
210 Proxy NDP entries are added on the host side veth interface for
211 the gateway IPs defined in the container to allow the container
212 to reach the host. By default, lxc chooses a name for the net‐
213 work device belonging to the outside of the container, but if
214 you wish to handle this name yourselves, you can tell lxc to set
215 a specific name with the lxc.net.[i].veth.pair option (except
216 for unprivileged containers where this option is ignored for se‐
217 curity reasons). Static routes can be added on the host point‐
218 ing to the container using the lxc.net.[i].veth.ipv4.route and
219 lxc.net.[i].veth.ipv6.route options. Several lines specify sev‐
220 eral routes. The route is in format x.y.z.t/m, eg.
221 192.168.1.0/24.
222
223 vlan: a vlan interface is linked with the interface specified by
224 the lxc.net.[i].link and assigned to the container. The vlan
225 identifier is specified with the option lxc.net.[i].vlan.id.
226
227 macvlan: a macvlan interface is linked with the interface speci‐
228 fied by the lxc.net.[i].link and assigned to the container.
229 lxc.net.[i].macvlan.mode specifies the mode the macvlan will use
230 to communicate between different macvlan on the same upper de‐
231 vice. The accepted modes are private, vepa, bridge and passthru.
232 In private mode, the device never communicates with any other
233 device on the same upper_dev (default). In vepa mode, the new
234 Virtual Ethernet Port Aggregator (VEPA) mode, it assumes that
235 the adjacent bridge returns all frames where both source and
236 destination are local to the macvlan port, i.e. the bridge is
237 set up as a reflective relay. Broadcast frames coming in from
238 the upper_dev get flooded to all macvlan interfaces in VEPA
239 mode, local frames are not delivered locally. In bridge mode, it
240 provides the behavior of a simple bridge between different
241 macvlan interfaces on the same port. Frames from one interface
242 to another one get delivered directly and are not sent out ex‐
243 ternally. Broadcast frames get flooded to all other bridge ports
244 and to the external interface, but when they come back from a
245 reflective relay, we don't deliver them again. Since we know all
246 the MAC addresses, the macvlan bridge mode does not require
247 learning or STP like the bridge module does. In passthru mode,
248 all frames received by the physical interface are forwarded to
249 the macvlan interface. Only one macvlan interface in passthru
250 mode is possible for one physical interface.
251
252 ipvlan: an ipvlan interface is linked with the interface speci‐
253 fied by the lxc.net.[i].link and assigned to the container.
254 lxc.net.[i].ipvlan.mode specifies the mode the ipvlan will use
255 to communicate between different ipvlan on the same upper de‐
256 vice. The accepted modes are l3, l3s and l2. It defaults to l3
257 mode. In l3 mode TX processing up to L3 happens on the stack
258 instance attached to the dependent device and packets are
259 switched to the stack instance of the parent device for the L2
260 processing and routing from that instance will be used before
261 packets are queued on the outbound device. In this mode the de‐
262 pendent devices will not receive nor can send multicast / broad‐
263 cast traffic. In l3s mode TX processing is very similar to the
264 L3 mode except that iptables (conn-tracking) works in this mode
265 and hence it is L3-symmetric (L3s). This will have slightly
266 less performance but that shouldn't matter since you are choos‐
267 ing this mode over plain-L3 mode to make conn-tracking work. In
268 l2 mode TX processing happens on the stack instance attached to
269 the dependent device and packets are switched and queued to the
270 parent device to send devices out. In this mode the dependent
271 devices will RX/TX multicast and broadcast (if applicable) as
272 well. lxc.net.[i].ipvlan.isolation specifies the isolation
273 mode. The accepted isolation values are bridge, private and
274 vepa. It defaults to bridge. In bridge isolation mode depen‐
275 dent devices can cross-talk among themselves apart from talking
276 through the parent device. In private isolation mode the port
277 is set in private mode. i.e. port won't allow cross communica‐
278 tion between dependent devices. In vepa isolation mode the port
279 is set in VEPA mode. i.e. port will offload switching function‐
280 ality to the external entity as described in 802.1Qbg.
281
282 phys: an already existing interface specified by the
283 lxc.net.[i].link is assigned to the container.
284
285 lxc.net.[i].flags
286 Specify an action to do for the network.
287
288 up: activates the interface.
289
290 lxc.net.[i].link
291 Specify the interface to be used for real network traffic.
292
293 lxc.net.[i].l2proxy
294 Controls whether layer 2 IP neighbour proxy entries will be
295 added to the lxc.net.[i].link interface for the IP addresses of
296 the container. Can be set to 0 or 1. Defaults to 0. When used
297 with IPv4 addresses, the following sysctl values need to be set:
298 net.ipv4.conf.[link].forwarding=1 When used with IPv6 addresses,
299 the following sysctl values need to be set:
300 net.ipv6.conf.[link].proxy_ndp=1 net.ipv6.conf.[link].forward‐
301 ing=1
302
303 lxc.net.[i].mtu
304 Specify the maximum transfer unit for this interface.
305
306 lxc.net.[i].name
307 The interface name is dynamically allocated, but if another name
308 is needed because the configuration files being used by the con‐
309 tainer use a generic name, eg. eth0, this option will rename the
310 interface in the container.
311
312 lxc.net.[i].hwaddr
313 The interface mac address is dynamically allocated by default to
314 the virtual interface, but in some cases, this is needed to re‐
315 solve a mac address conflict or to always have the same link-lo‐
316 cal ipv6 address. Any "x" in address will be replaced by random
317 value, this allows setting hwaddr templates.
318
319 lxc.net.[i].ipv4.address
320 Specify the ipv4 address to assign to the virtualized interface.
321 Several lines specify several ipv4 addresses. The address is in
322 format x.y.z.t/m, eg. 192.168.1.123/24.
323
324 lxc.net.[i].ipv4.gateway
325 Specify the ipv4 address to use as the gateway inside the con‐
326 tainer. The address is in format x.y.z.t, eg. 192.168.1.123.
327 Can also have the special value auto, which means to take the
328 primary address from the bridge interface (as specified by the
329 lxc.net.[i].link option) and use that as the gateway. auto is
330 only available when using the veth, macvlan and ipvlan network
331 types. Can also have the special value of dev, which means to
332 set the default gateway as a device route. This is primarily
333 for use with layer 3 network modes, such as IPVLAN.
334
335 lxc.net.[i].ipv6.address
336 Specify the ipv6 address to assign to the virtualized interface.
337 Several lines specify several ipv6 addresses. The address is in
338 format x::y/m, eg. 2003:db8:1:0:214:1234:fe0b:3596/64
339
340 lxc.net.[i].ipv6.gateway
341 Specify the ipv6 address to use as the gateway inside the con‐
342 tainer. The address is in format x::y, eg. 2003:db8:1:0::1 Can
343 also have the special value auto, which means to take the pri‐
344 mary address from the bridge interface (as specified by the
345 lxc.net.[i].link option) and use that as the gateway. auto is
346 only available when using the veth, macvlan and ipvlan network
347 types. Can also have the special value of dev, which means to
348 set the default gateway as a device route. This is primarily
349 for use with layer 3 network modes, such as IPVLAN.
350
351 lxc.net.[i].script.up
352 Add a configuration option to specify a script to be executed
353 after creating and configuring the network used from the host
354 side.
355
356 In addition to the information available to all hooks. The fol‐
357 lowing information is provided to the script:
358
359 • LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
360
361 • LXC_HOOK_SECTION: the section type 'net'.
362
363 • LXC_NET_TYPE: the network type. This is one of the valid net‐
364 work types listed here (e.g. 'vlan', 'macvlan', 'ipvlan',
365 'veth').
366
367 • LXC_NET_PARENT: the parent device on the host. This is only
368 set for network types 'mavclan', 'veth', 'phys'.
369
370 • LXC_NET_PEER: the name of the peer device on the host. This is
371 only set for 'veth' network types. Note that this information
372 is only available when lxc.hook.version is set to 1.
373
374 Whether this information is provided in the form of environment vari‐
375 ables or as arguments to the script depends on the value of
376 lxc.hook.version. If set to 1 then information is provided in the form
377 of environment variables. If set to 0 information is provided as argu‐
378 ments to the script.
379
380 Standard output from the script is logged at debug level. Standard er‐
381 ror is not logged, but can be captured by the hook redirecting its
382 standard error to standard output.
383
384 lxc.net.[i].script.down
385 Add a configuration option to specify a script to be executed
386 before destroying the network used from the host side.
387
388 In addition to the information available to all hooks. The fol‐
389 lowing information is provided to the script:
390
391 • LXC_HOOK_TYPE: the hook type. This is either 'up' or 'down'.
392
393 • LXC_HOOK_SECTION: the section type 'net'.
394
395 • LXC_NET_TYPE: the network type. This is one of the valid net‐
396 work types listed here (e.g. 'vlan', 'macvlan', 'ipvlan',
397 'veth').
398
399 • LXC_NET_PARENT: the parent device on the host. This is only
400 set for network types 'mavclan', 'veth', 'phys'.
401
402 • LXC_NET_PEER: the name of the peer device on the host. This is
403 only set for 'veth' network types. Note that this information
404 is only available when lxc.hook.version is set to 1.
405
406 Whether this information is provided in the form of environment vari‐
407 ables or as arguments to the script depends on the value of
408 lxc.hook.version. If set to 1 then information is provided in the form
409 of environment variables. If set to 0 information is provided as argu‐
410 ments to the script.
411
412 Standard output from the script is logged at debug level. Standard er‐
413 ror is not logged, but can be captured by the hook redirecting its
414 standard error to standard output.
415
416 NEW PSEUDO TTY INSTANCE (DEVPTS)
417 For stricter isolation the container can have its own private instance
418 of the pseudo tty.
419
420 lxc.pty.max
421 If set, the container will have a new pseudo tty instance, mak‐
422 ing this private to it. The value specifies the maximum number
423 of pseudo ttys allowed for a pty instance (this limitation is
424 not implemented yet).
425
426 CONTAINER SYSTEM CONSOLE
427 If the container is configured with a root filesystem and the inittab
428 file is setup to use the console, you may want to specify where the
429 output of this console goes.
430
431 lxc.console.buffer.size
432 Setting this option instructs liblxc to allocate an in-memory
433 ringbuffer. The container's console output will be written to
434 the ringbuffer. Note that ringbuffer must be at least as big as
435 a standard page size. When passed a value smaller than a single
436 page size liblxc will allocate a ringbuffer of a single page
437 size. A page size is usually 4KB. The keyword 'auto' will cause
438 liblxc to allocate a ringbuffer of 128KB. When manually speci‐
439 fying a size for the ringbuffer the value should be a power of 2
440 when converted to bytes. Valid size prefixes are 'KB', 'MB',
441 'GB'. (Note that all conversions are based on multiples of 1024.
442 That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'. Addi‐
443 tionally, the case of the suffix is ignored, i.e. 'kB', 'KB' and
444 'Kb' are treated equally.)
445
446 lxc.console.size
447 Setting this option instructs liblxc to place a limit on the
448 size of the console log file specified in lxc.console.logfile.
449 Note that size of the log file must be at least as big as a
450 standard page size. When passed a value smaller than a single
451 page size liblxc will set the size of log file to a single page
452 size. A page size is usually 4KB. The keyword 'auto' will cause
453 liblxc to place a limit of 128KB on the log file. When manually
454 specifying a size for the log file the value should be a power
455 of 2 when converted to bytes. Valid size prefixes are 'KB',
456 'MB', 'GB'. (Note that all conversions are based on multiples of
457 1024. That means 'KB' == 'KiB', 'MB' == 'MiB', 'GB' == 'GiB'.
458 Additionally, the case of the suffix is ignored, i.e. 'kB', 'KB'
459 and 'Kb' are treated equally.) If users want to mirror the con‐
460 sole ringbuffer on disk they should set lxc.console.size equal
461 to lxc.console.buffer.size.
462
463 lxc.console.logfile
464 Specify a path to a file where the console output will be writ‐
465 ten. Note that in contrast to the on-disk ringbuffer logfile
466 this file will keep growing potentially filling up the users
467 disks if not rotated and deleted. This problem can also be
468 avoided by using the in-memory ringbuffer options lxc.con‐
469 sole.buffer.size and lxc.console.buffer.logfile.
470
471 lxc.console.rotate
472 Whether to rotate the console logfile specified in lxc.con‐
473 sole.logfile. Users can send an API request to rotate the log‐
474 file. Note that the old logfile will have the same name as the
475 original with the suffix ".1" appended. Users wishing to pre‐
476 vent the console log file from filling the disk should rotate
477 the logfile and delete it if unneeded. This problem can also be
478 avoided by using the in-memory ringbuffer options lxc.con‐
479 sole.buffer.size and lxc.console.buffer.logfile.
480
481 lxc.console.path
482 Specify a path to a device to which the console will be at‐
483 tached. The keyword 'none' will simply disable the console.
484 Note, when specifying 'none' and creating a device node for the
485 console in the container at /dev/console or bind-mounting the
486 hosts's /dev/console into the container at /dev/console the con‐
487 tainer will have direct access to the hosts's /dev/console.
488 This is dangerous when the container has write access to the de‐
489 vice and should thus be used with caution.
490
491 CONSOLE THROUGH THE TTYS
492 This option is useful if the container is configured with a root
493 filesystem and the inittab file is setup to launch a getty on the ttys.
494 The option specifies the number of ttys to be available for the con‐
495 tainer. The number of gettys in the inittab file of the container
496 should not be greater than the number of ttys specified in this option,
497 otherwise the excess getty sessions will die and respawn indefinitely
498 giving annoying messages on the console or in /var/log/messages.
499
500 lxc.tty.max
501 Specify the number of tty to make available to the container.
502
503 CONSOLE DEVICES LOCATION
504 LXC consoles are provided through Unix98 PTYs created on the host and
505 bind-mounted over the expected devices in the container. By default,
506 they are bind-mounted over /dev/console and /dev/ttyN. This can prevent
507 package upgrades in the guest. Therefore you can specify a directory
508 location (under /dev under which LXC will create the files and bind-
509 mount over them. These will then be symbolically linked to /dev/console
510 and /dev/ttyN. A package upgrade can then succeed as it is able to re‐
511 move and replace the symbolic links.
512
513 lxc.tty.dir
514 Specify a directory under /dev under which to create the con‐
515 tainer console devices. Note that LXC will move any bind-mounts
516 or device nodes for /dev/console into this directory.
517
518 /DEV DIRECTORY
519 By default, lxc creates a few symbolic links (fd,stdin,stdout,stderr)
520 in the container's /dev directory but does not automatically create de‐
521 vice node entries. This allows the container's /dev to be set up as
522 needed in the container rootfs. If lxc.autodev is set to 1, then after
523 mounting the container's rootfs LXC will mount a fresh tmpfs under /dev
524 (limited to 500K by default, unless defined in lxc.autodev.tmpfs.size)
525 and fill in a minimal set of initial devices. This is generally re‐
526 quired when starting a container containing a "systemd" based "init"
527 but may be optional at other times. Additional devices in the contain‐
528 ers /dev directory may be created through the use of the lxc.hook.au‐
529 todev hook.
530
531 lxc.autodev
532 Set this to 0 to stop LXC from mounting and populating a minimal
533 /dev when starting the container.
534
535 lxc.autodev.tmpfs.size
536 Set this to define the size of the /dev tmpfs. The default
537 value is 500000 (500K). If the parameter is used but without
538 value, the default value is used.
539
540 MOUNT POINTS
541 The mount points section specifies the different places to be mounted.
542 These mount points will be private to the container and won't be visi‐
543 ble by the processes running outside of the container. This is useful
544 to mount /etc, /var or /home for examples.
545
546 NOTE - LXC will generally ensure that mount targets and relative bind-
547 mount sources are properly confined under the container root, to avoid
548 attacks involving over-mounting host directories and files. (Symbolic
549 links in absolute mount sources are ignored) However, if the container
550 configuration first mounts a directory which is under the control of
551 the container user, such as /home/joe, into the container at some path,
552 and then mounts under path, then a TOCTTOU attack would be possible
553 where the container user modifies a symbolic link under his home direc‐
554 tory at just the right time.
555
556 lxc.mount.fstab
557 specify a file location in the fstab format, containing the
558 mount information. The mount target location can and in most
559 cases should be a relative path, which will become relative to
560 the mounted container root. For instance,
561
562 proc proc proc nodev,noexec,nosuid 0 0
563
564
565 Will mount a proc filesystem under the container's /proc, re‐
566 gardless of where the root filesystem comes from. This is re‐
567 silient to block device backed filesystems as well as container
568 cloning.
569
570 Note that when mounting a filesystem from an image file or block
571 device the third field (fs_vfstype) cannot be auto as with
572 mount(8) but must be explicitly specified.
573
574 lxc.mount.entry
575 Specify a mount point corresponding to a line in the fstab for‐
576 mat. Moreover lxc supports mount propagation, such as rshared
577 or rprivate, and adds three additional mount options. optional
578 don't fail if mount does not work. create=dir or create=file to
579 create dir (or file) when the point will be mounted. relative
580 source path is taken to be relative to the mounted container
581 root. For instance,
582
583 dev/null proc/kcore none bind,relative 0 0
584
585
586 Will expand dev/null to ${LXC_ROOTFS_MOUNT}/dev/null, and mount
587 it to proc/kcore inside the container.
588
589 lxc.mount.auto
590 specify which standard kernel file systems should be automati‐
591 cally mounted. This may dramatically simplify the configuration.
592 The file systems are:
593
594 • proc:mixed (or proc): mount /proc as read-write, but remount
595 /proc/sys and /proc/sysrq-trigger read-only for security /
596 container isolation purposes.
597
598 • proc:rw: mount /proc as read-write
599
600 • sys:mixed (or sys): mount /sys as read-only but with /sys/de‐
601 vices/virtual/net writable.
602
603 • sys:ro: mount /sys as read-only for security / container iso‐
604 lation purposes.
605
606 • sys:rw: mount /sys as read-write
607
608 • cgroup:mixed: Mount a tmpfs to /sys/fs/cgroup, create directo‐
609 ries for all hierarchies to which the container is added, cre‐
610 ate subdirectories in those hierarchies with the name of the
611 cgroup, and bind-mount the container's own cgroup into that
612 directory. The container will be able to write to its own
613 cgroup directory, but not the parents, since they will be re‐
614 mounted read-only.
615
616 • cgroup:mixed:force: The force option will cause LXC to perform
617 the cgroup mounts for the container under all circumstances.
618 Otherwise it is similar to cgroup:mixed. This is mainly use‐
619 ful when the cgroup namespaces are enabled where LXC will nor‐
620 mally leave mounting cgroups to the init binary of the con‐
621 tainer since it is perfectly safe to do so.
622
623 • cgroup:ro: similar to cgroup:mixed, but everything will be
624 mounted read-only.
625
626 • cgroup:ro:force: The force option will cause LXC to perform
627 the cgroup mounts for the container under all circumstances.
628 Otherwise it is similar to cgroup:ro. This is mainly useful
629 when the cgroup namespaces are enabled where LXC will normally
630 leave mounting cgroups to the init binary of the container
631 since it is perfectly safe to do so.
632
633 • cgroup:rw: similar to cgroup:mixed, but everything will be
634 mounted read-write. Note that the paths leading up to the con‐
635 tainer's own cgroup will be writable, but will not be a cgroup
636 filesystem but just part of the tmpfs of /sys/fs/cgroup
637
638 • cgroup:rw:force: The force option will cause LXC to perform
639 the cgroup mounts for the container under all circumstances.
640 Otherwise it is similar to cgroup:rw. This is mainly useful
641 when the cgroup namespaces are enabled where LXC will normally
642 leave mounting cgroups to the init binary of the container
643 since it is perfectly safe to do so.
644
645 • cgroup (without specifier): defaults to cgroup:rw if the con‐
646 tainer retains the CAP_SYS_ADMIN capability, cgroup:mixed oth‐
647 erwise.
648
649 • cgroup-full:mixed: mount a tmpfs to /sys/fs/cgroup, create di‐
650 rectories for all hierarchies to which the container is added,
651 bind-mount the hierarchies from the host to the container and
652 make everything read-only except the container's own cgroup.
653 Note that compared to cgroup, where all paths leading up to
654 the container's own cgroup are just simple directories in the
655 underlying tmpfs, here /sys/fs/cgroup/$hierarchy will contain
656 the host's full cgroup hierarchy, albeit read-only outside the
657 container's own cgroup. This may leak quite a bit of informa‐
658 tion into the container.
659
660 • cgroup-full:mixed:force: The force option will cause LXC to
661 perform the cgroup mounts for the container under all circum‐
662 stances. Otherwise it is similar to cgroup-full:mixed. This
663 is mainly useful when the cgroup namespaces are enabled where
664 LXC will normally leave mounting cgroups to the init binary of
665 the container since it is perfectly safe to do so.
666
667 • cgroup-full:ro: similar to cgroup-full:mixed, but everything
668 will be mounted read-only.
669
670 • cgroup-full:ro:force: The force option will cause LXC to per‐
671 form the cgroup mounts for the container under all circum‐
672 stances. Otherwise it is similar to cgroup-full:ro. This is
673 mainly useful when the cgroup namespaces are enabled where LXC
674 will normally leave mounting cgroups to the init binary of the
675 container since it is perfectly safe to do so.
676
677 • cgroup-full:rw: similar to cgroup-full:mixed, but everything
678 will be mounted read-write. Note that in this case, the con‐
679 tainer may escape its own cgroup. (Note also that if the con‐
680 tainer has CAP_SYS_ADMIN support and can mount the cgroup
681 filesystem itself, it may do so anyway.)
682
683 • cgroup-full:rw:force: The force option will cause LXC to per‐
684 form the cgroup mounts for the container under all circum‐
685 stances. Otherwise it is similar to cgroup-full:rw. This is
686 mainly useful when the cgroup namespaces are enabled where LXC
687 will normally leave mounting cgroups to the init binary of the
688 container since it is perfectly safe to do so.
689
690 • cgroup-full (without specifier): defaults to cgroup-full:rw if
691 the container retains the CAP_SYS_ADMIN capability,
692 cgroup-full:mixed otherwise.
693
694 If cgroup namespaces are enabled, then any cgroup auto-mounting request
695 will be ignored, since the container can mount the filesystems itself,
696 and automounting can confuse the container init.
697
698 Note that if automatic mounting of the cgroup filesystem is enabled,
699 the tmpfs under /sys/fs/cgroup will always be mounted read-write (but
700 for the :mixed and :ro cases, the individual hierarchies,
701 /sys/fs/cgroup/$hierarchy, will be read-only). This is in order to work
702 around a quirk in Ubuntu's mountall(8) command that will cause contain‐
703 ers to wait for user input at boot if /sys/fs/cgroup is mounted read-
704 only and the container can't remount it read-write due to a lack of
705 CAP_SYS_ADMIN.
706
707 Examples:
708
709 lxc.mount.auto = proc sys cgroup
710 lxc.mount.auto = proc:rw sys:rw cgroup-full:rw
711
712
713 ROOT FILE SYSTEM
714 The root file system of the container can be different than that of the
715 host system.
716
717 lxc.rootfs.path
718 specify the root file system for the container. It can be an im‐
719 age file, a directory or a block device. If not specified, the
720 container shares its root file system with the host.
721
722 For directory or simple block-device backed containers, a path‐
723 name can be used. If the rootfs is backed by a nbd device, then
724 nbd:file:1 specifies that file should be attached to a nbd de‐
725 vice, and partition 1 should be mounted as the rootfs. nbd:file
726 specifies that the nbd device itself should be mounted. over‐
727 layfs:/lower:/upper specifies that the rootfs should be an over‐
728 lay with /upper being mounted read-write over a read-only mount
729 of /lower. For overlay multiple /lower directories can be spec‐
730 ified. loop:/file tells lxc to attach /file to a loop device and
731 mount the loop device.
732
733 lxc.rootfs.mount
734 where to recursively bind lxc.rootfs.path before pivoting. This
735 is to ensure success of the pivot_root(8) syscall. Any directory
736 suffices, the default should generally work.
737
738 lxc.rootfs.options
739 extra mount options to use when mounting the rootfs.
740
741 lxc.rootfs.managed
742 Set this to 0 to indicate that LXC is not managing the container
743 storage, then LXC will not modify the container storage. The de‐
744 fault is 1.
745
746 CONTROL GROUPS ("CGROUPS")
747 The control group section contains the configuration for the different
748 subsystem. lxc does not check the correctness of the subsystem name.
749 This has the disadvantage of not detecting configuration errors until
750 the container is started, but has the advantage of permitting any fu‐
751 ture subsystem.
752
753 The kernel implementation of cgroups has changed significantly over the
754 years. With Linux 4.5 support for a new cgroup filesystem was added
755 usually referred to as "cgroup2" or "unified hierarchy". Since then the
756 old cgroup filesystem is usually referred to as "cgroup1" or the
757 "legacy hierarchies". Please see the cgroups manual page for a detailed
758 explanation of the differences between the two versions.
759
760 LXC distinguishes settings for the legacy and the unified hierarchy by
761 using different configuration key prefixes. To alter settings for con‐
762 trollers in a legacy hierarchy the key prefix lxc.cgroup. must be used
763 and in order to alter the settings for a controller in the unified hi‐
764 erarchy the lxc.cgroup2. key must be used. Note that LXC will ignore
765 lxc.cgroup. settings on systems that only use the unified hierarchy.
766 Conversely, it will ignore lxc.cgroup2. options on systems that only
767 use legacy hierachies.
768
769 At its core a cgroup hierarchy is a way to hierarchically organize pro‐
770 cesses. Usually a cgroup hierarchy will have one or more "controllers"
771 enabled. A "controller" in a cgroup hierarchy is usually responsible
772 for distributing a specific type of system resource along the hierar‐
773 chy. Controllers include the "pids" controller, the "cpu" controller,
774 the "memory" controller and others. Some controllers however do not
775 fall into the category of distributing a system resource, instead they
776 are often referred to as "utility" controllers. One utility controller
777 is the device controller. Instead of distributing a system resource it
778 allows to manage device access.
779
780 In the legacy hierarchy the device controller was implemented like most
781 other controllers as a set of files that could be written to. These
782 files where named "devices.allow" and "devices.deny". The legacy device
783 controller allowed the implementation of both "allowlists" and
784 "denylists".
785
786 An allowlist is a device program that by default blocks access to all
787 devices. In order to access specific devices "allow rules" for particu‐
788 lar devices or device classes must be specified. In contrast, a
789 denylist is a device program that by default allows access to all de‐
790 vices. In order to restrict access to specific devices "deny rules" for
791 particular devices or device classes must be specified.
792
793 In the unified cgroup hierarchy the implementation of the device con‐
794 troller has completely changed. Instead of files to read from and write
795 to a eBPF program of BPF_PROG_TYPE_CGROUP_DEVICE can be attached to a
796 cgroup. Even though the kernel implementation has changed completely
797 LXC tries to allow for the same semantics to be followed in the legacy
798 device cgroup and the unified eBPF-based device controller. The follow‐
799 ing paragraphs explain the semantics for the unified eBPF-based device
800 controller.
801
802 As mentioned the format for specifying device rules for the unified
803 eBPF-based device controller is the same as for the legacy cgroup de‐
804 vice controller; only the configuration key prefix has changed.
805 Specifically, device rules for the legacy cgroup device controller are
806 specified via lxc.cgroup.devices.allow and lxc.cgroup.devices.deny
807 whereas for the cgroup2 eBPF-based device controller lxc.cgroup.de‐
808 vices.allow and lxc.cgroup.devices.deny must be used.
809
810 • A allowlist device rule
811
812 lxc.cgroup2.devices.deny = a
813
814
815 will cause LXC to instruct the kernel to block access to all devices
816 by default. To grant access to devices allow device rules must be
817 added via the lxc.cgroup2.devices.allow key. This is referred to as a
818 "allowlist" device program.
819
820 • A denylist device rule
821
822 lxc.cgroup2.devices.allow = a
823
824
825 will cause LXC to instruct the kernel to allow access to all devices
826 by default. To deny access to devices deny device rules must be added
827 via lxc.cgroup2.devices.deny key. This is referred to as a
828 "denylist" device program.
829
830 • Specifying any of the aformentioned two rules will cause all previous
831 rules to be cleared, i.e. the device list will be reset.
832
833 • When an allowlist program is requested, i.e. access to all devices is
834 blocked by default, specific deny rules for individual devices or de‐
835 vice classes are ignored.
836
837 • When a denylist program is requested, i.e. access to all devices is
838 allowed by default, specific allow rules for individual devices or
839 device classes are ignored.
840
841 For example the set of rules:
842
843 lxc.cgroup2.devices.deny = a
844 lxc.cgroup2.devices.allow = c *:* m
845 lxc.cgroup2.devices.allow = b *:* m
846 lxc.cgroup2.devices.allow = c 1:3 rwm
847
848
849 implements an allowlist device program, i.e. the kernel will block ac‐
850 cess to all devices not specifically allowed in this list. This partic‐
851 ular program states that all character and block devices may be created
852 but only /dev/null might be read or written.
853
854 If we instead switch to the following set of rules:
855
856 lxc.cgroup2.devices.allow = a
857 lxc.cgroup2.devices.deny = c *:* m
858 lxc.cgroup2.devices.deny = b *:* m
859 lxc.cgroup2.devices.deny = c 1:3 rwm
860
861
862 then LXC would instruct the kernel to implement a denylist, i.e. the
863 kernel will allow access to all devices not specifically denied in this
864 list. This particular program states that no character devices or block
865 devices might be created and that /dev/null is not allow allowed to be
866 read, written, or created.
867
868 Now consider the same program but followed by a "global rule" which de‐
869 termines the type of device program (allowlist or denylist) as ex‐
870 plained above:
871
872 lxc.cgroup2.devices.allow = a
873 lxc.cgroup2.devices.deny = c *:* m
874 lxc.cgroup2.devices.deny = b *:* m
875 lxc.cgroup2.devices.deny = c 1:3 rwm
876 lxc.cgroup2.devices.allow = a
877
878
879 The last line will cause LXC to reset the device list without changing
880 the type of device program.
881
882 If we specify:
883
884 lxc.cgroup2.devices.allow = a
885 lxc.cgroup2.devices.deny = c *:* m
886 lxc.cgroup2.devices.deny = b *:* m
887 lxc.cgroup2.devices.deny = c 1:3 rwm
888 lxc.cgroup2.devices.deny = a
889
890
891 instead then the last line will cause LXC to reset the device list and
892 switch from a allowlist program to a denylist program.
893
894 lxc.cgroup.[controller name].[controller file]
895 Specify the control group value to be set on a legacy cgroup hi‐
896 erarchy. The controller name is the literal name of the control
897 group. The permitted names and the syntax of their values is not
898 dictated by LXC, instead it depends on the features of the Linux
899 kernel running at the time the container is started, eg.
900 lxc.cgroup.cpuset.cpus
901
902 lxc.cgroup2.[controller name].[controller file]
903 Specify the control group value to be set on the unified cgroup
904 hierarchy. The controller name is the literal name of the con‐
905 trol group. The permitted names and the syntax of their values
906 is not dictated by LXC, instead it depends on the features of
907 the Linux kernel running at the time the container is started,
908 eg. lxc.cgroup2.memory.high
909
910 lxc.cgroup.dir
911 specify a directory or path in which the container's cgroup will
912 be created. For example, setting lxc.cgroup.dir =
913 my-cgroup/first for a container named "c1" will create the con‐
914 tainer's cgroup as a sub-cgroup of "my-cgroup". For example, if
915 the user's current cgroup "my-user" is located in the root
916 cgroup of the cpuset controller in a cgroup v1 hierarchy this
917 would create the cgroup "/sys/fs/cgroup/cpuset/my-user/my-
918 cgroup/first/c1" for the container. Any missing cgroups will be
919 created by LXC. This presupposes that the user has write access
920 to its current cgroup.
921
922 lxc.cgroup.relative
923 Set this to 1 to instruct LXC to never escape to the root
924 cgroup. This makes it easy for users to adhere to restrictions
925 enforced by cgroup2 and systemd. Specifically, this makes it
926 possible to run LXC containers as systemd services.
927
928 CAPABILITIES
929 The capabilities can be dropped in the container if this one is run as
930 root.
931
932 lxc.cap.drop
933 Specify the capability to be dropped in the container. A single
934 line defining several capabilities with a space separation is
935 allowed. The format is the lower case of the capability defini‐
936 tion without the "CAP_" prefix, eg. CAP_SYS_MODULE should be
937 specified as sys_module. See capabilities(7). If used with no
938 value, lxc will clear any drop capabilities specified up to this
939 point.
940
941 lxc.cap.keep
942 Specify the capability to be kept in the container. All other
943 capabilities will be dropped. When a special value of "none" is
944 encountered, lxc will clear any keep capabilities specified up
945 to this point. A value of "none" alone can be used to drop all
946 capabilities.
947
948 NAMESPACES
949 A namespace can be cloned (lxc.namespace.clone), kept (lxc.name‐
950 space.keep) or shared (lxc.namespace.share.[namespace identifier]).
951
952 lxc.namespace.clone
953 Specify namespaces which the container is supposed to be created
954 with. The namespaces to create are specified as a space sepa‐
955 rated list. Each namespace must correspond to one of the stan‐
956 dard namespace identifiers as seen in the /proc/PID/ns direc‐
957 tory. When lxc.namespace.clone is not explicitly set all name‐
958 spaces supported by the kernel and the current configuration
959 will be used.
960
961 To create a new mount, net and ipc namespace set lxc.name‐
962 space.clone=mount net ipc.
963
964 lxc.namespace.keep
965 Specify namespaces which the container is supposed to inherit
966 from the process that created it. The namespaces to keep are
967 specified as a space separated list. Each namespace must corre‐
968 spond to one of the standard namespace identifiers as seen in
969 the /proc/PID/ns directory. The lxc.namespace.keep is a
970 denylist option, i.e. it is useful when enforcing that contain‐
971 ers must keep a specific set of namespaces.
972
973 To keep the network, user and ipc namespace set lxc.name‐
974 space.keep=user net ipc.
975
976 Note that sharing pid namespaces will likely not work with most
977 init systems.
978
979 Note that if the container requests a new user namespace and the
980 container wants to inherit the network namespace it needs to in‐
981 herit the user namespace as well.
982
983 lxc.namespace.share.[namespace identifier]
984 Specify a namespace to inherit from another container or
985 process. The [namespace identifier] suffix needs to be replaced
986 with one of the namespaces that appear in the /proc/PID/ns di‐
987 rectory.
988
989 To inherit the namespace from another process set the lxc.name‐
990 space.share.[namespace identifier] to the PID of the process,
991 e.g. lxc.namespace.share.net=42.
992
993 To inherit the namespace from another container set the
994 lxc.namespace.share.[namespace identifier] to the name of the
995 container, e.g. lxc.namespace.share.pid=c3.
996
997 To inherit the namespace from another container located in a
998 different path than the standard liblxc path set the lxc.name‐
999 space.share.[namespace identifier] to the full path to the con‐
1000 tainer, e.g. lxc.namespace.share.user=/opt/c3.
1001
1002 In order to inherit namespaces the caller needs to have suffi‐
1003 cient privilege over the process or container.
1004
1005 Note that sharing pid namespaces between system containers will
1006 likely not work with most init systems.
1007
1008 Note that if two processes are in different user namespaces and
1009 one process wants to inherit the other's network namespace it
1010 usually needs to inherit the user namespace as well.
1011
1012 Note that without careful additional configuration of an LSM,
1013 sharing user+pid namespaces with a task may allow that task to
1014 escalate privileges to that of the task calling liblxc.
1015
1016 RESOURCE LIMITS
1017 The soft and hard resource limits for the container can be changed.
1018 Unprivileged containers can only lower them. Resources which are not
1019 explicitly specified will be inherited.
1020
1021 lxc.prlimit.[limit name]
1022 Specify the resource limit to be set. A limit is specified as
1023 two colon separated values which are either numeric or the word
1024 'unlimited'. A single value can be used as a shortcut to set
1025 both soft and hard limit to the same value. The permitted names
1026 the "RLIMIT_" resource names in lowercase without the "RLIMIT_"
1027 prefix, eg. RLIMIT_NOFILE should be specified as "nofile". See
1028 setrlimit(2). If used with no value, lxc will clear the re‐
1029 source limit specified up to this point. A resource with no ex‐
1030 plicitly configured limitation will be inherited from the
1031 process starting up the container.
1032
1033 SYSCTL
1034 Configure kernel parameters for the container.
1035
1036 lxc.sysctl.[kernel parameters name]
1037 Specify the kernel parameters to be set. The parameters avail‐
1038 able are those listed under /proc/sys/. Note that not all
1039 sysctls are namespaced. Changing Non-namespaced sysctls will
1040 cause the system-wide setting to be modified. sysctl(8). If
1041 used with no value, lxc will clear the parameters specified up
1042 to this point.
1043
1044 APPARMOR PROFILE
1045 If lxc was compiled and installed with apparmor support, and the host
1046 system has apparmor enabled, then the apparmor profile under which the
1047 container should be run can be specified in the container configura‐
1048 tion. The default is lxc-container-default-cgns if the host kernel is
1049 cgroup namespace aware, or lxc-container-default otherwise.
1050
1051 lxc.apparmor.profile
1052 Specify the apparmor profile under which the container should be
1053 run. To specify that the container should be unconfined, use
1054
1055 lxc.apparmor.profile = unconfined
1056
1057 If the apparmor profile should remain unchanged (i.e. if you are
1058 nesting containers and are already confined), then use
1059
1060 lxc.apparmor.profile = unchanged
1061
1062 If you instruct LXC to generate the apparmor profile, then use
1063
1064 lxc.apparmor.profile = generated
1065
1066 lxc.apparmor.allow_incomplete
1067 Apparmor profiles are pathname based. Therefore many file re‐
1068 strictions require mount restrictions to be effective against a
1069 determined attacker. However, these mount restrictions are not
1070 yet implemented in the upstream kernel. Without the mount re‐
1071 strictions, the apparmor profiles still protect against acciden‐
1072 tal damager.
1073
1074 If this flag is 0 (default), then the container will not be
1075 started if the kernel lacks the apparmor mount features, so that
1076 a regression after a kernel upgrade will be detected. To start
1077 the container under partial apparmor protection, set this flag
1078 to 1.
1079
1080 lxc.apparmor.allow_nesting
1081 If set this to 1, causes the following changes. When generated
1082 apparmor profiles are used, they will contain the necessary
1083 changes to allow creating a nested container. In addition to the
1084 usual mount points, /dev/.lxc/proc and /dev/.lxc/sys will con‐
1085 tain procfs and sysfs mount points without the lxcfs overlays,
1086 which, if generated apparmor profiles are being used, will not
1087 be read/writable directly.
1088
1089 lxc.apparmor.raw
1090 A list of raw AppArmor profile lines to append to the profile.
1091 Only valid when using generated profiles.
1092
1093 SELINUX CONTEXT
1094 If lxc was compiled and installed with SELinux support, and the host
1095 system has SELinux enabled, then the SELinux context under which the
1096 container should be run can be specified in the container configura‐
1097 tion. The default is unconfined_t, which means that lxc will not at‐
1098 tempt to change contexts. See /usr/share/lxc/selinux/lxc.te for an ex‐
1099 ample policy and more information.
1100
1101 lxc.selinux.context
1102 Specify the SELinux context under which the container should be
1103 run or unconfined_t. For example
1104
1105 lxc.selinux.context = system_u:system_r:lxc_t:s0:c22
1106
1107 lxc.selinux.context.keyring
1108 Specify the SELinux context under which the container's keyring
1109 should be created. By default this the same as lxc.selinux.con‐
1110 text, or the context lxc is executed under if lxc.selinux.con‐
1111 text has not been set.
1112
1113 lxc.selinux.context.keyring = system_u:system_r:lxc_t:s0:c22
1114
1115 KERNEL KEYRING
1116 The Linux Keyring facility is primarily a way for various kernel compo‐
1117 nents to retain or cache security data, authentication keys, encryption
1118 keys, and other data in the kernel. By default lxc will create a new
1119 session keyring for the started application.
1120
1121 lxc.keyring.session
1122 Disable the creation of new session keyring by lxc. The started
1123 application will then inherit the current session keyring. By
1124 default, or when passing the value 1, a new keyring will be cre‐
1125 ated.
1126
1127 lxc.keyring.session = 0
1128
1129 SECCOMP CONFIGURATION
1130 A container can be started with a reduced set of available system calls
1131 by loading a seccomp profile at startup. The seccomp configuration file
1132 must begin with a version number on the first line, a policy type on
1133 the second line, followed by the configuration.
1134
1135 Versions 1 and 2 are currently supported. In version 1, the policy is a
1136 simple allowlist. The second line therefore must read "allowlist", with
1137 the rest of the file containing one (numeric) syscall number per line.
1138 Each syscall number is allowlisted, while every unlisted number is
1139 denylisted for use in the container
1140
1141 In version 2, the policy may be denylist or allowlist, supports per-
1142 rule and per-policy default actions, and supports per-architecture sys‐
1143 tem call resolution from textual names.
1144
1145 An example denylist policy, in which all system calls are allowed ex‐
1146 cept for mknod, which will simply do nothing and return 0 (success),
1147 looks like:
1148
1149 2
1150 denylist
1151 mknod errno 0
1152 ioctl notify
1153
1154
1155 Specifying "errno" as action will cause LXC to register a seccomp fil‐
1156 ter that will cause a specific errno to be returned to the caller. The
1157 errno value can be specified after the "errno" action word.
1158
1159 Specifying "notify" as action will cause LXC to register a seccomp lis‐
1160 tener and retrieve a listener file descriptor from the kernel. When a
1161 syscall is made that is registered as "notify" the kernel will generate
1162 a poll event and send a message over the file descriptor. The caller
1163 can read this message, inspect the syscalls including its arguments.
1164 Based on this information the caller is expected to send back a message
1165 informing the kernel which action to take. Until that message is sent
1166 the kernel will block the calling process. The format of the messages
1167 to read and sent is documented in seccomp itself.
1168
1169 lxc.seccomp.profile
1170 Specify a file containing the seccomp configuration to load be‐
1171 fore the container starts.
1172
1173 lxc.seccomp.allow_nesting
1174 If this flag is set to 1, then seccomp filters will be stacked
1175 regardless of whether a seccomp profile is already loaded. This
1176 allows nested containers to load their own seccomp profile. The
1177 default setting is 0.
1178
1179 lxc.seccomp.notify.proxy
1180 Specify a unix socket to which LXC will connect and forward sec‐
1181 comp events to. The path must be in the form
1182 unix:/path/to/socket or unix:@socket. The former specifies a
1183 path-bound unix domain socket while the latter specifies an ab‐
1184 stract unix domain socket.
1185
1186 lxc.seccomp.notify.cookie
1187 An additional string sent along with proxied seccomp notifica‐
1188 tion requests.
1189
1190 PR_SET_NO_NEW_PRIVS
1191 With PR_SET_NO_NEW_PRIVS active execve() promises not to grant privi‐
1192 leges to do anything that could not have been done without the execve()
1193 call (for example, rendering the set-user-ID and set-group-ID mode
1194 bits, and file capabilities non-functional). Once set, this bit cannot
1195 be unset. The setting of this bit is inherited by children created by
1196 fork() and clone(), and preserved across execve(). Note that
1197 PR_SET_NO_NEW_PRIVS is applied after the container has changed into its
1198 intended AppArmor profile or SElinux context.
1199
1200 lxc.no_new_privs
1201 Specify whether the PR_SET_NO_NEW_PRIVS flag should be set for
1202 the container. Set to 1 to activate.
1203
1204 UID MAPPINGS
1205 A container can be started in a private user namespace with user and
1206 group id mappings. For instance, you can map userid 0 in the container
1207 to userid 200000 on the host. The root user in the container will be
1208 privileged in the container, but unprivileged on the host. Normally a
1209 system container will want a range of ids, so you would map, for in‐
1210 stance, user and group ids 0 through 20,000 in the container to the ids
1211 200,000 through 220,000.
1212
1213 lxc.idmap
1214 Four values must be provided. First a character, either 'u', or
1215 'g', to specify whether user or group ids are being mapped. Next
1216 is the first userid as seen in the user namespace of the con‐
1217 tainer. Next is the userid as seen on the host. Finally, a range
1218 indicating the number of consecutive ids to map.
1219
1220 CONTAINER HOOKS
1221 Container hooks are programs or scripts which can be executed at vari‐
1222 ous times in a container's lifetime.
1223
1224 When a container hook is executed, additional information is passed
1225 along. The lxc.hook.version argument can be used to determine if the
1226 following arguments are passed as command line arguments or through en‐
1227 vironment variables. The arguments are:
1228
1229 • Container name.
1230
1231 • Section (always 'lxc').
1232
1233 • The hook type (i.e. 'clone' or 'pre-mount').
1234
1235 • Additional arguments. In the case of the clone hook, any extra argu‐
1236 ments passed will appear as further arguments to the hook. In the
1237 case of the stop hook, paths to filedescriptors for each of the con‐
1238 tainer's namespaces along with their types are passed.
1239
1240 The following environment variables are set:
1241
1242 • LXC_CGNS_AWARE: indicator whether the container is cgroup namespace
1243 aware.
1244
1245 • LXC_CONFIG_FILE: the path to the container configuration file.
1246
1247 • LXC_HOOK_TYPE: the hook type (e.g. 'clone', 'mount', 'pre-mount').
1248 Note that the existence of this environment variable is conditional
1249 on the value of lxc.hook.version. If it is set to 1 then
1250 LXC_HOOK_TYPE will be set.
1251
1252 • LXC_HOOK_SECTION: the section type (e.g. 'lxc', 'net'). Note that the
1253 existence of this environment variable is conditional on the value of
1254 lxc.hook.version. If it is set to 1 then LXC_HOOK_SECTION will be
1255 set.
1256
1257 • LXC_HOOK_VERSION: the version of the hooks. This value is identical
1258 to the value of the container's lxc.hook.version config item. If it
1259 is set to 0 then old-style hooks are used. If it is set to 1 then
1260 new-style hooks are used.
1261
1262 • LXC_LOG_LEVEL: the container's log level.
1263
1264 • LXC_NAME: is the container's name.
1265
1266 • LXC_[NAMESPACE IDENTIFIER]_NS: path under /proc/PID/fd/ to a file de‐
1267 scriptor referring to the container's namespace. For each preserved
1268 namespace type there will be a separate environment variable. These
1269 environment variables will only be set if lxc.hook.version is set to
1270 1.
1271
1272 • LXC_ROOTFS_MOUNT: the path to the mounted root filesystem.
1273
1274 • LXC_ROOTFS_PATH: this is the lxc.rootfs.path entry for the container.
1275 Note this is likely not where the mounted rootfs is to be found, use
1276 LXC_ROOTFS_MOUNT for that.
1277
1278 • LXC_SRC_NAME: in the case of the clone hook, this is the original
1279 container's name.
1280
1281 Standard output from the hooks is logged at debug level. Standard er‐
1282 ror is not logged, but can be captured by the hook redirecting its
1283 standard error to standard output.
1284
1285 lxc.hook.version
1286 To pass the arguments in new style via environment variables set
1287 to 1 otherwise set to 0 to pass them as arguments. This setting
1288 affects all hooks arguments that were traditionally passed as
1289 arguments to the script. Specifically, it affects the container
1290 name, section (e.g. 'lxc', 'net') and hook type (e.g. 'clone',
1291 'mount', 'pre-mount') arguments. If new-style hooks are used
1292 then the arguments will be available as environment variables.
1293 The container name will be set in LXC_NAME. (This is set inde‐
1294 pendently of the value used for this config item.) The section
1295 will be set in LXC_HOOK_SECTION and the hook type will be set in
1296 LXC_HOOK_TYPE. It also affects how the paths to file descrip‐
1297 tors referring to the container's namespaces are passed. If set
1298 to 1 then for each namespace a separate environment variable
1299 LXC_[NAMESPACE IDENTIFIER]_NS will be set. If set to 0 then the
1300 paths will be passed as arguments to the stop hook.
1301
1302 lxc.hook.pre-start
1303 A hook to be run in the host's namespace before the container
1304 ttys, consoles, or mounts are up.
1305
1306 lxc.hook.pre-mount
1307 A hook to be run in the container's fs namespace but before the
1308 rootfs has been set up. This allows for manipulation of the
1309 rootfs, i.e. to mount an encrypted filesystem. Mounts done in
1310 this hook will not be reflected on the host (apart from mounts
1311 propagation), so they will be automatically cleaned up when the
1312 container shuts down.
1313
1314 lxc.hook.mount
1315 A hook to be run in the container's namespace after mounting has
1316 been done, but before the pivot_root.
1317
1318 lxc.hook.autodev
1319 A hook to be run in the container's namespace after mounting has
1320 been done and after any mount hooks have run, but before the
1321 pivot_root, if lxc.autodev == 1. The purpose of this hook is to
1322 assist in populating the /dev directory of the container when
1323 using the autodev option for systemd based containers. The con‐
1324 tainer's /dev directory is relative to the ${LXC_ROOTFS_MOUNT}
1325 environment variable available when the hook is run.
1326
1327 lxc.hook.start-host
1328 A hook to be run in the host's namespace after the container has
1329 been setup, and immediately before starting the container init.
1330
1331 lxc.hook.start
1332 A hook to be run in the container's namespace immediately before
1333 executing the container's init. This requires the program to be
1334 available in the container.
1335
1336 lxc.hook.stop
1337 A hook to be run in the host's namespace with references to the
1338 container's namespaces after the container has been shut down.
1339 For each namespace an extra argument is passed to the hook con‐
1340 taining the namespace's type and a filename that can be used to
1341 obtain a file descriptor to the corresponding namespace, sepa‐
1342 rated by a colon. The type is the name as it would appear in the
1343 /proc/PID/ns directory. For instance for the mount namespace
1344 the argument usually looks like mnt:/proc/PID/fd/12.
1345
1346 lxc.hook.post-stop
1347 A hook to be run in the host's namespace after the container has
1348 been shut down.
1349
1350 lxc.hook.clone
1351 A hook to be run when the container is cloned to a new one. See
1352 lxc-clone(1) for more information.
1353
1354 lxc.hook.destroy
1355 A hook to be run when the container is destroyed.
1356
1357 CONTAINER HOOKS ENVIRONMENT VARIABLES
1358 A number of environment variables are made available to the startup
1359 hooks to provide configuration information and assist in the function‐
1360 ing of the hooks. Not all variables are valid in all contexts. In par‐
1361 ticular, all paths are relative to the host system and, as such, not
1362 valid during the lxc.hook.start hook.
1363
1364 LXC_NAME
1365 The LXC name of the container. Useful for logging messages in
1366 common log environments. [-n]
1367
1368 LXC_CONFIG_FILE
1369 Host relative path to the container configuration file. This
1370 gives the container to reference the original, top level, con‐
1371 figuration file for the container in order to locate any addi‐
1372 tional configuration information not otherwise made available.
1373 [-f]
1374
1375 LXC_CONSOLE
1376 The path to the console output of the container if not NULL.
1377 [-c] [lxc.console.path]
1378
1379 LXC_CONSOLE_LOGPATH
1380 The path to the console log output of the container if not NULL.
1381 [-L]
1382
1383 LXC_ROOTFS_MOUNT
1384 The mount location to which the container is initially bound.
1385 This will be the host relative path to the container rootfs for
1386 the container instance being started and is where changes should
1387 be made for that instance. [lxc.rootfs.mount]
1388
1389 LXC_ROOTFS_PATH
1390 The host relative path to the container root which has been
1391 mounted to the rootfs.mount location. [lxc.rootfs.path]
1392
1393 LXC_SRC_NAME
1394 Only for the clone hook. Is set to the original container name.
1395
1396 LXC_TARGET
1397 Only for the stop hook. Is set to "stop" for a container shut‐
1398 down or "reboot" for a container reboot.
1399
1400 LXC_CGNS_AWARE
1401 If unset, then this version of lxc is not aware of cgroup name‐
1402 spaces. If set, it will be set to 1, and lxc is aware of cgroup
1403 namespaces. Note this does not guarantee that cgroup namespaces
1404 are enabled in the kernel. This is used by the lxcfs mount hook.
1405
1406 LOGGING
1407 Logging can be configured on a per-container basis. By default, depend‐
1408 ing upon how the lxc package was compiled, container startup is logged
1409 only at the ERROR level, and logged to a file named after the container
1410 (with '.log' appended) either under the container path, or under
1411 /var/log/lxc.
1412
1413 Both the default log level and the log file can be specified in the
1414 container configuration file, overriding the default behavior. Note
1415 that the configuration file entries can in turn be overridden by the
1416 command line options to lxc-start.
1417
1418 lxc.log.level
1419 The level at which to log. The log level is an integer in the
1420 range of 0..8 inclusive, where a lower number means more verbose
1421 debugging. In particular 0 = trace, 1 = debug, 2 = info, 3 = no‐
1422 tice, 4 = warn, 5 = error, 6 = critical, 7 = alert, and 8 = fa‐
1423 tal. If unspecified, the level defaults to 5 (error), so that
1424 only errors and above are logged.
1425
1426 Note that when a script (such as either a hook script or a net‐
1427 work interface up or down script) is called, the script's stan‐
1428 dard output is logged at level 1, debug.
1429
1430 lxc.log.file
1431 The file to which logging info should be written.
1432
1433 lxc.log.syslog
1434 Send logging info to syslog. It respects the log level defined
1435 in lxc.log.level. The argument should be the syslog facility to
1436 use, valid ones are: daemon, local0, local1, local2, local3, lo‐
1437 cal4, local5, local5, local6, local7.
1438
1439 AUTOSTART
1440 The autostart options support marking which containers should be auto-
1441 started and in what order. These options may be used by LXC tools di‐
1442 rectly or by external tooling provided by the distributions.
1443
1444 lxc.start.auto
1445 Whether the container should be auto-started. Valid values are
1446 0 (off) and 1 (on).
1447
1448 lxc.start.delay
1449 How long to wait (in seconds) after the container is started be‐
1450 fore starting the next one.
1451
1452 lxc.start.order
1453 An integer used to sort the containers when auto-starting a se‐
1454 ries of containers at once. A lower value means an earlier
1455 start.
1456
1457 lxc.monitor.unshare
1458 If not zero the mount namespace will be unshared from the host
1459 before initializing the container (before running any pre-start
1460 hooks). This requires the CAP_SYS_ADMIN capability at startup.
1461 Default is 0.
1462
1463 lxc.monitor.signal.pdeath
1464 Set the signal to be sent to the container's init when the lxc
1465 monitor exits. By default it is set to SIGKILL which will cause
1466 all container processes to be killed when the lxc monitor
1467 process dies. To ensure that containers stay alive even if lxc
1468 monitor dies set this to 0.
1469
1470 lxc.group
1471 A multi-value key (can be used multiple times) to put the con‐
1472 tainer in a container group. Those groups can then be used
1473 (amongst other things) to start a series of related containers.
1474
1475 AUTOSTART AND SYSTEM BOOT
1476 Each container can be part of any number of groups or no group at all.
1477 Two groups are special. One is the NULL group, i.e. the container does
1478 not belong to any group. The other group is the "onboot" group.
1479
1480 When the system boots with the LXC service enabled, it will first at‐
1481 tempt to boot any containers with lxc.start.auto == 1 that is a member
1482 of the "onboot" group. The startup will be in order of lxc.start.order.
1483 If an lxc.start.delay has been specified, that delay will be honored
1484 before attempting to start the next container to give the current con‐
1485 tainer time to begin initialization and reduce overloading the host
1486 system. After starting the members of the "onboot" group, the LXC sys‐
1487 tem will proceed to boot containers with lxc.start.auto == 1 which are
1488 not members of any group (the NULL group) and proceed as with the on‐
1489 boot group.
1490
1491 CONTAINER ENVIRONMENT
1492 If you want to pass environment variables into the container (that is,
1493 environment variables which will be available to init and all of its
1494 descendents), you can use lxc.environment parameters to do so. Be care‐
1495 ful that you do not pass in anything sensitive; any process in the con‐
1496 tainer which doesn't have its environment scrubbed will have these
1497 variables available to it, and environment variables are always avail‐
1498 able via /proc/PID/environ.
1499
1500 This configuration parameter can be specified multiple times; once for
1501 each environment variable you wish to configure.
1502
1503 lxc.environment
1504 Specify an environment variable to pass into the container. Ex‐
1505 ample:
1506
1507 lxc.environment = APP_ENV=production
1508 lxc.environment = SYSLOG_SERVER=192.0.2.42
1509
1510
1511 It is possible to inherit host environment variables by setting
1512 the name of the variable without a "=" sign. For example:
1513
1514 lxc.environment = PATH
1515
1516
1518 In addition to the few examples given below, you will find some other
1519 examples of configuration file in /usr/share/doc/lxc/examples
1520
1521 NETWORK
1522 This configuration sets up a container to use a veth pair device with
1523 one side plugged to a bridge br0 (which has been configured before on
1524 the system by the administrator). The virtual network device visible in
1525 the container is renamed to eth0.
1526
1527 lxc.uts.name = myhostname
1528 lxc.net.0.type = veth
1529 lxc.net.0.flags = up
1530 lxc.net.0.link = br0
1531 lxc.net.0.name = eth0
1532 lxc.net.0.hwaddr = 4a:49:43:49:79:bf
1533 lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
1534 lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
1535
1536
1537 UID/GID MAPPING
1538 This configuration will map both user and group ids in the range 0-9999
1539 in the container to the ids 100000-109999 on the host.
1540
1541 lxc.idmap = u 0 100000 10000
1542 lxc.idmap = g 0 100000 10000
1543
1544
1545 CONTROL GROUP
1546 This configuration will setup several control groups for the applica‐
1547 tion, cpuset.cpus restricts usage of the defined cpu, cpus.share prior‐
1548 itize the control group, devices.allow makes usable the specified de‐
1549 vices.
1550
1551 lxc.cgroup.cpuset.cpus = 0,1
1552 lxc.cgroup.cpu.shares = 1234
1553 lxc.cgroup.devices.deny = a
1554 lxc.cgroup.devices.allow = c 1:3 rw
1555 lxc.cgroup.devices.allow = b 8:0 rw
1556
1557
1558 COMPLEX CONFIGURATION
1559 This example show a complex configuration making a complex network
1560 stack, using the control groups, setting a new hostname, mounting some
1561 locations and a changing root file system.
1562
1563 lxc.uts.name = complex
1564 lxc.net.0.type = veth
1565 lxc.net.0.flags = up
1566 lxc.net.0.link = br0
1567 lxc.net.0.hwaddr = 4a:49:43:49:79:bf
1568 lxc.net.0.ipv4.address = 10.2.3.5/24 10.2.3.255
1569 lxc.net.0.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3597
1570 lxc.net.0.ipv6.address = 2003:db8:1:0:214:5432:feab:3588
1571 lxc.net.1.type = macvlan
1572 lxc.net.1.flags = up
1573 lxc.net.1.link = eth0
1574 lxc.net.1.hwaddr = 4a:49:43:49:79:bd
1575 lxc.net.1.ipv4.address = 10.2.3.4/24
1576 lxc.net.1.ipv4.address = 192.168.10.125/24
1577 lxc.net.1.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3596
1578 lxc.net.2.type = phys
1579 lxc.net.2.flags = up
1580 lxc.net.2.link = dummy0
1581 lxc.net.2.hwaddr = 4a:49:43:49:79:ff
1582 lxc.net.2.ipv4.address = 10.2.3.6/24
1583 lxc.net.2.ipv6.address = 2003:db8:1:0:214:1234:fe0b:3297
1584 lxc.cgroup.cpuset.cpus = 0,1
1585 lxc.cgroup.cpu.shares = 1234
1586 lxc.cgroup.devices.deny = a
1587 lxc.cgroup.devices.allow = c 1:3 rw
1588 lxc.cgroup.devices.allow = b 8:0 rw
1589 lxc.mount.fstab = /etc/fstab.complex
1590 lxc.mount.entry = /lib /root/myrootfs/lib none ro,bind 0 0
1591 lxc.rootfs.path = dir:/mnt/rootfs.complex
1592 lxc.cap.drop = sys_module mknod setuid net_raw
1593 lxc.cap.drop = mac_override
1594
1595
1597 chroot(1), pivot_root(8), fstab(5), capabilities(7)
1598
1600 lxc(7), lxc-create(1), lxc-copy(1), lxc-destroy(1), lxc-start(1), lxc-
1601 stop(1), lxc-execute(1), lxc-console(1), lxc-monitor(1), lxc-wait(1),
1602 lxc-cgroup(1), lxc-ls(1), lxc-info(1), lxc-freeze(1), lxc-unfreeze(1),
1603 lxc-attach(1), lxc.conf(5)
1604
1606 Daniel Lezcano <daniel.lezcano@free.fr>
1607
1608
1609
1610 2021-05-08 lxc.container.conf(5)