1SYSTEMD.RESOURCE-CONTROL(5)systemd.resource-controlSYSTEMD.RESOURCE-CONTROL(5)
2
3
4

NAME

6       systemd.resource-control - Resource control unit settings
7

SYNOPSIS

9       slice.slice, scope.scope, service.service, socket.socket, mount.mount,
10       swap.swap
11

DESCRIPTION

13       Unit configuration files for services, slices, scopes, sockets, mount
14       points, and swap devices share a subset of configuration options for
15       resource control of spawned processes. Internally, this relies on the
16       Linux Control Groups (cgroups) kernel concept for organizing processes
17       in a hierarchical tree of named groups for the purpose of resource
18       management.
19
20       This man page lists the configuration options shared by those six unit
21       types. See systemd.unit(5) for the common options of all unit
22       configuration files, and systemd.slice(5), systemd.scope(5),
23       systemd.service(5), systemd.socket(5), systemd.mount(5), and
24       systemd.swap(5) for more information on the specific unit configuration
25       files. The resource control configuration options are configured in the
26       [Slice], [Scope], [Service], [Socket], [Mount], or [Swap] sections,
27       depending on the unit type.
28
29       In addition, options which control resources available to programs
30       executed by systemd are listed in systemd.exec(5). Those options
31       complement options listed here.
32
33       See the New Control Group Interfaces[1] for an introduction on how to
34       make use of resource control APIs from programs.
35
36   Setting resource controls for a group of related units
37       As described in systemd.unit(5), the settings listed here may be set
38       through the main file of a unit and drop-in snippets in *.d/
39       directories. The list of directories searched for drop-ins includes
40       names formed by repeatedly truncating the unit name after all dashes.
41       This is particularly convenient to set resource limits for a group of
42       units with similar names.
43
44       For example, every user gets their own slice user-nnn.slice. Drop-ins
45       with local configuration that affect user 1000 may be placed in
46       /etc/systemd/system/user-1000.slice,
47       /etc/systemd/system/user-1000.slice.d/*.conf, but also
48       /etc/systemd/system/user-.slice.d/*.conf. This last directory applies
49       to all user slices.
50

IMPLICIT DEPENDENCIES

52       The following dependencies are implicitly added:
53
54       •   Units with the Slice= setting set automatically acquire Requires=
55           and After= dependencies on the specified slice unit.
56

UNIFIED AND LEGACY CONTROL GROUP HIERARCHIES

58       The unified control group hierarchy is the new version of kernel
59       control group interface, see Control Groups v2[2]. Depending on the
60       resource type, there are differences in resource control capabilities.
61       Also, because of interface changes, some resource types have separate
62       set of options on the unified hierarchy.
63
64       CPU
65           CPUWeight= and StartupCPUWeight= replace CPUShares= and
66           StartupCPUShares=, respectively.
67
68           The "cpuacct" controller does not exist separately on the unified
69           hierarchy.
70
71       Memory
72           MemoryMax= replaces MemoryLimit=.  MemoryLow= and MemoryHigh= are
73           effective only on unified hierarchy.
74
75       IO
76           "IO"-prefixed settings are a superset of and replace
77           "BlockIO"-prefixed ones. On unified hierarchy, IO resource control
78           also applies to buffered writes.
79
80       To ease the transition, there is best-effort translation between the
81       two versions of settings. For each controller, if any of the settings
82       for the unified hierarchy are present, all settings for the legacy
83       hierarchy are ignored. If the resulting settings are for the other type
84       of hierarchy, the configurations are translated before application.
85
86       Legacy control group hierarchy (see Control Groups version 1[3]), also
87       called cgroup-v1, doesn't allow safe delegation of controllers to
88       unprivileged processes. If the system uses the legacy control group
89       hierarchy, resource control is disabled for the systemd user instance,
90       see systemd(1).
91

OPTIONS

93       Units of the types listed above can have settings for resource control
94       configuration:
95
96       CPUAccounting=
97           Turn on CPU usage accounting for this unit. Takes a boolean
98           argument. Note that turning on CPU accounting for one unit will
99           also implicitly turn it on for all units contained in the same
100           slice and for all its parent slices and the units contained
101           therein. The system default for this setting may be controlled with
102           DefaultCPUAccounting= in systemd-system.conf(5).
103
104       CPUWeight=weight, StartupCPUWeight=weight
105           Assign the specified CPU time weight to the processes executed, if
106           the unified control group hierarchy is used on the system. These
107           options take an integer value and control the "cpu.weight" control
108           group attribute. The allowed range is 1 to 10000. Defaults to 100.
109           For details about this control group attribute, see Control Groups
110           v2[2] and CFS Scheduler[4]. The available CPU time is split up
111           among all units within one slice relative to their CPU time weight.
112
113           While StartupCPUWeight= only applies to the startup phase of the
114           system, CPUWeight= applies to normal runtime of the system, and if
115           the former is not set also to the startup phase. Using
116           StartupCPUWeight= allows prioritizing specific services at boot-up
117           differently than during normal runtime.
118
119           These settings replace CPUShares= and StartupCPUShares=.
120
121       CPUQuota=
122           Assign the specified CPU time quota to the processes executed.
123           Takes a percentage value, suffixed with "%". The percentage
124           specifies how much CPU time the unit shall get at maximum, relative
125           to the total CPU time available on one CPU. Use values > 100% for
126           allotting CPU time on more than one CPU. This controls the
127           "cpu.max" attribute on the unified control group hierarchy and
128           "cpu.cfs_quota_us" on legacy. For details about these control group
129           attributes, see Control Groups v2[2] and sched-bwc.txt[5].
130
131           Example: CPUQuota=20% ensures that the executed processes will
132           never get more than 20% CPU time on one CPU.
133
134       CPUQuotaPeriodSec=
135           Assign the duration over which the CPU time quota specified by
136           CPUQuota= is measured. Takes a time duration value in seconds, with
137           an optional suffix such as "ms" for milliseconds (or "s" for
138           seconds.) The default setting is 100ms. The period is clamped to
139           the range supported by the kernel, which is [1ms, 1000ms].
140           Additionally, the period is adjusted up so that the quota interval
141           is also at least 1ms. Setting CPUQuotaPeriodSec= to an empty value
142           resets it to the default.
143
144           This controls the second field of "cpu.max" attribute on the
145           unified control group hierarchy and "cpu.cfs_period_us" on legacy.
146           For details about these control group attributes, see Control
147           Groups v2[2] and CFS Scheduler[4].
148
149           Example: CPUQuotaPeriodSec=10ms to request that the CPU quota is
150           measured in periods of 10ms.
151
152       AllowedCPUs=
153           Restrict processes to be executed on specific CPUs. Takes a list of
154           CPU indices or ranges separated by either whitespace or commas. CPU
155           ranges are specified by the lower and upper CPU indices separated
156           by a dash.
157
158           Setting AllowedCPUs= doesn't guarantee that all of the CPUs will be
159           used by the processes as it may be limited by parent units. The
160           effective configuration is reported as EffectiveCPUs=.
161
162           This setting is supported only with the unified control group
163           hierarchy.
164
165       AllowedMemoryNodes=
166           Restrict processes to be executed on specific memory NUMA nodes.
167           Takes a list of memory NUMA nodes indices or ranges separated by
168           either whitespace or commas. Memory NUMA nodes ranges are specified
169           by the lower and upper NUMA nodes indices separated by a dash.
170
171           Setting AllowedMemoryNodes= doesn't guarantee that all of the
172           memory NUMA nodes will be used by the processes as it may be
173           limited by parent units. The effective configuration is reported as
174           EffectiveMemoryNodes=.
175
176           This setting is supported only with the unified control group
177           hierarchy.
178
179       MemoryAccounting=
180           Turn on process and kernel memory accounting for this unit. Takes a
181           boolean argument. Note that turning on memory accounting for one
182           unit will also implicitly turn it on for all units contained in the
183           same slice and for all its parent slices and the units contained
184           therein. The system default for this setting may be controlled with
185           DefaultMemoryAccounting= in systemd-system.conf(5).
186
187       MemoryMin=bytes, MemoryLow=bytes
188           Specify the memory usage protection of the executed processes in
189           this unit. When reclaiming memory, the unit is treated as if it was
190           using less memory resulting in memory to be preferentially
191           reclaimed from unprotected units. Using MemoryLow= results in a
192           weaker protection where memory may still be reclaimed to avoid
193           invoking the OOM killer in case there is no other reclaimable
194           memory.
195
196           For a protection to be effective, it is generally required to set a
197           corresponding allocation on all ancestors, which is then
198           distributed between children (with the exception of the root
199           slice). Any MemoryMin= or MemoryLow= allocation that is not
200           explicitly distributed to specific children is used to create a
201           shared protection for all children. As this is a shared protection,
202           the children will freely compete for the memory.
203
204           Takes a memory size in bytes. If the value is suffixed with K, M, G
205           or T, the specified memory size is parsed as Kilobytes, Megabytes,
206           Gigabytes, or Terabytes (with the base 1024), respectively.
207           Alternatively, a percentage value may be specified, which is taken
208           relative to the installed physical memory on the system. If
209           assigned the special value "infinity", all available memory is
210           protected, which may be useful in order to always inherit all of
211           the protection afforded by ancestors. This controls the
212           "memory.min" or "memory.low" control group attribute. For details
213           about this control group attribute, see Memory Interface Files[6].
214
215           This setting is supported only if the unified control group
216           hierarchy is used and disables MemoryLimit=.
217
218           Units may have their children use a default "memory.min" or
219           "memory.low" value by specifying DefaultMemoryMin= or
220           DefaultMemoryLow=, which has the same semantics as MemoryMin= and
221           MemoryLow=. This setting does not affect "memory.min" or
222           "memory.low" in the unit itself. Using it to set a default child
223           allocation is only useful on kernels older than 5.7, which do not
224           support the "memory_recursiveprot" cgroup2 mount option.
225
226       MemoryHigh=bytes
227           Specify the throttling limit on memory usage of the executed
228           processes in this unit. Memory usage may go above the limit if
229           unavoidable, but the processes are heavily slowed down and memory
230           is taken away aggressively in such cases. This is the main
231           mechanism to control memory usage of a unit.
232
233           Takes a memory size in bytes. If the value is suffixed with K, M, G
234           or T, the specified memory size is parsed as Kilobytes, Megabytes,
235           Gigabytes, or Terabytes (with the base 1024), respectively.
236           Alternatively, a percentage value may be specified, which is taken
237           relative to the installed physical memory on the system. If
238           assigned the special value "infinity", no memory throttling is
239           applied. This controls the "memory.high" control group attribute.
240           For details about this control group attribute, see Memory
241           Interface Files[6].
242
243           This setting is supported only if the unified control group
244           hierarchy is used and disables MemoryLimit=.
245
246       MemoryMax=bytes
247           Specify the absolute limit on memory usage of the executed
248           processes in this unit. If memory usage cannot be contained under
249           the limit, out-of-memory killer is invoked inside the unit. It is
250           recommended to use MemoryHigh= as the main control mechanism and
251           use MemoryMax= as the last line of defense.
252
253           Takes a memory size in bytes. If the value is suffixed with K, M, G
254           or T, the specified memory size is parsed as Kilobytes, Megabytes,
255           Gigabytes, or Terabytes (with the base 1024), respectively.
256           Alternatively, a percentage value may be specified, which is taken
257           relative to the installed physical memory on the system. If
258           assigned the special value "infinity", no memory limit is applied.
259           This controls the "memory.max" control group attribute. For details
260           about this control group attribute, see Memory Interface Files[6].
261
262           This setting replaces MemoryLimit=.
263
264       MemorySwapMax=bytes
265           Specify the absolute limit on swap usage of the executed processes
266           in this unit.
267
268           Takes a swap size in bytes. If the value is suffixed with K, M, G
269           or T, the specified swap size is parsed as Kilobytes, Megabytes,
270           Gigabytes, or Terabytes (with the base 1024), respectively. If
271           assigned the special value "infinity", no swap limit is applied.
272           This controls the "memory.swap.max" control group attribute. For
273           details about this control group attribute, see Memory Interface
274           Files[6].
275
276           This setting is supported only if the unified control group
277           hierarchy is used and disables MemoryLimit=.
278
279       TasksAccounting=
280           Turn on task accounting for this unit. Takes a boolean argument. If
281           enabled, the system manager will keep track of the number of tasks
282           in the unit. The number of tasks accounted this way includes both
283           kernel threads and userspace processes, with each thread counting
284           individually. Note that turning on tasks accounting for one unit
285           will also implicitly turn it on for all units contained in the same
286           slice and for all its parent slices and the units contained
287           therein. The system default for this setting may be controlled with
288           DefaultTasksAccounting= in systemd-system.conf(5).
289
290       TasksMax=N
291           Specify the maximum number of tasks that may be created in the
292           unit. This ensures that the number of tasks accounted for the unit
293           (see above) stays below a specific limit. This either takes an
294           absolute number of tasks or a percentage value that is taken
295           relative to the configured maximum number of tasks on the system.
296           If assigned the special value "infinity", no tasks limit is
297           applied. This controls the "pids.max" control group attribute. For
298           details about this control group attribute, see Process Number
299           Controller[7].
300
301           The system default for this setting may be controlled with
302           DefaultTasksMax= in systemd-system.conf(5).
303
304       IOAccounting=
305           Turn on Block I/O accounting for this unit, if the unified control
306           group hierarchy is used on the system. Takes a boolean argument.
307           Note that turning on block I/O accounting for one unit will also
308           implicitly turn it on for all units contained in the same slice and
309           all for its parent slices and the units contained therein. The
310           system default for this setting may be controlled with
311           DefaultIOAccounting= in systemd-system.conf(5).
312
313           This setting replaces BlockIOAccounting= and disables settings
314           prefixed with BlockIO or StartupBlockIO.
315
316       IOWeight=weight, StartupIOWeight=weight
317           Set the default overall block I/O weight for the executed
318           processes, if the unified control group hierarchy is used on the
319           system. Takes a single weight value (between 1 and 10000) to set
320           the default block I/O weight. This controls the "io.weight" control
321           group attribute, which defaults to 100. For details about this
322           control group attribute, see IO Interface Files[8]. The available
323           I/O bandwidth is split up among all units within one slice relative
324           to their block I/O weight.
325
326           While StartupIOWeight= only applies to the startup phase of the
327           system, IOWeight= applies to the later runtime of the system, and
328           if the former is not set also to the startup phase. This allows
329           prioritizing specific services at boot-up differently than during
330           runtime.
331
332           These settings replace BlockIOWeight= and StartupBlockIOWeight= and
333           disable settings prefixed with BlockIO or StartupBlockIO.
334
335       IODeviceWeight=device weight
336           Set the per-device overall block I/O weight for the executed
337           processes, if the unified control group hierarchy is used on the
338           system. Takes a space-separated pair of a file path and a weight
339           value to specify the device specific weight value, between 1 and
340           10000. (Example: "/dev/sda 1000"). The file path may be specified
341           as path to a block device node or as any other file, in which case
342           the backing block device of the file system of the file is
343           determined. This controls the "io.weight" control group attribute,
344           which defaults to 100. Use this option multiple times to set
345           weights for multiple devices. For details about this control group
346           attribute, see IO Interface Files[8].
347
348           This setting replaces BlockIODeviceWeight= and disables settings
349           prefixed with BlockIO or StartupBlockIO.
350
351           The specified device node should reference a block device that has
352           an I/O scheduler associated, i.e. should not refer to partition or
353           loopback block devices, but to the originating, physical device.
354           When a path to a regular file or directory is specified it is
355           attempted to discover the correct originating device backing the
356           file system of the specified path. This works correctly only for
357           simpler cases, where the file system is directly placed on a
358           partition or physical block device, or where simple 1:1 encryption
359           using dm-crypt/LUKS is used. This discovery does not cover complex
360           storage and in particular RAID and volume management storage
361           devices.
362
363       IOReadBandwidthMax=device bytes, IOWriteBandwidthMax=device bytes
364           Set the per-device overall block I/O bandwidth maximum limit for
365           the executed processes, if the unified control group hierarchy is
366           used on the system. This limit is not work-conserving and the
367           executed processes are not allowed to use more even if the device
368           has idle capacity. Takes a space-separated pair of a file path and
369           a bandwidth value (in bytes per second) to specify the device
370           specific bandwidth. The file path may be a path to a block device
371           node, or as any other file in which case the backing block device
372           of the file system of the file is used. If the bandwidth is
373           suffixed with K, M, G, or T, the specified bandwidth is parsed as
374           Kilobytes, Megabytes, Gigabytes, or Terabytes, respectively, to the
375           base of 1000. (Example:
376           "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 5M"). This
377           controls the "io.max" control group attributes. Use this option
378           multiple times to set bandwidth limits for multiple devices. For
379           details about this control group attribute, see IO Interface
380           Files[8].
381
382           These settings replace BlockIOReadBandwidth= and
383           BlockIOWriteBandwidth= and disable settings prefixed with BlockIO
384           or StartupBlockIO.
385
386           Similar restrictions on block device discovery as for
387           IODeviceWeight= apply, see above.
388
389       IOReadIOPSMax=device IOPS, IOWriteIOPSMax=device IOPS
390           Set the per-device overall block I/O IOs-Per-Second maximum limit
391           for the executed processes, if the unified control group hierarchy
392           is used on the system. This limit is not work-conserving and the
393           executed processes are not allowed to use more even if the device
394           has idle capacity. Takes a space-separated pair of a file path and
395           an IOPS value to specify the device specific IOPS. The file path
396           may be a path to a block device node, or as any other file in which
397           case the backing block device of the file system of the file is
398           used. If the IOPS is suffixed with K, M, G, or T, the specified
399           IOPS is parsed as KiloIOPS, MegaIOPS, GigaIOPS, or TeraIOPS,
400           respectively, to the base of 1000. (Example:
401           "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 1K"). This
402           controls the "io.max" control group attributes. Use this option
403           multiple times to set IOPS limits for multiple devices. For details
404           about this control group attribute, see IO Interface Files[8].
405
406           These settings are supported only if the unified control group
407           hierarchy is used and disable settings prefixed with BlockIO or
408           StartupBlockIO.
409
410           Similar restrictions on block device discovery as for
411           IODeviceWeight= apply, see above.
412
413       IODeviceLatencyTargetSec=device target
414           Set the per-device average target I/O latency for the executed
415           processes, if the unified control group hierarchy is used on the
416           system. Takes a file path and a timespan separated by a space to
417           specify the device specific latency target. (Example: "/dev/sda
418           25ms"). The file path may be specified as path to a block device
419           node or as any other file, in which case the backing block device
420           of the file system of the file is determined. This controls the
421           "io.latency" control group attribute. Use this option multiple
422           times to set latency target for multiple devices. For details about
423           this control group attribute, see IO Interface Files[8].
424
425           Implies "IOAccounting=yes".
426
427           These settings are supported only if the unified control group
428           hierarchy is used.
429
430           Similar restrictions on block device discovery as for
431           IODeviceWeight= apply, see above.
432
433       IPAccounting=
434           Takes a boolean argument. If true, turns on IPv4 and IPv6 network
435           traffic accounting for packets sent or received by the unit. When
436           this option is turned on, all IPv4 and IPv6 sockets created by any
437           process of the unit are accounted for.
438
439           When this option is used in socket units, it applies to all IPv4
440           and IPv6 sockets associated with it (including both listening and
441           connection sockets where this applies). Note that for
442           socket-activated services, this configuration setting and the
443           accounting data of the service unit and the socket unit are kept
444           separate, and displayed separately. No propagation of the setting
445           and the collected statistics is done, in either direction.
446           Moreover, any traffic sent or received on any of the socket unit's
447           sockets is accounted to the socket unit — and never to the service
448           unit it might have activated, even if the socket is used by it.
449
450           The system default for this setting may be controlled with
451           DefaultIPAccounting= in systemd-system.conf(5).
452
453       IPAddressAllow=ADDRESS[/PREFIXLENGTH]...,
454       IPAddressDeny=ADDRESS[/PREFIXLENGTH]...
455           Turn on network traffic filtering for IP packets sent and received
456           over AF_INET and AF_INET6 sockets. Both directives take a space
457           separated list of IPv4 or IPv6 addresses, each optionally suffixed
458           with an address prefix length in bits after a "/" character. If the
459           suffix is omitted, the address is considered a host address, i.e.
460           the filter covers the whole address (32 bits for IPv4, 128 bits for
461           IPv6).
462
463           The access lists configured with this option are applied to all
464           sockets created by processes of this unit (or in the case of socket
465           units, associated with it). The lists are implicitly combined with
466           any lists configured for any of the parent slice units this unit
467           might be a member of. By default both access lists are empty. Both
468           ingress and egress traffic is filtered by these settings. In case
469           of ingress traffic the source IP address is checked against these
470           access lists, in case of egress traffic the destination IP address
471           is checked. The following rules are applied in turn:
472
473           •   Access is granted when the checked IP address matches an entry
474               in the IPAddressAllow= list.
475
476           •   Otherwise, access is denied when the checked IP address matches
477               an entry in the IPAddressDeny= list.
478
479           •   Otherwise, access is granted.
480
481           In order to implement an allow-listing IP firewall, it is
482           recommended to use a IPAddressDeny=any setting on an upper-level
483           slice unit (such as the root slice -.slice or the slice containing
484           all system services system.slice – see systemd.special(7) for
485           details on these slice units), plus individual per-service
486           IPAddressAllow= lines permitting network access to relevant
487           services, and only them.
488
489           Note that for socket-activated services, the IP access list
490           configured on the socket unit applies to all sockets associated
491           with it directly, but not to any sockets created by the ultimately
492           activated services for it. Conversely, the IP access list
493           configured for the service is not applied to any sockets passed
494           into the service via socket activation. Thus, it is usually a good
495           idea to replicate the IP access lists on both the socket and the
496           service unit. Nevertheless, it may make sense to maintain one list
497           more open and the other one more restricted, depending on the
498           usecase.
499
500           If these settings are used multiple times in the same unit the
501           specified lists are combined. If an empty string is assigned to
502           these settings the specific access list is reset and all previous
503           settings undone.
504
505           In place of explicit IPv4 or IPv6 address and prefix length
506           specifications a small set of symbolic names may be used. The
507           following names are defined:
508
509           Table 1. Special address/network names
510           ┌──────────────┬─────────────────────┬─────────────────────┐
511Symbolic Name Definition          Meaning             
512           ├──────────────┼─────────────────────┼─────────────────────┤
513any           │ 0.0.0.0/0 ::/0      │ Any host            │
514           ├──────────────┼─────────────────────┼─────────────────────┤
515localhost     │ 127.0.0.0/8 ::1/128 │ All addresses on    │
516           │              │                     │ the local loopback  │
517           ├──────────────┼─────────────────────┼─────────────────────┤
518link-local    │ 169.254.0.0/16      │ All link-local IP   │
519           │              │ fe80::/64           │ addresses           │
520           ├──────────────┼─────────────────────┼─────────────────────┤
521multicast     │ 224.0.0.0/4         │ All IP multicasting │
522           │              │ ff00::/8            │ addresses           │
523           └──────────────┴─────────────────────┴─────────────────────┘
524           Note that these settings might not be supported on some systems
525           (for example if eBPF control group support is not enabled in the
526           underlying kernel or container manager). These settings will have
527           no effect in that case. If compatibility with such systems is
528           desired it is hence recommended to not exclusively rely on them for
529           IP security.
530
531       IPIngressFilterPath=BPF_FS_PROGRAM_PATH,
532       IPEgressFilterPath=BPF_FS_PROGRAM_PATH
533           Add custom network traffic filters implemented as BPF programs,
534           applying to all IP packets sent and received over AF_INET and
535           AF_INET6 sockets. Takes an absolute path to a pinned BPF program in
536           the BPF virtual filesystem (/sys/fs/bpf/).
537
538           The filters configured with this option are applied to all sockets
539           created by processes of this unit (or in the case of socket units,
540           associated with it). The filters are loaded in addition to filters
541           any of the parent slice units this unit might be a member of as
542           well as any IPAddressAllow= and IPAddressDeny= filters in any of
543           these units. By default there are no filters specified.
544
545           If these settings are used multiple times in the same unit all the
546           specified programs are attached. If an empty string is assigned to
547           these settings the program list is reset and all previous specified
548           programs ignored.
549
550           Note that for socket-activated services, the IP filter programs
551           configured on the socket unit apply to all sockets associated with
552           it directly, but not to any sockets created by the ultimately
553           activated services for it. Conversely, the IP filter programs
554           configured for the service are not applied to any sockets passed
555           into the service via socket activation. Thus, it is usually a good
556           idea, to replicate the IP filter programs on both the socket and
557           the service unit, however it often makes sense to maintain one
558           configuration more open and the other one more restricted,
559           depending on the usecase.
560
561           Note that these settings might not be supported on some systems
562           (for example if eBPF control group support is not enabled in the
563           underlying kernel or container manager). These settings will fail
564           the service in that case. If compatibility with such systems is
565           desired it is hence recommended to attach your filter manually
566           (requires Delegate=yes) instead of using this setting.
567
568       DeviceAllow=
569           Control access to specific device nodes by the executed processes.
570           Takes two space-separated strings: a device node specifier followed
571           by a combination of r, w, m to control reading, writing, or
572           creation of the specific device node(s) by the unit (mknod),
573           respectively. On cgroup-v1 this controls the "devices.allow"
574           control group attribute. For details about this control group
575           attribute, see Device Whitelist Controller[9]. In the unified
576           cgroup hierarchy this functionality is implemented using eBPF
577           filtering.
578
579           The device node specifier is either a path to a device node in the
580           file system, starting with /dev/, or a string starting with either
581           "char-" or "block-" followed by a device group name, as listed in
582           /proc/devices. The latter is useful to allow-list all current and
583           future devices belonging to a specific device group at once. The
584           device group is matched according to filename globbing rules, you
585           may hence use the "*" and "?"  wildcards. (Note that such globbing
586           wildcards are not available for device node path specifications!)
587           In order to match device nodes by numeric major/minor, use device
588           node paths in the /dev/char/ and /dev/block/ directories. However,
589           matching devices by major/minor is generally not recommended as
590           assignments are neither stable nor portable between systems or
591           different kernel versions.
592
593           Examples: /dev/sda5 is a path to a device node, referring to an ATA
594           or SCSI block device.  "char-pts" and "char-alsa" are specifiers
595           for all pseudo TTYs and all ALSA sound devices, respectively.
596           "char-cpu/*" is a specifier matching all CPU related device groups.
597
598           Note that allow lists defined this way should only reference device
599           groups which are resolvable at the time the unit is started. Any
600           device groups not resolvable then are not added to the device allow
601           list. In order to work around this limitation, consider extending
602           service units with a pair of After=modprobe@xyz.service and
603           Wants=modprobe@xyz.service lines that load the necessary kernel
604           module implementing the device group if missing. Example:
605
606               ...
607               [Unit]
608               Wants=modprobe@loop.service
609               After=modprobe@loop.service
610
611               [Service]
612               DeviceAllow=block-loop
613               DeviceAllow=/dev/loop-control
614               ...
615
616       DevicePolicy=auto|closed|strict
617           Control the policy for allowing device access:
618
619           strict
620               means to only allow types of access that are explicitly
621               specified.
622
623           closed
624               in addition, allows access to standard pseudo devices including
625               /dev/null, /dev/zero, /dev/full, /dev/random, and /dev/urandom.
626
627           auto
628               in addition, allows access to all devices if no explicit
629               DeviceAllow= is present. This is the default.
630
631       Slice=
632           The name of the slice unit to place the unit in. Defaults to
633           system.slice for all non-instantiated units of all unit types
634           (except for slice units themselves see below). Instance units are
635           by default placed in a subslice of system.slice that is named after
636           the template name.
637
638           This option may be used to arrange systemd units in a hierarchy of
639           slices each of which might have resource settings applied.
640
641           For units of type slice, the only accepted value for this setting
642           is the parent slice. Since the name of a slice unit implies the
643           parent slice, it is hence redundant to ever set this parameter
644           directly for slice units.
645
646           Special care should be taken when relying on the default slice
647           assignment in templated service units that have
648           DefaultDependencies=no set, see systemd.service(5), section
649           "Default Dependencies" for details.
650
651       Delegate=
652           Turns on delegation of further resource control partitioning to
653           processes of the unit. Units where this is enabled may create and
654           manage their own private subhierarchy of control groups below the
655           control group of the unit itself. For unprivileged services (i.e.
656           those using the User= setting) the unit's control group will be
657           made accessible to the relevant user. When enabled the service
658           manager will refrain from manipulating control groups or moving
659           processes below the unit's control group, so that a clear concept
660           of ownership is established: the control group tree above the
661           unit's control group (i.e. towards the root control group) is owned
662           and managed by the service manager of the host, while the control
663           group tree below the unit's control group is owned and managed by
664           the unit itself. Takes either a boolean argument or a list of
665           control group controller names. If true, delegation is turned on,
666           and all supported controllers are enabled for the unit, making them
667           available to the unit's processes for management. If false,
668           delegation is turned off entirely (and no additional controllers
669           are enabled). If set to a list of controllers, delegation is turned
670           on, and the specified controllers are enabled for the unit. Note
671           that additional controllers than the ones specified might be made
672           available as well, depending on configuration of the containing
673           slice unit or other units contained in it. Note that assigning the
674           empty string will enable delegation, but reset the list of
675           controllers, all assignments prior to this will have no effect.
676           Defaults to false.
677
678           Note that controller delegation to less privileged code is only
679           safe on the unified control group hierarchy. Accordingly, access to
680           the specified controllers will not be granted to unprivileged
681           services on the legacy hierarchy, even when requested.
682
683           The following controller names may be specified: cpu, cpuacct,
684           cpuset, io, blkio, memory, devices, pids, bpf-firewall, and
685           bpf-devices.
686
687           Not all of these controllers are available on all kernels however,
688           and some are specific to the unified hierarchy while others are
689           specific to the legacy hierarchy. Also note that the kernel might
690           support further controllers, which aren't covered here yet as
691           delegation is either not supported at all for them or not defined
692           cleanly.
693
694           For further details on the delegation model consult Control Group
695           APIs and Delegation[10].
696
697       DisableControllers=
698           Disables controllers from being enabled for a unit's children. If a
699           controller listed is already in use in its subtree, the controller
700           will be removed from the subtree. This can be used to avoid child
701           units being able to implicitly or explicitly enable a controller.
702           Defaults to not disabling any controllers.
703
704           It may not be possible to successfully disable a controller if the
705           unit or any child of the unit in question delegates controllers to
706           its children, as any delegated subtree of the cgroup hierarchy is
707           unmanaged by systemd.
708
709           Multiple controllers may be specified, separated by spaces. You may
710           also pass DisableControllers= multiple times, in which case each
711           new instance adds another controller to disable. Passing
712           DisableControllers= by itself with no controller name present
713           resets the disabled controller list.
714
715           The following controller names may be specified: cpu, cpuacct,
716           cpuset, io, blkio, memory, devices, pids, bpf-firewall, and
717           bpf-devices.
718
719       ManagedOOMSwap=auto|kill, ManagedOOMMemoryPressure=auto|kill
720           Specifies how systemd-oomd.service(8) will act on this unit's
721           cgroups. Defaults to auto.
722
723           When set to kill, systemd-oomd will actively monitor this unit's
724           cgroup metrics to decide whether it needs to act. If the cgroup
725           passes the limits set by oomd.conf(5) or its overrides,
726           systemd-oomd will send a SIGKILL to all of the processes under the
727           chosen candidate cgroup. Note that only descendant cgroups can be
728           eligible candidates for killing; the unit that set its property to
729           kill is not a candidate (unless one of its ancestors set their
730           property to kill). You can find more details on candidates and kill
731           behavior at systemd-oomd.service(8) and oomd.conf(5). Setting
732           either of these properties to kill will also automatically acquire
733           After= and Wants= dependencies on systemd-oomd.service unless
734           DefaultDependencies=no.
735
736           When set to auto, systemd-oomd will not actively use this cgroup's
737           data for monitoring and detection. However, if an ancestor cgroup
738           has one of these properties set to kill, a unit with auto can still
739           be an eligible candidate for systemd-oomd to act on.
740
741       ManagedOOMMemoryPressureLimit=
742           Overrides the default memory pressure limit set by oomd.conf(5) for
743           this unit (cgroup). Takes a percentage value between 0% and 100%,
744           inclusive. This property is ignored unless
745           ManagedOOMMemoryPressure=kill. Defaults to 0%, which means to use
746           the default set by oomd.conf(5).
747
748       ManagedOOMPreference=none|avoid|omit
749           Allows deprioritizing or omitting this unit's cgroup as a candidate
750           when systemd-oomd needs to act. Requires support for extended
751           attributes (see xattr(7)) in order to use avoid or omit.
752           Additionally, systemd-oomd will ignore these extended attributes if
753           the unit's cgroup is not owned by the root user.
754
755           If this property is set to avoid, the service manager will convey
756           this to systemd-oomd, which will only select this cgroup if there
757           are no other viable candidates.
758
759           If this property is set to omit, the service manager will convey
760           this to systemd-oomd, which will ignore this cgroup as a candidate
761           and will not perform any actions on it.
762
763           It is recommended to use avoid and omit sparingly, as it can
764           adversely affect systemd-oomd's kill behavior. Also note that these
765           extended attributes are not applied recursively to cgroups under
766           this unit's cgroup.
767
768           Defaults to none which means systemd-oomd will rank this unit's
769           cgroup as defined in systemd-oomd.service(8) and oomd.conf(5).
770

DEPRECATED OPTIONS

772       The following options are deprecated. Use the indicated superseding
773       options instead:
774
775       CPUShares=weight, StartupCPUShares=weight
776           Assign the specified CPU time share weight to the processes
777           executed. These options take an integer value and control the
778           "cpu.shares" control group attribute. The allowed range is 2 to
779           262144. Defaults to 1024. For details about this control group
780           attribute, see CFS Scheduler[4]. The available CPU time is split up
781           among all units within one slice relative to their CPU time share
782           weight.
783
784           While StartupCPUShares= only applies to the startup phase of the
785           system, CPUShares= applies to normal runtime of the system, and if
786           the former is not set also to the startup phase. Using
787           StartupCPUShares= allows prioritizing specific services at boot-up
788           differently than during normal runtime.
789
790           Implies "CPUAccounting=yes".
791
792           These settings are deprecated. Use CPUWeight= and StartupCPUWeight=
793           instead.
794
795       MemoryLimit=bytes
796           Specify the limit on maximum memory usage of the executed
797           processes. The limit specifies how much process and kernel memory
798           can be used by tasks in this unit. Takes a memory size in bytes. If
799           the value is suffixed with K, M, G or T, the specified memory size
800           is parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with
801           the base 1024), respectively. Alternatively, a percentage value may
802           be specified, which is taken relative to the installed physical
803           memory on the system. If assigned the special value "infinity", no
804           memory limit is applied. This controls the "memory.limit_in_bytes"
805           control group attribute. For details about this control group
806           attribute, see Memory Resource Controller[11].
807
808           Implies "MemoryAccounting=yes".
809
810           This setting is deprecated. Use MemoryMax= instead.
811
812       BlockIOAccounting=
813           Turn on Block I/O accounting for this unit, if the legacy control
814           group hierarchy is used on the system. Takes a boolean argument.
815           Note that turning on block I/O accounting for one unit will also
816           implicitly turn it on for all units contained in the same slice and
817           all for its parent slices and the units contained therein. The
818           system default for this setting may be controlled with
819           DefaultBlockIOAccounting= in systemd-system.conf(5).
820
821           This setting is deprecated. Use IOAccounting= instead.
822
823       BlockIOWeight=weight, StartupBlockIOWeight=weight
824           Set the default overall block I/O weight for the executed
825           processes, if the legacy control group hierarchy is used on the
826           system. Takes a single weight value (between 10 and 1000) to set
827           the default block I/O weight. This controls the "blkio.weight"
828           control group attribute, which defaults to 500. For details about
829           this control group attribute, see Block IO Controller[12]. The
830           available I/O bandwidth is split up among all units within one
831           slice relative to their block I/O weight.
832
833           While StartupBlockIOWeight= only applies to the startup phase of
834           the system, BlockIOWeight= applies to the later runtime of the
835           system, and if the former is not set also to the startup phase.
836           This allows prioritizing specific services at boot-up differently
837           than during runtime.
838
839           Implies "BlockIOAccounting=yes".
840
841           These settings are deprecated. Use IOWeight= and StartupIOWeight=
842           instead.
843
844       BlockIODeviceWeight=device weight
845           Set the per-device overall block I/O weight for the executed
846           processes, if the legacy control group hierarchy is used on the
847           system. Takes a space-separated pair of a file path and a weight
848           value to specify the device specific weight value, between 10 and
849           1000. (Example: "/dev/sda 500"). The file path may be specified as
850           path to a block device node or as any other file, in which case the
851           backing block device of the file system of the file is determined.
852           This controls the "blkio.weight_device" control group attribute,
853           which defaults to 1000. Use this option multiple times to set
854           weights for multiple devices. For details about this control group
855           attribute, see Block IO Controller[12].
856
857           Implies "BlockIOAccounting=yes".
858
859           This setting is deprecated. Use IODeviceWeight= instead.
860
861       BlockIOReadBandwidth=device bytes, BlockIOWriteBandwidth=device bytes
862           Set the per-device overall block I/O bandwidth limit for the
863           executed processes, if the legacy control group hierarchy is used
864           on the system. Takes a space-separated pair of a file path and a
865           bandwidth value (in bytes per second) to specify the device
866           specific bandwidth. The file path may be a path to a block device
867           node, or as any other file in which case the backing block device
868           of the file system of the file is used. If the bandwidth is
869           suffixed with K, M, G, or T, the specified bandwidth is parsed as
870           Kilobytes, Megabytes, Gigabytes, or Terabytes, respectively, to the
871           base of 1000. (Example:
872           "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 5M"). This
873           controls the "blkio.throttle.read_bps_device" and
874           "blkio.throttle.write_bps_device" control group attributes. Use
875           this option multiple times to set bandwidth limits for multiple
876           devices. For details about these control group attributes, see
877           Block IO Controller[12].
878
879           Implies "BlockIOAccounting=yes".
880
881           These settings are deprecated. Use IOReadBandwidthMax= and
882           IOWriteBandwidthMax= instead.
883

SEE ALSO

885       systemd(1), systemd-system.conf(5), systemd.unit(5),
886       systemd.service(5), systemd.slice(5), systemd.scope(5),
887       systemd.socket(5), systemd.mount(5), systemd.swap(5), systemd.exec(5),
888       systemd.directives(7), systemd.special(7), systemd-oomd.service(8), The
889       documentation for control groups and specific controllers in the Linux
890       kernel: Control Groups v2[2].
891

NOTES

893        1. New Control Group Interfaces
894           https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
895
896        2. Control Groups v2
897           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
898
899        3. Control Groups version 1
900           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/
901
902        4. CFS Scheduler
903           https://www.kernel.org/doc/html/latest/scheduler/sched-design-CFS.html
904
905        5. sched-bwc.txt
906           https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt
907
908        6. Memory Interface Files
909           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#memory-interface-files
910
911        7. Process Number Controller
912           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/pids.html
913
914        8. IO Interface Files
915           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#io-interface-files
916
917        9. Device Whitelist Controller
918           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/devices.html
919
920       10. Control Group APIs and Delegation
921           https://systemd.io/CGROUP_DELEGATION
922
923       11. Memory Resource Controller
924           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/memory.html
925
926       12. Block IO Controller
927           https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/blkio-controller.html
928
929
930
931systemd 248                                        SYSTEMD.RESOURCE-CONTROL(5)
Impressum