1SCHED(7) Linux Programmer's Manual SCHED(7)
2
3
4
6 sched - overview of CPU scheduling
7
9 Since Linux 2.6.23, the default scheduler is CFS, the "Completely Fair
10 Scheduler". The CFS scheduler replaced the earlier "O(1)" scheduler.
11
12 API summary
13 Linux provides the following system calls for controlling the CPU
14 scheduling behavior, policy, and priority of processes (or, more pre‐
15 cisely, threads).
16
17 nice(2)
18 Set a new nice value for the calling thread, and return the new
19 nice value.
20
21 getpriority(2)
22 Return the nice value of a thread, a process group, or the set
23 of threads owned by a specified user.
24
25 setpriority(2)
26 Set the nice value of a thread, a process group, or the set of
27 threads owned by a specified user.
28
29 sched_setscheduler(2)
30 Set the scheduling policy and parameters of a specified thread.
31
32 sched_getscheduler(2)
33 Return the scheduling policy of a specified thread.
34
35 sched_setparam(2)
36 Set the scheduling parameters of a specified thread.
37
38 sched_getparam(2)
39 Fetch the scheduling parameters of a specified thread.
40
41 sched_get_priority_max(2)
42 Return the maximum priority available in a specified scheduling
43 policy.
44
45 sched_get_priority_min(2)
46 Return the minimum priority available in a specified scheduling
47 policy.
48
49 sched_rr_get_interval(2)
50 Fetch the quantum used for threads that are scheduled under the
51 "round-robin" scheduling policy.
52
53 sched_yield(2)
54 Cause the caller to relinquish the CPU, so that some other
55 thread be executed.
56
57 sched_setaffinity(2)
58 (Linux-specific) Set the CPU affinity of a specified thread.
59
60 sched_getaffinity(2)
61 (Linux-specific) Get the CPU affinity of a specified thread.
62
63 sched_setattr(2)
64 Set the scheduling policy and parameters of a specified thread.
65 This (Linux-specific) system call provides a superset of the
66 functionality of sched_setscheduler(2) and sched_setparam(2).
67
68 sched_getattr(2)
69 Fetch the scheduling policy and parameters of a specified
70 thread. This (Linux-specific) system call provides a superset
71 of the functionality of sched_getscheduler(2) and sched_get‐
72 param(2).
73
74 Scheduling policies
75 The scheduler is the kernel component that decides which runnable
76 thread will be executed by the CPU next. Each thread has an associated
77 scheduling policy and a static scheduling priority, sched_priority.
78 The scheduler makes its decisions based on knowledge of the scheduling
79 policy and static priority of all threads on the system.
80
81 For threads scheduled under one of the normal scheduling policies
82 (SCHED_OTHER, SCHED_IDLE, SCHED_BATCH), sched_priority is not used in
83 scheduling decisions (it must be specified as 0).
84
85 Processes scheduled under one of the real-time policies (SCHED_FIFO,
86 SCHED_RR) have a sched_priority value in the range 1 (low) to 99
87 (high). (As the numbers imply, real-time threads always have higher
88 priority than normal threads.) Note well: POSIX.1 requires an imple‐
89 mentation to support only a minimum 32 distinct priority levels for the
90 real-time policies, and some systems supply just this minimum. Porta‐
91 ble programs should use sched_get_priority_min(2) and sched_get_prior‐
92 ity_max(2) to find the range of priorities supported for a particular
93 policy.
94
95 Conceptually, the scheduler maintains a list of runnable threads for
96 each possible sched_priority value. In order to determine which thread
97 runs next, the scheduler looks for the nonempty list with the highest
98 static priority and selects the thread at the head of this list.
99
100 A thread's scheduling policy determines where it will be inserted into
101 the list of threads with equal static priority and how it will move
102 inside this list.
103
104 All scheduling is preemptive: if a thread with a higher static priority
105 becomes ready to run, the currently running thread will be preempted
106 and returned to the wait list for its static priority level. The
107 scheduling policy determines the ordering only within the list of
108 runnable threads with equal static priority.
109
110 SCHED_FIFO: First in-first out scheduling
111 SCHED_FIFO can be used only with static priorities higher than 0, which
112 means that when a SCHED_FIFO threads becomes runnable, it will always
113 immediately preempt any currently running SCHED_OTHER, SCHED_BATCH, or
114 SCHED_IDLE thread. SCHED_FIFO is a simple scheduling algorithm without
115 time slicing. For threads scheduled under the SCHED_FIFO policy, the
116 following rules apply:
117
118 1) A running SCHED_FIFO thread that has been preempted by another
119 thread of higher priority will stay at the head of the list for its
120 priority and will resume execution as soon as all threads of higher
121 priority are blocked again.
122
123 2) When a blocked SCHED_FIFO thread becomes runnable, it will be
124 inserted at the end of the list for its priority.
125
126 3) If a call to sched_setscheduler(2), sched_setparam(2),
127 sched_setattr(2), pthread_setschedparam(3), or pthread_setsched‐
128 prio(3) changes the priority of the running or runnable SCHED_FIFO
129 thread identified by pid the effect on the thread's position in the
130 list depends on the direction of the change to threads priority:
131
132 · If the thread's priority is raised, it is placed at the end of
133 the list for its new priority. As a consequence, it may preempt
134 a currently running thread with the same priority.
135
136 · If the thread's priority is unchanged, its position in the run
137 list is unchanged.
138
139 · If the thread's priority is lowered, it is placed at the front of
140 the list for its new priority.
141
142 According to POSIX.1-2008, changes to a thread's priority (or pol‐
143 icy) using any mechanism other than pthread_setschedprio(3) should
144 result in the thread being placed at the end of the list for its
145 priority.
146
147 4) A thread calling sched_yield(2) will be put at the end of the list.
148
149 No other events will move a thread scheduled under the SCHED_FIFO pol‐
150 icy in the wait list of runnable threads with equal static priority.
151
152 A SCHED_FIFO thread runs until either it is blocked by an I/O request,
153 it is preempted by a higher priority thread, or it calls
154 sched_yield(2).
155
156 SCHED_RR: Round-robin scheduling
157 SCHED_RR is a simple enhancement of SCHED_FIFO. Everything described
158 above for SCHED_FIFO also applies to SCHED_RR, except that each thread
159 is allowed to run only for a maximum time quantum. If a SCHED_RR
160 thread has been running for a time period equal to or longer than the
161 time quantum, it will be put at the end of the list for its priority.
162 A SCHED_RR thread that has been preempted by a higher priority thread
163 and subsequently resumes execution as a running thread will complete
164 the unexpired portion of its round-robin time quantum. The length of
165 the time quantum can be retrieved using sched_rr_get_interval(2).
166
167 SCHED_DEADLINE: Sporadic task model deadline scheduling
168 Since version 3.14, Linux provides a deadline scheduling policy
169 (SCHED_DEADLINE). This policy is currently implemented using GEDF
170 (Global Earliest Deadline First) in conjunction with CBS (Constant
171 Bandwidth Server). To set and fetch this policy and associated
172 attributes, one must use the Linux-specific sched_setattr(2) and
173 sched_getattr(2) system calls.
174
175 A sporadic task is one that has a sequence of jobs, where each job is
176 activated at most once per period. Each job also has a relative dead‐
177 line, before which it should finish execution, and a computation time,
178 which is the CPU time necessary for executing the job. The moment when
179 a task wakes up because a new job has to be executed is called the
180 arrival time (also referred to as the request time or release time).
181 The start time is the time at which a task starts its execution. The
182 absolute deadline is thus obtained by adding the relative deadline to
183 the arrival time.
184
185 The following diagram clarifies these terms:
186
187 arrival/wakeup absolute deadline
188 | start time |
189 | | |
190 v v v
191 -----x--------xooooooooooooooooo--------x--------x---
192 |<- comp. time ->|
193 |<------- relative deadline ------>|
194 |<-------------- period ------------------->|
195
196 When setting a SCHED_DEADLINE policy for a thread using
197 sched_setattr(2), one can specify three parameters: Runtime, Deadline,
198 and Period. These parameters do not necessarily correspond to the
199 aforementioned terms: usual practice is to set Runtime to something
200 bigger than the average computation time (or worst-case execution time
201 for hard real-time tasks), Deadline to the relative deadline, and
202 Period to the period of the task. Thus, for SCHED_DEADLINE scheduling,
203 we have:
204
205 arrival/wakeup absolute deadline
206 | start time |
207 | | |
208 v v v
209 -----x--------xooooooooooooooooo--------x--------x---
210 |<-- Runtime ------->|
211 |<----------- Deadline ----------->|
212 |<-------------- Period ------------------->|
213
214 The three deadline-scheduling parameters correspond to the sched_run‐
215 time, sched_deadline, and sched_period fields of the sched_attr struc‐
216 ture; see sched_setattr(2). These fields express values in nanosec‐
217 onds. If sched_period is specified as 0, then it is made the same as
218 sched_deadline.
219
220 The kernel requires that:
221
222 sched_runtime <= sched_deadline <= sched_period
223
224 In addition, under the current implementation, all of the parameter
225 values must be at least 1024 (i.e., just over one microsecond, which is
226 the resolution of the implementation), and less than 2^63. If any of
227 these checks fails, sched_setattr(2) fails with the error EINVAL.
228
229 The CBS guarantees non-interference between tasks, by throttling
230 threads that attempt to over-run their specified Runtime.
231
232 To ensure deadline scheduling guarantees, the kernel must prevent situ‐
233 ations where the set of SCHED_DEADLINE threads is not feasible (schedu‐
234 lable) within the given constraints. The kernel thus performs an
235 admittance test when setting or changing SCHED_DEADLINE policy and
236 attributes. This admission test calculates whether the change is fea‐
237 sible; if it is not, sched_setattr(2) fails with the error EBUSY.
238
239 For example, it is required (but not necessarily sufficient) for the
240 total utilization to be less than or equal to the total number of CPUs
241 available, where, since each thread can maximally run for Runtime per
242 Period, that thread's utilization is its Runtime divided by its Period.
243
244 In order to fulfill the guarantees that are made when a thread is
245 admitted to the SCHED_DEADLINE policy, SCHED_DEADLINE threads are the
246 highest priority (user controllable) threads in the system; if any
247 SCHED_DEADLINE thread is runnable, it will preempt any thread scheduled
248 under one of the other policies.
249
250 A call to fork(2) by a thread scheduled under the SCHED_DEADLINE policy
251 fails with the error EAGAIN, unless the thread has its reset-on-fork
252 flag set (see below).
253
254 A SCHED_DEADLINE thread that calls sched_yield(2) will yield the cur‐
255 rent job and wait for a new period to begin.
256
257 SCHED_OTHER: Default Linux time-sharing scheduling
258 SCHED_OTHER can be used at only static priority 0 (i.e., threads under
259 real-time policies always have priority over SCHED_OTHER processes).
260 SCHED_OTHER is the standard Linux time-sharing scheduler that is
261 intended for all threads that do not require the special real-time
262 mechanisms.
263
264 The thread to run is chosen from the static priority 0 list based on a
265 dynamic priority that is determined only inside this list. The dynamic
266 priority is based on the nice value (see below) and is increased for
267 each time quantum the thread is ready to run, but denied to run by the
268 scheduler. This ensures fair progress among all SCHED_OTHER threads.
269
270 In the Linux kernel source code, the SCHED_OTHER policy is actually
271 named SCHED_NORMAL.
272
273 The nice value
274 The nice value is an attribute that can be used to influence the CPU
275 scheduler to favor or disfavor a process in scheduling decisions. It
276 affects the scheduling of SCHED_OTHER and SCHED_BATCH (see below) pro‐
277 cesses. The nice value can be modified using nice(2), setpriority(2),
278 or sched_setattr(2).
279
280 According to POSIX.1, the nice value is a per-process attribute; that
281 is, the threads in a process should share a nice value. However, on
282 Linux, the nice value is a per-thread attribute: different threads in
283 the same process may have different nice values.
284
285 The range of the nice value varies across UNIX systems. On modern
286 Linux, the range is -20 (high priority) to +19 (low priority). On some
287 other systems, the range is -20..20. Very early Linux kernels (Before
288 Linux 2.0) had the range -infinity..15.
289
290 The degree to which the nice value affects the relative scheduling of
291 SCHED_OTHER processes likewise varies across UNIX systems and across
292 Linux kernel versions.
293
294 With the advent of the CFS scheduler in kernel 2.6.23, Linux adopted an
295 algorithm that causes relative differences in nice values to have a
296 much stronger effect. In the current implementation, each unit of dif‐
297 ference in the nice values of two processes results in a factor of 1.25
298 in the degree to which the scheduler favors the higher priority
299 process. This causes very low nice values (+19) to truly provide lit‐
300 tle CPU to a process whenever there is any other higher priority load
301 on the system, and makes high nice values (-20) deliver most of the CPU
302 to applications that require it (e.g., some audio applications).
303
304 On Linux, the RLIMIT_NICE resource limit can be used to define a limit
305 to which an unprivileged process's nice value can be raised; see setr‐
306 limit(2) for details.
307
308 For further details on the nice value, see the subsections on the auto‐
309 group feature and group scheduling, below.
310
311 SCHED_BATCH: Scheduling batch processes
312 (Since Linux 2.6.16.) SCHED_BATCH can be used only at static priority
313 0. This policy is similar to SCHED_OTHER in that it schedules the
314 thread according to its dynamic priority (based on the nice value).
315 The difference is that this policy will cause the scheduler to always
316 assume that the thread is CPU-intensive. Consequently, the scheduler
317 will apply a small scheduling penalty with respect to wakeup behavior,
318 so that this thread is mildly disfavored in scheduling decisions.
319
320 This policy is useful for workloads that are noninteractive, but do not
321 want to lower their nice value, and for workloads that want a determin‐
322 istic scheduling policy without interactivity causing extra preemptions
323 (between the workload's tasks).
324
325 SCHED_IDLE: Scheduling very low priority jobs
326 (Since Linux 2.6.23.) SCHED_IDLE can be used only at static priority
327 0; the process nice value has no influence for this policy.
328
329 This policy is intended for running jobs at extremely low priority
330 (lower even than a +19 nice value with the SCHED_OTHER or SCHED_BATCH
331 policies).
332
333 Resetting scheduling policy for child processes
334 Each thread has a reset-on-fork scheduling flag. When this flag is
335 set, children created by fork(2) do not inherit privileged scheduling
336 policies. The reset-on-fork flag can be set by either:
337
338 * ORing the SCHED_RESET_ON_FORK flag into the policy argument when
339 calling sched_setscheduler(2) (since Linux 2.6.32); or
340
341 * specifying the SCHED_FLAG_RESET_ON_FORK flag in attr.sched_flags
342 when calling sched_setattr(2).
343
344 Note that the constants used with these two APIs have different names.
345 The state of the reset-on-fork flag can analogously be retrieved using
346 sched_getscheduler(2) and sched_getattr(2).
347
348 The reset-on-fork feature is intended for media-playback applications,
349 and can be used to prevent applications evading the RLIMIT_RTTIME
350 resource limit (see getrlimit(2)) by creating multiple child processes.
351
352 More precisely, if the reset-on-fork flag is set, the following rules
353 apply for subsequently created children:
354
355 * If the calling thread has a scheduling policy of SCHED_FIFO or
356 SCHED_RR, the policy is reset to SCHED_OTHER in child processes.
357
358 * If the calling process has a negative nice value, the nice value is
359 reset to zero in child processes.
360
361 After the reset-on-fork flag has been enabled, it can be reset only if
362 the thread has the CAP_SYS_NICE capability. This flag is disabled in
363 child processes created by fork(2).
364
365 Privileges and resource limits
366 In Linux kernels before 2.6.12, only privileged (CAP_SYS_NICE) threads
367 can set a nonzero static priority (i.e., set a real-time scheduling
368 policy). The only change that an unprivileged thread can make is to
369 set the SCHED_OTHER policy, and this can be done only if the effective
370 user ID of the caller matches the real or effective user ID of the tar‐
371 get thread (i.e., the thread specified by pid) whose policy is being
372 changed.
373
374 A thread must be privileged (CAP_SYS_NICE) in order to set or modify a
375 SCHED_DEADLINE policy.
376
377 Since Linux 2.6.12, the RLIMIT_RTPRIO resource limit defines a ceiling
378 on an unprivileged thread's static priority for the SCHED_RR and
379 SCHED_FIFO policies. The rules for changing scheduling policy and pri‐
380 ority are as follows:
381
382 * If an unprivileged thread has a nonzero RLIMIT_RTPRIO soft limit,
383 then it can change its scheduling policy and priority, subject to
384 the restriction that the priority cannot be set to a value higher
385 than the maximum of its current priority and its RLIMIT_RTPRIO soft
386 limit.
387
388 * If the RLIMIT_RTPRIO soft limit is 0, then the only permitted
389 changes are to lower the priority, or to switch to a non-real-time
390 policy.
391
392 * Subject to the same rules, another unprivileged thread can also make
393 these changes, as long as the effective user ID of the thread making
394 the change matches the real or effective user ID of the target
395 thread.
396
397 * Special rules apply for the SCHED_IDLE policy. In Linux kernels
398 before 2.6.39, an unprivileged thread operating under this policy
399 cannot change its policy, regardless of the value of its
400 RLIMIT_RTPRIO resource limit. In Linux kernels since 2.6.39, an
401 unprivileged thread can switch to either the SCHED_BATCH or the
402 SCHED_OTHER policy so long as its nice value falls within the range
403 permitted by its RLIMIT_NICE resource limit (see getrlimit(2)).
404
405 Privileged (CAP_SYS_NICE) threads ignore the RLIMIT_RTPRIO limit; as
406 with older kernels, they can make arbitrary changes to scheduling pol‐
407 icy and priority. See getrlimit(2) for further information on
408 RLIMIT_RTPRIO.
409
410 Limiting the CPU usage of real-time and deadline processes
411 A nonblocking infinite loop in a thread scheduled under the SCHED_FIFO,
412 SCHED_RR, or SCHED_DEADLINE policy can potentially block all other
413 threads from accessing the CPU forever. Prior to Linux 2.6.25, the
414 only way of preventing a runaway real-time process from freezing the
415 system was to run (at the console) a shell scheduled under a higher
416 static priority than the tested application. This allows an emergency
417 kill of tested real-time applications that do not block or terminate as
418 expected.
419
420 Since Linux 2.6.25, there are other techniques for dealing with runaway
421 real-time and deadline processes. One of these is to use the
422 RLIMIT_RTTIME resource limit to set a ceiling on the CPU time that a
423 real-time process may consume. See getrlimit(2) for details.
424
425 Since version 2.6.25, Linux also provides two /proc files that can be
426 used to reserve a certain amount of CPU time to be used by non-real-
427 time processes. Reserving CPU time in this fashion allows some CPU
428 time to be allocated to (say) a root shell that can be used to kill a
429 runaway process. Both of these files specify time values in microsec‐
430 onds:
431
432 /proc/sys/kernel/sched_rt_period_us
433 This file specifies a scheduling period that is equivalent to
434 100% CPU bandwidth. The value in this file can range from 1 to
435 INT_MAX, giving an operating range of 1 microsecond to around 35
436 minutes. The default value in this file is 1,000,000 (1 sec‐
437 ond).
438
439 /proc/sys/kernel/sched_rt_runtime_us
440 The value in this file specifies how much of the "period" time
441 can be used by all real-time and deadline scheduled processes on
442 the system. The value in this file can range from -1 to
443 INT_MAX-1. Specifying -1 makes the run time the same as the
444 period; that is, no CPU time is set aside for non-real-time pro‐
445 cesses (which was the Linux behavior before kernel 2.6.25). The
446 default value in this file is 950,000 (0.95 seconds), meaning
447 that 5% of the CPU time is reserved for processes that don't run
448 under a real-time or deadline scheduling policy.
449
450 Response time
451 A blocked high priority thread waiting for I/O has a certain response
452 time before it is scheduled again. The device driver writer can
453 greatly reduce this response time by using a "slow interrupt" interrupt
454 handler.
455
456 Miscellaneous
457 Child processes inherit the scheduling policy and parameters across a
458 fork(2). The scheduling policy and parameters are preserved across
459 execve(2).
460
461 Memory locking is usually needed for real-time processes to avoid pag‐
462 ing delays; this can be done with mlock(2) or mlockall(2).
463
464 The autogroup feature
465 Since Linux 2.6.38, the kernel provides a feature known as autogrouping
466 to improve interactive desktop performance in the face of multiprocess,
467 CPU-intensive workloads such as building the Linux kernel with large
468 numbers of parallel build processes (i.e., the make(1) -j flag).
469
470 This feature operates in conjunction with the CFS scheduler and
471 requires a kernel that is configured with CONFIG_SCHED_AUTOGROUP. On a
472 running system, this feature is enabled or disabled via the file
473 /proc/sys/kernel/sched_autogroup_enabled; a value of 0 disables the
474 feature, while a value of 1 enables it. The default value in this file
475 is 1, unless the kernel was booted with the noautogroup parameter.
476
477 A new autogroup is created when a new session is created via setsid(2);
478 this happens, for example, when a new terminal window is started. A
479 new process created by fork(2) inherits its parent's autogroup member‐
480 ship. Thus, all of the processes in a session are members of the same
481 autogroup. An autogroup is automatically destroyed when the last
482 process in the group terminates.
483
484 When autogrouping is enabled, all of the members of an autogroup are
485 placed in the same kernel scheduler "task group". The CFS scheduler
486 employs an algorithm that equalizes the distribution of CPU cycles
487 across task groups. The benefits of this for interactive desktop per‐
488 formance can be described via the following example.
489
490 Suppose that there are two autogroups competing for the same CPU (i.e.,
491 presume either a single CPU system or the use of taskset(1) to confine
492 all the processes to the same CPU on an SMP system). The first group
493 contains ten CPU-bound processes from a kernel build started with
494 make -j10. The other contains a single CPU-bound process: a video
495 player. The effect of autogrouping is that the two groups will each
496 receive half of the CPU cycles. That is, the video player will receive
497 50% of the CPU cycles, rather than just 9% of the cycles, which would
498 likely lead to degraded video playback. The situation on an SMP system
499 is more complex, but the general effect is the same: the scheduler dis‐
500 tributes CPU cycles across task groups such that an autogroup that con‐
501 tains a large number of CPU-bound processes does not end up hogging CPU
502 cycles at the expense of the other jobs on the system.
503
504 A process's autogroup (task group) membership can be viewed via the
505 file /proc/[pid]/autogroup:
506
507 $ cat /proc/1/autogroup
508 /autogroup-1 nice 0
509
510 This file can also be used to modify the CPU bandwidth allocated to an
511 autogroup. This is done by writing a number in the "nice" range to the
512 file to set the autogroup's nice value. The allowed range is from +19
513 (low priority) to -20 (high priority). (Writing values outside of this
514 range causes write(2) to fail with the error EINVAL.)
515
516 The autogroup nice setting has the same meaning as the process nice
517 value, but applies to distribution of CPU cycles to the autogroup as a
518 whole, based on the relative nice values of other autogroups. For a
519 process inside an autogroup, the CPU cycles that it receives will be a
520 product of the autogroup's nice value (compared to other autogroups)
521 and the process's nice value (compared to other processes in the same
522 autogroup.
523
524 The use of the cgroups(7) CPU controller to place processes in cgroups
525 other than the root CPU cgroup overrides the effect of autogrouping.
526
527 The autogroup feature groups only processes scheduled under non-real-
528 time policies (SCHED_OTHER, SCHED_BATCH, and SCHED_IDLE). It does not
529 group processes scheduled under real-time and deadline policies. Those
530 processes are scheduled according to the rules described earlier.
531
532 The nice value and group scheduling
533 When scheduling non-real-time processes (i.e., those scheduled under
534 the SCHED_OTHER, SCHED_BATCH, and SCHED_IDLE policies), the CFS sched‐
535 uler employs a technique known as "group scheduling", if the kernel was
536 configured with the CONFIG_FAIR_GROUP_SCHED option (which is typical).
537
538 Under group scheduling, threads are scheduled in "task groups". Task
539 groups have a hierarchical relationship, rooted under the initial task
540 group on the system, known as the "root task group". Task groups are
541 formed in the following circumstances:
542
543 * All of the threads in a CPU cgroup form a task group. The parent of
544 this task group is the task group of the corresponding parent
545 cgroup.
546
547 * If autogrouping is enabled, then all of the threads that are
548 (implicitly) placed in an autogroup (i.e., the same session, as cre‐
549 ated by setsid(2)) form a task group. Each new autogroup is thus a
550 separate task group. The root task group is the parent of all such
551 autogroups.
552
553 * If autogrouping is enabled, then the root task group consists of all
554 processes in the root CPU cgroup that were not otherwise implicitly
555 placed into a new autogroup.
556
557 * If autogrouping is disabled, then the root task group consists of
558 all processes in the root CPU cgroup.
559
560 * If group scheduling was disabled (i.e., the kernel was configured
561 without CONFIG_FAIR_GROUP_SCHED), then all of the processes on the
562 system are notionally placed in a single task group.
563
564 Under group scheduling, a thread's nice value has an effect for sched‐
565 uling decisions only relative to other threads in the same task group.
566 This has some surprising consequences in terms of the traditional
567 semantics of the nice value on UNIX systems. In particular, if auto‐
568 grouping is enabled (which is the default in various distributions),
569 then employing setpriority(2) or nice(1) on a process has an effect
570 only for scheduling relative to other processes executed in the same
571 session (typically: the same terminal window).
572
573 Conversely, for two processes that are (for example) the sole CPU-bound
574 processes in different sessions (e.g., different terminal windows, each
575 of whose jobs are tied to different autogroups), modifying the nice
576 value of the process in one of the sessions has no effect in terms of
577 the scheduler's decisions relative to the process in the other session.
578 A possibly useful workaround here is to use a command such as the fol‐
579 lowing to modify the autogroup nice value for all of the processes in a
580 terminal session:
581
582 $ echo 10 > /proc/self/autogroup
583
584 Real-time features in the mainline Linux kernel
585 Since kernel version 2.6.18, Linux is gradually becoming equipped with
586 real-time capabilities, most of which are derived from the former real‐
587 time-preempt patch set. Until the patches have been completely merged
588 into the mainline kernel, they must be installed to achieve the best
589 real-time performance. These patches are named:
590
591 patch-kernelversion-rtpatchversion
592
593 and can be downloaded from ⟨http://www.kernel.org/pub/linux/kernel
594 /projects/rt/⟩.
595
596 Without the patches and prior to their full inclusion into the mainline
597 kernel, the kernel configuration offers only the three preemption
598 classes CONFIG_PREEMPT_NONE, CONFIG_PREEMPT_VOLUNTARY, and CONFIG_PRE‐
599 EMPT_DESKTOP which respectively provide no, some, and considerable
600 reduction of the worst-case scheduling latency.
601
602 With the patches applied or after their full inclusion into the main‐
603 line kernel, the additional configuration item CONFIG_PREEMPT_RT
604 becomes available. If this is selected, Linux is transformed into a
605 regular real-time operating system. The FIFO and RR scheduling poli‐
606 cies are then used to run a thread with true real-time priority and a
607 minimum worst-case scheduling latency.
608
610 The cgroups(7) CPU controller can be used to limit the CPU consumption
611 of groups of processes.
612
613 Originally, Standard Linux was intended as a general-purpose operating
614 system being able to handle background processes, interactive applica‐
615 tions, and less demanding real-time applications (applications that
616 need to usually meet timing deadlines). Although the Linux kernel 2.6
617 allowed for kernel preemption and the newly introduced O(1) scheduler
618 ensures that the time needed to schedule is fixed and deterministic
619 irrespective of the number of active tasks, true real-time computing
620 was not possible up to kernel version 2.6.17.
621
623 chcpu(1), chrt(1), lscpu(1), ps(1), taskset(1), top(1), getpriority(2),
624 mlock(2), mlockall(2), munlock(2), munlockall(2), nice(2),
625 sched_get_priority_max(2), sched_get_priority_min(2),
626 sched_getaffinity(2), sched_getparam(2), sched_getscheduler(2),
627 sched_rr_get_interval(2), sched_setaffinity(2), sched_setparam(2),
628 sched_setscheduler(2), sched_yield(2), setpriority(2),
629 pthread_getaffinity_np(3), pthread_getschedparam(3),
630 pthread_setaffinity_np(3), sched_getcpu(3), capabilities(7), cpuset(7)
631
632 Programming for the real world - POSIX.4 by Bill O. Gallmeister,
633 O'Reilly & Associates, Inc., ISBN 1-56592-074-0.
634
635 The Linux kernel source files Documentation/scheduler/sched-
636 deadline.txt, Documentation/scheduler/sched-rt-group.txt,
637 Documentation/scheduler/sched-design-CFS.txt, and
638 Documentation/scheduler/sched-nice-design.txt
639
641 This page is part of release 5.04 of the Linux man-pages project. A
642 description of the project, information about reporting bugs, and the
643 latest version of this page, can be found at
644 https://www.kernel.org/doc/man-pages/.
645
646
647
648Linux 2019-08-02 SCHED(7)