1SCHED_SETAFFINITY(2) Linux Programmer's Manual SCHED_SETAFFINITY(2)
2
3
4
6 sched_setaffinity, sched_getaffinity - set and get a thread's CPU
7 affinity mask
8
10 #define _GNU_SOURCE /* See feature_test_macros(7) */
11 #include <sched.h>
12
13 int sched_setaffinity(pid_t pid, size_t cpusetsize,
14 const cpu_set_t *mask);
15 int sched_getaffinity(pid_t pid, size_t cpusetsize,
16 cpu_set_t *mask);
17
19 A thread's CPU affinity mask determines the set of CPUs on which it is
20 eligible to run. On a multiprocessor system, setting the CPU affinity
21 mask can be used to obtain performance benefits. For example, by dedi‐
22 cating one CPU to a particular thread (i.e., setting the affinity mask
23 of that thread to specify a single CPU, and setting the affinity mask
24 of all other threads to exclude that CPU), it is possible to ensure
25 maximum execution speed for that thread. Restricting a thread to run
26 on a single CPU also avoids the performance cost caused by the cache
27 invalidation that occurs when a thread ceases to execute on one CPU and
28 then recommences execution on a different CPU.
29
30 A CPU affinity mask is represented by the cpu_set_t structure, a "CPU
31 set", pointed to by mask. A set of macros for manipulating CPU sets is
32 described in CPU_SET(3).
33
34 sched_setaffinity() sets the CPU affinity mask of the thread whose ID
35 is pid to the value specified by mask. If pid is zero, then the call‐
36 ing thread is used. The argument cpusetsize is the length (in bytes)
37 of the data pointed to by mask. Normally this argument would be speci‐
38 fied as sizeof(cpu_set_t).
39
40 If the thread specified by pid is not currently running on one of the
41 CPUs specified in mask, then that thread is migrated to one of the CPUs
42 specified in mask.
43
44 sched_getaffinity() writes the affinity mask of the thread whose ID is
45 pid into the cpu_set_t structure pointed to by mask. The cpusetsize
46 argument specifies the size (in bytes) of mask. If pid is zero, then
47 the mask of the calling thread is returned.
48
50 On success, sched_setaffinity() and sched_getaffinity() return 0 (but
51 see "C library/kernel differences" below, which notes that the underly‐
52 ing sched_getaffinity() differs in its return value). On failure, -1
53 is returned, and errno is set to indicate the error.
54
56 EFAULT A supplied memory address was invalid.
57
58 EINVAL The affinity bit mask mask contains no processors that are cur‐
59 rently physically on the system and permitted to the thread ac‐
60 cording to any restrictions that may be imposed by cpuset
61 cgroups or the "cpuset" mechanism described in cpuset(7).
62
63 EINVAL (sched_getaffinity() and, in kernels before 2.6.9,
64 sched_setaffinity()) cpusetsize is smaller than the size of the
65 affinity mask used by the kernel.
66
67 EPERM (sched_setaffinity()) The calling thread does not have appropri‐
68 ate privileges. The caller needs an effective user ID equal to
69 the real user ID or effective user ID of the thread identified
70 by pid, or it must possess the CAP_SYS_NICE capability in the
71 user namespace of the thread pid.
72
73 ESRCH The thread whose ID is pid could not be found.
74
76 The CPU affinity system calls were introduced in Linux kernel 2.5.8.
77 The system call wrappers were introduced in glibc 2.3. Initially, the
78 glibc interfaces included a cpusetsize argument, typed as unsigned int.
79 In glibc 2.3.3, the cpusetsize argument was removed, but was then re‐
80 stored in glibc 2.3.4, with type size_t.
81
83 These system calls are Linux-specific.
84
86 After a call to sched_setaffinity(), the set of CPUs on which the
87 thread will actually run is the intersection of the set specified in
88 the mask argument and the set of CPUs actually present on the system.
89 The system may further restrict the set of CPUs on which the thread
90 runs if the "cpuset" mechanism described in cpuset(7) is being used.
91 These restrictions on the actual set of CPUs on which the thread will
92 run are silently imposed by the kernel.
93
94 There are various ways of determining the number of CPUs available on
95 the system, including: inspecting the contents of /proc/cpuinfo; using
96 sysconf(3) to obtain the values of the _SC_NPROCESSORS_CONF and
97 _SC_NPROCESSORS_ONLN parameters; and inspecting the list of CPU direc‐
98 tories under /sys/devices/system/cpu/.
99
100 sched(7) has a description of the Linux scheduling scheme.
101
102 The affinity mask is a per-thread attribute that can be adjusted inde‐
103 pendently for each of the threads in a thread group. The value re‐
104 turned from a call to gettid(2) can be passed in the argument pid.
105 Specifying pid as 0 will set the attribute for the calling thread, and
106 passing the value returned from a call to getpid(2) will set the attri‐
107 bute for the main thread of the thread group. (If you are using the
108 POSIX threads API, then use pthread_setaffinity_np(3) instead of
109 sched_setaffinity().)
110
111 The isolcpus boot option can be used to isolate one or more CPUs at
112 boot time, so that no processes are scheduled onto those CPUs. Follow‐
113 ing the use of this boot option, the only way to schedule processes
114 onto the isolated CPUs is via sched_setaffinity() or the cpuset(7)
115 mechanism. For further information, see the kernel source file Docu‐
116 mentation/admin-guide/kernel-parameters.txt. As noted in that file,
117 isolcpus is the preferred mechanism of isolating CPUs (versus the al‐
118 ternative of manually setting the CPU affinity of all processes on the
119 system).
120
121 A child created via fork(2) inherits its parent's CPU affinity mask.
122 The affinity mask is preserved across an execve(2).
123
124 C library/kernel differences
125 This manual page describes the glibc interface for the CPU affinity
126 calls. The actual system call interface is slightly different, with
127 the mask being typed as unsigned long *, reflecting the fact that the
128 underlying implementation of CPU sets is a simple bit mask.
129
130 On success, the raw sched_getaffinity() system call returns the number
131 of bytes placed copied into the mask buffer; this will be the minimum
132 of cpusetsize and the size (in bytes) of the cpumask_t data type that
133 is used internally by the kernel to represent the CPU set bit mask.
134
135 Handling systems with large CPU affinity masks
136 The underlying system calls (which represent CPU masks as bit masks of
137 type unsigned long *) impose no restriction on the size of the CPU
138 mask. However, the cpu_set_t data type used by glibc has a fixed size
139 of 128 bytes, meaning that the maximum CPU number that can be repre‐
140 sented is 1023. If the kernel CPU affinity mask is larger than 1024,
141 then calls of the form:
142
143 sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
144
145 fail with the error EINVAL, the error produced by the underlying system
146 call for the case where the mask size specified in cpusetsize is
147 smaller than the size of the affinity mask used by the kernel. (De‐
148 pending on the system CPU topology, the kernel affinity mask can be
149 substantially larger than the number of active CPUs in the system.)
150
151 When working on systems with large kernel CPU affinity masks, one must
152 dynamically allocate the mask argument (see CPU_ALLOC(3)). Currently,
153 the only way to do this is by probing for the size of the required mask
154 using sched_getaffinity() calls with increasing mask sizes (until the
155 call does not fail with the error EINVAL).
156
157 Be aware that CPU_ALLOC(3) may allocate a slightly larger CPU set than
158 requested (because CPU sets are implemented as bit masks allocated in
159 units of sizeof(long)). Consequently, sched_getaffinity() can set bits
160 beyond the requested allocation size, because the kernel sees a few ad‐
161 ditional bits. Therefore, the caller should iterate over the bits in
162 the returned set, counting those which are set, and stop upon reaching
163 the value returned by CPU_COUNT(3) (rather than iterating over the num‐
164 ber of bits requested to be allocated).
165
167 The program below creates a child process. The parent and child then
168 each assign themselves to a specified CPU and execute identical loops
169 that consume some CPU time. Before terminating, the parent waits for
170 the child to complete. The program takes three command-line arguments:
171 the CPU number for the parent, the CPU number for the child, and the
172 number of loop iterations that both processes should perform.
173
174 As the sample runs below demonstrate, the amount of real and CPU time
175 consumed when running the program will depend on intra-core caching ef‐
176 fects and whether the processes are using the same CPU.
177
178 We first employ lscpu(1) to determine that this (x86) system has two
179 cores, each with two CPUs:
180
181 $ lscpu | egrep -i 'core.*:|socket'
182 Thread(s) per core: 2
183 Core(s) per socket: 2
184 Socket(s): 1
185
186 We then time the operation of the example program for three cases: both
187 processes running on the same CPU; both processes running on different
188 CPUs on the same core; and both processes running on different CPUs on
189 different cores.
190
191 $ time -p ./a.out 0 0 100000000
192 real 14.75
193 user 3.02
194 sys 11.73
195 $ time -p ./a.out 0 1 100000000
196 real 11.52
197 user 3.98
198 sys 19.06
199 $ time -p ./a.out 0 3 100000000
200 real 7.89
201 user 3.29
202 sys 12.07
203
204 Program source
205
206 #define _GNU_SOURCE
207 #include <sched.h>
208 #include <stdio.h>
209 #include <stdlib.h>
210 #include <unistd.h>
211 #include <sys/wait.h>
212
213 #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
214 } while (0)
215
216 int
217 main(int argc, char *argv[])
218 {
219 cpu_set_t set;
220 int parentCPU, childCPU;
221 int nloops;
222
223 if (argc != 4) {
224 fprintf(stderr, "Usage: %s parent-cpu child-cpu num-loops\n",
225 argv[0]);
226 exit(EXIT_FAILURE);
227 }
228
229 parentCPU = atoi(argv[1]);
230 childCPU = atoi(argv[2]);
231 nloops = atoi(argv[3]);
232
233 CPU_ZERO(&set);
234
235 switch (fork()) {
236 case -1: /* Error */
237 errExit("fork");
238
239 case 0: /* Child */
240 CPU_SET(childCPU, &set);
241
242 if (sched_setaffinity(getpid(), sizeof(set), &set) == -1)
243 errExit("sched_setaffinity");
244
245 for (int j = 0; j < nloops; j++)
246 getppid();
247
248 exit(EXIT_SUCCESS);
249
250 default: /* Parent */
251 CPU_SET(parentCPU, &set);
252
253 if (sched_setaffinity(getpid(), sizeof(set), &set) == -1)
254 errExit("sched_setaffinity");
255
256 for (int j = 0; j < nloops; j++)
257 getppid();
258
259 wait(NULL); /* Wait for child to terminate */
260 exit(EXIT_SUCCESS);
261 }
262 }
263
265 lscpu(1), nproc(1), taskset(1), clone(2), getcpu(2), getpriority(2),
266 gettid(2), nice(2), sched_get_priority_max(2),
267 sched_get_priority_min(2), sched_getscheduler(2),
268 sched_setscheduler(2), setpriority(2), CPU_SET(3), get_nprocs(3),
269 pthread_setaffinity_np(3), sched_getcpu(3), capabilities(7), cpuset(7),
270 sched(7), numactl(8)
271
273 This page is part of release 5.13 of the Linux man-pages project. A
274 description of the project, information about reporting bugs, and the
275 latest version of this page, can be found at
276 https://www.kernel.org/doc/man-pages/.
277
278
279
280Linux 2021-03-22 SCHED_SETAFFINITY(2)