1SCHED_SETAFFINITY(2) Linux Programmer's Manual SCHED_SETAFFINITY(2)
2
3
4
6 sched_setaffinity, sched_getaffinity - set and get a thread's CPU
7 affinity mask
8
10 #define _GNU_SOURCE /* See feature_test_macros(7) */
11 #include <sched.h>
12
13 int sched_setaffinity(pid_t pid, size_t cpusetsize,
14 const cpu_set_t *mask);
15
16 int sched_getaffinity(pid_t pid, size_t cpusetsize,
17 cpu_set_t *mask);
18
20 A thread's CPU affinity mask determines the set of CPUs on which it is
21 eligible to run. On a multiprocessor system, setting the CPU affinity
22 mask can be used to obtain performance benefits. For example, by dedi‐
23 cating one CPU to a particular thread (i.e., setting the affinity mask
24 of that thread to specify a single CPU, and setting the affinity mask
25 of all other threads to exclude that CPU), it is possible to ensure
26 maximum execution speed for that thread. Restricting a thread to run
27 on a single CPU also avoids the performance cost caused by the cache
28 invalidation that occurs when a thread ceases to execute on one CPU and
29 then recommences execution on a different CPU.
30
31 A CPU affinity mask is represented by the cpu_set_t structure, a "CPU
32 set", pointed to by mask. A set of macros for manipulating CPU sets is
33 described in CPU_SET(3).
34
35 sched_setaffinity() sets the CPU affinity mask of the thread whose ID
36 is pid to the value specified by mask. If pid is zero, then the call‐
37 ing thread is used. The argument cpusetsize is the length (in bytes)
38 of the data pointed to by mask. Normally this argument would be speci‐
39 fied as sizeof(cpu_set_t).
40
41 If the thread specified by pid is not currently running on one of the
42 CPUs specified in mask, then that thread is migrated to one of the CPUs
43 specified in mask.
44
45 sched_getaffinity() writes the affinity mask of the thread whose ID is
46 pid into the cpu_set_t structure pointed to by mask. The cpusetsize
47 argument specifies the size (in bytes) of mask. If pid is zero, then
48 the mask of the calling thread is returned.
49
51 On success, sched_setaffinity() and sched_getaffinity() return 0. On
52 error, -1 is returned, and errno is set appropriately.
53
55 EFAULT A supplied memory address was invalid.
56
57 EINVAL The affinity bit mask mask contains no processors that are cur‐
58 rently physically on the system and permitted to the thread
59 according to any restrictions that may be imposed by cpuset
60 cgroups or the "cpuset" mechanism described in cpuset(7).
61
62 EINVAL (sched_getaffinity() and, in kernels before 2.6.9,
63 sched_setaffinity()) cpusetsize is smaller than the size of the
64 affinity mask used by the kernel.
65
66 EPERM (sched_setaffinity()) The calling thread does not have appropri‐
67 ate privileges. The caller needs an effective user ID equal to
68 the real user ID or effective user ID of the thread identified
69 by pid, or it must possess the CAP_SYS_NICE capability in the
70 user namespace of the thread pid.
71
72 ESRCH The thread whose ID is pid could not be found.
73
75 The CPU affinity system calls were introduced in Linux kernel 2.5.8.
76 The system call wrappers were introduced in glibc 2.3. Initially, the
77 glibc interfaces included a cpusetsize argument, typed as unsigned int.
78 In glibc 2.3.3, the cpusetsize argument was removed, but was then
79 restored in glibc 2.3.4, with type size_t.
80
82 These system calls are Linux-specific.
83
85 After a call to sched_setaffinity(), the set of CPUs on which the
86 thread will actually run is the intersection of the set specified in
87 the mask argument and the set of CPUs actually present on the system.
88 The system may further restrict the set of CPUs on which the thread
89 runs if the "cpuset" mechanism described in cpuset(7) is being used.
90 These restrictions on the actual set of CPUs on which the thread will
91 run are silently imposed by the kernel.
92
93 There are various ways of determining the number of CPUs available on
94 the system, including: inspecting the contents of /proc/cpuinfo; using
95 sysconf(3) to obtain the values of the _SC_NPROCESSORS_CONF and
96 _SC_NPROCESSORS_ONLN parameters; and inspecting the list of CPU direc‐
97 tories under /sys/devices/system/cpu/.
98
99 sched(7) has a description of the Linux scheduling scheme.
100
101 The affinity mask is a per-thread attribute that can be adjusted inde‐
102 pendently for each of the threads in a thread group. The value
103 returned from a call to gettid(2) can be passed in the argument pid.
104 Specifying pid as 0 will set the attribute for the calling thread, and
105 passing the value returned from a call to getpid(2) will set the
106 attribute for the main thread of the thread group. (If you are using
107 the POSIX threads API, then use pthread_setaffinity_np(3) instead of
108 sched_setaffinity().)
109
110 The isolcpus boot option can be used to isolate one or more CPUs at
111 boot time, so that no processes are scheduled onto those CPUs. Follow‐
112 ing the use of this boot option, the only way to schedule processes
113 onto the isolated CPUs is via sched_setaffinity() or the cpuset(7)
114 mechanism. For further information, see the kernel source file Docu‐
115 mentation/admin-guide/kernel-parameters.txt. As noted in that file,
116 isolcpus is the preferred mechanism of isolating CPUs (versus the
117 alternative of manually setting the CPU affinity of all processes on
118 the system).
119
120 A child created via fork(2) inherits its parent's CPU affinity mask.
121 The affinity mask is preserved across an execve(2).
122
123 C library/kernel differences
124 This manual page describes the glibc interface for the CPU affinity
125 calls. The actual system call interface is slightly different, with
126 the mask being typed as unsigned long *, reflecting the fact that the
127 underlying implementation of CPU sets is a simple bit mask.
128
129 On success, the raw sched_getaffinity() system call returns the number
130 of bytes placed copied into the mask buffer; this will be the minimum
131 of cpusetsize and the size (in bytes) of the cpumask_t data type that
132 is used internally by the kernel to represent the CPU set bit mask.
133
134 Handling systems with large CPU affinity masks
135 The underlying system calls (which represent CPU masks as bit masks of
136 type unsigned long *) impose no restriction on the size of the CPU
137 mask. However, the cpu_set_t data type used by glibc has a fixed size
138 of 128 bytes, meaning that the maximum CPU number that can be repre‐
139 sented is 1023. If the kernel CPU affinity mask is larger than 1024,
140 then calls of the form:
141
142 sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
143
144 fail with the error EINVAL, the error produced by the underlying system
145 call for the case where the mask size specified in cpusetsize is
146 smaller than the size of the affinity mask used by the kernel.
147 (Depending on the system CPU topology, the kernel affinity mask can be
148 substantially larger than the number of active CPUs in the system.)
149
150 When working on systems with large kernel CPU affinity masks, one must
151 dynamically allocate the mask argument (see CPU_ALLOC(3)). Currently,
152 the only way to do this is by probing for the size of the required mask
153 using sched_getaffinity() calls with increasing mask sizes (until the
154 call does not fail with the error EINVAL).
155
156 Be aware that CPU_ALLOC(3) may allocate a slightly larger CPU set than
157 requested (because CPU sets are implemented as bit masks allocated in
158 units of sizeof(long)). Consequently, sched_getaffinity() can set bits
159 beyond the requested allocation size, because the kernel sees a few
160 additional bits. Therefore, the caller should iterate over the bits in
161 the returned set, counting those which are set, and stop upon reaching
162 the value returned by CPU_COUNT(3) (rather than iterating over the num‐
163 ber of bits requested to be allocated).
164
166 The program below creates a child process. The parent and child then
167 each assign themselves to a specified CPU and execute identical loops
168 that consume some CPU time. Before terminating, the parent waits for
169 the child to complete. The program takes three command-line arguments:
170 the CPU number for the parent, the CPU number for the child, and the
171 number of loop iterations that both processes should perform.
172
173 As the sample runs below demonstrate, the amount of real and CPU time
174 consumed when running the program will depend on intra-core caching
175 effects and whether the processes are using the same CPU.
176
177 We first employ lscpu(1) to determine that this (x86) system has two
178 cores, each with two CPUs:
179
180 $ lscpu | egrep -i 'core.*:|socket'
181 Thread(s) per core: 2
182 Core(s) per socket: 2
183 Socket(s): 1
184
185 We then time the operation of the example program for three cases: both
186 processes running on the same CPU; both processes running on different
187 CPUs on the same core; and both processes running on different CPUs on
188 different cores.
189
190 $ time -p ./a.out 0 0 100000000
191 real 14.75
192 user 3.02
193 sys 11.73
194 $ time -p ./a.out 0 1 100000000
195 real 11.52
196 user 3.98
197 sys 19.06
198 $ time -p ./a.out 0 3 100000000
199 real 7.89
200 user 3.29
201 sys 12.07
202
203 Program source
204
205 #define _GNU_SOURCE
206 #include <sched.h>
207 #include <stdio.h>
208 #include <stdlib.h>
209 #include <unistd.h>
210 #include <sys/wait.h>
211
212 #define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
213 } while (0)
214
215 int
216 main(int argc, char *argv[])
217 {
218 cpu_set_t set;
219 int parentCPU, childCPU;
220 int nloops, j;
221
222 if (argc != 4) {
223 fprintf(stderr, "Usage: %s parent-cpu child-cpu num-loops\n",
224 argv[0]);
225 exit(EXIT_FAILURE);
226 }
227
228 parentCPU = atoi(argv[1]);
229 childCPU = atoi(argv[2]);
230 nloops = atoi(argv[3]);
231
232 CPU_ZERO(&set);
233
234 switch (fork()) {
235 case -1: /* Error */
236 errExit("fork");
237
238 case 0: /* Child */
239 CPU_SET(childCPU, &set);
240
241 if (sched_setaffinity(getpid(), sizeof(set), &set) == -1)
242 errExit("sched_setaffinity");
243
244 for (j = 0; j < nloops; j++)
245 getppid();
246
247 exit(EXIT_SUCCESS);
248
249 default: /* Parent */
250 CPU_SET(parentCPU, &set);
251
252 if (sched_setaffinity(getpid(), sizeof(set), &set) == -1)
253 errExit("sched_setaffinity");
254
255 for (j = 0; j < nloops; j++)
256 getppid();
257
258 wait(NULL); /* Wait for child to terminate */
259 exit(EXIT_SUCCESS);
260 }
261 }
262
264 lscpu(1), nproc(1), taskset(1), clone(2), getcpu(2), getpriority(2),
265 gettid(2), nice(2), sched_get_priority_max(2),
266 sched_get_priority_min(2), sched_getscheduler(2),
267 sched_setscheduler(2), setpriority(2), CPU_SET(3), get_nprocs(3),
268 pthread_setaffinity_np(3), sched_getcpu(3), capabilities(7), cpuset(7),
269 sched(7), numactl(8)
270
272 This page is part of release 5.02 of the Linux man-pages project. A
273 description of the project, information about reporting bugs, and the
274 latest version of this page, can be found at
275 https://www.kernel.org/doc/man-pages/.
276
277
278
279Linux 2019-03-06 SCHED_SETAFFINITY(2)