1MEMBARRIER(2) Linux Programmer's Manual MEMBARRIER(2)
2
3
4
6 membarrier - issue memory barriers on a set of threads
7
9 #include <linux/membarrier.h>
10
11 int membarrier(int cmd, int flags);
12
14 The membarrier() system call helps reducing the overhead of the memory
15 barrier instructions required to order memory accesses on multi-core
16 systems. However, this system call is heavier than a memory barrier,
17 so using it effectively is not as simple as replacing memory barriers
18 with this system call, but requires understanding of the details below.
19
20 Use of memory barriers needs to be done taking into account that a mem‐
21 ory barrier always needs to be either matched with its memory barrier
22 counterparts, or that the architecture's memory model doesn't require
23 the matching barriers.
24
25 There are cases where one side of the matching barriers (which we will
26 refer to as "fast side") is executed much more often than the other
27 (which we will refer to as "slow side"). This is a prime target for
28 the use of membarrier(). The key idea is to replace, for these match‐
29 ing barriers, the fast-side memory barriers by simple compiler barri‐
30 ers, for example:
31
32 asm volatile ("" : : : "memory")
33
34 and replace the slow-side memory barriers by calls to membarrier().
35
36 This will add overhead to the slow side, and remove overhead from the
37 fast side, thus resulting in an overall performance increase as long as
38 the slow side is infrequent enough that the overhead of the membar‐
39 rier() calls does not outweigh the performance gain on the fast side.
40
41 The cmd argument is one of the following:
42
43 MEMBARRIER_CMD_QUERY
44 Query the set of supported commands. The return value of the
45 call is a bit mask of supported commands. MEMBARRIER_CMD_QUERY,
46 which has the value 0, is not itself included in this bit mask.
47 This command is always supported (on kernels where membarrier()
48 is provided).
49
50 MEMBARRIER_CMD_SHARED
51 Ensure that all threads from all processes on the system pass
52 through a state where all memory accesses to user-space
53 addresses match program order between entry to and return from
54 the membarrier() system call. All threads on the system are
55 targeted by this command.
56
57 MEMBARRIER_CMD_PRIVATE_EXPEDITED (since Linux 4.14)
58 Execute a memory barrier on each running thread belonging to the
59 same process as the current thread. Upon return from system
60 call, the calling thread is assured that all its running threads
61 siblings have passed through a state where all memory accesses
62 to user-space addresses match program order between entry to and
63 return from the system call (non-running threads are de facto in
64 such a state). This covers only threads from the same process
65 as the calling thread.
66
67 The "expedited" commands complete faster than the non-expedited
68 ones; they never block, but have the downside of causing extra
69 overhead. A process needs to register its intent to use the
70 private expedited command prior to using it.
71
72 MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED (since Linux 4.14)
73 Register the process's intent to use MEMBARRIER_CMD_PRI‐
74 VATE_EXPEDITED.
75
76 The flags argument is currently unused and must be specified as 0.
77
78 All memory accesses performed in program order from each targeted
79 thread are guaranteed to be ordered with respect to membarrier().
80
81 If we use the semantic barrier() to represent a compiler barrier forc‐
82 ing memory accesses to be performed in program order across the bar‐
83 rier, and smp_mb() to represent explicit memory barriers forcing full
84 memory ordering across the barrier, we have the following ordering ta‐
85 ble for each pairing of barrier(), membarrier() and smp_mb(). The pair
86 ordering is detailed as (O: ordered, X: not ordered):
87
88 barrier() smp_mb() membarrier()
89 barrier() X X O
90 smp_mb() X O O
91 membarrier() O O O
92
94 On success, the MEMBARRIER_CMD_QUERY operation returns a bit mask of
95 supported commands, and the MEMBARRIER_CMD_SHARED , MEMBARRIER_CMD_PRI‐
96 VATE_EXPEDITED , and MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED , opera‐
97 tions return zero. On error, -1 is returned, and errno is set appro‐
98 priately.
99
100 For a given command, with flags set to 0, this system call is guaran‐
101 teed to always return the same value until reboot. Further calls with
102 the same arguments will lead to the same result. Therefore, with flags
103 set to 0, error handling is required only for the first call to membar‐
104 rier().
105
107 EINVAL cmd is invalid, or flags is nonzero, or the MEMBAR‐
108 RIER_CMD_SHARED command is disabled because the nohz_full CPU
109 parameter has been set.
110
111 ENOSYS The membarrier() system call is not implemented by this kernel.
112
113 EPERM The current process was not registered prior to using private
114 expedited commands.
115
117 The membarrier() system call was added in Linux 4.3.
118
120 membarrier() is Linux-specific.
121
123 A memory barrier instruction is part of the instruction set of archi‐
124 tectures with weakly-ordered memory models. It orders memory accesses
125 prior to the barrier and after the barrier with respect to matching
126 barriers on other cores. For instance, a load fence can order loads
127 prior to and following that fence with respect to stores ordered by
128 store fences.
129
130 Program order is the order in which instructions are ordered in the
131 program assembly code.
132
133 Examples where membarrier() can be useful include implementations of
134 Read-Copy-Update libraries and garbage collectors.
135
137 Assuming a multithreaded application where "fast_path()" is executed
138 very frequently, and where "slow_path()" is executed infrequently, the
139 following code (x86) can be transformed using membarrier():
140
141 #include <stdlib.h>
142
143 static volatile int a, b;
144
145 static void
146 fast_path(int *read_b)
147 {
148 a = 1;
149 asm volatile ("mfence" : : : "memory");
150 *read_b = b;
151 }
152
153 static void
154 slow_path(int *read_a)
155 {
156 b = 1;
157 asm volatile ("mfence" : : : "memory");
158 *read_a = a;
159 }
160
161 int
162 main(int argc, char **argv)
163 {
164 int read_a, read_b;
165
166 /*
167 * Real applications would call fast_path() and slow_path()
168 * from different threads. Call those from main() to keep
169 * this example short.
170 */
171
172 slow_path(&read_a);
173 fast_path(&read_b);
174
175 /*
176 * read_b == 0 implies read_a == 1 and
177 * read_a == 0 implies read_b == 1.
178 */
179
180 if (read_b == 0 && read_a == 0)
181 abort();
182
183 exit(EXIT_SUCCESS);
184 }
185
186 The code above transformed to use membarrier() becomes:
187
188 #define _GNU_SOURCE
189 #include <stdlib.h>
190 #include <stdio.h>
191 #include <unistd.h>
192 #include <sys/syscall.h>
193 #include <linux/membarrier.h>
194
195 static volatile int a, b;
196
197 static int
198 membarrier(int cmd, int flags)
199 {
200 return syscall(__NR_membarrier, cmd, flags);
201 }
202
203 static int
204 init_membarrier(void)
205 {
206 int ret;
207
208 /* Check that membarrier() is supported. */
209
210 ret = membarrier(MEMBARRIER_CMD_QUERY, 0);
211 if (ret < 0) {
212 perror("membarrier");
213 return -1;
214 }
215
216 if (!(ret & MEMBARRIER_CMD_SHARED)) {
217 fprintf(stderr,
218 "membarrier does not support MEMBARRIER_CMD_SHARED\n");
219 return -1;
220 }
221
222 return 0;
223 }
224
225 static void
226 fast_path(int *read_b)
227 {
228 a = 1;
229 asm volatile ("" : : : "memory");
230 *read_b = b;
231 }
232
233 static void
234 slow_path(int *read_a)
235 {
236 b = 1;
237 membarrier(MEMBARRIER_CMD_SHARED, 0);
238 *read_a = a;
239 }
240
241 int
242 main(int argc, char **argv)
243 {
244 int read_a, read_b;
245
246 if (init_membarrier())
247 exit(EXIT_FAILURE);
248
249 /*
250 * Real applications would call fast_path() and slow_path()
251 * from different threads. Call those from main() to keep
252 * this example short.
253 */
254
255 slow_path(&read_a);
256 fast_path(&read_b);
257
258 /*
259 * read_b == 0 implies read_a == 1 and
260 * read_a == 0 implies read_b == 1.
261 */
262
263 if (read_b == 0 && read_a == 0)
264 abort();
265
266 exit(EXIT_SUCCESS);
267 }
268
270 This page is part of release 4.15 of the Linux man-pages project. A
271 description of the project, information about reporting bugs, and the
272 latest version of this page, can be found at
273 https://www.kernel.org/doc/man-pages/.
274
275
276
277Linux 2017-11-15 MEMBARRIER(2)