1libxdp(3) libxdp - library for loading XDP programs libxdp(3)
2
3
4
6 libxdp - library for attaching XDP programs and using AF_XDP sockets
7
9 This directory contains the files for the libxdp library for attaching
10 XDP programs to network interfaces and using AF_XDP sockets. The li‐
11 brary is fairly lightweight and relies on libbpf to do the heavy lift‐
12 ing for processing eBPF object files etc.
13
14
15 Libxdp provides two primary features on top of libbpf. The first is the
16 ability to load multiple XDP programs in sequence on a single network
17 device (which is not natively supported by the kernel). This support
18 relies on the freplace functionality in the kernel, which makes it pos‐
19 sible to attach an eBPF program as a replacement for a global function
20 in another (already loaded) eBPF program. The second main feature is
21 helper functions for configuring AF_XDP sockets as well as reading and
22 writing packets from these sockets.
23
24
25 Some of the functionality provided by libxdp depends on particular ker‐
26 nel features; see the "Kernel feature compatibility" section below for
27 details.
28
29
30 Using libxdp from an application
31 Basic usage of libxdp from an application is quite straight forward.
32 The following example loads, then unloads, an XDP program from the 'lo'
33 interface:
34
35 #define IFINDEX 1
36
37 struct xdp_program *prog;
38 int err;
39
40 prog = xdp_program__open_file("my-program.o", "section_name", NULL);
41 err = xdp_program__attach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
42
43 if (!err)
44 xdp_program__detach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
45
46 xdp_program__close(prog);
47
48
49 The xdp_program structure is an opaque structure that represents a sin‐
50 gle XDP program. libxdp contains functions to create such a struct ei‐
51 ther from a BPF object file on disk, from a libbpf BPF object, or from
52 an identifier of a program that is already loaded into the kernel:
53
54 struct xdp_program *xdp_program__from_bpf_obj(struct bpf_object *obj,
55 const char *section_name);
56 struct xdp_program *xdp_program__find_file(const char *filename,
57 const char *section_name,
58 struct bpf_object_open_opts *opts);
59 struct xdp_program *xdp_program__open_file(const char *filename,
60 const char *section_name,
61 struct bpf_object_open_opts *opts);
62 struct xdp_program *xdp_program__from_fd(int fd);
63 struct xdp_program *xdp_program__from_id(__u32 prog_id);
64 struct xdp_program *xdp_program__from_pin(const char *pin_path);
65
66
67 The functions that open a BPF object or file need the function name of
68 the XDP program as well as the file name or object, since an ELF file
69 can contain multiple XDP programs. The xdp_program__find_file() func‐
70 tion takes a filename without a path, and will look for the object in
71 LIBXDP_OBJECT_PATH which defaults to /usr/lib/bpf (or /usr/lib64/bpf on
72 systems using a split library path). This is convenient for applica‐
73 tions shipping pre-compiled eBPF object files.
74
75
76 The xdp_program__attach() function will attach the program to an inter‐
77 face, building a dispatcher program to execute it. Multiple programs
78 can be attached at once with xdp_program__attach_multi(); they will be
79 sorted in order of their run priority, and execution from one program
80 to the next will proceed based on the chain call actions defined for
81 each program (see the Program metadata section below). Because the
82 loading process involves modifying the attach type of the program, the
83 attach functions only work with struct xdp_program objects that have
84 not yet been loaded into the kernel.
85
86
87 When using the attach functions to attach to an interface that already
88 has an XDP program loaded, libxdp will attempt to add the program to
89 the list of loaded programs. However, this may fail, either due to
90 missing kernel support, or because the already-attached program was not
91 loaded using a dispatcher compatible with libxdp. If the kernel support
92 for incremental attach (merged in kernel 5.10) is missing, the only way
93 to actually run multiple programs on a single interface is to attach
94 them all at the same time with xdp_program__attach_multi(). If the ex‐
95 isting program is not an XDP dispatcher, that program will have to be
96 detached from the interface before libxdp can attach a new one. This
97 can be done by calling xdp_program__detach() with a reference to the
98 loaded program; but note that this will of course break any application
99 relying on that other XDP program to be present.
100
101
103 To support multiple XDP programs on the same interface, libxdp uses two
104 pieces of metadata for each XDP program: Run priority and chain call
105 actions.
106
107
108 Run priority
109 This is the priority of the program and is a simple integer used to
110 sort programs when loading multiple programs onto the same interface.
111 Programs that wish to run early (such as a packet filter) should set
112 low values for this, while programs that want to run later (such as a
113 packet forwarder or counter) should set higher values. Note that later
114 programs are only run if the previous programs end with a return code
115 that is part of its chain call actions (see below). If not specified,
116 the default priority value is 50.
117
118
119 Chain call actions
120 These are the program return codes that the program indicate for pack‐
121 ets that should continue processing. If the program returns one of
122 these actions, later programs in the call chain will be run, whereas if
123 it returns any other action, processing will be interrupted, and the
124 XDP dispatcher will return the verdict immediately. If not set, this
125 defaults to just XDP_PASS, which is likely the value most programs
126 should use.
127
128
129 Specifying metadata
130 The metadata outlined above is specified as BTF information embedded in
131 the ELF file containing the XDP program. The xdp_helpers.h file shipped
132 with libxdp contains helper macros to include this information, which
133 can be used as follows:
134
135 #include <bpf/bpf_helpers.h>
136 #include <xdp/xdp_helpers.h>
137
138 struct {
139 __uint(priority, 10);
140 __uint(XDP_PASS, 1);
141 __uint(XDP_DROP, 1);
142 } XDP_RUN_CONFIG(my_xdp_func);
143
144
145 This example specifies that the XDP program in my_xdp_func should have
146 priority 10 and that its chain call actions are XDP_PASS and XDP_DROP.
147 In a source file with multiple XDP programs in the same file, a defini‐
148 tion like the above can be included for each program (main XDP func‐
149 tion). Any program that does not specify any config information will
150 use the default values outlined above.
151
152
153 Inspecting and modifying metadata
154 libxdp exposes the following functions that an application can use to
155 inspect and modify the metadata on an XDP program. Modification is only
156 possible before a program is attached on an interface. These functions
157 won't modify the BTF information itself, but the new values will be
158 stored as part of the program attachment.
159
160 unsigned int xdp_program__run_prio(const struct xdp_program *xdp_prog);
161 int xdp_program__set_run_prio(struct xdp_program *xdp_prog,
162 unsigned int run_prio);
163 bool xdp_program__chain_call_enabled(const struct xdp_program *xdp_prog,
164 enum xdp_action action);
165 int xdp_program__set_chain_call_enabled(struct xdp_program *prog,
166 unsigned int action,
167 bool enabled);
168 int xdp_program__print_chain_call_actions(const struct xdp_program *prog,
169 char *buf,
170 size_t buf_len);
171
172
174 To support multiple non-offloaded programs on the same network inter‐
175 face, libxdp uses a dispatcher program which is a small wrapper program
176 that will call each component program in turn, expect the return code,
177 and then chain call to the next program based on the chain call actions
178 of the previous program (see the Program metadata section above).
179
180
181 While applications using libxdp do not need to know the details of the
182 dispatcher program to just load an XDP program unto an interface,
183 libxdp does expose the dispatcher and its attached component programs,
184 which can be used to list the programs currently attached to an inter‐
185 face.
186
187
188 The structure used for this is struct xdp_multiprog, which can only be
189 constructed from the programs loaded on an interface based on ifindex.
190 The API for getting a multiprog reference and iterating through the at‐
191 tached programs looks like this:
192
193 struct xdp_multiprog *xdp_multiprog__get_from_ifindex(int ifindex);
194 struct xdp_program *xdp_multiprog__next_prog(const struct xdp_program *prog,
195 const struct xdp_multiprog *mp);
196 void xdp_multiprog__close(struct xdp_multiprog *mp);
197 int xdp_multiprog__detach(struct xdp_multiprog *mp, int ifindex);
198 enum xdp_attach_mode xdp_multiprog__attach_mode(const struct xdp_multiprog *mp);
199 struct xdp_program *xdp_multiprog__main_prog(const struct xdp_multiprog *mp);
200 struct xdp_program *xdp_multiprog__hw_prog(const struct xdp_multiprog *mp);
201 bool xdp_multiprog__is_legacy(const struct xdp_multiprog *mp);
202
203
204 If a non-offloaded program is attached to the interface which libxdp
205 doesn't recognise as a dispatcher program, an xdp_multiprog structure
206 will still be returned, and xdp_multiprog__is_legacy() will return true
207 for that program (note that this also holds true if only an offloaded
208 program is loaded). A reference to that (regular) XDP program can be
209 obtained by xdp_multiprog__main_prog(). If the program attached to the
210 interface is a dispatcher program, xdp_multiprog__main_prog() will re‐
211 turn a reference to the dispatcher program itself, which is mainly use‐
212 ful for obtaining other data about that program (such as the program
213 ID). A reference to an offloaded program can be acquired using xdp_mul‐
214 tiprog_hw_prog(). Function xdp_multiprog__attach_mode() returns the at‐
215 tach mode of the non-offloaded program, whether an offloaded program is
216 attached should be checked through xdp_multiprog_hw_prog().
217
218
219 Pinning in bpffs
220 The kernel will automatically detach component programs from the dis‐
221 patcher once the last reference to them disappears. To prevent this
222 from happening, libxdp will pin the component program references in
223 bpffs before attaching the dispatcher to the network interface. The
224 pathnames generated for pinning is as follows:
225
226
227 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID - dispatcher program for
228 IFINDEX with BPF program ID DID
229
230 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-prog - component program
231 0, program reference
232
233 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-link - component program
234 0, bpf_link reference
235
236 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-prog - component program
237 1, program reference
238
239 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-link - component program
240 1, bpf_link reference
241
242 — etc, up to ten component programs
243
244
245 If set, the LIBXDP_BPFFS environment variable will override the loca‐
246 tion of bpffs, but the xdp subdirectory is always used. If no bpffs is
247 mounted, libxdp will consult the environment variable LIBXDP_BPFFS_AU‐
248 TOMOUNT. If this is set to 1, libxdp will attempt to automount a bpffs.
249 If not, libxdp will fall back to loading a single program without a
250 dispatcher, as if the kernel did not support the features needed for
251 multiprog attachment.
252
253
255 Libxdp implements helper functions for configuring AF_XDP sockets as
256 well as reading and writing packets from these sockets. AF_XDP sockets
257 can be used to redirect packets to user-space at high rates from an XDP
258 program. Note that this functionality used to reside in libbpf, but has
259 now been moved over to libxdp as it is a better fit for this library.
260 As of the 1.0 release of libbpf, the AF_XDP socket support will be re‐
261 moved and all future development will be performed in libxdp instead.
262
263
264 For an overview of AF_XDP sockets, please refer to this Linux Plumbers
265 paper (http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-
266 v3.pdf) and the documentation in the Linux kernel (Documentation/net‐
267 working/af_xdp.rst or https://www.kernel.org/doc/html/latest/network‐
268 ing/af_xdp.html).
269
270
271 For an example on how to use the interface, take a look at the AF_XDP-
272 example and AF_XDP-forwarding programs in the bpf-examples repository:
273 https://github.com/xdp-project/bpf-examples.
274
275
276 Control path
277 Libxdp provides helper functions for creating and destroying umems and
278 sockets as shown below. The first thing that a user generally wants to
279 do is to create a umem area. This is the area that will contain all
280 packets received and the ones that are going to be sent. After that,
281 AF_XDP sockets can be created tied to this umem. These can either be
282 sockets that have exclusive ownership of that umem through
283 xsk_socket__create() or shared with other sockets using
284 xsk_socket__create_shared. There is one option called
285 XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD that can be set in the libxdp_flags
286 field (also called libbpf_flags for compatibility reasons). This will
287 make libxdp not load any XDP program or set and BPF maps which is a
288 must if users want to add their own XDP program.
289
290
291 If there is already a socket created with socket(AF_XDP, SOCK_RAW, 0)
292 not bound and not tied to any umem, file descriptor of this socket can
293 be used in an xsk_umem__create_with_fd() variant of the umem creation
294 function.
295
296 int xsk_umem__create(struct xsk_umem **umem,
297 void *umem_area, __u64 size,
298 struct xsk_ring_prod *fill,
299 struct xsk_ring_cons *comp,
300 const struct xsk_umem_config *config);
301 int xsk_umem__create_with_fd(struct xsk_umem **umem,
302 int fd, void *umem_area, __u64 size,
303 struct xsk_ring_prod *fill,
304 struct xsk_ring_cons *comp,
305 const struct xsk_umem_config *config);
306 int xsk_socket__create(struct xsk_socket **xsk,
307 const char *ifname, __u32 queue_id,
308 struct xsk_umem *umem,
309 struct xsk_ring_cons *rx,
310 struct xsk_ring_prod *tx,
311 const struct xsk_socket_config *config);
312 int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
313 const char *ifname,
314 __u32 queue_id, struct xsk_umem *umem,
315 struct xsk_ring_cons *rx,
316 struct xsk_ring_prod *tx,
317 struct xsk_ring_prod *fill,
318 struct xsk_ring_cons *comp,
319 const struct xsk_socket_config *config);
320 int xsk_umem__delete(struct xsk_umem *umem);
321 void xsk_socket__delete(struct xsk_socket *xsk);
322
323
324 There are also two helper function to get the file descriptor of a umem
325 or a socket. These are needed when using standard Linux syscalls such
326 as poll(), recvmsg(), sendto(), etc.
327
328 int xsk_umem__fd(const struct xsk_umem *umem);
329 int xsk_socket__fd(const struct xsk_socket *xsk);
330
331
332 The control path also provides two APIs for setting up AF_XDP sockets
333 when the process that is going to use the AF_XDP socket is non-privi‐
334 leged. These two functions perform the operations that require privi‐
335 leges and can be executed from some form of control process that has
336 the necessary privileges. The xsk_socket__create executed on the non-
337 privileged process will then skip these two steps. For an example on
338 how to use these, please take a look at the AF_XDP-example program in
339 the bpf-examples repository: https://github.com/xdp-project/bpf-exam‐
340 ples/tree/master/AF_XDP-example.
341
342 int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
343 int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
344
345
346 To further reduce required level of privileges, an AF_XDP socket can be
347 created beforehand with socket(AF_XDP, SOCK_RAW, 0) and passed to a
348 non-privileged process. This socket can be used in xsk_umem__cre‐
349 ate_with_fd() and later in xsk_socket__create() with created umem.
350 xsk_socket__create_shared() would still require privileges for AF_XDP
351 socket creation.
352
353
354 Data path
355 For performance reasons, all the data path functions are static inline
356 functions found in the xsk.h header file so they can be optimized into
357 the target application binary for best possible performance. There are
358 four FIFO rings of two main types: producer rings (fill and Tx) and
359 consumer rings (Rx and completion). The producer rings use
360 xsk_ring_prod functions and consumer rings use xsk_ring_cons functions.
361 For producer rings, you start with reserving one or more slots in a
362 producer ring and then when they have been filled out, you submit them
363 so that the kernel will act on them. For a consumer ring, you peek if
364 there are any new packets in the ring and if so you can read them from
365 the ring. Once you are done reading them, you release them back to the
366 kernel so it can use them for new packets. There is also a cancel oper‐
367 ation for consumer rings if the application does not want to consume
368 all packets received with the peek operation.
369
370 __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx);
371 void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb);
372 __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx);
373 void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb);
374 void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb);
375
376
377 The functions below are used for reading and writing the descriptors of
378 the rings. xsk_ring_prod__fill_addr() and xsk_ring_prod__tx_desc()
379 writes entries in the fill and Tx rings respectively, while
380 xsk_ring_cons__comp_addr and xsk_ring_cons__rx_desc reads entries from
381 the completion and Rx rings respectively. The idx is the parameter re‐
382 turned in the xsk_ring_prod__reserve or xsk_ring_cons__peek calls. To
383 advance to the next entry, simply do idx++.
384
385 __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, __u32 idx);
386 struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx, __u32 idx);
387 const __u64 *xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx);
388 const struct xdp_desc *xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx);
389
390
391 The xsk_umem functions are used to get a pointer to the packet data it‐
392 self, always located inside the umem. In the default aligned mode, you
393 can get the addr variable straight from the Rx descriptor. But in un‐
394 aligned mode, you need to use the three last function below as the off‐
395 set used is carried in the upper 16 bits of the addr. Therefore, you
396 cannot use the addr straight from the descriptor in the unaligned case.
397
398 void *xsk_umem__get_data(void *umem_area, __u64 addr);
399 __u64 xsk_umem__extract_addr(__u64 addr);
400 __u64 xsk_umem__extract_offset(__u64 addr);
401 __u64 xsk_umem__add_offset_to_addr(__u64 addr);
402
403
404 There is one more function in the data path and that checks if the
405 need_wakeup flag is set. Use of this flag is highly encouraged and
406 should be enabled by setting XDP_USE_NEED_WAKEUP bit in the
407 xdp_bind_flags field that is provided to the xsk_socket_cre‐
408 ate_[shared]() calls. If this function returns true, then you need to
409 call recvmsg(), sendto(), or poll() depending on the situation.
410 recvmsg() if you are receiving, or sendto() if you are sending. poll()
411 can be used for both cases and provide the ability to sleep too, as
412 with any other socket. But note that poll is a slower operation than
413 the other two.
414
415 int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r);
416
417
418 For an example on how to use all these APIs, take a look at the AF_XDP-
419 example and AF_XDP-forwarding programs in the bpf-examples repository:
420 https://github.com/xdp-project/bpf-examples.
421
422
424 The features exposed by libxdp relies on certain kernel versions and
425 BPF features to work. To get the full benefit of all features, libxdp
426 needs to be used with kernel 5.10 or newer, unless the commits men‐
427 tioned below have been backported. However, libxdp will probe the ker‐
428 nel and transparently fall back to legacy loading procedures, so it is
429 possible to use the library with older versions, although some features
430 will be unavailable, as detailed below.
431
432
433 The ability to attach multiple BPF programs to a single interface re‐
434 lies on the kernel "BPF program extension" feature which was introduced
435 by commit be8704ff07d2 ("bpf: Introduce dynamic program extensions") in
436 the upstream kernel and first appeared in kernel release 5.6. To incre‐
437 mentally attach multiple programs, a further refinement added by commit
438 4a1e7c0c63e0 ("bpf: Support attaching freplace programs to multiple at‐
439 tach points") is needed; this first appeared in the upstream kernel
440 version 5.10. The functionality relies on the "BPF trampolines" feature
441 which is unfortunately only available on the x86_64 architecture. In
442 other words, kernels before 5.6 can only attach a single XDP program to
443 each interface, kernels 5.6+ can attach multiple programs if they are
444 all attached at the same time, and kernels 5.10 have full support for
445 XDP multiprog on x86_64. On other architectures, only a single program
446 can be attached to each interface.
447
448
449 To load AF_XDP programs, kernel support for AF_XDP sockets needs to be
450 included and enabled in the kernel build. In addition, when using
451 AF_XDP sockets, an XDP program is also loaded on the interface. The XDP
452 program used for this by libxdp requires the ability to do map lookups
453 into XSK maps, which was introduced with commit fada7fdc83c0 ("bpf: Al‐
454 low bpf_map_lookup_elem() on an xskmap") in kernel 5.3. This means that
455 the minimum required kernel version for using AF_XDP is kernel 5.3;
456 however, for the AF_XDP XDP program to co-exist with other programs,
457 the same constraints for multiprog applies as outlined above.
458
459
460 Note that some Linux distributions backport features to earlier kernel
461 versions, especially in enterprise kernels; for instance, Red Hat En‐
462 terprise Linux kernels include everything needed for libxdp to function
463 since RHEL 8.5.
464
465
466 Finally, XDP programs loaded using the multiprog facility must include
467 type information (using the BPF Type Format, BTF). To get this, compile
468 the programs with a recent version of Clang/LLVM (version 10+), and en‐
469 able debug information when compiling (using the -g option).
470
471
473 Please report any bugs on Github: https://github.com/xdp-project/xdp-
474 tools/issues
475
476
478 libxdp and this man page were written by Toke Høiland-Jørgensen. AF_XDP
479 support and documentation was contributed by Magnus Karlsson.
480
481
482
483v1.4.1 October 20, 2023 libxdp(3)