1libxdp(3)          libxdp - library for loading XDP programs         libxdp(3)
2
3
4

NAME

6       libxdp - library for attaching XDP programs and using AF_XDP sockets
7

SYNOPSIS

9       This  directory contains the files for the libxdp library for attaching
10       XDP programs to network interfaces and using AF_XDP  sockets.  The  li‐
11       brary  is fairly lightweight and relies on libbpf to do the heavy lift‐
12       ing for processing eBPF object files etc.
13
14
15       Libxdp provides two primary features on top of libbpf. The first is the
16       ability  to  load multiple XDP programs in sequence on a single network
17       device (which is not natively supported by the  kernel).  This  support
18       relies on the freplace functionality in the kernel, which makes it pos‐
19       sible to attach an eBPF program as a replacement for a global  function
20       in  another  (already  loaded) eBPF program. The second main feature is
21       helper functions for configuring AF_XDP sockets as well as reading  and
22       writing packets from these sockets.
23
24
25       Some of the functionality provided by libxdp depends on particular ker‐
26       nel features; see the "Kernel feature compatibility" section below  for
27       details.
28
29
30   Using libxdp from an application
31       Basic  usage  of  libxdp from an application is quite straight forward.
32       The following example loads, then unloads, an XDP program from the 'lo'
33       interface:
34
35              #define IFINDEX 1
36
37              struct xdp_program *prog;
38              int err;
39
40              prog = xdp_program__open_file("my-program.o", "section_name", NULL);
41              err = xdp_program__attach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
42
43              if (!err)
44                  xdp_program__detach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
45
46              xdp_program__close(prog);
47
48
49       The xdp_program structure is an opaque structure that represents a sin‐
50       gle XDP program. libxdp contains functions to create such a struct  ei‐
51       ther  from a BPF object file on disk, from a libbpf BPF object, or from
52       an identifier of a program that is already loaded into the kernel:
53
54              struct xdp_program *xdp_program__from_bpf_obj(struct bpf_object *obj,
55                                             const char *section_name);
56              struct xdp_program *xdp_program__find_file(const char *filename,
57                                          const char *section_name,
58                                          struct bpf_object_open_opts *opts);
59              struct xdp_program *xdp_program__open_file(const char *filename,
60                                          const char *section_name,
61                                          struct bpf_object_open_opts *opts);
62              struct xdp_program *xdp_program__from_fd(int fd);
63              struct xdp_program *xdp_program__from_id(__u32 prog_id);
64              struct xdp_program *xdp_program__from_pin(const char *pin_path);
65
66
67       The functions that open a BPF object or file need the function name  of
68       the  XDP  program as well as the file name or object, since an ELF file
69       can contain multiple XDP programs. The  xdp_program__find_file()  func‐
70       tion  takes  a filename without a path, and will look for the object in
71       LIBXDP_OBJECT_PATH which defaults to /usr/lib/bpf (or /usr/lib64/bpf on
72       systems  using  a  split library path). This is convenient for applica‐
73       tions shipping pre-compiled eBPF object files.
74
75
76       The xdp_program__attach() function will attach the program to an inter‐
77       face,  building  a  dispatcher program to execute it. Multiple programs
78       can be attached at once with xdp_program__attach_multi(); they will  be
79       sorted  in  order of their run priority, and execution from one program
80       to the next will proceed based on the chain call  actions  defined  for
81       each  program  (see  the  Program  metadata section below). Because the
82       loading process involves modifying the attach type of the program,  the
83       attach  functions  only  work with struct xdp_program objects that have
84       not yet been loaded into the kernel.
85
86
87       When using the attach functions to attach to an interface that  already
88       has  an  XDP  program loaded, libxdp will attempt to add the program to
89       the list of loaded programs. However, this  may  fail,  either  due  to
90       missing kernel support, or because the already-attached program was not
91       loaded using a dispatcher compatible with libxdp. If the kernel support
92       for incremental attach (merged in kernel 5.10) is missing, the only way
93       to actually run multiple programs on a single interface  is  to  attach
94       them  all at the same time with xdp_program__attach_multi(). If the ex‐
95       isting program is not an XDP dispatcher, that program will have  to  be
96       detached  from  the  interface before libxdp can attach a new one. This
97       can be done by calling xdp_program__detach() with a  reference  to  the
98       loaded program; but note that this will of course break any application
99       relying on that other XDP program to be present.
100
101

Program metadata

103       To support multiple XDP programs on the same interface, libxdp uses two
104       pieces  of  metadata  for each XDP program: Run priority and chain call
105       actions.
106
107
108   Run priority
109       This is the priority of the program and is a  simple  integer  used  to
110       sort  programs  when loading multiple programs onto the same interface.
111       Programs that wish to run early (such as a packet  filter)  should  set
112       low  values  for this, while programs that want to run later (such as a
113       packet forwarder or counter) should set higher values. Note that  later
114       programs  are  only run if the previous programs end with a return code
115       that is part of its chain call actions (see below). If  not  specified,
116       the default priority value is 50.
117
118
119   Chain call actions
120       These  are the program return codes that the program indicate for pack‐
121       ets that should continue processing. If  the  program  returns  one  of
122       these actions, later programs in the call chain will be run, whereas if
123       it returns any other action, processing will be  interrupted,  and  the
124       XDP  dispatcher  will  return the verdict immediately. If not set, this
125       defaults to just XDP_PASS, which is  likely  the  value  most  programs
126       should use.
127
128
129   Specifying metadata
130       The metadata outlined above is specified as BTF information embedded in
131       the ELF file containing the XDP program. The xdp_helpers.h file shipped
132       with  libxdp  contains helper macros to include this information, which
133       can be used as follows:
134
135              #include <bpf/bpf_helpers.h>
136              #include <xdp/xdp_helpers.h>
137
138              struct {
139                   __uint(priority, 10);
140                   __uint(XDP_PASS, 1);
141                   __uint(XDP_DROP, 1);
142              } XDP_RUN_CONFIG(my_xdp_func);
143
144
145       This example specifies that the XDP program in my_xdp_func should  have
146       priority  10 and that its chain call actions are XDP_PASS and XDP_DROP.
147       In a source file with multiple XDP programs in the same file, a defini‐
148       tion  like  the  above can be included for each program (main XDP func‐
149       tion). Any program that does not specify any  config  information  will
150       use the default values outlined above.
151
152
153   Inspecting and modifying metadata
154       libxdp  exposes  the following functions that an application can use to
155       inspect and modify the metadata on an XDP program. Modification is only
156       possible  before a program is attached on an interface. These functions
157       won't modify the BTF information itself, but the  new  values  will  be
158       stored as part of the program attachment.
159
160              unsigned int xdp_program__run_prio(const struct xdp_program *xdp_prog);
161              int xdp_program__set_run_prio(struct xdp_program *xdp_prog,
162                                   unsigned int run_prio);
163              bool xdp_program__chain_call_enabled(const struct xdp_program *xdp_prog,
164                                       enum xdp_action action);
165              int xdp_program__set_chain_call_enabled(struct xdp_program *prog,
166                                       unsigned int action,
167                                       bool enabled);
168              int xdp_program__print_chain_call_actions(const struct xdp_program *prog,
169                                         char *buf,
170                                         size_t buf_len);
171
172

The dispatcher program

174       To  support  multiple non-offloaded programs on the same network inter‐
175       face, libxdp uses a dispatcher program which is a small wrapper program
176       that  will call each component program in turn, expect the return code,
177       and then chain call to the next program based on the chain call actions
178       of the previous program (see the Program metadata section above).
179
180
181       While  applications using libxdp do not need to know the details of the
182       dispatcher program to just load  an  XDP  program  unto  an  interface,
183       libxdp  does expose the dispatcher and its attached component programs,
184       which can be used to list the programs currently attached to an  inter‐
185       face.
186
187
188       The  structure used for this is struct xdp_multiprog, which can only be
189       constructed from the programs loaded on an interface based on  ifindex.
190       The API for getting a multiprog reference and iterating through the at‐
191       tached programs looks like this:
192
193              struct xdp_multiprog *xdp_multiprog__get_from_ifindex(int ifindex);
194              struct xdp_program *xdp_multiprog__next_prog(const struct xdp_program *prog,
195                                            const struct xdp_multiprog *mp);
196              void xdp_multiprog__close(struct xdp_multiprog *mp);
197              int xdp_multiprog__detach(struct xdp_multiprog *mp, int ifindex);
198              enum xdp_attach_mode xdp_multiprog__attach_mode(const struct xdp_multiprog *mp);
199              struct xdp_program *xdp_multiprog__main_prog(const struct xdp_multiprog *mp);
200              struct xdp_program *xdp_multiprog__hw_prog(const struct xdp_multiprog *mp);
201              bool xdp_multiprog__is_legacy(const struct xdp_multiprog *mp);
202
203
204       If a non-offloaded program is attached to the  interface  which  libxdp
205       doesn't  recognise  as a dispatcher program, an xdp_multiprog structure
206       will still be returned, and xdp_multiprog__is_legacy() will return true
207       for  that  program (note that this also holds true if only an offloaded
208       program is loaded). A reference to that (regular) XDP  program  can  be
209       obtained  by xdp_multiprog__main_prog(). If the program attached to the
210       interface is a dispatcher program, xdp_multiprog__main_prog() will  re‐
211       turn a reference to the dispatcher program itself, which is mainly use‐
212       ful for obtaining other data about that program (such  as  the  program
213       ID). A reference to an offloaded program can be acquired using xdp_mul‐
214       tiprog_hw_prog(). Function xdp_multiprog__attach_mode() returns the at‐
215       tach mode of the non-offloaded program, whether an offloaded program is
216       attached should be checked through xdp_multiprog_hw_prog().
217
218
219   Pinning in bpffs
220       The kernel will automatically detach component programs from  the  dis‐
221       patcher  once  the  last  reference to them disappears. To prevent this
222       from happening, libxdp will pin the  component  program  references  in
223       bpffs  before  attaching  the  dispatcher to the network interface. The
224       pathnames generated for pinning is as follows:
225
226
227       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID  -  dispatcher   program   for
228           IFINDEX with BPF program ID DID
229
230       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-prog - component program
231           0, program reference
232
233       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-link - component program
234           0, bpf_link reference
235
236       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-prog - component program
237           1, program reference
238
239       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-link - component program
240           1, bpf_link reference
241
242       —   etc, up to ten component programs
243
244
245       If  set,  the LIBXDP_BPFFS environment variable will override the loca‐
246       tion of bpffs, but the xdp subdirectory is always used. If no bpffs  is
247       mounted,  libxdp will consult the environment variable LIBXDP_BPFFS_AU‐
248       TOMOUNT. If this is set to 1, libxdp will attempt to automount a bpffs.
249       If  not,  libxdp  will  fall back to loading a single program without a
250       dispatcher, as if the kernel did not support the  features  needed  for
251       multiprog attachment.
252
253

Using AF_XDP sockets

255       Libxdp  implements  helper  functions for configuring AF_XDP sockets as
256       well as reading and writing packets from these sockets. AF_XDP  sockets
257       can be used to redirect packets to user-space at high rates from an XDP
258       program. Note that this functionality used to reside in libbpf, but has
259       now  been  moved over to libxdp as it is a better fit for this library.
260       As of the 1.0 release of libbpf, the AF_XDP socket support will be  re‐
261       moved and all future development will be performed in libxdp instead.
262
263
264       For  an overview of AF_XDP sockets, please refer to this Linux Plumbers
265       paper (http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-
266       v3.pdf)  and  the documentation in the Linux kernel (Documentation/net‐
267       working/af_xdp.rst  or  https://www.kernel.org/doc/html/latest/network
268       ing/af_xdp.html).
269
270
271       For  an example on how to use the interface, take a look at the AF_XDP-
272       example and AF_XDP-forwarding programs in the bpf-examples  repository:
273       https://github.com/xdp-project/bpf-examples.
274
275
276   Control path
277       Libxdp  provides helper functions for creating and destroying umems and
278       sockets as shown below. The first thing that a user generally wants  to
279       do  is  to  create  a umem area. This is the area that will contain all
280       packets received and the ones that are going to be  sent.  After  that,
281       AF_XDP  sockets  can  be created tied to this umem. These can either be
282       sockets  that  have  exclusive   ownership   of   that   umem   through
283       xsk_socket__create()    or    shared    with    other   sockets   using
284       xsk_socket__create_shared.    There    is     one     option     called
285       XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD that can be set in the libxdp_flags
286       field (also called libbpf_flags for compatibility reasons).  This  will
287       make  libxdp  not  load  any XDP program or set and BPF maps which is a
288       must if users want to add their own XDP program.
289
290
291       If there is already a socket created with socket(AF_XDP,  SOCK_RAW,  0)
292       not  bound and not tied to any umem, file descriptor of this socket can
293       be used in an xsk_umem__create_with_fd() variant of the  umem  creation
294       function.
295
296              int xsk_umem__create(struct xsk_umem **umem,
297                             void *umem_area, __u64 size,
298                             struct xsk_ring_prod *fill,
299                             struct xsk_ring_cons *comp,
300                             const struct xsk_umem_config *config);
301              int xsk_umem__create_with_fd(struct xsk_umem **umem,
302                                  int fd, void *umem_area, __u64 size,
303                                  struct xsk_ring_prod *fill,
304                                  struct xsk_ring_cons *comp,
305                                  const struct xsk_umem_config *config);
306              int xsk_socket__create(struct xsk_socket **xsk,
307                               const char *ifname, __u32 queue_id,
308                               struct xsk_umem *umem,
309                               struct xsk_ring_cons *rx,
310                               struct xsk_ring_prod *tx,
311                               const struct xsk_socket_config *config);
312              int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
313                                   const char *ifname,
314                                   __u32 queue_id, struct xsk_umem *umem,
315                                   struct xsk_ring_cons *rx,
316                                   struct xsk_ring_prod *tx,
317                                   struct xsk_ring_prod *fill,
318                                   struct xsk_ring_cons *comp,
319                                   const struct xsk_socket_config *config);
320              int xsk_umem__delete(struct xsk_umem *umem);
321              void xsk_socket__delete(struct xsk_socket *xsk);
322
323
324       There are also two helper function to get the file descriptor of a umem
325       or a socket. These are needed when using standard Linux  syscalls  such
326       as poll(), recvmsg(), sendto(), etc.
327
328              int xsk_umem__fd(const struct xsk_umem *umem);
329              int xsk_socket__fd(const struct xsk_socket *xsk);
330
331
332       The  control  path also provides two APIs for setting up AF_XDP sockets
333       when the process that is going to use the AF_XDP socket  is  non-privi‐
334       leged.  These  two functions perform the operations that require privi‐
335       leges and can be executed from some form of control  process  that  has
336       the  necessary  privileges. The xsk_socket__create executed on the non-
337       privileged process will then skip these two steps. For  an  example  on
338       how  to  use these, please take a look at the AF_XDP-example program in
339       the bpf-examples  repository:  https://github.com/xdp-project/bpf-exam
340       ples/tree/master/AF_XDP-example.
341
342              int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
343              int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
344
345
346       To further reduce required level of privileges, an AF_XDP socket can be
347       created beforehand with socket(AF_XDP, SOCK_RAW, 0)  and  passed  to  a
348       non-privileged  process.   This  socket  can  be used in xsk_umem__cre‐
349       ate_with_fd() and later  in  xsk_socket__create()  with  created  umem.
350       xsk_socket__create_shared()  would  still require privileges for AF_XDP
351       socket creation.
352
353
354   Data path
355       For performance reasons, all the data path functions are static  inline
356       functions  found in the xsk.h header file so they can be optimized into
357       the target application binary for best possible performance. There  are
358       four  FIFO  rings  of  two main types: producer rings (fill and Tx) and
359       consumer  rings  (Rx  and   completion).   The   producer   rings   use
360       xsk_ring_prod functions and consumer rings use xsk_ring_cons functions.
361       For producer rings, you start with reserving one or  more  slots  in  a
362       producer  ring and then when they have been filled out, you submit them
363       so that the kernel will act on them. For a consumer ring, you  peek  if
364       there  are any new packets in the ring and if so you can read them from
365       the ring. Once you are done reading them, you release them back to  the
366       kernel so it can use them for new packets. There is also a cancel oper‐
367       ation for consumer rings if the application does not  want  to  consume
368       all packets received with the peek operation.
369
370              __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx);
371              void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb);
372              __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx);
373              void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb);
374              void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb);
375
376
377       The functions below are used for reading and writing the descriptors of
378       the  rings.  xsk_ring_prod__fill_addr()  and   xsk_ring_prod__tx_desc()
379       writes   entries   in   the  fill  and  Tx  rings  respectively,  while
380       xsk_ring_cons__comp_addr and xsk_ring_cons__rx_desc reads entries  from
381       the  completion and Rx rings respectively. The idx is the parameter re‐
382       turned in the xsk_ring_prod__reserve or xsk_ring_cons__peek  calls.  To
383       advance to the next entry, simply do idx++.
384
385              __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, __u32 idx);
386              struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx, __u32 idx);
387              const __u64 *xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx);
388              const struct xdp_desc *xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx);
389
390
391       The xsk_umem functions are used to get a pointer to the packet data it‐
392       self, always located inside the umem. In the default aligned mode,  you
393       can  get  the addr variable straight from the Rx descriptor. But in un‐
394       aligned mode, you need to use the three last function below as the off‐
395       set  used  is  carried in the upper 16 bits of the addr. Therefore, you
396       cannot use the addr straight from the descriptor in the unaligned case.
397
398              void *xsk_umem__get_data(void *umem_area, __u64 addr);
399              __u64 xsk_umem__extract_addr(__u64 addr);
400              __u64 xsk_umem__extract_offset(__u64 addr);
401              __u64 xsk_umem__add_offset_to_addr(__u64 addr);
402
403
404       There is one more function in the data path  and  that  checks  if  the
405       need_wakeup  flag  is  set.  Use  of this flag is highly encouraged and
406       should  be  enabled  by  setting   XDP_USE_NEED_WAKEUP   bit   in   the
407       xdp_bind_flags   field   that   is   provided  to  the  xsk_socket_cre‐
408       ate_[shared]() calls. If this function returns true, then you  need  to
409       call  recvmsg(),  sendto(),  or  poll()  depending  on  the  situation.
410       recvmsg() if you are receiving, or sendto() if you are sending.  poll()
411       can  be  used  for  both cases and provide the ability to sleep too, as
412       with any other socket. But note that poll is a  slower  operation  than
413       the other two.
414
415              int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r);
416
417
418       For an example on how to use all these APIs, take a look at the AF_XDP-
419       example and AF_XDP-forwarding programs in the bpf-examples  repository:
420       https://github.com/xdp-project/bpf-examples.
421
422

Kernel and BPF program feature compatibility

424       The  features  exposed  by libxdp relies on certain kernel versions and
425       BPF features to work. To get the full benefit of all  features,  libxdp
426       needs  to  be  used  with kernel 5.10 or newer, unless the commits men‐
427       tioned below have been backported. However, libxdp will probe the  ker‐
428       nel  and transparently fall back to legacy loading procedures, so it is
429       possible to use the library with older versions, although some features
430       will be unavailable, as detailed below.
431
432
433       The  ability  to attach multiple BPF programs to a single interface re‐
434       lies on the kernel "BPF program extension" feature which was introduced
435       by commit be8704ff07d2 ("bpf: Introduce dynamic program extensions") in
436       the upstream kernel and first appeared in kernel release 5.6. To incre‐
437       mentally attach multiple programs, a further refinement added by commit
438       4a1e7c0c63e0 ("bpf: Support attaching freplace programs to multiple at‐
439       tach  points")  is  needed;  this first appeared in the upstream kernel
440       version 5.10. The functionality relies on the "BPF trampolines" feature
441       which  is  unfortunately  only available on the x86_64 architecture. In
442       other words, kernels before 5.6 can only attach a single XDP program to
443       each  interface,  kernels 5.6+ can attach multiple programs if they are
444       all attached at the same time, and kernels 5.10 have full  support  for
445       XDP  multiprog on x86_64. On other architectures, only a single program
446       can be attached to each interface.
447
448
449       To load AF_XDP programs, kernel support for AF_XDP sockets needs to  be
450       included  and  enabled  in  the  kernel  build. In addition, when using
451       AF_XDP sockets, an XDP program is also loaded on the interface. The XDP
452       program  used for this by libxdp requires the ability to do map lookups
453       into XSK maps, which was introduced with commit fada7fdc83c0 ("bpf: Al‐
454       low bpf_map_lookup_elem() on an xskmap") in kernel 5.3. This means that
455       the minimum required kernel version for using  AF_XDP  is  kernel  5.3;
456       however,  for  the  AF_XDP XDP program to co-exist with other programs,
457       the same constraints for multiprog applies as outlined above.
458
459
460       Note that some Linux distributions backport features to earlier  kernel
461       versions,  especially  in enterprise kernels; for instance, Red Hat En‐
462       terprise Linux kernels include everything needed for libxdp to function
463       since RHEL 8.5.
464
465
466       Finally,  XDP programs loaded using the multiprog facility must include
467       type information (using the BPF Type Format, BTF). To get this, compile
468       the programs with a recent version of Clang/LLVM (version 10+), and en‐
469       able debug information when compiling (using the -g option).
470
471

BUGS

473       Please report any bugs on  Github:  https://github.com/xdp-project/xdp-
474       tools/issues
475
476

AUTHORS

478       libxdp and this man page were written by Toke Høiland-Jørgensen. AF_XDP
479       support and documentation was contributed by Magnus Karlsson.
480
481
482
483v1.4.1                         October 20, 2023                      libxdp(3)
Impressum