1libxdp(3)          libxdp - library for loading XDP programs         libxdp(3)
2
3
4

NAME

6       libxdp - library for attaching XDP programs and using AF_XDP sockets
7

SYNOPSIS

9       This  directory contains the files for the libxdp library for attaching
10       XDP programs to network interfaces and using AF_XDP  sockets.  The  li‐
11       brary  is fairly lightweight and relies on libbpf to do the heavy lift‐
12       ing for processing eBPF object files etc.
13
14
15       Libxdp provides two primary features on top of libbpf. The first is the
16       ability  to  load multiple XDP programs in sequence on a single network
17       device (which is not natively supported by the  kernel).  This  support
18       relies on the freplace functionality in the kernel, which makes it pos‐
19       sible to attach an eBPF program as a replacement for a global  function
20       in  another  (already  loaded) eBPF program. The second main feature is
21       helper functions for configuring AF_XDP sockets as well as reading  and
22       writing packets from these sockets.
23
24
25       Some of the functionality provided by libxdp depends on particular ker‐
26       nel features; see the "Kernel feature compatibility" section below  for
27       details.
28
29
30   Using libxdp from an application
31       Basic  usage  of  libxdp from an application is quite straight forward.
32       The following example loads, then unloads, an XDP program from the 'lo'
33       interface:
34
35              #define IFINDEX 1
36
37              struct xdp_program *prog;
38              int err;
39
40              prog = xdp_program__open_file("my-program.o", "section_name", NULL);
41              err = xdp_program__attach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
42
43              if (!err)
44                  xdp_program__detach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
45
46              xdp_program__close(prog);
47
48
49       The xdp_program structure is an opaque structure that represents a sin‐
50       gle XDP program. libxdp contains functions to create such a struct  ei‐
51       ther  from a BPF object file on disk, from a libbpf BPF object, or from
52       an identifier of a program that is already loaded into the kernel:
53
54              struct xdp_program *xdp_program__from_bpf_obj(struct bpf_object *obj,
55                                             const char *section_name);
56              struct xdp_program *xdp_program__find_file(const char *filename,
57                                          const char *section_name,
58                                          struct bpf_object_open_opts *opts);
59              struct xdp_program *xdp_program__open_file(const char *filename,
60                                          const char *section_name,
61                                          struct bpf_object_open_opts *opts);
62              struct xdp_program *xdp_program__from_fd(int fd);
63              struct xdp_program *xdp_program__from_id(__u32 prog_id);
64              struct xdp_program *xdp_program__from_pin(const char *pin_path);
65
66
67       The functions that open a BPF object or file need the function name  of
68       the  XDP  program as well as the file name or object, since an ELF file
69       can contain multiple XDP programs. The  xdp_program__find_file()  func‐
70       tion  takes  a filename without a path, and will look for the object in
71       LIBXDP_OBJECT_PATH which defaults to /usr/lib/bpf (or /usr/lib64/bpf on
72       systems  using  a  split library path). This is convenient for applica‐
73       tions shipping pre-compiled eBPF object files.
74
75
76       The xdp_program__attach() function will attach the program to an inter‐
77       face,  building  a  dispatcher program to execute it. Multiple programs
78       can be attached at once with xdp_program__attach_multi(); they will  be
79       sorted  in  order of their run priority, and execution from one program
80       to the next will proceed based on the chain call  actions  defined  for
81       each  program  (see  the  Program  metadata section below). Because the
82       loading process involves modifying the attach type of the program,  the
83       attach  functions  only  work with struct xdp_program objects that have
84       not yet been loaded into the kernel.
85
86
87       When using the attach functions to attach to an interface that  already
88       has  an  XDP  program loaded, libxdp will attempt to add the program to
89       the list of loaded programs. However, this  may  fail,  either  due  to
90       missing kernel support, or because the already-attached program was not
91       loaded using a dispatcher compatible with libxdp. If the kernel support
92       for incremental attach (merged in kernel 5.10) is missing, the only way
93       to actually run multiple programs on a single interface  is  to  attach
94       them  all at the same time with xdp_program__attach_multi(). If the ex‐
95       isting program is not an XDP dispatcher, that program will have  to  be
96       detached  from  the  interface before libxdp can attach a new one. This
97       can be done by calling xdp_program__detach() with a  reference  to  the
98       loaded program; but note that this will of course break any application
99       relying on that other XDP program to be present.
100
101

Program metadata

103       To support multiple XDP programs on the same interface, libxdp uses two
104       pieces  of  metadata  for each XDP program: Run priority and chain call
105       actions.
106
107
108   Run priority
109       This is the priority of the program and is a  simple  integer  used  to
110       sort  programs  when loading multiple programs onto the same interface.
111       Programs that wish to run early (such as a packet  filter)  should  set
112       low  values  for this, while programs that want to run later (such as a
113       packet forwarder or counter) should set higher values. Note that  later
114       programs  are  only run if the previous programs end with a return code
115       that is part of its chain call actions (see below). If  not  specified,
116       the default priority value is 50.
117
118
119   Chain call actions
120       These  are the program return codes that the program indicate for pack‐
121       ets that should continue processing. If  the  program  returns  one  of
122       these actions, later programs in the call chain will be run, whereas if
123       it returns any other action, processing will be  interrupted,  and  the
124       XDP  dispatcher  will  return the verdict immediately. If not set, this
125       defaults to just XDP_PASS, which is  likely  the  value  most  programs
126       should use.
127
128
129   Specifying metadata
130       The metadata outlined above is specified as BTF information embedded in
131       the ELF file containing the XDP program. The xdp_helpers.h file shipped
132       with  libxdp  contains helper macros to include this information, which
133       can be used as follows:
134
135              #include <bpf/bpf_helpers.h>
136              #include <xdp/xdp_helpers.h>
137
138              struct {
139                   __uint(priority, 10);
140                   __uint(XDP_PASS, 1);
141                   __uint(XDP_DROP, 1);
142              } XDP_RUN_CONFIG(my_xdp_func);
143
144
145       This example specifies that the XDP program in my_xdp_func should  have
146       priority  10 and that its chain call actions are XDP_PASS and XDP_DROP.
147       In a source file with multiple XDP programs in the same file, a defini‐
148       tion  like  the  above can be included for each program (main XDP func‐
149       tion). Any program that does not specify any  config  information  will
150       use the default values outlined above.
151
152
153   Inspecting and modifying metadata
154       libxdp  exposes  the following functions that an application can use to
155       inspect and modify the metadata on an XDP program. Modification is only
156       possible  before a program is attached on an interface. These functions
157       won't modify the BTF information itself, but the  new  values  will  be
158       stored as part of the program attachment.
159
160              unsigned int xdp_program__run_prio(const struct xdp_program *xdp_prog);
161              int xdp_program__set_run_prio(struct xdp_program *xdp_prog,
162                                   unsigned int run_prio);
163              bool xdp_program__chain_call_enabled(const struct xdp_program *xdp_prog,
164                                       enum xdp_action action);
165              int xdp_program__set_chain_call_enabled(struct xdp_program *prog,
166                                       unsigned int action,
167                                       bool enabled);
168              int xdp_program__print_chain_call_actions(const struct xdp_program *prog,
169                                         char *buf,
170                                         size_t buf_len);
171
172

The dispatcher program

174       To  support  multiple non-offloaded programs on the same network inter‐
175       face, libxdp uses a dispatcher program which is a small wrapper program
176       that  will call each component program in turn, expect the return code,
177       and then chain call to the next program based on the chain call actions
178       of the previous program (see the Program metadata section above).
179
180
181       While  applications using libxdp do not need to know the details of the
182       dispatcher program to just load  an  XDP  program  unto  an  interface,
183       libxdp  does expose the dispatcher and its attached component programs,
184       which can be used to list the programs currently attached to an  inter‐
185       face.
186
187
188       The  structure used for this is struct xdp_multiprog, which can only be
189       constructed from the programs loaded on an interface based on  ifindex.
190       The API for getting a multiprog reference and iterating through the at‐
191       tached programs looks like this:
192
193              struct xdp_multiprog *xdp_multiprog__get_from_ifindex(int ifindex);
194              struct xdp_program *xdp_multiprog__next_prog(const struct xdp_program *prog,
195                                            const struct xdp_multiprog *mp);
196              void xdp_multiprog__close(struct xdp_multiprog *mp);
197              int xdp_multiprog__detach(struct xdp_multiprog *mp, int ifindex);
198              enum xdp_attach_mode xdp_multiprog__attach_mode(const struct xdp_multiprog *mp);
199              struct xdp_program *xdp_multiprog__main_prog(const struct xdp_multiprog *mp);
200              struct xdp_program *xdp_multiprog__hw_prog(const struct xdp_multiprog *mp);
201              bool xdp_multiprog__is_legacy(const struct xdp_multiprog *mp);
202
203
204       If a non-offloaded program is attached to the  interface  which  libxdp
205       doesn't  recognise  as a dispatcher program, an xdp_multiprog structure
206       will still be returned, and xdp_multiprog__is_legacy() will return true
207       for  that  program (note that this also holds true if only an offloaded
208       program is loaded). A reference to that (regular) XDP  program  can  be
209       obtained  by xdp_multiprog__main_prog(). If the program attached to the
210       interface is a dispatcher program, xdp_multiprog__main_prog() will  re‐
211       turn a reference to the dispatcher program itself, which is mainly use‐
212       ful for obtaining other data about that program (such  as  the  program
213       ID). A reference to an offloaded program can be acquired using xdp_mul‐
214       tiprog_hw_prog(). Function xdp_multiprog__attach_mode() returns the at‐
215       tach mode of the non-offloaded program, whether an offloaded program is
216       attached should be checked through xdp_multiprog_hw_prog().
217
218
219   Pinning in bpffs
220       The kernel will automatically detach component programs from  the  dis‐
221       patcher  once  the  last  reference to them disappears. To prevent this
222       from happening, libxdp will pin the  component  program  references  in
223       bpffs  before  attaching  the  dispatcher to the network interface. The
224       pathnames generated for pinning is as follows:
225
226
227       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID  -  dispatcher   program   for
228           IFINDEX with BPF program ID DID
229
230       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-prog - component program
231           0, program reference
232
233       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-link - component program
234           0, bpf_link reference
235
236       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-prog - component program
237           1, program reference
238
239       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-link - component program
240           1, bpf_link reference
241
242       —   etc, up to ten component programs
243
244
245       If  set,  the LIBXDP_BPFFS environment variable will override the loca‐
246       tion of bpffs, but the xdp subdirectory is always used.
247
248

Using AF_XDP sockets

250       Libxdp implements helper functions for configuring  AF_XDP  sockets  as
251       well  as reading and writing packets from these sockets. AF_XDP sockets
252       can be used to redirect packets to user-space at high rates from an XDP
253       program. Note that this functionality used to reside in libbpf, but has
254       now been moved over to libxdp as it is a better fit for  this  library.
255       As  of the 1.0 release of libbpf, the AF_XDP socket support will be re‐
256       moved and all future development will be performed in libxdp instead.
257
258
259       For an overview of AF_XDP sockets, please refer to this Linux  Plumbers
260       paper (http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-
261       v3.pdf) and the documentation in the Linux  kernel  (Documentation/net‐
262       working/af_xdp.rst  or  https://www.kernel.org/doc/html/latest/network
263       ing/af_xdp.html).
264
265
266       For an example on how to use the interface, take a look at  the  sample
267       application  in  the  Linux  kernel  source  tree  at  samples/bpf/xdp‐
268       sock_user.c.
269
270
271   Control path
272       Libxdp provides helper functions for creating and destroying umems  and
273       sockets  as shown below. The first thing that a user generally wants to
274       do is to create a umem area. This is the area  that  will  contain  all
275       packets  received  and  the ones that are going to be sent. After that,
276       AF_XDP sockets can be created tied to this umem. These  can  either  be
277       sockets   that   have   exclusive   ownership   of  that  umem  through
278       xsk_socket__create()   or   shared    with    other    sockets    using
279       xsk_socket__create_shared.     There     is     one    option    called
280       XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD that can be set in the libxdp_flags
281       field  (also  called libbpf_flags for compatibility reasons). This will
282       make libxdp not load any XDP program or set and BPF  maps  which  is  a
283       must if users want to add their own XDP program.
284
285              int xsk_umem__create(struct xsk_umem **umem,
286                             void *umem_area, __u64 size,
287                             struct xsk_ring_prod *fill,
288                             struct xsk_ring_cons *comp,
289                             const struct xsk_umem_config *config);
290              int xsk_socket__create(struct xsk_socket **xsk,
291                               const char *ifname, __u32 queue_id,
292                               struct xsk_umem *umem,
293                               struct xsk_ring_cons *rx,
294                               struct xsk_ring_prod *tx,
295                               const struct xsk_socket_config *config);
296              int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
297                                   const char *ifname,
298                                   __u32 queue_id, struct xsk_umem *umem,
299                                   struct xsk_ring_cons *rx,
300                                   struct xsk_ring_prod *tx,
301                                   struct xsk_ring_prod *fill,
302                                   struct xsk_ring_cons *comp,
303                                   const struct xsk_socket_config *config);
304              int xsk_umem__delete(struct xsk_umem *umem);
305              void xsk_socket__delete(struct xsk_socket *xsk);
306
307
308       There are also two helper function to get the file descriptor of a umem
309       or a socket. These are needed when using standard Linux  syscalls  such
310       as poll(), recvmsg(), sendto(), etc.
311
312              int xsk_umem__fd(const struct xsk_umem *umem);
313              int xsk_socket__fd(const struct xsk_socket *xsk);
314
315
316       The  control  path also provides two APIs for setting up AF_XDP sockets
317       when the process that is going to use the AF_XDP socket  is  non-privi‐
318       leged.  These  two functions perform the operations that require privi‐
319       leges and can be executed from some form of control  process  that  has
320       the  necessary  privileges. The xsk_socket__create executed on the non-
321       privileged process will then skip these two steps. For  an  example  on
322       how  to  use  these,  please  take  a  look  at https://github.com/tor
323       valds/linux/blob/master/samples/bpf/xdpsock_user.c at  samples/bpf/xdp‐
324       sock_user.c    and   https://github.com/torvalds/linux/blob/master/sam
325       ples/bpf/xdpsock_ctrl_proc.c at samples/bpf/xdpsock_ctrl_proc.c in  the
326       Linux kernel source tree.
327
328              int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
329              int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
330
331
332   Data path
333       For  performance reasons, all the data path functions are static inline
334       functions found in the xsk.h header file so they can be optimized  into
335       the  target application binary for best possible performance. There are
336       four FIFO rings of two main types: producer rings  (fill  and  Tx)  and
337       consumer   rings   (Rx   and   completion).   The  producer  rings  use
338       xsk_ring_prod functions and consumer rings use xsk_ring_cons functions.
339       For  producer  rings,  you  start with reserving one or more slots in a
340       producer ring and then when they have been filled out, you submit  them
341       so  that  the kernel will act on them. For a consumer ring, you peek if
342       there are any new packets in the ring and if so you can read them  from
343       the  ring. Once you are done reading them, you release them back to the
344       kernel so it can use them for new packets. There is also a cancel oper‐
345       ation  for  consumer  rings if the application does not want to consume
346       all packets received with the peek operation.
347
348              __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx);
349              void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb);
350              __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx);
351              void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb);
352              void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb);
353
354
355       The functions below are used for reading and writing the descriptors of
356       the   rings.  xsk_ring_prod__fill_addr()  and  xsk_ring_prod__tx_desc()
357       writes  entries  in  the  fill  and  Tx   rings   respectively,   while
358       xsk_ring_cons__comp_addr  and xsk_ring_cons__rx_desc reads entries from
359       the completion and Rx rings respectively. The idx is the parameter  re‐
360       turned  in  the xsk_ring_prod__reserve or xsk_ring_cons__peek calls. To
361       advance to the next entry, simply do idx++.
362
363              __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, __u32 idx);
364              struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx, __u32 idx);
365              const __u64 *xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx);
366              const struct xdp_desc *xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx);
367
368
369       The xsk_umem functions are used to get a pointer to the packet data it‐
370       self,  always located inside the umem. In the default aligned mode, you
371       can get the addr variable straight from the Rx descriptor. But  in  un‐
372       aligned mode, you need to use the three last function below as the off‐
373       set used is carried in the upper 16 bits of the  addr.  Therefore,  you
374       cannot use the addr straight from the descriptor in the unaligned case.
375
376              void *xsk_umem__get_data(void *umem_area, __u64 addr);
377              __u64 xsk_umem__extract_addr(__u64 addr);
378              __u64 xsk_umem__extract_offset(__u64 addr);
379              __u64 xsk_umem__add_offset_to_addr(__u64 addr);
380
381
382       There  is  one  more  function  in the data path and that checks if the
383       need_wakeup flag is set. Use of this  flag  is  highly  encouraged  and
384       should   be   enabled   by   setting  XDP_USE_NEED_WAKEUP  bit  in  the
385       xdp_bind_flags  field  that  is   provided   to   the   xsk_socket_cre‐
386       ate_[shared]()  calls.  If this function returns true, then you need to
387       call  recvmsg(),  sendto(),  or  poll()  depending  on  the  situation.
388       recvmsg()  if you are receiving, or sendto() if you are sending. poll()
389       can be used for both cases and provide the ability  to  sleep  too,  as
390       with  any  other  socket. But note that poll is a slower operation than
391       the other two.
392
393              int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r);
394
395
396       For an example on how to use all these APIs, take a look at the  sample
397       applications in the Linux kernel source tree at https://github.com/tor
398       valds/linux/blob/master/samples/bpf/xdpsock_user.c at  samples/bpf/xdp‐
399       sock_user.c    and   https://github.com/torvalds/linux/blob/master/sam
400       ples/bpf/xsk_fwd.c at samples/bpf/xsk_fwd.c.
401
402

Kernel and BPF program feature compatibility

404       The features exposed by libxdp relies on certain  kernel  versions  and
405       BPF  features  to work. To get the full benefit of all features, libxdp
406       needs to be used with kernel 5.10 or newer,  unless  the  commits  men‐
407       tioned  below have been backported. However, libxdp will probe the ker‐
408       nel and transparently fall back to legacy loading procedures, so it  is
409       possible to use the library with older versions, although some features
410       will be unavailable, as detailed below.
411
412
413       The ability to attach multiple BPF programs to a single  interface  re‐
414       lies on the kernel "BPF program extension" feature which was introduced
415       by commit be8704ff07d2 ("bpf: Introduce dynamic program extensions") in
416       the upstream kernel and first appeared in kernel release 5.6. To incre‐
417       mentally attach multiple programs, a further refinement added by commit
418       4a1e7c0c63e0 ("bpf: Support attaching freplace programs to multiple at‐
419       tach points") is needed; this first appeared  in  the  upstream  kernel
420       version 5.10. The functionality relies on the "BPF trampolines" feature
421       which is unfortunately only available on the  x86_64  architecture.  In
422       other words, kernels before 5.6 can only attach a single XDP program to
423       each interface, kernels 5.6+ can attach multiple programs if  they  are
424       all  attached  at the same time, and kernels 5.10 have full support for
425       XDP multiprog on x86_64. On other architectures, only a single  program
426       can be attached to each interface.
427
428
429       To  load AF_XDP programs, kernel support for AF_XDP sockets needs to be
430       included and enabled in the  kernel  build.  In  addition,  when  using
431       AF_XDP sockets, an XDP program is also loaded on the interface. The XDP
432       program used for this by libxdp requires the ability to do map  lookups
433       into XSK maps, which was introduced with commit fada7fdc83c0 ("bpf: Al‐
434       low bpf_map_lookup_elem() on an xskmap") in kernel 5.3. This means that
435       the  minimum  required  kernel  version for using AF_XDP is kernel 5.3;
436       however, for the AF_XDP XDP program to co-exist  with  other  programs,
437       the same constraints for multiprog applies as outlined above.
438
439
440       Note  that some Linux distributions backport features to earlier kernel
441       versions, especially in enterprise kernels; for instance, Red  Hat  En‐
442       terprise Linux kernels include everything needed for libxdp to function
443       since RHEL 8.5.
444
445
446       Finally, XDP programs loaded using the multiprog facility must  include
447       type information (using the BPF Type Format, BTF). To get this, compile
448       the programs with a recent version of Clang/LLVM (version 10+), and en‐
449       able debug information when compiling (using the -g option).
450
451

BUGS

453       Please  report  any bugs on Github: https://github.com/xdp-project/xdp-
454       tools/issues
455
456

AUTHORS

458       libxdp and this man page were written by Toke Høiland-Jørgensen. AF_XDP
459       support and documentation was contributed by Magnus Karlsson.
460
461
462
463v1.2.6                          August 16, 2022                      libxdp(3)
Impressum