1libxdp(3)          libxdp - library for loading XDP programs         libxdp(3)
2
3
4

libxdp - library for attaching XDP programs and using AF_XDP sockets

6       This  directory contains the files for the libxdp library for attaching
7       XDP programs to network interfaces and using AF_XDP  sockets.  The  li‐
8       brary  is fairly lightweight and relies on libbpf to do the heavy lift‐
9       ing for processing eBPF object files etc.
10
11
12       Libxdp provides two primary features on top of libbpf. The first is the
13       ability  to  load multiple XDP programs in sequence on a single network
14       device (which is not natively supported by the  kernel).  This  support
15       relies on the freplace functionality in the kernel, which makes it pos‐
16       sible to attach an eBPF program as a replacement for a global  function
17       in  another  (already  loaded) eBPF program. The second main feature is
18       helper functions for configuring AF_XDP sockets as well as reading  and
19       writing packets from these sockets.
20
21
22       Some of the functionality provided by libxdp depends on particular ker‐
23       nel features; see the "Kernel feature compatibility" section below  for
24       details.
25
26
27   Using libxdp from an application
28       Basic  usage  of  libxdp from an application is quite straight forward.
29       The following example loads, then unloads, an XDP program from the 'lo'
30       interface:
31
32              #define IFINDEX 1
33
34              struct xdp_program *prog;
35              int err;
36
37              prog = xdp_program__open_file("my-program.o", "section_name", NULL);
38              err = xdp_program__attach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
39
40              if (!err)
41                  xdp_program__detach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
42
43              xdp_program__close(prog);
44
45
46       The xdp_program structure is an opaque structure that represents a sin‐
47       gle XDP program. libxdp contains functions to create such a struct  ei‐
48       ther  from a BPF object file on disk, from a libbpf BPF object, or from
49       an identifier of a program that is already loaded into the kernel:
50
51              struct xdp_program *xdp_program__from_bpf_obj(struct bpf_object *obj,
52                                             const char *section_name);
53              struct xdp_program *xdp_program__find_file(const char *filename,
54                                          const char *section_name,
55                                          struct bpf_object_open_opts *opts);
56              struct xdp_program *xdp_program__open_file(const char *filename,
57                                          const char *section_name,
58                                          struct bpf_object_open_opts *opts);
59              struct xdp_program *xdp_program__from_fd(int fd);
60              struct xdp_program *xdp_program__from_id(__u32 prog_id);
61              struct xdp_program *xdp_program__from_pin(const char *pin_path);
62
63
64       The functions that open a BPF object or file need the function name  of
65       the  XDP  program as well as the file name or object, since an ELF file
66       can contain multiple XDP programs. The  xdp_program__find_file()  func‐
67       tion  takes  a filename without a path, and will look for the object in
68       LIBXDP_OBJECT_PATH which defaults to /usr/lib/bpf (or /usr/lib64/bpf on
69       systems  using  a  split library path). This is convenient for applica‐
70       tions shipping pre-compiled eBPF object files.
71
72
73       The xdp_program__attach() function will attach the program to an inter‐
74       face,  building  a  dispatcher program to execute it. Multiple programs
75       can be attached at once with xdp_program__attach_multi(); they will  be
76       sorted  in  order of their run priority, and execution from one program
77       to the next will proceed based on the chain call  actions  defined  for
78       each  program  (see  the  Program  metadata section below). Because the
79       loading process involves modifying the attach type of the program,  the
80       attach  functions  only  work with struct xdp_program objects that have
81       not yet been loaded into the kernel.
82
83
84       When using the attach functions to attach to an interface that  already
85       has  an  XDP  program loaded, libxdp will attempt to add the program to
86       the list of loaded programs. However, this  may  fail,  either  due  to
87       missing kernel support, or because the already-attached program was not
88       loaded using a dispatcher compatible with libxdp. If the kernel support
89       for incremental attach (merged in kernel 5.10) is missing, the only way
90       to actually run multiple programs on a single interface  is  to  attach
91       them  all at the same time with xdp_program__attach_multi(). If the ex‐
92       isting program is not an XDP dispatcher, that program will have  to  be
93       detached  from  the  interface before libxdp can attach a new one. This
94       can be done by calling xdp_program__detach() with a  reference  to  the
95       loaded program; but note that this will of course break any application
96       relying on that other XDP program to be present.
97
98

Program metadata

100       To support multiple XDP programs on the same interface, libxdp uses two
101       pieces  of  metadata  for each XDP program: Run priority and chain call
102       actions.
103
104
105   Run priority
106       This is the priority of the program and is a  simple  integer  used  to
107       sort  programs  when loading multiple programs onto the same interface.
108       Programs that wish to run early (such as a packet  filter)  should  set
109       low  values  for this, while programs that want to run later (such as a
110       packet forwarder or counter) should set higher values. Note that  later
111       programs  are  only run if the previous programs end with a return code
112       that is part of its chain call actions (see below). If  not  specified,
113       the default priority value is 50.
114
115
116   Chain call actions
117       These  are the program return codes that the program indicate for pack‐
118       ets that should continue processing. If  the  program  returns  one  of
119       these actions, later programs in the call chain will be run, whereas if
120       it returns any other action, processing will be  interrupted,  and  the
121       XDP  dispatcher  will  return the verdict immediately. If not set, this
122       defaults to just XDP_PASS, which is  likely  the  value  most  programs
123       should use.
124
125
126   Specifying metadata
127       The metadata outlined above is specified as BTF information embedded in
128       the ELF file containing the XDP program. The xdp_helpers.h file shipped
129       with  libxdp  contains helper macros to include this information, which
130       can be used as follows:
131
132              #include <bpf/bpf_helpers.h>
133              #include <xdp/xdp_helpers.h>
134
135              struct {
136                   __uint(priority, 10);
137                   __uint(XDP_PASS, 1);
138                   __uint(XDP_DROP, 1);
139              } XDP_RUN_CONFIG(my_xdp_func);
140
141
142       This example specifies that the XDP program in my_xdp_func should  have
143       priority  10 and that its chain call actions are XDP_PASS and XDP_DROP.
144       In a source file with multiple XDP programs in the same file, a defini‐
145       tion  like  the  above can be included for each program (main XDP func‐
146       tion). Any program that does not specify any  config  information  will
147       use the default values outlined above.
148
149
150   Inspecting and modifying metadata
151       libxdp  exposes  the following functions that an application can use to
152       inspect and modify the metadata on an XDP program. Modification is only
153       possible  before a program is attached on an interface. These functions
154       won't modify the BTF information itself, but the  new  values  will  be
155       stored as part of the program attachment.
156
157              unsigned int xdp_program__run_prio(const struct xdp_program *xdp_prog);
158              int xdp_program__set_run_prio(struct xdp_program *xdp_prog,
159                                   unsigned int run_prio);
160              bool xdp_program__chain_call_enabled(const struct xdp_program *xdp_prog,
161                                       enum xdp_action action);
162              int xdp_program__set_chain_call_enabled(struct xdp_program *prog,
163                                       unsigned int action,
164                                       bool enabled);
165              int xdp_program__print_chain_call_actions(const struct xdp_program *prog,
166                                         char *buf,
167                                         size_t buf_len);
168
169

The dispatcher program

171       To  support  multiple non-offloaded programs on the same network inter‐
172       face, libxdp uses a dispatcher program which is a small wrapper program
173       that  will call each component program in turn, expect the return code,
174       and then chain call to the next program based on the chain call actions
175       of the previous program (see the Program metadata section above).
176
177
178       While  applications using libxdp do not need to know the details of the
179       dispatcher program to just load  an  XDP  program  unto  an  interface,
180       libxdp  does expose the dispatcher and its attached component programs,
181       which can be used to list the programs currently attached to an  inter‐
182       face.
183
184
185       The  structure used for this is struct xdp_multiprog, which can only be
186       constructed from the programs loaded on an interface based on  ifindex.
187       The API for getting a multiprog reference and iterating through the at‐
188       tached programs looks like this:
189
190              struct xdp_multiprog *xdp_multiprog__get_from_ifindex(int ifindex);
191              struct xdp_program *xdp_multiprog__next_prog(const struct xdp_program *prog,
192                                            const struct xdp_multiprog *mp);
193              void xdp_multiprog__close(struct xdp_multiprog *mp);
194              int xdp_multiprog__detach(struct xdp_multiprog *mp, int ifindex);
195              enum xdp_attach_mode xdp_multiprog__attach_mode(const struct xdp_multiprog *mp);
196              struct xdp_program *xdp_multiprog__main_prog(const struct xdp_multiprog *mp);
197              struct xdp_program *xdp_multiprog__hw_prog(const struct xdp_multiprog *mp);
198              bool xdp_multiprog__is_legacy(const struct xdp_multiprog *mp);
199
200
201       If a non-offloaded program is attached to the  interface  which  libxdp
202       doesn't  recognise  as a dispatcher program, an xdp_multiprog structure
203       will still be returned, and xdp_multiprog__is_legacy() will return true
204       for  that  program (note that this also holds true if only an offloaded
205       program is loaded). A reference to that (regular) XDP  program  can  be
206       obtained  by xdp_multiprog__main_prog(). If the program attached to the
207       interface is a dispatcher program, xdp_multiprog__main_prog() will  re‐
208       turn a reference to the dispatcher program itself, which is mainly use‐
209       ful for obtaining other data about that program (such  as  the  program
210       ID). A reference to an offloaded program can be acquired using xdp_mul‐
211       tiprog_hw_prog(). Function xdp_multiprog__attach_mode() returns the at‐
212       tach mode of the non-offloaded program, whether an offloaded program is
213       attached should be checked through xdp_multiprog_hw_prog().
214
215
216   Pinning in bpffs
217       The kernel will automatically detach component programs from  the  dis‐
218       patcher  once  the  last  reference to them disappears. To prevent this
219       from happening, libxdp will pin the  component  program  references  in
220       bpffs  before  attaching  the  dispatcher to the network interface. The
221       pathnames generated for pinning is as follows:
222
223
224       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID  -  dispatcher   program   for
225           IFINDEX with BPF program ID DID
226
227       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-prog - component program
228           0, program reference
229
230       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-link - component program
231           0, bpf_link reference
232
233       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-prog - component program
234           1, program reference
235
236       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-link - component program
237           1, bpf_link reference
238
239       —   etc, up to ten component programs
240
241
242       If  set,  the LIBXDP_BPFFS environment variable will override the loca‐
243       tion of bpffs, but the xdp subdirectory is always used.
244
245

Using AF_XDP sockets

247       Libxdp implements helper functions for configuring  AF_XDP  sockets  as
248       well  as reading and writing packets from these sockets. AF_XDP sockets
249       can be used to redirect packets to user-space at high rates from an XDP
250       program. Note that this functionality used to reside in libbpf, but has
251       now been moved over to libxdp as it is a better fit for  this  library.
252       As  of the 1.0 release of libbpf, the AF_XDP socket support will be re‐
253       moved and all future development will be performed in libxdp instead.
254
255
256       For an overview of AF_XDP sockets, please refer to this Linux  Plumbers
257       paper (http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-
258       v3.pdf) and the documentation in the Linux  kernel  (Documentation/net‐
259       working/af_xdp.rst  or  https://www.kernel.org/doc/html/latest/network
260       ing/af_xdp.html).
261
262
263       For an example on how to use the interface, take a look at  the  sample
264       application  in  the  Linux  kernel  source  tree  at  samples/bpf/xdp‐
265       sock_user.c.
266
267
268   Control path
269       Libxdp provides helper functions for creating and destroying umems  and
270       sockets  as shown below. The first thing that a user generally wants to
271       do is to create a umem area. This is the area  that  will  contain  all
272       packets  received  and  the ones that are going to be sent. After that,
273       AF_XDP sockets can be created tied to this umem. These  can  either  be
274       sockets   that   have   exclusive   ownership   of  that  umem  through
275       xsk_socket__create()   or   shared    with    other    sockets    using
276       xsk_socket__create_shared.     There     is     one    option    called
277       XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD that can be set in the libxdp_flags
278       field  (also  called libbpf_flags for compatibility reasons). This will
279       make libxdp not load any XDP program or set and BPF  maps  which  is  a
280       must if users want to add their own XDP program.
281
282              int xsk_umem__create(struct xsk_umem **umem,
283                             void *umem_area, __u64 size,
284                             struct xsk_ring_prod *fill,
285                             struct xsk_ring_cons *comp,
286                             const struct xsk_umem_config *config);
287              int xsk_socket__create(struct xsk_socket **xsk,
288                               const char *ifname, __u32 queue_id,
289                               struct xsk_umem *umem,
290                               struct xsk_ring_cons *rx,
291                               struct xsk_ring_prod *tx,
292                               const struct xsk_socket_config *config);
293              int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
294                                   const char *ifname,
295                                   __u32 queue_id, struct xsk_umem *umem,
296                                   struct xsk_ring_cons *rx,
297                                   struct xsk_ring_prod *tx,
298                                   struct xsk_ring_prod *fill,
299                                   struct xsk_ring_cons *comp,
300                                   const struct xsk_socket_config *config);
301              int xsk_umem__delete(struct xsk_umem *umem);
302              void xsk_socket__delete(struct xsk_socket *xsk);
303
304
305       There are also two helper function to get the file descriptor of a umem
306       or a socket. These are needed when using standard Linux  syscalls  such
307       as poll(), recvmsg(), sendto(), etc.
308
309              int xsk_umem__fd(const struct xsk_umem *umem);
310              int xsk_socket__fd(const struct xsk_socket *xsk);
311
312
313       The  control  path also provides two APIs for setting up AF_XDP sockets
314       when the process that is going to use the AF_XDP socket  is  non-privi‐
315       leged.  These  two functions perform the operations that require privi‐
316       leges and can be executed from some form of control  process  that  has
317       the  necessary  privileges. The xsk_socket__create executed on the non-
318       privileged process will then skip these two steps. For  an  example  on
319       how  to  use  these,  please  take  a  look  at https://github.com/tor
320       valds/linux/blob/master/samples/bpf/xdpsock_user.c at  samples/bpf/xdp‐
321       sock_user.c    and   https://github.com/torvalds/linux/blob/master/sam
322       ples/bpf/xdpsock_ctrl_proc.c at samples/bpf/xdpsock_ctrl_proc.c in  the
323       Linux kernel source tree.
324
325              int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
326              int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
327
328
329   Data path
330       For  performance reasons, all the data path functions are static inline
331       functions found in the xsk.h header file so they can be optimized  into
332       the  target application binary for best possible performance. There are
333       four FIFO rings of two main types: producer rings  (fill  and  Tx)  and
334       consumer   rings   (Rx   and   completion).   The  producer  rings  use
335       xsk_ring_prod functions and consumer rings use xsk_ring_cons functions.
336       For  producer  rings,  you  start with reserving one or more slots in a
337       producer ring and then when they have been filled out, you submit  them
338       so  that  the kernel will act on them. For a consumer ring, you peek if
339       there are any new packets in the ring and if so you can read them  from
340       the  ring. Once you are done reading them, you release them back to the
341       kernel so it can use them for new packets. There is also a cancel oper‐
342       ation  for  consumer  rings if the application does not want to consume
343       all packets received with the peek operation.
344
345              __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx);
346              void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb);
347              __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx);
348              void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb);
349              void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb);
350
351
352       The functions below are used for reading and writing the descriptors of
353       the   rings.  xsk_ring_prod__fill_addr()  and  xsk_ring_prod__tx_desc()
354       writes  entries  in  the  fill  and  Tx   rings   respectively,   while
355       xsk_ring_cons__comp_addr  and xsk_ring_cons__rx_desc reads entries from
356       the completion and Rx rings respectively. The idx is the parameter  re‐
357       turned  in  the xsk_ring_prod__reserve or xsk_ring_cons__peek calls. To
358       advance to the next entry, simply do idx++.
359
360              __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, __u32 idx);
361              struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx, __u32 idx);
362              const __u64 *xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx);
363              const struct xdp_desc *xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx);
364
365
366       The xsk_umem functions are used to get a pointer to the packet data it‐
367       self,  always located inside the umem. In the default aligned mode, you
368       can get the addr variable straight from the Rx descriptor. But  in  un‐
369       aligned mode, you need to use the three last function below as the off‐
370       set used is carried in the upper 16 bits of the  addr.  Therefore,  you
371       cannot use the addr straight from the descriptor in the unaligned case.
372
373              void *xsk_umem__get_data(void *umem_area, __u64 addr);
374              __u64 xsk_umem__extract_addr(__u64 addr);
375              __u64 xsk_umem__extract_offset(__u64 addr);
376              __u64 xsk_umem__add_offset_to_addr(__u64 addr);
377
378
379       There  is  one  more  function  in the data path and that checks if the
380       need_wakeup flag is set. Use of this  flag  is  highly  encouraged  and
381       should   be   enabled   by   setting  XDP_USE_NEED_WAKEUP  bit  in  the
382       xdp_bind_flags  field  that  is   provided   to   the   xsk_socket_cre‐
383       ate_[shared]()  calls.  If this function returns true, then you need to
384       call  recvmsg(),  sendto(),  or  poll()  depending  on  the  situation.
385       recvmsg()  if you are receiving, or sendto() if you are sending. poll()
386       can be used for both cases and provide the ability  to  sleep  too,  as
387       with  any  other  socket. But note that poll is a slower operation than
388       the other two.
389
390              int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r);
391
392
393       For an example on how to use all these APIs, take a look at the  sample
394       applications in the Linux kernel source tree at https://github.com/tor
395       valds/linux/blob/master/samples/bpf/xdpsock_user.c at  samples/bpf/xdp‐
396       sock_user.c    and   https://github.com/torvalds/linux/blob/master/sam
397       ples/bpf/xsk_fwd.c at samples/bpf/xsk_fwd.c.
398
399

Kernel and BPF program feature compatibility

401       The features exposed by libxdp relies on certain  kernel  versions  and
402       BPF  features  to work. To get the full benefit of all features, libxdp
403       needs to be used with kernel 5.10 or newer,  unless  the  commits  men‐
404       tioned  below have been backported. However, libxdp will probe the ker‐
405       nel and transparently fall back to legacy loading procedures, so it  is
406       possible to use the library with older versions, although some features
407       will be unavailable, as detailed below.
408
409
410       The ability to attach multiple BPF programs to a single  interface  re‐
411       lies on the kernel "BPF program extension" feature which was introduced
412       by commit be8704ff07d2 ("bpf: Introduce dynamic program extensions") in
413       the upstream kernel and first appeared in kernel release 5.6. To incre‐
414       mentally attach multiple programs, a further refinement added by commit
415       4a1e7c0c63e0 ("bpf: Support attaching freplace programs to multiple at‐
416       tach points") is needed; this first appeared  in  the  upstream  kernel
417       version 5.10. The functionality relies on the "BPF trampolines" feature
418       which is unfortunately only available on the  x86_64  architecture.  In
419       other words, kernels before 5.6 can only attach a single XDP program to
420       each interface, kernels 5.6+ can attach multiple programs if  they  are
421       all  attached  at the same time, and kernels 5.10 have full support for
422       XDP multiprog on x86_64. On other architectures, only a single  program
423       can be attached to each interface.
424
425
426       To  load AF_XDP programs, kernel support for AF_XDP sockets needs to be
427       included and enabled in the  kernel  build.  In  addition,  when  using
428       AF_XDP sockets, an XDP program is also loaded on the interface. The XDP
429       program used for this by libxdp requires the ability to do map  lookups
430       into XSK maps, which was introduced with commit fada7fdc83c0 ("bpf: Al‐
431       low bpf_map_lookup_elem() on an xskmap") in kernel 5.3. This means that
432       the  minimum  required  kernel  version for using AF_XDP is kernel 5.3;
433       however, for the AF_XDP XDP program to co-exist  with  other  programs,
434       the same constraints for multiprog applies as outlined above.
435
436
437       Note  that some Linux distributions backport features to earlier kernel
438       versions, especially in enterprise kernels; for instance, Red  Hat  En‐
439       terprise Linux kernels include everything needed for libxdp to function
440       since RHEL 8.5.
441
442
443       Finally, XDP programs loaded using the multiprog facility must  include
444       type information (using the BPF Type Format, BTF). To get this, compile
445       the programs with a recent version of Clang/LLVM (version 10+), and en‐
446       able debug information when compiling (using the -g option).
447
448

BUGS

450       Please  report  any bugs on Github: https://github.com/xdp-project/xdp-
451       tools/issues
452
453

AUTHORS

455       libxdp and this man page were written by Toke Høiland-Jørgensen. AF_XDP
456       support and documentation was contributed by Magnus Karlsson.
457
458
459
460v1.2.3                         February 17, 2022                     libxdp(3)
Impressum