1libxdp(3)          libxdp - library for loading XDP programs         libxdp(3)
2
3
4

libxdp - library for attaching XDP programs and using AF_XDP sockets

6       This  directory contains the files for the libxdp library for attaching
7       XDP programs to network interfaces and using AF_XDP  sockets.  The  li‐
8       brary  is fairly lightweight and relies on libbpf to do the heavy lift‐
9       ing for processing eBPF object files etc.
10
11
12       Libxdp provides two primary features on top of =libbpf=The first is the
13       ability  to  load multiple XDP programs in sequence on a single network
14       device (which is not natively supported by the  kernel).  This  support
15       relies on the freplace functionality in the kernel, which makes it pos‐
16       sible to attach an eBPF program as a replacement for a global  function
17       in  another  (already  loaded) eBPF program. The second main feature is
18       helper functions for configuring AF_XDP sockets as well as reading  and
19       writing packets from these sockets.
20
21
22   Using libxdp from an application
23       Basic  usage  of  libxdp from an application is quite straight forward.
24       The following example loads, then unloads, an XDP program from the 'lo'
25       interface:
26
27              #define IFINDEX 1
28
29              struct xdp_program *prog;
30              int err;
31
32              prog = xdp_program__open_file("my-program.o", "section_name", NULL);
33              err = xdp_program__attach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
34
35              if (!err)
36                  xdp_program__detach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
37
38              xdp_program__close(prog);
39
40
41       The xdp_program structure is an opaque structure that represents a sin‐
42       gle XDP program. libxdp contains functions to create such a struct  ei‐
43       ther  from a BPF object file on disk, from a libbpf BPF object, or from
44       an identifier of a program that is already loaded into the kernel:
45
46              struct xdp_program *xdp_program__from_bpf_obj(struct bpf_object *obj,
47                                             const char *section_name);
48              struct xdp_program *xdp_program__find_file(const char *filename,
49                                          const char *section_name,
50                                          struct bpf_object_open_opts *opts);
51              struct xdp_program *xdp_program__open_file(const char *filename,
52                                          const char *section_name,
53                                          struct bpf_object_open_opts *opts);
54              struct xdp_program *xdp_program__from_fd(int fd);
55              struct xdp_program *xdp_program__from_id(__u32 prog_id);
56              struct xdp_program *xdp_program__from_pin(const char *pin_path);
57
58
59       The functions that open a BPF object or file need the function name  of
60       the  XDP  program as well as the file name or object, since an ELF file
61       can contain multiple XDP programs. The  xdp_program__find_file()  func‐
62       tion  takes  a filename without a path, and will look for the object in
63       LIBXDP_OBJECT_PATH which defaults to /usr/lib/bpf (or /usr/lib64/bpf on
64       systems  using  a  split library path). This is convenient for applica‐
65       tions shipping pre-compiled eBPF object files.
66
67
68       The xdp_program__attach() function will attach the program to an inter‐
69       face,  building  a  dispatcher program to execute it. Multiple programs
70       can be attached at once with xdp_program__attach_multi(); they will  be
71       sorted  in  order of their run priority, and execution from one program
72       to the next will proceed based on the chain call  actions  defined  for
73       each  program  (see  the  Program  metadata section below). Because the
74       loading process involves modifying the attach type of the program,  the
75       attach  functions  only  work with struct xdp_program objects that have
76       not yet been loaded into the kernel.
77
78
79       When using the attach functions to attach to an interface that  already
80       has  an  XDP  program loaded, libxdp will attempt to add the program to
81       the list of loaded programs. However, this  may  fail,  either  due  to
82       missing kernel support, or because the already-attached program was not
83       loaded using a dispatcher compatible with libxdp. If the kernel support
84       for incremental attach (merged in kernel 5.10) is missing, the only way
85       to actually run multiple programs on a single interface  is  to  attach
86       them  all at the same time with xdp_program__attach_multi(). If the ex‐
87       isting program is not an XDP dispatcher, that program will have  to  be
88       detached  from  the  interface before libxdp can attach a new one. This
89       can be done by calling xdp_program__detach() with a  reference  to  the
90       loaded program; but note that this will of course break any application
91       relying on that other XDP program to be present.
92
93

Program metadata

95       To support multiple XDP programs on the same interface, libxdp uses two
96       pieces  of  metadata  for each XDP program: Run priority and chain call
97       actions.
98
99
100   Run priority
101       This is the priority of the program and is a  simple  integer  used  to
102       sort  programs  when loading multiple programs onto the same interface.
103       Programs that wish to run early (such as a packet  filter)  should  set
104       low  values  for this, while programs that want to run later (such as a
105       packet forwarder or counter) should set higher values. Note that  later
106       programs  are  only run if the previous programs end with a return code
107       that is part of its chain call actions (see below). If  not  specified,
108       the default priority value is 50.
109
110
111   Chain call actions
112       These  are the program return codes that the program indicate for pack‐
113       ets that should continue processing. If  the  program  returns  one  of
114       these actions, later programs in the call chain will be run, whereas if
115       it returns any other action, processing will be  interrupted,  and  the
116       XDP  dispatcher  will  return the verdict immediately. If not set, this
117       defaults to just XDP_PASS, which is  likely  the  value  most  programs
118       should use.
119
120
121   Specifying metadata
122       The metadata outlined above is specified as BTF information embedded in
123       the ELF file containing the XDP program. The xdp_helpers.h file shipped
124       with  libxdp  contains helper macros to include this information, which
125       can be used as follows:
126
127              #include <bpf/bpf_helpers.h>
128              #include <xdp/xdp_helpers.h>
129
130              struct {
131                   __uint(priority, 10);
132                   __uint(XDP_PASS, 1);
133                   __uint(XDP_DROP, 1);
134              } XDP_RUN_CONFIG(my_xdp_func);
135
136
137       This example specifies that the XDP program in my_xdp_func should  have
138       priority  10 and that its chain call actions are XDP_PASS and XDP_DROP.
139       In a source file with multiple XDP programs in the same file, a defini‐
140       tion  like  the  above can be included for each program (main XDP func‐
141       tion). Any program that does not specify any  config  information  will
142       use the default values outlined above.
143
144
145   Inspecting and modifying metadata
146       libxdp  exposes  the following functions that an application can use to
147       inspect and modify the metadata on an XDP program. Modification is only
148       possible  before a program is attached on an interface. These functions
149       won't modify the BTF information itself, but the  new  values  will  be
150       stored as part of the program attachment.
151
152              unsigned int xdp_program__run_prio(const struct xdp_program *xdp_prog);
153              int xdp_program__set_run_prio(struct xdp_program *xdp_prog,
154                                   unsigned int run_prio);
155              bool xdp_program__chain_call_enabled(const struct xdp_program *xdp_prog,
156                                       enum xdp_action action);
157              int xdp_program__set_chain_call_enabled(struct xdp_program *prog,
158                                       unsigned int action,
159                                       bool enabled);
160              int xdp_program__print_chain_call_actions(const struct xdp_program *prog,
161                                         char *buf,
162                                         size_t buf_len);
163
164

The dispatcher program

166       To  support  multiple non-offloaded programs on the same network inter‐
167       face, libxdp uses a dispatcher program which is a small wrapper program
168       that  will call each component program in turn, expect the return code,
169       and then chain call to the next program based on the chain call actions
170       of the previous program (see the Program metadata section above).
171
172
173       While  applications using libxdp do not need to know the details of the
174       dispatcher program to just load  an  XDP  program  unto  an  interface,
175       libxdp  does expose the dispatcher and its attached component programs,
176       which can be used to list the programs currently attached to an  inter‐
177       face.
178
179
180       The  structure used for this is struct xdp_multiprog, which can only be
181       constructed from the programs loaded on an interface based on  ifindex.
182       The API for getting a multiprog reference and iterating through the at‐
183       tached programs looks like this:
184
185              struct xdp_multiprog *xdp_multiprog__get_from_ifindex(int ifindex);
186              struct xdp_program *xdp_multiprog__next_prog(const struct xdp_program *prog,
187                                            const struct xdp_multiprog *mp);
188              void xdp_multiprog__close(struct xdp_multiprog *mp);
189              int xdp_multiprog__detach(struct xdp_multiprog *mp, int ifindex);
190              enum xdp_attach_mode xdp_multiprog__attach_mode(const struct xdp_multiprog *mp);
191              struct xdp_program *xdp_multiprog__main_prog(const struct xdp_multiprog *mp);
192              struct xdp_program *xdp_multiprog__hw_prog(const struct xdp_multiprog *mp);
193              bool xdp_multiprog__is_legacy(const struct xdp_multiprog *mp);
194
195
196       If a non-offloaded program is attached to the  interface  which  libxdp
197       doesn't  recognise  as a dispatcher program, an xdp_multiprog structure
198       will still be returned, and xdp_multiprog__is_legacy() will return true
199       for  that  program (note that this also holds true if only an offloaded
200       program is loaded). A reference to that (regular) XDP  program  can  be
201       obtained  by xdp_multiprog__main_prog(). If the program attached to the
202       interface is a dispatcher program, xdp_multiprog__main_prog() will  re‐
203       turn a reference to the dispatcher program itself, which is mainly use‐
204       ful for obtaining other data about that program (such  as  the  program
205       ID). A reference to an offloaded program can be acquired using xdp_mul‐
206       tiprog_hw_prog(). Function xdp_multiprog__attach_mode() returns the at‐
207       tach mode of the non-offloaded program, whether an offloaded program is
208       attached should be checked through xdp_multiprog_hw_prog().
209
210
211   Pinning in bpffs
212       The kernel will automatically detach component programs from  the  dis‐
213       patcher  once  the  last  reference to them disappears. To prevent this
214       from happening, libxdp will pin the  component  program  references  in
215       bpffs  before  attaching  the  dispatcher to the network interface. The
216       pathnames generated for pinning is as follows:
217
218
219       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID  -  dispatcher   program   for
220           IFINDEX with BPF program ID DID
221
222       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-prog - component program
223           0, program reference
224
225       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-link - component program
226           0, bpf_link reference
227
228       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-prog - component program
229           1, program reference
230
231       —   /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-link - component program
232           1, bpf_link reference
233
234       —   etc, up to ten component programs
235
236
237       If  set,  the LIBXDP_BPFFS environment variable will override the loca‐
238       tion of bpffs, but the xdp subdirectory is always used.
239
240

Using AF_XDP sockets

242       Libxdp implements helper functions for configuring  AF_XDP  sockets  as
243       well  as reading and writing packets from these sockets. AF_XDP sockets
244       can be used to redirect packets to user-space at high rates from an XDP
245       program. Note that this functionality used to reside in libbpf, but has
246       now been moved over to libxdp as it is a better fit for  this  library.
247       As  of the 1.0 release of libbpf, the AF_XDP socket support will be re‐
248       moved and all future development will be performed in libxdp instead.
249
250
251       For an overview of AF_XDP sockets, please refer to this Linux  Plumbers
252       paper (http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-
253       v3.pdf) and the documentation in the Linux  kernel  (Documentation/net‐
254       working/af_xdp.rst or https://www.kernel.org/doc/Documentation/network
255       ing/af_xdp.rst).
256
257
258       For an example on how to use the interface, take a look at  the  sample
259       application  in  the  Linux  kernel  source  tree  at  samples/bpf/xdp‐
260       sock_user.c.
261
262
263   Control path
264       Libxdp provides helper functions for creating and destroying umems  and
265       sockets  as shown below. The first thing that a user generally wants to
266       do is to create a umem area. This is the area  that  will  contain  all
267       packets  received  and  the ones that are going to be sent. After that,
268       AF_XDP sockets can be created tied to this umem. These  can  either  be
269       sockets   that   have   exclusive   ownership   of  that  umem  through
270       xsk_socket__create()   or   shared    with    other    sockets    using
271       xsk_socket__create_shared.     There     is     one    option    called
272       XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD that can be set in the libxdp_flags
273       field  (also  called libbpf_flags for compatibility reasons). This will
274       make libxdp not load any XDP program or set and BPF  maps  which  is  a
275       must if users want to add their own XDP program.
276
277              int xsk_umem__create(struct xsk_umem **umem,
278                             void *umem_area, __u64 size,
279                             struct xsk_ring_prod *fill,
280                             struct xsk_ring_cons *comp,
281                             const struct xsk_umem_config *config);
282              int xsk_socket__create(struct xsk_socket **xsk,
283                               const char *ifname, __u32 queue_id,
284                               struct xsk_umem *umem,
285                               struct xsk_ring_cons *rx,
286                               struct xsk_ring_prod *tx,
287                               const struct xsk_socket_config *config);
288              int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
289                                   const char *ifname,
290                                   __u32 queue_id, struct xsk_umem *umem,
291                                   struct xsk_ring_cons *rx,
292                                   struct xsk_ring_prod *tx,
293                                   struct xsk_ring_prod *fill,
294                                   struct xsk_ring_cons *comp,
295                                   const struct xsk_socket_config *config);
296              int xsk_umem__delete(struct xsk_umem *umem);
297              void xsk_socket__delete(struct xsk_socket *xsk);
298
299
300       There are also two helper function to get the file descriptor of a umem
301       or a socket. These are needed when using standard Linux  syscalls  such
302       as poll(), recvmsg(), sendto(), etc.
303
304              int xsk_umem__fd(const struct xsk_umem *umem);
305              int xsk_socket__fd(const struct xsk_socket *xsk);
306
307
308       The  control  path also provides two APIs for setting up AF_XDP sockets
309       when the process that is going to use the AF_XDP socket  is  non-privi‐
310       leged.  These  two functions perform the operations that require privi‐
311       leges and can be executed from some form of control  process  that  has
312       the  necessary  privileges. The xsk_socket__create executed on the non-
313       privileged process will then skip these two steps. For  an  example  on
314       how  to use these, please take a look at samples/bpf/xdpsock_user.c and
315       samples/bpf/xdpsock_ctrl_proc.c in the Linux kernel source tree.
316
317              int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
318              int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
319
320
321   Data path
322       For performance reasons, all the data path functions are static  inline
323       functions  found in the xsk.h header file so they can be optimized into
324       the target application binary for best possible performance. There  are
325       four  FIFO  rings  of  two main types: producer rings (fill and Tx) and
326       consumer  rings  (Rx  and   completion).   The   producer   rings   use
327       xsk_ring_prod functions and consumer rings use xsk_ring_cons functions.
328       For producer rings, you start with reserving one or  more  slots  in  a
329       producer  ring and then when they have been filled out, you submit them
330       so that the kernel will act on them. For a consumer ring, you  peek  if
331       there  are any new packets in the ring and if so you can read them from
332       the ring. Once you are done reading them, you release them back to  the
333       kernel so it can use them for new packets. There is also a cancel oper‐
334       ation for consumer rings if the application does not  want  to  consume
335       all packets received with the peek operation.
336
337              __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx);
338              void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb);
339              __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx);
340              void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb);
341              void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb);
342
343
344       The functions below are used for reading and writing the descriptors of
345       the  rings.  xsk_ring_prod__fill_addr()  and   xsk_ring_prod__tx_desc()
346       writes   entries   in   the  fill  and  Tx  rings  respectively,  while
347       xsk_ring_cons__comp_addr and xsk_ring_cons__rx_desc reads entries  from
348       the  completion  and Rx rings respectively. The idx is the paramter re‐
349       turned in the xsk_ring_prod__reserve or xsk_ring_cons__peek  calls.  To
350       advance to the next entry, simply do idx++.
351
352              __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, __u32 idx);
353              struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx, __u32 idx);
354              const __u64 *xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx);
355              const struct xdp_desc *xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx);
356
357
358       The xsk_umem functions are used to get a pointer to the packet data it‐
359       self, always located inside the umem. In the default aligned mode,  you
360       can  get  the addr variable straight from the Rx descriptor. But in un‐
361       aligned mode, you need to use the three last function below as the off‐
362       set  used  is  carried in the upper 16 bits of the addr. Therefore, you
363       cannot use the addr straight from the descriptor in the unaligned case.
364
365              void *xsk_umem__get_data(void *umem_area, __u64 addr);
366              __u64 xsk_umem__extract_addr(__u64 addr);
367              __u64 xsk_umem__extract_offset(__u64 addr);
368              __u64 xsk_umem__add_offset_to_addr(__u64 addr);
369
370
371       There is one more function in the data path  and  that  checks  if  the
372       need_wakeup  flag  is  set.  Use  of this flag is highly encouraged and
373       should  be  enabled  by  setting   XDP_USE_NEED_WAKEUP   bit   in   the
374       xdp_bind_flags   field   that   is   provided  to  the  xsk_socket_cre‐
375       ate_[shared]() calls. If this function returns true, then you  need  to
376       call  recvmsg(),  sendto(),  or  poll()  depending  on  the  situation.
377       recvmsg() if you are receiving, or sendto() if you are sending.  poll()
378       can  be  used  for  both cases and provide the ability to sleep too, as
379       with any other socket. But note that poll is a  slower  operation  than
380       the other two.
381
382              int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r);
383
384
385       For  an example on how to use all these APIs, take a look at the sample
386       applications in  the  Linux  kernel  source  tree  at  samples/bpf/xdp‐
387       sock_user.c and samples/bpf/xsk_fwd.c.
388
389

BUGS

391       Please  report  any bugs on Github: https://github.com/xdp-project/xdp-
392       tools/issues
393
394

AUTHORS

396       libxdp and this man page were written by Toke Høiland-Jørgensen. AF_XDP
397       support and documentation was contributed by Magnus Karlsson.
398
399
400
401v1.2.0                           July 24, 2021                       libxdp(3)
Impressum