1libxdp(3) libxdp - library for loading XDP programs libxdp(3)
2
3
4
6 This directory contains the files for the libxdp library for attaching
7 XDP programs to network interfaces and using AF_XDP sockets. The li‐
8 brary is fairly lightweight and relies on libbpf to do the heavy lift‐
9 ing for processing eBPF object files etc.
10
11
12 Libxdp provides two primary features on top of =libbpf=The first is the
13 ability to load multiple XDP programs in sequence on a single network
14 device (which is not natively supported by the kernel). This support
15 relies on the freplace functionality in the kernel, which makes it pos‐
16 sible to attach an eBPF program as a replacement for a global function
17 in another (already loaded) eBPF program. The second main feature is
18 helper functions for configuring AF_XDP sockets as well as reading and
19 writing packets from these sockets.
20
21
22 Using libxdp from an application
23 Basic usage of libxdp from an application is quite straight forward.
24 The following example loads, then unloads, an XDP program from the 'lo'
25 interface:
26
27 #define IFINDEX 1
28
29 struct xdp_program *prog;
30 int err;
31
32 prog = xdp_program__open_file("my-program.o", "section_name", NULL);
33 err = xdp_program__attach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
34
35 if (!err)
36 xdp_program__detach(prog, IFINDEX, XDP_MODE_NATIVE, 0);
37
38 xdp_program__close(prog);
39
40
41 The xdp_program structure is an opaque structure that represents a sin‐
42 gle XDP program. libxdp contains functions to create such a struct ei‐
43 ther from a BPF object file on disk, from a libbpf BPF object, or from
44 an identifier of a program that is already loaded into the kernel:
45
46 struct xdp_program *xdp_program__from_bpf_obj(struct bpf_object *obj,
47 const char *section_name);
48 struct xdp_program *xdp_program__find_file(const char *filename,
49 const char *section_name,
50 struct bpf_object_open_opts *opts);
51 struct xdp_program *xdp_program__open_file(const char *filename,
52 const char *section_name,
53 struct bpf_object_open_opts *opts);
54 struct xdp_program *xdp_program__from_fd(int fd);
55 struct xdp_program *xdp_program__from_id(__u32 prog_id);
56 struct xdp_program *xdp_program__from_pin(const char *pin_path);
57
58
59 The functions that open a BPF object or file need the function name of
60 the XDP program as well as the file name or object, since an ELF file
61 can contain multiple XDP programs. The xdp_program__find_file() func‐
62 tion takes a filename without a path, and will look for the object in
63 LIBXDP_OBJECT_PATH which defaults to /usr/lib/bpf (or /usr/lib64/bpf on
64 systems using a split library path). This is convenient for applica‐
65 tions shipping pre-compiled eBPF object files.
66
67
68 The xdp_program__attach() function will attach the program to an inter‐
69 face, building a dispatcher program to execute it. Multiple programs
70 can be attached at once with xdp_program__attach_multi(); they will be
71 sorted in order of their run priority, and execution from one program
72 to the next will proceed based on the chain call actions defined for
73 each program (see the Program metadata section below). Because the
74 loading process involves modifying the attach type of the program, the
75 attach functions only work with struct xdp_program objects that have
76 not yet been loaded into the kernel.
77
78
79 When using the attach functions to attach to an interface that already
80 has an XDP program loaded, libxdp will attempt to add the program to
81 the list of loaded programs. However, this may fail, either due to
82 missing kernel support, or because the already-attached program was not
83 loaded using a dispatcher compatible with libxdp. If the kernel support
84 for incremental attach (merged in kernel 5.10) is missing, the only way
85 to actually run multiple programs on a single interface is to attach
86 them all at the same time with xdp_program__attach_multi(). If the ex‐
87 isting program is not an XDP dispatcher, that program will have to be
88 detached from the interface before libxdp can attach a new one. This
89 can be done by calling xdp_program__detach() with a reference to the
90 loaded program; but note that this will of course break any application
91 relying on that other XDP program to be present.
92
93
95 To support multiple XDP programs on the same interface, libxdp uses two
96 pieces of metadata for each XDP program: Run priority and chain call
97 actions.
98
99
100 Run priority
101 This is the priority of the program and is a simple integer used to
102 sort programs when loading multiple programs onto the same interface.
103 Programs that wish to run early (such as a packet filter) should set
104 low values for this, while programs that want to run later (such as a
105 packet forwarder or counter) should set higher values. Note that later
106 programs are only run if the previous programs end with a return code
107 that is part of its chain call actions (see below). If not specified,
108 the default priority value is 50.
109
110
111 Chain call actions
112 These are the program return codes that the program indicate for pack‐
113 ets that should continue processing. If the program returns one of
114 these actions, later programs in the call chain will be run, whereas if
115 it returns any other action, processing will be interrupted, and the
116 XDP dispatcher will return the verdict immediately. If not set, this
117 defaults to just XDP_PASS, which is likely the value most programs
118 should use.
119
120
121 Specifying metadata
122 The metadata outlined above is specified as BTF information embedded in
123 the ELF file containing the XDP program. The xdp_helpers.h file shipped
124 with libxdp contains helper macros to include this information, which
125 can be used as follows:
126
127 #include <bpf/bpf_helpers.h>
128 #include <xdp/xdp_helpers.h>
129
130 struct {
131 __uint(priority, 10);
132 __uint(XDP_PASS, 1);
133 __uint(XDP_DROP, 1);
134 } XDP_RUN_CONFIG(my_xdp_func);
135
136
137 This example specifies that the XDP program in my_xdp_func should have
138 priority 10 and that its chain call actions are XDP_PASS and XDP_DROP.
139 In a source file with multiple XDP programs in the same file, a defini‐
140 tion like the above can be included for each program (main XDP func‐
141 tion). Any program that does not specify any config information will
142 use the default values outlined above.
143
144
145 Inspecting and modifying metadata
146 libxdp exposes the following functions that an application can use to
147 inspect and modify the metadata on an XDP program. Modification is only
148 possible before a program is attached on an interface. These functions
149 won't modify the BTF information itself, but the new values will be
150 stored as part of the program attachment.
151
152 unsigned int xdp_program__run_prio(const struct xdp_program *xdp_prog);
153 int xdp_program__set_run_prio(struct xdp_program *xdp_prog,
154 unsigned int run_prio);
155 bool xdp_program__chain_call_enabled(const struct xdp_program *xdp_prog,
156 enum xdp_action action);
157 int xdp_program__set_chain_call_enabled(struct xdp_program *prog,
158 unsigned int action,
159 bool enabled);
160 int xdp_program__print_chain_call_actions(const struct xdp_program *prog,
161 char *buf,
162 size_t buf_len);
163
164
166 To support multiple non-offloaded programs on the same network inter‐
167 face, libxdp uses a dispatcher program which is a small wrapper program
168 that will call each component program in turn, expect the return code,
169 and then chain call to the next program based on the chain call actions
170 of the previous program (see the Program metadata section above).
171
172
173 While applications using libxdp do not need to know the details of the
174 dispatcher program to just load an XDP program unto an interface,
175 libxdp does expose the dispatcher and its attached component programs,
176 which can be used to list the programs currently attached to an inter‐
177 face.
178
179
180 The structure used for this is struct xdp_multiprog, which can only be
181 constructed from the programs loaded on an interface based on ifindex.
182 The API for getting a multiprog reference and iterating through the at‐
183 tached programs looks like this:
184
185 struct xdp_multiprog *xdp_multiprog__get_from_ifindex(int ifindex);
186 struct xdp_program *xdp_multiprog__next_prog(const struct xdp_program *prog,
187 const struct xdp_multiprog *mp);
188 void xdp_multiprog__close(struct xdp_multiprog *mp);
189 int xdp_multiprog__detach(struct xdp_multiprog *mp, int ifindex);
190 enum xdp_attach_mode xdp_multiprog__attach_mode(const struct xdp_multiprog *mp);
191 struct xdp_program *xdp_multiprog__main_prog(const struct xdp_multiprog *mp);
192 struct xdp_program *xdp_multiprog__hw_prog(const struct xdp_multiprog *mp);
193 bool xdp_multiprog__is_legacy(const struct xdp_multiprog *mp);
194
195
196 If a non-offloaded program is attached to the interface which libxdp
197 doesn't recognise as a dispatcher program, an xdp_multiprog structure
198 will still be returned, and xdp_multiprog__is_legacy() will return true
199 for that program (note that this also holds true if only an offloaded
200 program is loaded). A reference to that (regular) XDP program can be
201 obtained by xdp_multiprog__main_prog(). If the program attached to the
202 interface is a dispatcher program, xdp_multiprog__main_prog() will re‐
203 turn a reference to the dispatcher program itself, which is mainly use‐
204 ful for obtaining other data about that program (such as the program
205 ID). A reference to an offloaded program can be acquired using xdp_mul‐
206 tiprog_hw_prog(). Function xdp_multiprog__attach_mode() returns the at‐
207 tach mode of the non-offloaded program, whether an offloaded program is
208 attached should be checked through xdp_multiprog_hw_prog().
209
210
211 Pinning in bpffs
212 The kernel will automatically detach component programs from the dis‐
213 patcher once the last reference to them disappears. To prevent this
214 from happening, libxdp will pin the component program references in
215 bpffs before attaching the dispatcher to the network interface. The
216 pathnames generated for pinning is as follows:
217
218
219 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID - dispatcher program for
220 IFINDEX with BPF program ID DID
221
222 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-prog - component program
223 0, program reference
224
225 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog0-link - component program
226 0, bpf_link reference
227
228 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-prog - component program
229 1, program reference
230
231 — /sys/fs/bpf/xdp/dispatch-IFINDEX-DID/prog1-link - component program
232 1, bpf_link reference
233
234 — etc, up to ten component programs
235
236
237 If set, the LIBXDP_BPFFS environment variable will override the loca‐
238 tion of bpffs, but the xdp subdirectory is always used.
239
240
242 Libxdp implements helper functions for configuring AF_XDP sockets as
243 well as reading and writing packets from these sockets. AF_XDP sockets
244 can be used to redirect packets to user-space at high rates from an XDP
245 program. Note that this functionality used to reside in libbpf, but has
246 now been moved over to libxdp as it is a better fit for this library.
247 As of the 1.0 release of libbpf, the AF_XDP socket support will be re‐
248 moved and all future development will be performed in libxdp instead.
249
250
251 For an overview of AF_XDP sockets, please refer to this Linux Plumbers
252 paper (http://vger.kernel.org/lpc_net2018_talks/lpc18_pres_af_xdp_perf-
253 v3.pdf) and the documentation in the Linux kernel (Documentation/net‐
254 working/af_xdp.rst or https://www.kernel.org/doc/Documentation/network‐
255 ing/af_xdp.rst).
256
257
258 For an example on how to use the interface, take a look at the sample
259 application in the Linux kernel source tree at samples/bpf/xdp‐
260 sock_user.c.
261
262
263 Control path
264 Libxdp provides helper functions for creating and destroying umems and
265 sockets as shown below. The first thing that a user generally wants to
266 do is to create a umem area. This is the area that will contain all
267 packets received and the ones that are going to be sent. After that,
268 AF_XDP sockets can be created tied to this umem. These can either be
269 sockets that have exclusive ownership of that umem through
270 xsk_socket__create() or shared with other sockets using
271 xsk_socket__create_shared. There is one option called
272 XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD that can be set in the libxdp_flags
273 field (also called libbpf_flags for compatibility reasons). This will
274 make libxdp not load any XDP program or set and BPF maps which is a
275 must if users want to add their own XDP program.
276
277 int xsk_umem__create(struct xsk_umem **umem,
278 void *umem_area, __u64 size,
279 struct xsk_ring_prod *fill,
280 struct xsk_ring_cons *comp,
281 const struct xsk_umem_config *config);
282 int xsk_socket__create(struct xsk_socket **xsk,
283 const char *ifname, __u32 queue_id,
284 struct xsk_umem *umem,
285 struct xsk_ring_cons *rx,
286 struct xsk_ring_prod *tx,
287 const struct xsk_socket_config *config);
288 int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
289 const char *ifname,
290 __u32 queue_id, struct xsk_umem *umem,
291 struct xsk_ring_cons *rx,
292 struct xsk_ring_prod *tx,
293 struct xsk_ring_prod *fill,
294 struct xsk_ring_cons *comp,
295 const struct xsk_socket_config *config);
296 int xsk_umem__delete(struct xsk_umem *umem);
297 void xsk_socket__delete(struct xsk_socket *xsk);
298
299
300 There are also two helper function to get the file descriptor of a umem
301 or a socket. These are needed when using standard Linux syscalls such
302 as poll(), recvmsg(), sendto(), etc.
303
304 int xsk_umem__fd(const struct xsk_umem *umem);
305 int xsk_socket__fd(const struct xsk_socket *xsk);
306
307
308 The control path also provides two APIs for setting up AF_XDP sockets
309 when the process that is going to use the AF_XDP socket is non-privi‐
310 leged. These two functions perform the operations that require privi‐
311 leges and can be executed from some form of control process that has
312 the necessary privileges. The xsk_socket__create executed on the non-
313 privileged process will then skip these two steps. For an example on
314 how to use these, please take a look at samples/bpf/xdpsock_user.c and
315 samples/bpf/xdpsock_ctrl_proc.c in the Linux kernel source tree.
316
317 int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
318 int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
319
320
321 Data path
322 For performance reasons, all the data path functions are static inline
323 functions found in the xsk.h header file so they can be optimized into
324 the target application binary for best possible performance. There are
325 four FIFO rings of two main types: producer rings (fill and Tx) and
326 consumer rings (Rx and completion). The producer rings use
327 xsk_ring_prod functions and consumer rings use xsk_ring_cons functions.
328 For producer rings, you start with reserving one or more slots in a
329 producer ring and then when they have been filled out, you submit them
330 so that the kernel will act on them. For a consumer ring, you peek if
331 there are any new packets in the ring and if so you can read them from
332 the ring. Once you are done reading them, you release them back to the
333 kernel so it can use them for new packets. There is also a cancel oper‐
334 ation for consumer rings if the application does not want to consume
335 all packets received with the peek operation.
336
337 __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx);
338 void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb);
339 __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx);
340 void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb);
341 void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb);
342
343
344 The functions below are used for reading and writing the descriptors of
345 the rings. xsk_ring_prod__fill_addr() and xsk_ring_prod__tx_desc()
346 writes entries in the fill and Tx rings respectively, while
347 xsk_ring_cons__comp_addr and xsk_ring_cons__rx_desc reads entries from
348 the completion and Rx rings respectively. The idx is the paramter re‐
349 turned in the xsk_ring_prod__reserve or xsk_ring_cons__peek calls. To
350 advance to the next entry, simply do idx++.
351
352 __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill, __u32 idx);
353 struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx, __u32 idx);
354 const __u64 *xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx);
355 const struct xdp_desc *xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx);
356
357
358 The xsk_umem functions are used to get a pointer to the packet data it‐
359 self, always located inside the umem. In the default aligned mode, you
360 can get the addr variable straight from the Rx descriptor. But in un‐
361 aligned mode, you need to use the three last function below as the off‐
362 set used is carried in the upper 16 bits of the addr. Therefore, you
363 cannot use the addr straight from the descriptor in the unaligned case.
364
365 void *xsk_umem__get_data(void *umem_area, __u64 addr);
366 __u64 xsk_umem__extract_addr(__u64 addr);
367 __u64 xsk_umem__extract_offset(__u64 addr);
368 __u64 xsk_umem__add_offset_to_addr(__u64 addr);
369
370
371 There is one more function in the data path and that checks if the
372 need_wakeup flag is set. Use of this flag is highly encouraged and
373 should be enabled by setting XDP_USE_NEED_WAKEUP bit in the
374 xdp_bind_flags field that is provided to the xsk_socket_cre‐
375 ate_[shared]() calls. If this function returns true, then you need to
376 call recvmsg(), sendto(), or poll() depending on the situation.
377 recvmsg() if you are receiving, or sendto() if you are sending. poll()
378 can be used for both cases and provide the ability to sleep too, as
379 with any other socket. But note that poll is a slower operation than
380 the other two.
381
382 int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r);
383
384
385 For an example on how to use all these APIs, take a look at the sample
386 applications in the Linux kernel source tree at samples/bpf/xdp‐
387 sock_user.c and samples/bpf/xsk_fwd.c.
388
389
391 Please report any bugs on Github: https://github.com/xdp-project/xdp-
392 tools/issues
393
394
396 libxdp and this man page were written by Toke Høiland-Jørgensen. AF_XDP
397 support and documentation was contributed by Magnus Karlsson.
398
399
400
401v1.2.0 July 24, 2021 libxdp(3)