1BPF-HELPERS(7)                                                  BPF-HELPERS(7)
2
3
4

NAME

6       BPF-HELPERS - list of eBPF helper functions
7

DESCRIPTION

9       The  extended  Berkeley Packet Filter (eBPF) subsystem consists in pro‐
10       grams written in a pseudo-assembly language, then attached  to  one  of
11       the  several  kernel hooks and run in reaction of specific events. This
12       framework differs from the older, "classic" BPF (or "cBPF") in  several
13       aspects,  one  of  them being the ability to call special functions (or
14       "helpers") from within a program.  These functions are restricted to  a
15       white-list of helpers defined in the kernel.
16
17       These helpers are used by eBPF programs to interact with the system, or
18       with the context in which they work. For instance, they can be used  to
19       print  debugging messages, to get the time since the system was booted,
20       to interact with eBPF maps, or to  manipulate  network  packets.  Since
21       there  are  several eBPF program types, and that they do not run in the
22       same context, each program  type  can  only  call  a  subset  of  those
23       helpers.
24
25       Due  to  eBPF  conventions,  a helper can not have more than five argu‐
26       ments.
27
28       Internally, eBPF programs call directly into the compiled helper  func‐
29       tions  without  requiring  any foreign-function interface. As a result,
30       calling helpers introduces no overhead, thus offering excellent perfor‐
31       mance.
32
33       This  document is an attempt to list and document the helpers available
34       to eBPF developers. They are sorted by chronological order (the  oldest
35       helpers in the kernel at the top).
36

HELPERS

38       void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
39
40              Description
41                     Perform a lookup in map for an entry associated to key.
42
43              Return Map  value  associated  to  key,  or NULL if no entry was
44                     found.
45
46       int bpf_map_update_elem(struct bpf_map *map,  const  void  *key,  const
47       void *value, u64 flags)
48
49              Description
50                     Add or update the value of the entry associated to key in
51                     map with value. flags is one of:
52
53                     BPF_NOEXIST
54                            The entry for key must not exist in the map.
55
56                     BPF_EXIST
57                            The entry for key must already exist in the map.
58
59                     BPF_ANY
60                            No condition on the existence  of  the  entry  for
61                            key.
62
63                     Flag  value  BPF_NOEXIST cannot be used for maps of types
64                     BPF_MAP_TYPE_ARRAY  or  BPF_MAP_TYPE_PERCPU_ARRAY    (all
65                     elements always exist), the helper would return an error.
66
67              Return 0 on success, or a negative error in case of failure.
68
69       int bpf_map_delete_elem(struct bpf_map *map, const void *key)
70
71              Description
72                     Delete entry with key from map.
73
74              Return 0 on success, or a negative error in case of failure.
75
76       int bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
77
78              Description
79                     For  tracing  programs, safely attempt to read size bytes
80                     from kernel space address unsafe_ptr and store  the  data
81                     in dst.
82
83                     Generally,       use       bpf_probe_read_user()       or
84                     bpf_probe_read_kernel() instead.
85
86              Return 0 on success, or a negative error in case of failure.
87
88       u64 bpf_ktime_get_ns(void)
89
90              Description
91                     Return the time elapsed since system  boot,  in  nanosec‐
92                     onds.   Does  not  include time the system was suspended.
93                     See: clock_gettime(CLOCK_MONOTONIC)
94
95              Return Current ktime.
96
97       int bpf_trace_printk(const char *fmt, u32 fmt_size, ...)
98
99              Description
100                     This helper is a "printk()-like" facility for  debugging.
101                     It  prints  a  message  defined  by  format  fmt (of size
102                     fmt_size) to  file  /sys/kernel/debug/tracing/trace  from
103                     DebugFS, if available. It can take up to three additional
104                     u64 arguments (as an eBPF helpers, the  total  number  of
105                     arguments is limited to five).
106
107                     Each  time the helper is called, it appends a line to the
108                     trace.  Lines are discarded while /sys/kernel/debug/trac‐
109                     ing/trace    is    open,    use   /sys/kernel/debug/trac‐
110                     ing/trace_pipe to avoid this.  The format of the trace is
111                     customizable,  and  the exact output one will get depends
112                     on   the   options   set    in    /sys/kernel/debug/trac‐
113                     ing/trace_options  (see  also  the  README file under the
114                     same directory). However, it usually  defaults  to  some‐
115                     thing like:
116
117                        telnet-470   [001] .N.. 419421.045894: 0x00000001: <formatted msg>
118
119                     In the above:
120
121                        · telnet is the name of the current task.
122
123                        · 470 is the PID of the current task.
124
125                        · 001 is the CPU number on which the task is running.
126
127                        · In  .N..,  each character refers to a set of options
128                          (whether  irqs  are  enabled,  scheduling   options,
129                          whether  hard/softirqs  are  running,  level of pre‐
130                          empt_disabled   respectively).    N    means    that
131                          TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED are set.
132
133                        · 419421.045894 is a timestamp.
134
135                        · 0x00000001  is  a  fake  value  used  by BPF for the
136                          instruction pointer register.
137
138                        · <formatted msg> is the message formatted with fmt.
139
140                     The conversion specifiers supported by fmt  are  similar,
141                     but  more limited than for printk(). They are %d, %i, %u,
142                     %x, %ld, %li, %lu, %lx, %lld, %lli, %llu, %llx,  %p,  %s.
143                     No modifier (size of field, padding with zeroes, etc.) is
144                     available, and the helper will return -EINVAL (but  print
145                     nothing) if it encounters an unknown specifier.
146
147                     Also,  note  that  bpf_trace_printk() is slow, and should
148                     only be used for debugging purposes. For this  reason,  a
149                     notice bloc (spanning several lines) is printed to kernel
150                     logs and states that the helper should not be  used  "for
151                     production  use"  the  first time this helper is used (or
152                     more precisely, when  trace_printk()  buffers  are  allo‐
153                     cated).  For  passing  values  to user space, perf events
154                     should be preferred.
155
156              Return The number of bytes written to the buffer, or a  negative
157                     error in case of failure.
158
159       u32 bpf_get_prandom_u32(void)
160
161              Description
162                     Get a pseudo-random number.
163
164                     From  a  security point of view, this helper uses its own
165                     pseudo-random internal state, and cannot be used to infer
166                     the  seed  of  other random functions in the kernel. How‐
167                     ever, it is essential to note that the generator used  by
168                     the helper is not cryptographically secure.
169
170              Return A random 32-bit unsigned value.
171
172       u32 bpf_get_smp_processor_id(void)
173
174              Description
175                     Get  the  SMP  (symmetric  multiprocessing) processor id.
176                     Note that all  programs  run  with  preemption  disabled,
177                     which  means  that  the SMP processor id is stable during
178                     all the execution of the program.
179
180              Return The SMP id of the processor running the program.
181
182       int bpf_skb_store_bytes(struct sk_buff *skb,  u32  offset,  const  void
183       *from, u32 len, u64 flags)
184
185              Description
186                     Store len bytes from address from into the packet associ‐
187                     ated to skb,  at  offset.  flags  are  a  combination  of
188                     BPF_F_RECOMPUTE_CSUM  (automatically recompute the check‐
189                     sum  for  the  packet  after  storing  the   bytes)   and
190                     BPF_F_INVALIDATE_HASH  (set  skb->hash,  skb->swhash  and
191                     skb->l4hash to 0).
192
193                     A call to this helper is susceptible to change the under‐
194                     lying  packet buffer. Therefore, at load time, all checks
195                     on pointers previously done by the verifier  are  invali‐
196                     dated  and must be performed again, if the helper is used
197                     in combination with direct packet access.
198
199              Return 0 on success, or a negative error in case of failure.
200
201       int bpf_l3_csum_replace(struct sk_buff *skb, u32 offset, u64 from,  u64
202       to, u64 size)
203
204              Description
205                     Recompute  the  layer 3 (e.g. IP) checksum for the packet
206                     associated to skb. Computation  is  incremental,  so  the
207                     helper  must  know  the  former value of the header field
208                     that was modified (from), the new  value  of  this  field
209                     (to),  and  the  number of bytes (2 or 4) for this field,
210                     stored in size.  Alternatively, it is possible  to  store
211                     the difference between the previous and the new values of
212                     the header field in to, by setting from and  size  to  0.
213                     For both methods, offset indicates the location of the IP
214                     checksum within the packet.
215
216                     This helper works in  combination  with  bpf_csum_diff(),
217                     which  does  not update the checksum in-place, but offers
218                     more flexibility and can handle sizes larger than 2 or  4
219                     for the checksum to update.
220
221                     A call to this helper is susceptible to change the under‐
222                     lying packet buffer. Therefore, at load time, all  checks
223                     on  pointers  previously done by the verifier are invali‐
224                     dated and must be performed again, if the helper is  used
225                     in combination with direct packet access.
226
227              Return 0 on success, or a negative error in case of failure.
228
229       int  bpf_l4_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64
230       to, u64 flags)
231
232              Description
233                     Recompute the layer 4 (e.g. TCP, UDP  or  ICMP)  checksum
234                     for  the  packet associated to skb. Computation is incre‐
235                     mental, so the helper must know the former value  of  the
236                     header  field  that was modified (from), the new value of
237                     this field (to), and the number of bytes  (2  or  4)  for
238                     this  field,  stored  on  the  lowest four bits of flags.
239                     Alternatively, it is possible  to  store  the  difference
240                     between  the  previous  and  the new values of the header
241                     field in to, by setting from and the four lowest bits  of
242                     flags  to 0. For both methods, offset indicates the loca‐
243                     tion of the IP checksum within the packet. In addition to
244                     the  size  of  the field, flags can be added (bitwise OR)
245                     actual flags. With BPF_F_MARK_MANGLED_0, a null  checksum
246                     is  left untouched (unless BPF_F_MARK_ENFORCE is added as
247                     well), and for updates resulting in a null  checksum  the
248                     value    is   set   to   CSUM_MANGLED_0   instead.   Flag
249                     BPF_F_PSEUDO_HDR indicates the checksum is to be computed
250                     against a pseudo-header.
251
252                     This  helper  works  in combination with bpf_csum_diff(),
253                     which does not update the checksum in-place,  but  offers
254                     more  flexibility and can handle sizes larger than 2 or 4
255                     for the checksum to update.
256
257                     A call to this helper is susceptible to change the under‐
258                     lying  packet buffer. Therefore, at load time, all checks
259                     on pointers previously done by the verifier  are  invali‐
260                     dated  and must be performed again, if the helper is used
261                     in combination with direct packet access.
262
263              Return 0 on success, or a negative error in case of failure.
264
265       int bpf_tail_call(void *ctx, struct bpf_map *prog_array_map, u32 index)
266
267              Description
268                     This special helper is used to trigger a "tail call",  or
269                     in  other  words,  to jump into another eBPF program. The
270                     same stack frame is used (but values on stack and in reg‐
271                     isters  for the caller are not accessible to the callee).
272                     This mechanism allows for program  chaining,  either  for
273                     raising  the  maximum  number  of available eBPF instruc‐
274                     tions,  or  to  execute  given  programs  in  conditional
275                     blocks.  For security reasons, there is an upper limit to
276                     the number of successive tail  calls  that  can  be  per‐
277                     formed.
278
279                     Upon  call  of  this helper, the program attempts to jump
280                     into   a   program   referenced   at   index   index   in
281                     prog_array_map,     a     special     map     of     type
282                     BPF_MAP_TYPE_PROG_ARRAY, and passes ctx, a pointer to the
283                     context.
284
285                     If  the  call  succeeds,  the kernel immediately runs the
286                     first instruction of the new program. This is not a func‐
287                     tion  call, and it never returns to the previous program.
288                     If the call fails, then the helper has no effect, and the
289                     caller  continues  to  run its subsequent instructions. A
290                     call can fail if the destination  program  for  the  jump
291                     does  not  exist (i.e. index is superior to the number of
292                     entries in prog_array_map), or if the maximum  number  of
293                     tail  calls  has been reached for this chain of programs.
294                     This  limit  is  defined  in  the  kernel  by  the  macro
295                     MAX_TAIL_CALL_CNT  (not  accessible to user space), which
296                     is currently set to 32.
297
298              Return 0 on success, or a negative error in case of failure.
299
300       int bpf_clone_redirect(struct sk_buff *skb, u32 ifindex, u64 flags)
301
302              Description
303                     Clone and  redirect  the  packet  associated  to  skb  to
304                     another  net  device  of  index ifindex. Both ingress and
305                     egress  interfaces  can  be  used  for  redirection.  The
306                     BPF_F_INGRESS value in flags is used to make the distinc‐
307                     tion (ingress path is selected if the  flag  is  present,
308                     egress  path otherwise).  This is the only flag supported
309                     for now.
310
311                     In comparison with bpf_redirect() helper, bpf_clone_redi‐
312                     rect()  has the associated cost of duplicating the packet
313                     buffer, but this can be executed out of the eBPF program.
314                     Conversely,  bpf_redirect()  is more efficient, but it is
315                     handled through an action code where the redirection hap‐
316                     pens only after the eBPF program has returned.
317
318                     A call to this helper is susceptible to change the under‐
319                     lying packet buffer. Therefore, at load time, all  checks
320                     on  pointers  previously done by the verifier are invali‐
321                     dated and must be performed again, if the helper is  used
322                     in combination with direct packet access.
323
324              Return 0 on success, or a negative error in case of failure.
325
326       u64 bpf_get_current_pid_tgid(void)
327
328              Return A 64-bit integer containing the current tgid and pid, and
329                     created  as  such:  current_task->tgid  <<  32   |   cur‐
330                     rent_task->pid.
331
332       u64 bpf_get_current_uid_gid(void)
333
334              Return A  64-bit integer containing the current GID and UID, and
335                     created as such: current_gid << 32 | current_uid.
336
337       int bpf_get_current_comm(void *buf, u32 size_of_buf)
338
339              Description
340                     Copy the comm attribute of the current task into  buf  of
341                     size_of_buf.  The comm attribute contains the name of the
342                     executable (excluding the path) for the current task. The
343                     size_of_buf  must  be  strictly positive. On success, the
344                     helper makes sure that  the  buf  is  NUL-terminated.  On
345                     failure, it is filled with zeroes.
346
347              Return 0 on success, or a negative error in case of failure.
348
349       u32 bpf_get_cgroup_classid(struct sk_buff *skb)
350
351              Description
352                     Retrieve  the  classid for the current task, i.e. for the
353                     net_cls cgroup to which skb belongs.
354
355                     This helper can be used on TC egress  path,  but  not  on
356                     ingress.
357
358                     The  net_cls  cgroup provides an interface to tag network
359                     packets based on a user-provided identifier for all traf‐
360                     fic  coming  from  the  tasks  belonging  to  the related
361                     cgroup. See also the related kernel documentation, avail‐
362                     able   from   the   Linux   sources  in  file  Documenta‐
363                     tion/admin-guide/cgroup-v1/net_cls.rst.
364
365                     The Linux kernel has two versions for cgroups: there  are
366                     cgroups  v1  and cgroups v2. Both are available to users,
367                     who can use a mixture of them, but note that the  net_cls
368                     cgroup  is for cgroup v1 only. This makes it incompatible
369                     with  BPF  programs  run   on   cgroups,   which   is   a
370                     cgroup-v2-only  feature  (a socket can only hold data for
371                     one version of cgroups at a time).
372
373                     This helper is only available is the kernel was  compiled
374                     with  the  CONFIG_CGROUP_NET_CLASSID configuration option
375                     set to "y" or to "m".
376
377              Return The classid, or 0 for the default unconfigured classid.
378
379       int  bpf_skb_vlan_push(struct  sk_buff  *skb,  __be16  vlan_proto,  u16
380       vlan_tci)
381
382              Description
383                     Push  a vlan_tci (VLAN tag control information) of proto‐
384                     col vlan_proto to the  packet  associated  to  skb,  then
385                     update the checksum. Note that if vlan_proto is different
386                     from ETH_P_8021Q and ETH_P_8021AD, it is considered to be
387                     ETH_P_8021Q.
388
389                     A call to this helper is susceptible to change the under‐
390                     lying packet buffer. Therefore, at load time, all  checks
391                     on  pointers  previously done by the verifier are invali‐
392                     dated and must be performed again, if the helper is  used
393                     in combination with direct packet access.
394
395              Return 0 on success, or a negative error in case of failure.
396
397       int bpf_skb_vlan_pop(struct sk_buff *skb)
398
399              Description
400                     Pop a VLAN header from the packet associated to skb.
401
402                     A call to this helper is susceptible to change the under‐
403                     lying packet buffer. Therefore, at load time, all  checks
404                     on  pointers  previously done by the verifier are invali‐
405                     dated and must be performed again, if the helper is  used
406                     in combination with direct packet access.
407
408              Return 0 on success, or a negative error in case of failure.
409
410       int  bpf_skb_get_tunnel_key(struct  sk_buff *skb, struct bpf_tunnel_key
411       *key, u32 size, u64 flags)
412
413              Description
414                     Get tunnel metadata. This helper takes a pointer  key  to
415                     an  empty  struct  bpf_tunnel_key  of  size, that will be
416                     filled with tunnel metadata for the packet associated  to
417                     skb.   The  flags can be set to BPF_F_TUNINFO_IPV6, which
418                     indicates that the  tunnel  is  based  on  IPv6  protocol
419                     instead of IPv4.
420
421                     The  struct  bpf_tunnel_key is an object that generalizes
422                     the principal parameters used by various tunneling proto‐
423                     cols  into  a  single struct. This way, it can be used to
424                     easily make a decision  based  on  the  contents  of  the
425                     encapsulation  header,  "summarized"  in  this struct. In
426                     particular, it holds the IP address  of  the  remote  end
427                     (IPv4 or IPv6, depending on the case) in key->remote_ipv4
428                     or  key->remote_ipv6.  Also,  this  struct  exposes   the
429                     key->tunnel_id,  which is generally mapped to a VNI (Vir‐
430                     tual Network Identifier), making it programmable together
431                     with the bpf_skb_set_tunnel_key() helper.
432
433                     Let's  imagine  that the following code is part of a pro‐
434                     gram attached to the TC ingress interface, on one end  of
435                     a  GRE tunnel, and is supposed to filter out all messages
436                     coming from remote ends  with  IPv4  address  other  than
437                     10.0.0.1:
438
439                        int ret;
440                        struct bpf_tunnel_key key = {};
441
442                        ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0);
443                        if (ret < 0)
444                                return TC_ACT_SHOT;     // drop packet
445
446                        if (key.remote_ipv4 != 0x0a000001)
447                                return TC_ACT_SHOT;     // drop packet
448
449                        return TC_ACT_OK;               // accept packet
450
451                     This  interface  can  also be used with all encapsulation
452                     devices that can  operate  in  "collect  metadata"  mode:
453                     instead of having one network device per specific config‐
454                     uration, the "collect metadata" mode only requires a sin‐
455                     gle  device where the configuration can be extracted from
456                     this helper.
457
458                     This can be used together with various  tunnels  such  as
459                     VXLan, Geneve, GRE or IP in IP (IPIP).
460
461              Return 0 on success, or a negative error in case of failure.
462
463       int  bpf_skb_set_tunnel_key(struct  sk_buff *skb, struct bpf_tunnel_key
464       *key, u32 size, u64 flags)
465
466              Description
467                     Populate tunnel metadata for packet  associated  to  skb.
468                     The  tunnel  metadata  is  set to the contents of key, of
469                     size. The flags can be set to a combination of  the  fol‐
470                     lowing values:
471
472                     BPF_F_TUNINFO_IPV6
473                            Indicate that the tunnel is based on IPv6 protocol
474                            instead of IPv4.
475
476                     BPF_F_ZERO_CSUM_TX
477                            For IPv4 packets, add a flag  to  tunnel  metadata
478                            indicating  that  checksum  computation  should be
479                            skipped and checksum set to zeroes.
480
481                     BPF_F_DONT_FRAGMENT
482                            Add a flag to tunnel metadata indicating that  the
483                            packet should not be fragmented.
484
485                     BPF_F_SEQ_NUMBER
486                            Add  a  flag  to tunnel metadata indicating that a
487                            sequence number should be added to  tunnel  header
488                            before sending the packet. This flag was added for
489                            GRE encapsulation, but might be  used  with  other
490                            protocols as well in the future.
491
492                     Here is a typical usage on the transmit path:
493
494                        struct bpf_tunnel_key key;
495                             populate key ...
496                        bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0);
497                        bpf_clone_redirect(skb, vxlan_dev_ifindex, 0);
498
499                     See  also the description of the bpf_skb_get_tunnel_key()
500                     helper for additional information.
501
502              Return 0 on success, or a negative error in case of failure.
503
504       u64 bpf_perf_event_read(struct bpf_map *map, u64 flags)
505
506              Description
507                     Read the value of  a  perf  event  counter.  This  helper
508                     relies  on  a  map of type BPF_MAP_TYPE_PERF_EVENT_ARRAY.
509                     The nature of the perf event counter is selected when map
510                     is  updated  with perf event file descriptors. The map is
511                     an array whose size is the number of available CPUs,  and
512                     each cell contains a value relative to one CPU. The value
513                     to retrieve is indicated  by  flags,  that  contains  the
514                     index    of   the   CPU   to   look   up,   masked   with
515                     BPF_F_INDEX_MASK. Alternatively,  flags  can  be  set  to
516                     BPF_F_CURRENT_CPU to indicate that the value for the cur‐
517                     rent CPU should be retrieved.
518
519                     Note that before Linux 4.13, only hardware perf event can
520                     be retrieved.
521
522                     Also,     be     aware     that    the    newer    helper
523                     bpf_perf_event_read_value()    is    recommended     over
524                     bpf_perf_event_read() in general. The latter has some ABI
525                     quirks where error and counter value are used as a return
526                     code  (which  is  wrong  to do since ranges may overlap).
527                     This issue  is  fixed  with  bpf_perf_event_read_value(),
528                     which  at  the  same time provides more features over the
529                     bpf_perf_event_read()  interface.  Please  refer  to  the
530                     description of bpf_perf_event_read_value() for details.
531
532              Return The value of the perf event counter read from the map, or
533                     a negative error code in case of failure.
534
535       int bpf_redirect(u32 ifindex, u64 flags)
536
537              Description
538                     Redirect the  packet  to  another  net  device  of  index
539                     ifindex.     This   helper   is   somewhat   similar   to
540                     bpf_clone_redirect(),  except  that  the  packet  is  not
541                     cloned, which provides increased performance.
542
543                     Except for XDP, both ingress and egress interfaces can be
544                     used for redirection. The BPF_F_INGRESS value in flags is
545                     used to make the distinction (ingress path is selected if
546                     the flag is present, egress path  otherwise).  Currently,
547                     XDP  only  supports  redirection to the egress interface,
548                     and accepts no flag at all.
549
550                     The same effect  can  also  be  attained  with  the  more
551                     generic bpf_redirect_map(), which uses a BPF map to store
552                     the redirect target instead of providing it  directly  to
553                     the helper.
554
555              Return For  XDP,  the  helper returns XDP_REDIRECT on success or
556                     XDP_ABORTED on error. For other program types, the values
557                     are TC_ACT_REDIRECT on success or TC_ACT_SHOT on error.
558
559       u32 bpf_get_route_realm(struct sk_buff *skb)
560
561              Description
562                     Retrieve  the  realm  or  the  route,  that is to say the
563                     tclassid field of the destination for the skb. The inden‐
564                     tifier  retrieved  is a user-provided tag, similar to the
565                     one used with the net_cls  cgroup  (see  description  for
566                     bpf_get_cgroup_classid()  helper),  but  here this tag is
567                     held by a route (a destination entry), not by a task.
568
569                     Retrieving this  identifier  works  with  the  clsact  TC
570                     egress  hook  (see  also  tc-bpf(8)), or alternatively on
571                     conventional  classful  egress  qdiscs,  but  not  on  TC
572                     ingress  path. In case of clsact TC egress hook, this has
573                     the advantage that, internally, the destination entry has
574                     not been dropped yet in the transmit path. Therefore, the
575                     destination entry does not need to be  artificially  held
576                     via  netif_keep_dst()  for a classful qdisc until the skb
577                     is freed.
578
579                     This helper is available only if the kernel was  compiled
580                     with CONFIG_IP_ROUTE_CLASSID configuration option.
581
582              Return The  realm of the route for the packet associated to skb,
583                     or 0 if none was found.
584
585       int bpf_perf_event_output(void *ctx, struct bpf_map  *map,  u64  flags,
586       void *data, u64 size)
587
588              Description
589                     Write raw data blob into a special BPF perf event held by
590                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
591                     event must have the following attributes: PERF_SAMPLE_RAW
592                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
593                     PERF_COUNT_SW_BPF_OUTPUT as config.
594
595                     The flags are used to indicate the index in map for which
596                     the value must  be  put,  masked  with  BPF_F_INDEX_MASK.
597                     Alternatively,  flags  can be set to BPF_F_CURRENT_CPU to
598                     indicate that the index of the current CPU core should be
599                     used.
600
601                     The value to write, of size, is passed through eBPF stack
602                     and pointed by data.
603
604                     The context of the program ctx needs also  be  passed  to
605                     the helper.
606
607                     On user space, a program willing to read the values needs
608                     to call perf_event_open() on the perf event  (either  for
609                     one  or  for  all  CPUs) and to store the file descriptor
610                     into the map. This must be done before the  eBPF  program
611                     can  send  data  into it. An example is available in file
612                     samples/bpf/trace_output_user.c  in  the   Linux   kernel
613                     source  tree  (the  eBPF  program  counterpart is in sam‐
614                     ples/bpf/trace_output_kern.c).
615
616                     bpf_perf_event_output() achieves better performance  than
617                     bpf_trace_printk()  for sharing data with user space, and
618                     is much better suitable for streaming data from eBPF pro‐
619                     grams.
620
621                     Note  that  this  helper is not restricted to tracing use
622                     cases and can be used with programs attached to TC or XDP
623                     as  well,  where it allows for passing data to user space
624                     listeners. Data can be:
625
626                     · Only custom structs,
627
628                     · Only the packet payload, or
629
630                     · A combination of both.
631
632              Return 0 on success, or a negative error in case of failure.
633
634       int bpf_skb_load_bytes(const void *skb, u32 offset, void *to, u32 len)
635
636              Description
637                     This helper was provided as an easy way to load data from
638                     a  packet.  It  can be used to load len bytes from offset
639                     from the  packet  associated  to  skb,  into  the  buffer
640                     pointed by to.
641
642                     Since  Linux  4.7,  usage  of this helper has mostly been
643                     replaced by "direct packet access", enabling packet  data
644                     to be manipulated with skb->data and skb->data_end point‐
645                     ing respectively to the first byte of packet data and  to
646                     the  byte after the last byte of packet data. However, it
647                     remains useful if one wishes to read large quantities  of
648                     data at once from a packet into the eBPF stack.
649
650              Return 0 on success, or a negative error in case of failure.
651
652       int bpf_get_stackid(void *ctx, struct bpf_map *map, u64 flags)
653
654              Description
655                     Walk  a  user  or  a  kernel  stack and return its id. To
656                     achieve this, the helper needs ctx, which is a pointer to
657                     the context on which the tracing program is executed, and
658                     a pointer to a map of type BPF_MAP_TYPE_STACK_TRACE.
659
660                     The last argument,  flags,  holds  the  number  of  stack
661                     frames   to   skip   (from   0   to   255),  masked  with
662                     BPF_F_SKIP_FIELD_MASK. The next bits can be used to set a
663                     combination of the following flags:
664
665                     BPF_F_USER_STACK
666                            Collect  a  user  space  stack instead of a kernel
667                            stack.
668
669                     BPF_F_FAST_STACK_CMP
670                            Compare stacks by hash only.
671
672                     BPF_F_REUSE_STACKID
673                            If  two  different  stacks  hash  into  the   same
674                            stackid, discard the old one.
675
676                     The  stack  id  retrieved is a 32 bit long integer handle
677                     which can be further combined with other data  (including
678                     other stack ids) and used as a key into maps. This can be
679                     useful for generating a variety of graphs (such as  flame
680                     graphs or off-cpu graphs).
681
682                     For  walking  a stack, this helper is an improvement over
683                     bpf_probe_read(), which can be used with  unrolled  loops
684                     but  is not efficient and consumes a lot of eBPF instruc‐
685                     tions.  Instead,  bpf_get_stackid()  can  collect  up  to
686                     PERF_MAX_STACK_DEPTH  both  kernel  and user frames. Note
687                     that this limit can be controlled with  the  sysctl  pro‐
688                     gram,  and  that it should be manually increased in order
689                     to profile long user stacks (such as stacks for Java pro‐
690                     grams). To do so, use:
691
692                        # sysctl kernel.perf_event_max_stack=<new value>
693
694              Return The  positive  or null stack id on success, or a negative
695                     error in case of failure.
696
697       s64 bpf_csum_diff(__be32 *from, u32 from_size, __be32 *to, u32 to_size,
698       __wsum seed)
699
700              Description
701                     Compute  a  checksum  difference,  from  the  raw  buffer
702                     pointed by from, of length from_size (that must be a mul‐
703                     tiple  of  4),  towards  the raw buffer pointed by to, of
704                     size to_size (same remark). An optional seed can be added
705                     to  the  value  (this  can be cascaded, the seed may come
706                     from a previous call to the helper).
707
708                     This is flexible enough to be used in several ways:
709
710                     · With from_size == 0, to_size > 0 and seed set to check‐
711                       sum, it can be used when pushing new data.
712
713                     · With from_size > 0, to_size == 0 and seed set to check‐
714                       sum, it can be used when removing data from a packet.
715
716                     · With from_size > 0, to_size > 0 and seed set to  0,  it
717                       can  be used to compute a diff. Note that from_size and
718                       to_size do not need to be equal.
719
720                     This   helper   can   be   used   in   combination   with
721                     bpf_l3_csum_replace() and bpf_l4_csum_replace(), to which
722                     one  can   feed   in   the   difference   computed   with
723                     bpf_csum_diff().
724
725              Return The  checksum result, or a negative error code in case of
726                     failure.
727
728       int bpf_skb_get_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
729
730              Description
731                     Retrieve tunnel options metadata for the  packet  associ‐
732                     ated  to skb, and store the raw tunnel option data to the
733                     buffer opt of size.
734
735                     This helper can be used with encapsulation  devices  that
736                     can  operate  in "collect metadata" mode (please refer to
737                     the related note in the description  of  bpf_skb_get_tun‐
738                     nel_key()  for  more details). A particular example where
739                     this can be used is in combination with the Geneve encap‐
740                     sulation  protocol,  where  it  allows  for pushing (with
741                     bpf_skb_get_tunnel_opt() helper) and retrieving arbitrary
742                     TLVs  (Type-Length-Value  headers) from the eBPF program.
743                     This allows for full customization of these headers.
744
745              Return The size of the option data retrieved.
746
747       int bpf_skb_set_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
748
749              Description
750                     Set tunnel options metadata for the packet associated  to
751                     skb to the option data contained in the raw buffer opt of
752                     size.
753
754                     See also the description of the  bpf_skb_get_tunnel_opt()
755                     helper for additional information.
756
757              Return 0 on success, or a negative error in case of failure.
758
759       int bpf_skb_change_proto(struct sk_buff *skb, __be16 proto, u64 flags)
760
761              Description
762                     Change  the  protocol of the skb to proto. Currently sup‐
763                     ported are transition from IPv4 to IPv6, and from IPv6 to
764                     IPv4.  The  helper  takes  care of the groundwork for the
765                     transition, including resizing  the  socket  buffer.  The
766                     eBPF program is expected to fill the new headers, if any,
767                     via skb_store_bytes() and to recompute the checksums with
768                     bpf_l3_csum_replace() and bpf_l4_csum_replace(). The main
769                     case for this helper is to perform NAT64  operations  out
770                     of an eBPF program.
771
772                     Internally, the GSO type is marked as dodgy so that head‐
773                     ers are checked and  segments  are  recalculated  by  the
774                     GSO/GRO  engine.   The  size for GSO target is adapted as
775                     well.
776
777                     All values for flags are reserved for future  usage,  and
778                     must be left at zero.
779
780                     A call to this helper is susceptible to change the under‐
781                     lying packet buffer. Therefore, at load time, all  checks
782                     on  pointers  previously done by the verifier are invali‐
783                     dated and must be performed again, if the helper is  used
784                     in combination with direct packet access.
785
786              Return 0 on success, or a negative error in case of failure.
787
788       int bpf_skb_change_type(struct sk_buff *skb, u32 type)
789
790              Description
791                     Change  the packet type for the packet associated to skb.
792                     This comes down to setting skb->pkt_type to type,  except
793                     the  eBPF  program  does  not  have  a  write  access  to
794                     skb->pkt_type beside this helper.  Using  a  helper  here
795                     allows for graceful handling of errors.
796
797                     The  major  use  case  is  to  change  incoming  skb*s to
798                     **PACKET_HOST* in a programmatic way instead of having to
799                     recirculate  via  redirect(..., BPF_F_INGRESS), for exam‐
800                     ple.
801
802                     Note that type only allows certain values. At this  time,
803                     they are:
804
805                     PACKET_HOST
806                            Packet is for us.
807
808                     PACKET_BROADCAST
809                            Send packet to all.
810
811                     PACKET_MULTICAST
812                            Send packet to group.
813
814                     PACKET_OTHERHOST
815                            Send packet to someone else.
816
817              Return 0 on success, or a negative error in case of failure.
818
819       int  bpf_skb_under_cgroup(struct sk_buff *skb, struct bpf_map *map, u32
820       index)
821
822              Description
823                     Check whether skb is a descendant of the cgroup2 held  by
824                     map of type BPF_MAP_TYPE_CGROUP_ARRAY, at index.
825
826              Return The  return  value depends on the result of the test, and
827                     can be:
828
829                     · 0, if the skb failed the cgroup2 descendant test.
830
831                     · 1, if the skb succeeded the cgroup2 descendant test.
832
833                     · A negative error code, if an error occurred.
834
835       u32 bpf_get_hash_recalc(struct sk_buff *skb)
836
837              Description
838                     Retrieve the hash of the packet, skb->hash. If it is  not
839                     set,  in  particular  if the hash was cleared due to man‐
840                     gling, recompute this hash. Later accesses  to  the  hash
841                     can be done directly with skb->hash.
842
843                     Calling  bpf_set_hash_invalid(), changing a packet proto‐
844                     type    with    bpf_skb_change_proto(),    or     calling
845                     bpf_skb_store_bytes()  with the BPF_F_INVALIDATE_HASH are
846                     actions susceptible to clear the hash and  to  trigger  a
847                     new     computation     for     the    next    call    to
848                     bpf_get_hash_recalc().
849
850              Return The 32-bit hash.
851
852       u64 bpf_get_current_task(void)
853
854              Return A pointer to the current task struct.
855
856       int bpf_probe_write_user(void *dst, const void *src, u32 len)
857
858              Description
859                     Attempt in a safe way to write len bytes from the  buffer
860                     src  to dst in memory. It only works for threads that are
861                     in user context, and dst  must  be  a  valid  user  space
862                     address.
863
864                     This  helper  should not be used to implement any kind of
865                     security mechanism because of TOC-TOU attacks, but rather
866                     to  debug, divert, and manipulate execution of semi-coop‐
867                     erative processes.
868
869                     Keep in mind that this feature is meant for  experiments,
870                     and it has a risk of crashing the system and running pro‐
871                     grams.  Therefore, when an eBPF program using this helper
872                     is  attached, a warning including PID and process name is
873                     printed to kernel logs.
874
875              Return 0 on success, or a negative error in case of failure.
876
877       int bpf_current_task_under_cgroup(struct bpf_map *map, u32 index)
878
879              Description
880                     Check whether the probe is being run is the context of  a
881                     given  subset  of  the  cgroup2 hierarchy. The cgroup2 to
882                     test is held by map of type BPF_MAP_TYPE_CGROUP_ARRAY, at
883                     index.
884
885              Return The  return  value depends on the result of the test, and
886                     can be:
887
888                     · 0, if the skb task belongs to the cgroup2.
889
890                     · 1, if the skb task does not belong to the cgroup2.
891
892                     · A negative error code, if an error occurred.
893
894       int bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
895
896              Description
897                     Resize (trim or grow) the packet associated to skb to the
898                     new  len.  The  flags  are reserved for future usage, and
899                     must be left at zero.
900
901                     The basic idea is that the  helper  performs  the  needed
902                     work to change the size of the packet, then the eBPF pro‐
903                     gram    rewrites    the    rest    via    helpers    like
904                     bpf_skb_store_bytes(),             bpf_l3_csum_replace(),
905                     bpf_l3_csum_replace() and others. This helper is  a  slow
906                     path  utility intended for replies with control messages.
907                     And because it is targeted  for  slow  path,  the  helper
908                     itself  can  afford to be slow: it implicitly linearizes,
909                     unclones and drops offloads from the skb.
910
911                     A call to this helper is susceptible to change the under‐
912                     lying  packet buffer. Therefore, at load time, all checks
913                     on pointers previously done by the verifier  are  invali‐
914                     dated  and must be performed again, if the helper is used
915                     in combination with direct packet access.
916
917              Return 0 on success, or a negative error in case of failure.
918
919       int bpf_skb_pull_data(struct sk_buff *skb, u32 len)
920
921              Description
922                     Pull in non-linear data in case the skb is non-linear and
923                     not  all  of len are part of the linear section. Make len
924                     bytes from skb readable and writable. If a zero value  is
925                     passed  for  len,  then  the  whole  length of the skb is
926                     pulled.
927
928                     This helper is only needed for reading and  writing  with
929                     direct packet access.
930
931                     For  direct packet access, testing that offsets to access
932                     are within packet boundaries (test on  skb->data_end)  is
933                     susceptible  to  fail  if  offsets are invalid, or if the
934                     requested data is in non-linear  parts  of  the  skb.  On
935                     failure  the program can just bail out, or in the case of
936                     a non-linear buffer, use a helper to make the data avail‐
937                     able. The bpf_skb_load_bytes() helper is a first solution
938                     to  access  the  data.  Another  one  consists  in  using
939                     bpf_skb_pull_data  to  pull in once the non-linear parts,
940                     then retesting and eventually access the data.
941
942                     At the same  time,  this  also  makes  sure  the  skb  is
943                     uncloned,  which  is  a  necessary  condition  for direct
944                     write. As this needs to be an  invariant  for  the  write
945                     part  only,  the  verifier detects writes and adds a pro‐
946                     logue that is calling bpf_skb_pull_data() to  effectively
947                     unclone  the  skb  from  the very beginning in case it is
948                     indeed cloned.
949
950                     A call to this helper is susceptible to change the under‐
951                     lying  packet buffer. Therefore, at load time, all checks
952                     on pointers previously done by the verifier  are  invali‐
953                     dated  and must be performed again, if the helper is used
954                     in combination with direct packet access.
955
956              Return 0 on success, or a negative error in case of failure.
957
958       s64 bpf_csum_update(struct sk_buff *skb, __wsum csum)
959
960              Description
961                     Add the checksum csum into skb->csum in case  the  driver
962                     has  supplied  a checksum for the entire packet into that
963                     field. Return an error otherwise. This helper is intended
964                     to  be  used in combination with bpf_csum_diff(), in par‐
965                     ticular when the checksum needs to be updated after  data
966                     has  been  written  into the packet through direct packet
967                     access.
968
969              Return The checksum on success, or a negative error code in case
970                     of failure.
971
972       void bpf_set_hash_invalid(struct sk_buff *skb)
973
974              Description
975                     Invalidate  the  current  skb->hash. It can be used after
976                     mangling on headers  through  direct  packet  access,  in
977                     order  to indicate that the hash is outdated and to trig‐
978                     ger a recalculation the next time  the  kernel  tries  to
979                     access this hash or when the bpf_get_hash_recalc() helper
980                     is called.
981
982       int bpf_get_numa_node_id(void)
983
984              Description
985                     Return the id of the current NUMA node. The  primary  use
986                     case  for this helper is the selection of sockets for the
987                     local NUMA node, when the program is attached to  sockets
988                     using   the  SO_ATTACH_REUSEPORT_EBPF  option  (see  also
989                     socket(7)), but the helper is  also  available  to  other
990                     eBPF  program  types,  similarly  to  bpf_get_smp_proces‐
991                     sor_id().
992
993              Return The id of current NUMA node.
994
995       int bpf_skb_change_head(struct sk_buff *skb, u32 len, u64 flags)
996
997              Description
998                     Grows headroom of packet associated to  skb  and  adjusts
999                     the  offset  of  the  MAC  header accordingly, adding len
1000                     bytes of space. It automatically extends and  reallocates
1001                     memory as required.
1002
1003                     This  helper  can  be used on a layer 3 skb to push a MAC
1004                     header for redirection into a layer 2 device.
1005
1006                     All values for flags are reserved for future  usage,  and
1007                     must be left at zero.
1008
1009                     A call to this helper is susceptible to change the under‐
1010                     lying packet buffer. Therefore, at load time, all  checks
1011                     on  pointers  previously done by the verifier are invali‐
1012                     dated and must be performed again, if the helper is  used
1013                     in combination with direct packet access.
1014
1015              Return 0 on success, or a negative error in case of failure.
1016
1017       int bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
1018
1019              Description
1020                     Adjust  (move)  xdp_md->data by delta bytes. Note that it
1021                     is possible to use  a  negative  value  for  delta.  This
1022                     helper  can  be used to prepare the packet for pushing or
1023                     popping headers.
1024
1025                     A call to this helper is susceptible to change the under‐
1026                     lying  packet buffer. Therefore, at load time, all checks
1027                     on pointers previously done by the verifier  are  invali‐
1028                     dated  and must be performed again, if the helper is used
1029                     in combination with direct packet access.
1030
1031              Return 0 on success, or a negative error in case of failure.
1032
1033       int bpf_probe_read_str(void *dst, u32 size, const void *unsafe_ptr)
1034
1035              Description
1036                     Copy a  NUL  terminated  string  from  an  unsafe  kernel
1037                     address   unsafe_ptr   to  dst.  See  bpf_probe_read_ker‐
1038                     nel_str() for more details.
1039
1040                     Generally,     use      bpf_probe_read_user_str()      or
1041                     bpf_probe_read_kernel_str() instead.
1042
1043              Return On  success,  the strictly positive length of the string,
1044                     including the trailing NUL character. On error,  a  nega‐
1045                     tive value.
1046
1047       u64 bpf_get_socket_cookie(struct sk_buff *skb)
1048
1049              Description
1050                     If  the struct sk_buff pointed by skb has a known socket,
1051                     retrieve the cookie (generated by  the  kernel)  of  this
1052                     socket.   If  no  cookie has been set yet, generate a new
1053                     cookie. Once generated, the socket cookie remains  stable
1054                     for the life of the socket. This helper can be useful for
1055                     monitoring per socket networking traffic statistics as it
1056                     provides  a  global socket identifier that can be assumed
1057                     unique.
1058
1059              Return A 8-byte long non-decreasing number on success, or  0  if
1060                     the socket field is missing inside skb.
1061
1062       u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
1063
1064              Description
1065                     Equivalent to bpf_get_socket_cookie() helper that accepts
1066                     skb, but gets socket from struct bpf_sock_addr context.
1067
1068              Return A 8-byte long non-decreasing number.
1069
1070       u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
1071
1072              Description
1073                     Equivalent to bpf_get_socket_cookie() helper that accepts
1074                     skb, but gets socket from struct bpf_sock_ops context.
1075
1076              Return A 8-byte long non-decreasing number.
1077
1078       u32 bpf_get_socket_uid(struct sk_buff *skb)
1079
1080              Return The  owner  UID  of  the socket associated to skb. If the
1081                     socket is NULL, or if it is not a full socket (i.e. if it
1082                     is  a time-wait or a request socket instead), overflowuid
1083                     value is returned (note that overflowuid  might  also  be
1084                     the actual UID value for the socket).
1085
1086       u32 bpf_set_hash(struct sk_buff *skb, u32 hash)
1087
1088              Description
1089                     Set  the  full  hash for skb (set the field skb->hash) to
1090                     value hash.
1091
1092              Return 0
1093
1094       int bpf_setsockopt(void *bpf_socket, int level, int optname, void *opt‐
1095       val, int optlen)
1096
1097              Description
1098                     Emulate  a  call to setsockopt() on the socket associated
1099                     to bpf_socket, which must be a full socket. The level  at
1100                     which  the  option  resides  and  the name optname of the
1101                     option must be  specified,  see  setsockopt(2)  for  more
1102                     information.   The  option  value  of  length  optlen  is
1103                     pointed by optval.
1104
1105                     bpf_socket should be one of the following:
1106
1107                     · struct bpf_sock_ops for BPF_PROG_TYPE_SOCK_OPS.
1108
1109                     · struct bpf_sock_addr for  BPF_CGROUP_INET4_CONNECT  and
1110                       BPF_CGROUP_INET6_CONNECT.
1111
1112                     This helper actually implements a subset of setsockopt().
1113                     It supports the following levels:
1114
1115                     · SOL_SOCKET,  which  supports  the  following  optnames:
1116                       SO_RCVBUF,  SO_SNDBUF, SO_MAX_PACING_RATE, SO_PRIORITY,
1117                       SO_RCVLOWAT, SO_MARK.
1118
1119                     · IPPROTO_TCP, which  supports  the  following  optnames:
1120                       TCP_CONGESTION, TCP_BPF_IW, TCP_BPF_SNDCWND_CLAMP.
1121
1122                     · IPPROTO_IP, which supports optname IP_TOS.
1123
1124                     · IPPROTO_IPV6, which supports optname IPV6_TCLASS.
1125
1126              Return 0 on success, or a negative error in case of failure.
1127
1128       int  bpf_skb_adjust_room(struct  sk_buff  *skb, s32 len_diff, u32 mode,
1129       u64 flags)
1130
1131              Description
1132                     Grow or shrink the room for data in the packet associated
1133                     to skb by len_diff, and according to the selected mode.
1134
1135                     By  default, the helper will reset any offloaded checksum
1136                     indicator of  the  skb  to  CHECKSUM_NONE.  This  can  be
1137                     avoided by the following flag:
1138
1139                     · BPF_F_ADJ_ROOM_NO_CSUM_RESET:  Do  not  reset offloaded
1140                       checksum data of the skb to CHECKSUM_NONE.
1141
1142                     There are two supported modes at this time:
1143
1144                     · BPF_ADJ_ROOM_MAC: Adjust room at the  mac  layer  (room
1145                       space is added or removed below the layer 2 header).
1146
1147                     · BPF_ADJ_ROOM_NET:  Adjust  room  at  the  network layer
1148                       (room space is added  or  removed  below  the  layer  3
1149                       header).
1150
1151                     The following flags are supported at this time:
1152
1153                     · BPF_F_ADJ_ROOM_FIXED_GSO:   Do   not  adjust  gso_size.
1154                       Adjusting mss in this way is not allowed for datagrams.
1155
1156                     · BPF_F_ADJ_ROOM_ENCAP_L3_IPV4,
1157                       BPF_F_ADJ_ROOM_ENCAP_L3_IPV6: Any new space is reserved
1158                       to hold a tunnel header.   Configure  skb  offsets  and
1159                       other fields accordingly.
1160
1161                     · BPF_F_ADJ_ROOM_ENCAP_L4_GRE,
1162                       BPF_F_ADJ_ROOM_ENCAP_L4_UDP: Use with ENCAP_L3 flags to
1163                       further specify the tunnel type.
1164
1165                     · BPF_F_ADJ_ROOM_ENCAP_L2(len):   Use   with  ENCAP_L3/L4
1166                       flags to further specify the tunnel type;  len  is  the
1167                       length of the inner MAC header.
1168
1169                     A call to this helper is susceptible to change the under‐
1170                     lying packet buffer. Therefore, at load time, all  checks
1171                     on  pointers  previously done by the verifier are invali‐
1172                     dated and must be performed again, if the helper is  used
1173                     in combination with direct packet access.
1174
1175              Return 0 on success, or a negative error in case of failure.
1176
1177       int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
1178
1179              Description
1180                     Redirect  the packet to the endpoint referenced by map at
1181                     index key. Depending on its type, this  map  can  contain
1182                     references to net devices (for forwarding packets through
1183                     other ports), or to CPUs (for redirecting XDP  frames  to
1184                     another  CPU; but this is only implemented for native XDP
1185                     (with driver support) as of this writing).
1186
1187                     The lower two bits of flags are used as the  return  code
1188                     if the map lookup fails. This is so that the return value
1189                     can be one of the XDP program return codes up to  XDP_TX,
1190                     as  chosen  by  the  caller. Any higher bits in the flags
1191                     argument must be unset.
1192
1193                     See also bpf_redirect(), which only supports  redirecting
1194                     to an ifindex, but doesn't require a map to do so.
1195
1196              Return XDP_REDIRECT  on  success,  or the value of the two lower
1197                     bits of the flags argument on error.
1198
1199       int bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map  *map,  u32
1200       key, u64 flags)
1201
1202              Description
1203                     Redirect  the  packet to the socket referenced by map (of
1204                     type BPF_MAP_TYPE_SOCKMAP) at index key. Both ingress and
1205                     egress  interfaces  can  be  used  for  redirection.  The
1206                     BPF_F_INGRESS value in flags is used to make the distinc‐
1207                     tion  (ingress  path  is selected if the flag is present,
1208                     egress path otherwise). This is the only  flag  supported
1209                     for now.
1210
1211              Return SK_PASS on success, or SK_DROP on error.
1212
1213       int  bpf_sock_map_update(struct  bpf_sock_ops  *skops,  struct  bpf_map
1214       *map, void *key, u64 flags)
1215
1216              Description
1217                     Add an entry to, or update a map referencing sockets. The
1218                     skops  is used as a new value for the entry associated to
1219                     key. flags is one of:
1220
1221                     BPF_NOEXIST
1222                            The entry for key must not exist in the map.
1223
1224                     BPF_EXIST
1225                            The entry for key must already exist in the map.
1226
1227                     BPF_ANY
1228                            No condition on the existence  of  the  entry  for
1229                            key.
1230
1231                     If  the map has eBPF programs (parser and verdict), those
1232                     will be inherited by  the  socket  being  added.  If  the
1233                     socket is already attached to eBPF programs, this results
1234                     in an error.
1235
1236              Return 0 on success, or a negative error in case of failure.
1237
1238       int bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
1239
1240              Description
1241                     Adjust the address pointed by xdp_md->data_meta by  delta
1242                     (which can be positive or negative). Note that this oper‐
1243                     ation modifies the address stored in xdp_md->data, so the
1244                     latter  must  be  loaded  only  after the helper has been
1245                     called.
1246
1247                     The use of xdp_md->data_meta is optional and programs are
1248                     not  required  to  use it. The rationale is that when the
1249                     packet is processed with XDP (e.g. as DoS filter), it  is
1250                     possible  to  push further meta data along with it before
1251                     passing to the stack, and to give the guarantee  that  an
1252                     ingress  eBPF  program attached as a TC classifier on the
1253                     same device can pick this up for further post-processing.
1254                     Since  TC  works with socket buffers, it remains possible
1255                     to set from XDP the mark or priority pointers,  or  other
1256                     pointers  for  the  socket  buffer.   Having this scratch
1257                     space generic and programmable allows for more  flexibil‐
1258                     ity  as the user is free to store whatever meta data they
1259                     need.
1260
1261                     A call to this helper is susceptible to change the under‐
1262                     lying  packet buffer. Therefore, at load time, all checks
1263                     on pointers previously done by the verifier  are  invali‐
1264                     dated  and must be performed again, if the helper is used
1265                     in combination with direct packet access.
1266
1267              Return 0 on success, or a negative error in case of failure.
1268
1269       int bpf_perf_event_read_value(struct bpf_map *map,  u64  flags,  struct
1270       bpf_perf_event_value *buf, u32 buf_size)
1271
1272              Description
1273                     Read the value of a perf event counter, and store it into
1274                     buf of size buf_size. This helper relies on a map of type
1275                     BPF_MAP_TYPE_PERF_EVENT_ARRAY.  The  nature  of  the perf
1276                     event counter is selected when map is updated  with  perf
1277                     event file descriptors. The map is an array whose size is
1278                     the number of available CPUs, and each  cell  contains  a
1279                     value relative to one CPU. The value to retrieve is indi‐
1280                     cated by flags, that contains the index  of  the  CPU  to
1281                     look  up,  masked  with  BPF_F_INDEX_MASK. Alternatively,
1282                     flags can be set to BPF_F_CURRENT_CPU  to  indicate  that
1283                     the value for the current CPU should be retrieved.
1284
1285                     This    helper    behaves    in    a    way    close   to
1286                     bpf_perf_event_read() helper, save that instead  of  just
1287                     returning the value observed, it fills the buf structure.
1288                     This allows for additional data to be retrieved: in  par‐
1289                     ticular,  the  enabled and running times (in buf->enabled
1290                     and buf->running, respectively) are copied.  In  general,
1291                     bpf_perf_event_read_value()     is    recommended    over
1292                     bpf_perf_event_read(), which has some ABI issues and pro‐
1293                     vides fewer functionalities.
1294
1295                     These  values are interesting, because hardware PMU (Per‐
1296                     formance Monitoring Unit) counters are limited resources.
1297                     When  there  are  more  PMU based perf events opened than
1298                     available counters, kernel will multiplex these events so
1299                     each  event  gets certain percentage (but not all) of the
1300                     PMU time. In case that multiplexing happens,  the  number
1301                     of  samples  or  counter  value will not reflect the case
1302                     compared to when no multiplexing occurs. This makes  com‐
1303                     parison between different runs difficult.  Typically, the
1304                     counter value should be normalized  before  comparing  to
1305                     other  experiments.  The  usual  normalization is done as
1306                     follows.
1307
1308                        normalized_counter = counter * t_enabled / t_running
1309
1310                     Where t_enabled is the time enabled for event and  t_run‐
1311                     ning  is the time running for event since last normaliza‐
1312                     tion. The enabled and running times are accumulated since
1313                     the  perf  event  open. To achieve scaling factor between
1314                     two invocations of an eBPF program, users can use CPU  id
1315                     as  the key (which is typical for perf array usage model)
1316                     to remember the previous value  and  do  the  calculation
1317                     inside the eBPF program.
1318
1319              Return 0 on success, or a negative error in case of failure.
1320
1321       int  bpf_perf_prog_read_value(struct  bpf_perf_event_data  *ctx, struct
1322       bpf_perf_event_value *buf, u32 buf_size)
1323
1324              Description
1325                     For en eBPF program attached to a  perf  event,  retrieve
1326                     the  value  of  the  event  counter associated to ctx and
1327                     store it in the structure pointed  by  buf  and  of  size
1328                     buf_size.  Enabled  and  running times are also stored in
1329                     the    structure    (see    description     of     helper
1330                     bpf_perf_event_read_value() for more details).
1331
1332              Return 0 on success, or a negative error in case of failure.
1333
1334       int bpf_getsockopt(void *bpf_socket, int level, int optname, void *opt‐
1335       val, int optlen)
1336
1337              Description
1338                     Emulate a call to getsockopt() on the  socket  associated
1339                     to  bpf_socket, which must be a full socket. The level at
1340                     which the option resides and  the  name  optname  of  the
1341                     option  must  be  specified,  see  getsockopt(2) for more
1342                     information.  The retrieved value is stored in the struc‐
1343                     ture pointed by opval and of length optlen.
1344
1345                     bpf_socket should be one of the following:
1346
1347                     · struct bpf_sock_ops for BPF_PROG_TYPE_SOCK_OPS.
1348
1349                     · struct  bpf_sock_addr  for BPF_CGROUP_INET4_CONNECT and
1350                       BPF_CGROUP_INET6_CONNECT.
1351
1352                     This helper actually implements a subset of getsockopt().
1353                     It supports the following levels:
1354
1355                     · IPPROTO_TCP, which supports optname TCP_CONGESTION.
1356
1357                     · IPPROTO_IP, which supports optname IP_TOS.
1358
1359                     · IPPROTO_IPV6, which supports optname IPV6_TCLASS.
1360
1361              Return 0 on success, or a negative error in case of failure.
1362
1363       int bpf_override_return(struct pt_regs *regs, u64 rc)
1364
1365              Description
1366                     Used  for  error  injection,  this helper uses kprobes to
1367                     override the return value of the probed function, and  to
1368                     set  it to rc.  The first argument is the context regs on
1369                     which the kprobe works.
1370
1371                     This helper works by setting the PC (program counter)  to
1372                     an  override function which is run in place of the origi‐
1373                     nal probed function. This means the  probed  function  is
1374                     not  run  at  all.  The replacement function just returns
1375                     with the required value.
1376
1377                     This helper has security implications, and thus  is  sub‐
1378                     ject  to restrictions. It is only available if the kernel
1379                     was compiled with the CONFIG_BPF_KPROBE_OVERRIDE configu‐
1380                     ration  option,  and  in this case it only works on func‐
1381                     tions tagged with  ALLOW_ERROR_INJECTION  in  the  kernel
1382                     code.
1383
1384                     Also,  the helper is only available for the architectures
1385                     having the CONFIG_FUNCTION_ERROR_INJECTION option. As  of
1386                     this writing, x86 architecture is the only one to support
1387                     this feature.
1388
1389              Return 0
1390
1391       int  bpf_sock_ops_cb_flags_set(struct   bpf_sock_ops   *bpf_sock,   int
1392       argval)
1393
1394              Description
1395                     Attempt  to  set  the  value of the bpf_sock_ops_cb_flags
1396                     field for the full TCP socket associated to  bpf_sock_ops
1397                     to argval.
1398
1399                     The  primary  use  of this field is to determine if there
1400                     should   be   calls   to   eBPF    programs    of    type
1401                     BPF_PROG_TYPE_SOCK_OPS at various points in the TCP code.
1402                     A program of the same type can change its value, per con‐
1403                     nection  and  as necessary, when the connection is estab‐
1404                     lished. This field is directly  accessible  for  reading,
1405                     but  this  helper  must  be  used for updates in order to
1406                     return an error if an eBPF program tries to set  a  call‐
1407                     back that is not supported in the current kernel.
1408
1409                     argval is a flag array which can combine these flags:
1410
1411                     · BPF_SOCK_OPS_RTO_CB_FLAG (retransmission time out)
1412
1413                     · BPF_SOCK_OPS_RETRANS_CB_FLAG (retransmission)
1414
1415                     · BPF_SOCK_OPS_STATE_CB_FLAG (TCP state change)
1416
1417                     · BPF_SOCK_OPS_RTT_CB_FLAG (every RTT)
1418
1419                     Therefore,  this function can be used to clear a callback
1420                     flag by setting the appropriate bit to zero. e.g. to dis‐
1421                     able the RTO callback:
1422
1423                     bpf_sock_ops_cb_flags_set(bpf_sock,
1424                            bpf_sock->bpf_sock_ops_cb_flags                  &
1425                            ~BPF_SOCK_OPS_RTO_CB_FLAG)
1426
1427                     Here are some examples of where one could call such  eBPF
1428                     program:
1429
1430                     · When RTO fires.
1431
1432                     · When a packet is retransmitted.
1433
1434                     · When the connection terminates.
1435
1436                     · When a packet is sent.
1437
1438                     · When a packet is received.
1439
1440              Return Code -EINVAL if the socket is not a full TCP socket; oth‐
1441                     erwise, a positive number containing the bits that  could
1442                     not be set is returned (which comes down to 0 if all bits
1443                     were set as required).
1444
1445       int bpf_msg_redirect_map(struct sk_msg_buff *msg, struct bpf_map  *map,
1446       u32 key, u64 flags)
1447
1448              Description
1449                     This  helper is used in programs implementing policies at
1450                     the socket level. If the message msg is allowed  to  pass
1451                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1452                     rect  it  to  the  socket  referenced  by  map  (of  type
1453                     BPF_MAP_TYPE_SOCKMAP)  at  index  key.  Both  ingress and
1454                     egress  interfaces  can  be  used  for  redirection.  The
1455                     BPF_F_INGRESS value in flags is used to make the distinc‐
1456                     tion (ingress path is selected if the  flag  is  present,
1457                     egress  path  otherwise). This is the only flag supported
1458                     for now.
1459
1460              Return SK_PASS on success, or SK_DROP on error.
1461
1462       int bpf_msg_apply_bytes(struct sk_msg_buff *msg, u32 bytes)
1463
1464              Description
1465                     For socket policies, apply the verdict of the  eBPF  pro‐
1466                     gram to the next bytes (number of bytes) of message msg.
1467
1468                     For  example,  this  helper  can be used in the following
1469                     cases:
1470
1471                     · A single sendmsg() or sendfile() system  call  contains
1472                       multiple logical messages that the eBPF program is sup‐
1473                       posed to read and for which it should apply a verdict.
1474
1475                     · An eBPF program only cares to read the first bytes of a
1476                       msg.  If  the message has a large payload, then setting
1477                       up and calling the  eBPF  program  repeatedly  for  all
1478                       bytes,  even though the verdict is already known, would
1479                       create unnecessary overhead.
1480
1481                     When called from within an eBPF program, the helper  sets
1482                     a  counter  internal  to  the BPF infrastructure, that is
1483                     used to apply the last verdict  to  the  next  bytes.  If
1484                     bytes  is  smaller  than the current data being processed
1485                     from a sendmsg() or sendfile()  system  call,  the  first
1486                     bytes  will  be  sent and the eBPF program will be re-run
1487                     with the pointer for start of data pointing to byte  num‐
1488                     ber  bytes  + 1. If bytes is larger than the current data
1489                     being processed, then the eBPF verdict will be applied to
1490                     multiple  sendmsg()  or  sendfile() calls until bytes are
1491                     consumed.
1492
1493                     Note that if a socket closes with  the  internal  counter
1494                     holding  a  non-zero value, this is not a problem because
1495                     data is not being buffered for bytes and is sent as it is
1496                     received.
1497
1498              Return 0
1499
1500       int bpf_msg_cork_bytes(struct sk_msg_buff *msg, u32 bytes)
1501
1502              Description
1503                     For socket policies, prevent the execution of the verdict
1504                     eBPF program for message msg until  bytes  (byte  number)
1505                     have been accumulated.
1506
1507                     This  can  be  used  when  one needs a specific number of
1508                     bytes before a verdict can be assigned, even if the  data
1509                     spans multiple sendmsg() or sendfile() calls. The extreme
1510                     case would be a user calling  sendmsg()  repeatedly  with
1511                     1-byte  long message segments. Obviously, this is bad for
1512                     performance, but it is still valid. If the  eBPF  program
1513                     needs  bytes  bytes to validate a header, this helper can
1514                     be used to prevent the eBPF program to  be  called  again
1515                     until bytes have been accumulated.
1516
1517              Return 0
1518
1519       int  bpf_msg_pull_data(struct sk_msg_buff *msg, u32 start, u32 end, u64
1520       flags)
1521
1522              Description
1523                     For socket policies, pull in non-linear  data  from  user
1524                     space   for   msg   and   set   pointers   msg->data  and
1525                     msg->data_end to start and end bytes  offsets  into  msg,
1526                     respectively.
1527
1528                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
1529                     it can only parse data that the (data, data_end) pointers
1530                     have already consumed. For sendmsg() hooks this is likely
1531                     the first scatterlist element. But for calls  relying  on
1532                     the  sendpage  handler (e.g. sendfile()) this will be the
1533                     range (0, 0) because the data is shared with  user  space
1534                     and  by  default  the objective is to avoid allowing user
1535                     space to modify data while (or  after)  eBPF  verdict  is
1536                     being  decided.  This  helper can be used to pull in data
1537                     and to set the start and end  pointer  to  given  values.
1538                     Data  will  be  copied if necessary (i.e. if data was not
1539                     linear and if start and end pointers do not point to  the
1540                     same chunk).
1541
1542                     A call to this helper is susceptible to change the under‐
1543                     lying packet buffer. Therefore, at load time, all  checks
1544                     on  pointers  previously done by the verifier are invali‐
1545                     dated and must be performed again, if the helper is  used
1546                     in combination with direct packet access.
1547
1548                     All  values  for flags are reserved for future usage, and
1549                     must be left at zero.
1550
1551              Return 0 on success, or a negative error in case of failure.
1552
1553       int bpf_bind(struct bpf_sock_addr  *ctx,  struct  sockaddr  *addr,  int
1554       addr_len)
1555
1556              Description
1557                     Bind  the socket associated to ctx to the address pointed
1558                     by addr, of length addr_len. This allows for making  out‐
1559                     going  connection  from the desired IP address, which can
1560                     be useful for example when all processes inside a  cgroup
1561                     should  use one single IP address on a host that has mul‐
1562                     tiple IP configured.
1563
1564                     This helper works for IPv4 and IPv6, TCP and UDP sockets.
1565                     The   domain   (addr->sa_family)   must  be  AF_INET  (or
1566                     AF_INET6). It's advised to pass zero  port  (sin_port  or
1567                     sin6_port)  which  triggers  IP_BIND_ADDRESS_NO_PORT-like
1568                     behavior and lets  the  kernel  efficiently  pick  up  an
1569                     unused  port  as  long  as  4-tuple  is  unique.  Passing
1570                     non-zero port might lead to degraded performance.
1571
1572              Return 0 on success, or a negative error in case of failure.
1573
1574       int bpf_xdp_adjust_tail(struct xdp_buff *xdp_md, int delta)
1575
1576              Description
1577                     Adjust (move) xdp_md->data_end by delta bytes. It is pos‐
1578                     sible  to  both  shrink and grow the packet tail.  Shrink
1579                     done via delta being a negative integer.
1580
1581                     A call to this helper is susceptible to change the under‐
1582                     lying  packet buffer. Therefore, at load time, all checks
1583                     on pointers previously done by the verifier  are  invali‐
1584                     dated  and must be performed again, if the helper is used
1585                     in combination with direct packet access.
1586
1587              Return 0 on success, or a negative error in case of failure.
1588
1589       int  bpf_skb_get_xfrm_state(struct  sk_buff  *skb,  u32  index,  struct
1590       bpf_xfrm_state *xfrm_state, u32 size, u64 flags)
1591
1592              Description
1593                     Retrieve the XFRM state (IP transform framework, see also
1594                     ip-xfrm(8)) at index in XFRM "security path" for skb.
1595
1596                     The   retrieved   value   is   stored   in   the   struct
1597                     bpf_xfrm_state pointed by xfrm_state and of length size.
1598
1599                     All  values  for flags are reserved for future usage, and
1600                     must be left at zero.
1601
1602                     This helper is available only if the kernel was  compiled
1603                     with CONFIG_XFRM configuration option.
1604
1605              Return 0 on success, or a negative error in case of failure.
1606
1607       int bpf_get_stack(void *ctx, void *buf, u32 size, u64 flags)
1608
1609              Description
1610                     Return  a  user or a kernel stack in bpf program provided
1611                     buffer.  To achieve this, the helper needs ctx, which  is
1612                     a  pointer to the context on which the tracing program is
1613                     executed.  To store the stacktrace, the bpf program  pro‐
1614                     vides buf with a nonnegative size.
1615
1616                     The  last  argument,  flags,  holds  the  number of stack
1617                     frames  to  skip   (from   0   to   255),   masked   with
1618                     BPF_F_SKIP_FIELD_MASK.  The  next bits can be used to set
1619                     the following flags:
1620
1621                     BPF_F_USER_STACK
1622                            Collect a user space stack  instead  of  a  kernel
1623                            stack.
1624
1625                     BPF_F_USER_BUILD_ID
1626                            Collect  buildid+offset  instead  of  ips for user
1627                            stack, only  valid  if  BPF_F_USER_STACK  is  also
1628                            specified.
1629
1630                     bpf_get_stack()  can  collect  up to PERF_MAX_STACK_DEPTH
1631                     both kernel and user frames, subject to sufficient  large
1632                     buffer  size. Note that this limit can be controlled with
1633                     the sysctl  program,  and  that  it  should  be  manually
1634                     increased  in  order to profile long user stacks (such as
1635                     stacks for Java programs). To do so, use:
1636
1637                        # sysctl kernel.perf_event_max_stack=<new value>
1638
1639              Return A non-negative value equal to or less than size  on  suc‐
1640                     cess, or a negative error in case of failure.
1641
1642       int  bpf_skb_load_bytes_relative(const void *skb, u32 offset, void *to,
1643       u32 len, u32 start_header)
1644
1645              Description
1646                     This helper is similar to bpf_skb_load_bytes() in that it
1647                     provides  an  easy way to load len bytes from offset from
1648                     the packet associated to skb, into the buffer pointed  by
1649                     to.  The  difference  to  bpf_skb_load_bytes()  is that a
1650                     fifth argument start_header exists in order to  select  a
1651                     base offset to start from. start_header can be one of:
1652
1653                     BPF_HDR_START_MAC
1654                            Base offset to load data from is skb's mac header.
1655
1656                     BPF_HDR_START_NET
1657                            Base  offset  to  load  data from is skb's network
1658                            header.
1659
1660                     In general,  "direct  packet  access"  is  the  preferred
1661                     method  to access packet data, however, this helper is in
1662                     particular useful in socket filters where skb->data  does
1663                     not always point to the start of the mac header and where
1664                     "direct packet access" is not available.
1665
1666              Return 0 on success, or a negative error in case of failure.
1667
1668       int bpf_fib_lookup(void *ctx, struct bpf_fib_lookup *params, int  plen,
1669       u32 flags)
1670
1671              Description
1672                     Do  FIB  lookup  in  kernel  tables  using  parameters in
1673                     params.  If lookup is successful and result shows  packet
1674                     is  to be forwarded, the neighbor tables are searched for
1675                     the nexthop.  If successful (ie., FIB lookup  shows  for‐
1676                     warding  and nexthop is resolved), the nexthop address is
1677                     returned in ipv4_dst or ipv6_dst based on family, smac is
1678                     set  to mac address of egress device, dmac is set to nex‐
1679                     thop mac address, rt_metric is set to metric  from  route
1680                     (IPv4/IPv6  only), and ifindex is set to the device index
1681                     of the nexthop from the FIB lookup.
1682
1683                     plen argument is the size of the passed in struct.  flags
1684                     argument  can be a combination of one or more of the fol‐
1685                     lowing values:
1686
1687                     BPF_FIB_LOOKUP_DIRECT
1688                            Do a direct table lookup vs full lookup using  FIB
1689                            rules.
1690
1691                     BPF_FIB_LOOKUP_OUTPUT
1692                            Perform lookup from an egress perspective (default
1693                            is ingress).
1694
1695                     ctx is either struct xdp_md for XDP  programs  or  struct
1696                     sk_buff tc cls_act programs.
1697
1698              Return
1699
1700                     · < 0 if any input argument is invalid
1701
1702                     · 0  on  success  (packet  is forwarded, nexthop neighbor
1703                       exists)
1704
1705                     · > 0 one of BPF_FIB_LKUP_RET_ codes explaining  why  the
1706                       packet is not forwarded or needs assist from full stack
1707
1708       int  bpf_sock_hash_update(struct  bpf_sock_ops  *skops,  struct bpf_map
1709       *map, void *key, u64 flags)
1710
1711              Description
1712                     Add an entry to, or update  a  sockhash  map  referencing
1713                     sockets.   The skops is used as a new value for the entry
1714                     associated to key. flags is one of:
1715
1716                     BPF_NOEXIST
1717                            The entry for key must not exist in the map.
1718
1719                     BPF_EXIST
1720                            The entry for key must already exist in the map.
1721
1722                     BPF_ANY
1723                            No condition on the existence  of  the  entry  for
1724                            key.
1725
1726                     If  the map has eBPF programs (parser and verdict), those
1727                     will be inherited by  the  socket  being  added.  If  the
1728                     socket is already attached to eBPF programs, this results
1729                     in an error.
1730
1731              Return 0 on success, or a negative error in case of failure.
1732
1733       int bpf_msg_redirect_hash(struct sk_msg_buff *msg, struct bpf_map *map,
1734       void *key, u64 flags)
1735
1736              Description
1737                     This  helper is used in programs implementing policies at
1738                     the socket level. If the message msg is allowed  to  pass
1739                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1740                     rect  it  to  the  socket  referenced  by  map  (of  type
1741                     BPF_MAP_TYPE_SOCKHASH)  using  hash key. Both ingress and
1742                     egress  interfaces  can  be  used  for  redirection.  The
1743                     BPF_F_INGRESS value in flags is used to make the distinc‐
1744                     tion (ingress path is selected if the  flag  is  present,
1745                     egress  path  otherwise). This is the only flag supported
1746                     for now.
1747
1748              Return SK_PASS on success, or SK_DROP on error.
1749
1750       int bpf_sk_redirect_hash(struct sk_buff *skb, struct bpf_map *map, void
1751       *key, u64 flags)
1752
1753              Description
1754                     This  helper is used in programs implementing policies at
1755                     the skb socket level. If the sk_buff skb  is  allowed  to
1756                     pass   (i.e.    if  the  verdeict  eBPF  program  returns
1757                     SK_PASS), redirect it to the socket referenced by map (of
1758                     type  BPF_MAP_TYPE_SOCKHASH) using hash key. Both ingress
1759                     and egress interfaces can be used  for  redirection.  The
1760                     BPF_F_INGRESS value in flags is used to make the distinc‐
1761                     tion (ingress path is selected if the  flag  is  present,
1762                     egress  otherwise).  This  is the only flag supported for
1763                     now.
1764
1765              Return SK_PASS on success, or SK_DROP on error.
1766
1767       int bpf_lwt_push_encap(struct sk_buff *skb, u32 type,  void  *hdr,  u32
1768       len)
1769
1770              Description
1771                     Encapsulate the packet associated to skb within a Layer 3
1772                     protocol header. This header is provided in the buffer at
1773                     address  hdr,  with len its size in bytes. type indicates
1774                     the protocol of the header and can be one of:
1775
1776                     BPF_LWT_ENCAP_SEG6
1777                            IPv6 encapsulation  with  Segment  Routing  Header
1778                            (struct  ipv6_sr_hdr).  hdr only contains the SRH,
1779                            the IPv6 header is computed by the kernel.
1780
1781                     BPF_LWT_ENCAP_SEG6_INLINE
1782                            Only works if skb contains an IPv6 packet.  Insert
1783                            a  Segment  Routing  Header  (struct  ipv6_sr_hdr)
1784                            inside the IPv6 header.
1785
1786                     BPF_LWT_ENCAP_IP
1787                            IP  encapsulation  (GRE/GUE/IPIP/etc).  The  outer
1788                            header  must  be IPv4 or IPv6, followed by zero or
1789                            more additional headers, up  to  LWT_BPF_MAX_HEAD‐
1790                            ROOM  total bytes in all prepended headers. Please
1791                            note that if skb_is_gso(skb) is true, no more than
1792                            two  headers  can  be  prepended,  and  the  inner
1793                            header,  if  present,  should  be  either  GRE  or
1794                            UDP/GUE.
1795
1796                     BPF_LWT_ENCAP_SEG6*  types  can be called by BPF programs
1797                     of type BPF_PROG_TYPE_LWT_IN; BPF_LWT_ENCAP_IP  type  can
1798                     be  called  by bpf programs of types BPF_PROG_TYPE_LWT_IN
1799                     and BPF_PROG_TYPE_LWT_XMIT.
1800
1801                     A call to this helper is susceptible to change the under‐
1802                     lying  packet buffer. Therefore, at load time, all checks
1803                     on pointers previously done by the verifier  are  invali‐
1804                     dated  and must be performed again, if the helper is used
1805                     in combination with direct packet access.
1806
1807              Return 0 on success, or a negative error in case of failure.
1808
1809       int bpf_lwt_seg6_store_bytes(struct sk_buff  *skb,  u32  offset,  const
1810       void *from, u32 len)
1811
1812              Description
1813                     Store len bytes from address from into the packet associ‐
1814                     ated to skb, at offset. Only  the  flags,  tag  and  TLVs
1815                     inside  the  outermost IPv6 Segment Routing Header can be
1816                     modified through this helper.
1817
1818                     A call to this helper is susceptible to change the under‐
1819                     lying  packet buffer. Therefore, at load time, all checks
1820                     on pointers previously done by the verifier  are  invali‐
1821                     dated  and must be performed again, if the helper is used
1822                     in combination with direct packet access.
1823
1824              Return 0 on success, or a negative error in case of failure.
1825
1826       int bpf_lwt_seg6_adjust_srh(struct sk_buff *skb, u32 offset, s32 delta)
1827
1828              Description
1829                     Adjust the size allocated to TLVs in the  outermost  IPv6
1830                     Segment Routing Header contained in the packet associated
1831                     to skb, at position offset by delta bytes.  Only  offsets
1832                     after  the  segments  are  accepted. delta can be as well
1833                     positive (growing) as negative (shrinking).
1834
1835                     A call to this helper is susceptible to change the under‐
1836                     lying  packet buffer. Therefore, at load time, all checks
1837                     on pointers previously done by the verifier  are  invali‐
1838                     dated  and must be performed again, if the helper is used
1839                     in combination with direct packet access.
1840
1841              Return 0 on success, or a negative error in case of failure.
1842
1843       int bpf_lwt_seg6_action(struct sk_buff *skb, u32 action,  void  *param,
1844       u32 param_len)
1845
1846              Description
1847                     Apply  an  IPv6  Segment Routing action of type action to
1848                     the packet associated to skb. Each action takes a parame‐
1849                     ter  contained  at address param, and of length param_len
1850                     bytes.  action can be one of:
1851
1852                     SEG6_LOCAL_ACTION_END_X
1853                            End.X action: Endpoint with Layer-3 cross-connect.
1854                            Type of param: struct in6_addr.
1855
1856                     SEG6_LOCAL_ACTION_END_T
1857                            End.T  action:  Endpoint  with specific IPv6 table
1858                            lookup.  Type of param: int.
1859
1860                     SEG6_LOCAL_ACTION_END_B6
1861                            End.B6 action: Endpoint bound to an  SRv6  policy.
1862                            Type of param: struct ipv6_sr_hdr.
1863
1864                     SEG6_LOCAL_ACTION_END_B6_ENCAP
1865                            End.B6.Encap  action:  Endpoint  bound  to an SRv6
1866                            encapsulation  policy.   Type  of  param:   struct
1867                            ipv6_sr_hdr.
1868
1869                     A call to this helper is susceptible to change the under‐
1870                     lying packet buffer. Therefore, at load time, all  checks
1871                     on  pointers  previously done by the verifier are invali‐
1872                     dated and must be performed again, if the helper is  used
1873                     in combination with direct packet access.
1874
1875              Return 0 on success, or a negative error in case of failure.
1876
1877       int bpf_rc_repeat(void *ctx)
1878
1879              Description
1880                     This helper is used in programs implementing IR decoding,
1881                     to report a successfully decoded repeat key message. This
1882                     delays  the  generation  of a key up event for previously
1883                     generated key down event.
1884
1885                     Some IR protocols like NEC have a special IR message  for
1886                     repeating last button, for when a button is held down.
1887
1888                     The  ctx  should  point to the lirc sample as passed into
1889                     the program.
1890
1891                     This helper is only available is the kernel was  compiled
1892                     with  the  CONFIG_BPF_LIRC_MODE2 configuration option set
1893                     to "y".
1894
1895              Return 0
1896
1897       int bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle)
1898
1899              Description
1900                     This helper is used in programs implementing IR decoding,
1901                     to report a successfully decoded key press with scancode,
1902                     toggle value in the given protocol. The scancode will  be
1903                     translated to a keycode using the rc keymap, and reported
1904                     as an input key down event. After a period a key up event
1905                     is  generated.  This  period  can  be extended by calling
1906                     either bpf_rc_keydown() again with the  same  values,  or
1907                     calling bpf_rc_repeat().
1908
1909                     Some  protocols  include a toggle bit, in case the button
1910                     was released and pressed again between consecutive  scan‐
1911                     codes.
1912
1913                     The  ctx  should  point to the lirc sample as passed into
1914                     the program.
1915
1916                     The protocol is the decoded  protocol  number  (see  enum
1917                     rc_proto for some predefined values).
1918
1919                     This  helper is only available is the kernel was compiled
1920                     with the CONFIG_BPF_LIRC_MODE2 configuration  option  set
1921                     to "y".
1922
1923              Return 0
1924
1925       u64 bpf_skb_cgroup_id(struct sk_buff *skb)
1926
1927              Description
1928                     Return the cgroup v2 id of the socket associated with the
1929                     skb.  This is roughly similar to the bpf_get_cgroup_clas‐
1930                     sid() helper for cgroup v1 by providing a tag resp. iden‐
1931                     tifier that can be matched on or  used  for  map  lookups
1932                     e.g.  to  implement  policy.  The cgroup v2 id of a given
1933                     path in the hierarchy is exposed in  user  space  through
1934                     the f_handle API in order to get to the same 64-bit id.
1935
1936                     This  helper  can  be  used on TC egress path, but not on
1937                     ingress, and is available only if the kernel was compiled
1938                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
1939
1940              Return The  id  is  returned  or  0  in case the id could not be
1941                     retrieved.
1942
1943       u64 bpf_get_current_cgroup_id(void)
1944
1945              Return A 64-bit integer containing the current cgroup  id  based
1946                     on the cgroup within which the current task is running.
1947
1948       void *bpf_get_local_storage(void *map, u64 flags)
1949
1950              Description
1951                     Get  the pointer to the local storage area.  The type and
1952                     the size of the local storage is defined by the map argu‐
1953                     ment.   The  flags meaning is specific for each map type,
1954                     and has to be 0 for cgroup local storage.
1955
1956                     Depending on the BPF program type, a local  storage  area
1957                     can  be shared between multiple instances of the BPF pro‐
1958                     gram, running simultaneously.
1959
1960                     A user should care about the synchronization by  himself.
1961                     For  example,  by  using  the BPF_STX_XADD instruction to
1962                     alter the shared data.
1963
1964              Return A pointer to the local storage area.
1965
1966       int  bpf_sk_select_reuseport(struct  sk_reuseport_md   *reuse,   struct
1967       bpf_map *map, void *key, u64 flags)
1968
1969              Description
1970                     Select  a  SO_REUSEPORT socket from a BPF_MAP_TYPE_REUSE‐
1971                     PORT_ARRAY map.  It checks the selected socket is  match‐
1972                     ing the incoming request in the socket buffer.
1973
1974              Return 0 on success, or a negative error in case of failure.
1975
1976       u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level)
1977
1978              Description
1979                     Return id of cgroup v2 that is ancestor of cgroup associ‐
1980                     ated with the skb at the ancestor_level.  The root cgroup
1981                     is  at ancestor_level zero and each step down the hierar‐
1982                     chy increments the level. If ancestor_level ==  level  of
1983                     cgroup  associated  with  skb,  then return value will be
1984                     same as that of bpf_skb_cgroup_id().
1985
1986                     The helper is  useful  to  implement  policies  based  on
1987                     cgroups that are upper in hierarchy than immediate cgroup
1988                     associated with skb.
1989
1990                     The format of returned id and helper limitations are same
1991                     as in bpf_skb_cgroup_id().
1992
1993              Return The  id  is  returned  or  0  in case the id could not be
1994                     retrieved.
1995
1996       struct bpf_sock  *bpf_sk_lookup_tcp(void  *ctx,  struct  bpf_sock_tuple
1997       *tuple, u32 tuple_size, u64 netns, u64 flags)
1998
1999              Description
2000                     Look for TCP socket matching tuple, optionally in a child
2001                     network  namespace  netns.  The  return  value  must   be
2002                     checked, and if non-NULL, released via bpf_sk_release().
2003
2004                     The  ctx should point to the context of the program, such
2005                     as the skb or socket (depending on the hook in use). This
2006                     is  used  to determine the base network namespace for the
2007                     lookup.
2008
2009                     tuple_size must be one of:
2010
2011                     sizeof(tuple->ipv4)
2012                            Look for an IPv4 socket.
2013
2014                     sizeof(tuple->ipv6)
2015                            Look for an IPv6 socket.
2016
2017                     If the netns is a negative signed  32-bit  integer,  then
2018                     the  socket lookup table in the netns associated with the
2019                     ctx will will be used. For the  TC  hooks,  this  is  the
2020                     netns of the device in the skb. For socket hooks, this is
2021                     the netns of the socket.  If netns is  any  other  signed
2022                     32-bit value greater than or equal to zero then it speci‐
2023                     fies the ID of the netns relative to the netns associated
2024                     with  the  ctx.  netns  values beyond the range of 32-bit
2025                     integers are reserved for future use.
2026
2027                     All values for flags are reserved for future  usage,  and
2028                     must be left at zero.
2029
2030                     This  helper is available only if the kernel was compiled
2031                     with CONFIG_NET configuration option.
2032
2033              Return Pointer to struct bpf_sock, or NULL in case  of  failure.
2034                     For  sockets  with  reuseport option, the struct bpf_sock
2035                     result is from  reuse->socks[]  using  the  hash  of  the
2036                     tuple.
2037
2038       struct  bpf_sock  *bpf_sk_lookup_udp(void  *ctx,  struct bpf_sock_tuple
2039       *tuple, u32 tuple_size, u64 netns, u64 flags)
2040
2041              Description
2042                     Look for UDP socket matching tuple, optionally in a child
2043                     network   namespace  netns.  The  return  value  must  be
2044                     checked, and if non-NULL, released via bpf_sk_release().
2045
2046                     The ctx should point to the context of the program,  such
2047                     as the skb or socket (depending on the hook in use). This
2048                     is used to determine the base network namespace  for  the
2049                     lookup.
2050
2051                     tuple_size must be one of:
2052
2053                     sizeof(tuple->ipv4)
2054                            Look for an IPv4 socket.
2055
2056                     sizeof(tuple->ipv6)
2057                            Look for an IPv6 socket.
2058
2059                     If  the  netns  is a negative signed 32-bit integer, then
2060                     the socket lookup table in the netns associated with  the
2061                     ctx  will  will  be  used.  For the TC hooks, this is the
2062                     netns of the device in the skb. For socket hooks, this is
2063                     the  netns  of  the socket.  If netns is any other signed
2064                     32-bit value greater than or equal to zero then it speci‐
2065                     fies the ID of the netns relative to the netns associated
2066                     with the ctx. netns values beyond  the  range  of  32-bit
2067                     integers are reserved for future use.
2068
2069                     All  values  for flags are reserved for future usage, and
2070                     must be left at zero.
2071
2072                     This helper is available only if the kernel was  compiled
2073                     with CONFIG_NET configuration option.
2074
2075              Return Pointer  to  struct bpf_sock, or NULL in case of failure.
2076                     For sockets with reuseport option,  the  struct  bpf_sock
2077                     result  is  from  reuse->socks[]  using  the  hash of the
2078                     tuple.
2079
2080       int bpf_sk_release(struct bpf_sock *sock)
2081
2082              Description
2083                     Release the reference  held  by  sock.  sock  must  be  a
2084                     non-NULL     pointer     that     was    returned    from
2085                     bpf_sk_lookup_xxx().
2086
2087              Return 0 on success, or a negative error in case of failure.
2088
2089       int bpf_map_push_elem(struct  bpf_map  *map,  const  void  *value,  u64
2090       flags)
2091
2092              Description
2093                     Push an element value in map. flags is one of:
2094
2095                     BPF_EXIST
2096                            If  the queue/stack is full, the oldest element is
2097                            removed to make room for this.
2098
2099              Return 0 on success, or a negative error in case of failure.
2100
2101       int bpf_map_pop_elem(struct bpf_map *map, void *value)
2102
2103              Description
2104                     Pop an element from map.
2105
2106              Return 0 on success, or a negative error in case of failure.
2107
2108       int bpf_map_peek_elem(struct bpf_map *map, void *value)
2109
2110              Description
2111                     Get an element from map without removing it.
2112
2113              Return 0 on success, or a negative error in case of failure.
2114
2115       int bpf_msg_push_data(struct sk_msg_buff *msg, u32 start, u32 len,  u64
2116       flags)
2117
2118              Description
2119                     For  socket policies, insert len bytes into msg at offset
2120                     start.
2121
2122                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
2123                     it  may  want to insert metadata or options into the msg.
2124                     This can later be read and used by any of the lower layer
2125                     BPF hooks.
2126
2127                     This  helper  may fail if under memory pressure (a malloc
2128                     fails) in these cases BPF programs will get an  appropri‐
2129                     ate error and BPF programs will need to handle them.
2130
2131              Return 0 on success, or a negative error in case of failure.
2132
2133       int  bpf_msg_pop_data(struct  sk_msg_buff *msg, u32 start, u32 len, u64
2134       flags)
2135
2136              Description
2137                     Will remove len bytes from a msg starting at byte  start.
2138                     This may result in ENOMEM errors under certain situations
2139                     if an allocation and copy are required due to a full ring
2140                     buffer.   However, the helper will try to avoid doing the
2141                     allocation if possible. Other errors can occur  if  input
2142                     parameters are invalid either due to start byte not being
2143                     valid part of msg  payload  and/or  pop  value  being  to
2144                     large.
2145
2146              Return 0 on success, or a negative error in case of failure.
2147
2148       int bpf_rc_pointer_rel(void *ctx, s32 rel_x, s32 rel_y)
2149
2150              Description
2151                     This helper is used in programs implementing IR decoding,
2152                     to report a successfully decoded pointer movement.
2153
2154                     The ctx should point to the lirc sample  as  passed  into
2155                     the program.
2156
2157                     This  helper is only available is the kernel was compiled
2158                     with the CONFIG_BPF_LIRC_MODE2 configuration  option  set
2159                     to "y".
2160
2161              Return 0
2162
2163       int bpf_spin_lock(struct bpf_spin_lock *lock)
2164
2165              Description
2166                     Acquire a spinlock represented by the pointer lock, which
2167                     is stored as part of a value of a map.  Taking  the  lock
2168                     allows  to  safely  update the rest of the fields in that
2169                     value. The spinlock can (and must) later be released with
2170                     a call to bpf_spin_unlock(lock).
2171
2172                     Spinlocks  in BPF programs come with a number of restric‐
2173                     tions and constraints:
2174
2175                     · bpf_spin_lock objects are only allowed inside  maps  of
2176                       types  BPF_MAP_TYPE_HASH  and  BPF_MAP_TYPE_ARRAY (this
2177                       list could be extended in the future).
2178
2179                     · BTF description of the map is mandatory.
2180
2181                     · The BPF program can take ONE lock at a time, since tak‐
2182                       ing two or more could cause dead locks.
2183
2184                     · Only  one  struct bpf_spin_lock is allowed per map ele‐
2185                       ment.
2186
2187                     · When the lock is taken, calls (either  BPF  to  BPF  or
2188                       helpers) are not allowed.
2189
2190                     · The  BPF_LD_ABS  and  BPF_LD_IND  instructions  are not
2191                       allowed inside a spinlock-ed region.
2192
2193                     · The BPF program MUST call bpf_spin_unlock() to  release
2194                       the lock, on all execution paths, before it returns.
2195
2196                     · The  BPF  program  can access struct bpf_spin_lock only
2197                       via the bpf_spin_lock() and bpf_spin_unlock()  helpers.
2198                       Loading  or  storing data into the struct bpf_spin_lock
2199                       lock; field of a map is not allowed.
2200
2201                     · To use the bpf_spin_lock() helper, the BTF  description
2202                       of  the  map  value  must  be  a struct and have struct
2203                       bpf_spin_lock anyname; field at the top level.   Nested
2204                       lock inside another struct is not allowed.
2205
2206                     · The struct bpf_spin_lock lock field in a map value must
2207                       be aligned on a multiple of 4 bytes in that value.
2208
2209                     · Syscall with command BPF_MAP_LOOKUP_ELEM does not  copy
2210                       the bpf_spin_lock field to user space.
2211
2212                     · Syscall  with  command  BPF_MAP_UPDATE_ELEM,  or update
2213                       from a BPF program, do  not  update  the  bpf_spin_lock
2214                       field.
2215
2216                     · bpf_spin_lock  cannot  be on the stack or inside a net‐
2217                       working packet (it can only be inside of a map values).
2218
2219                     · bpf_spin_lock is available to root only.
2220
2221                     · Tracing programs and socket filter programs cannot  use
2222                       bpf_spin_lock()  due  to insufficient preemption checks
2223                       (but this may change in the future).
2224
2225                     · bpf_spin_lock  is  not  allowed  in   inner   maps   of
2226                       map-in-map.
2227
2228              Return 0
2229
2230       int bpf_spin_unlock(struct bpf_spin_lock *lock)
2231
2232              Description
2233                     Release   the   lock  previously  locked  by  a  call  to
2234                     bpf_spin_lock(lock).
2235
2236              Return 0
2237
2238       struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)
2239
2240              Description
2241                     This helper gets a struct bpf_sock pointer such that  all
2242                     the fields in this bpf_sock can be accessed.
2243
2244              Return A  struct bpf_sock pointer on success, or NULL in case of
2245                     failure.
2246
2247       struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk)
2248
2249              Description
2250                     This helper gets a struct  bpf_tcp_sock  pointer  from  a
2251                     struct bpf_sock pointer.
2252
2253              Return A struct bpf_tcp_sock pointer on success, or NULL in case
2254                     of failure.
2255
2256       int bpf_skb_ecn_set_ce(struct sk_buff *skb)
2257
2258              Description
2259                     Set ECN (Explicit Congestion Notification)  field  of  IP
2260                     header to CE (Congestion Encountered) if current value is
2261                     ECT (ECN Capable Transport). Otherwise, do nothing. Works
2262                     with IPv6 and IPv4.
2263
2264              Return 1  if  the  CE  flag is set (either by the current helper
2265                     call or because it was already present), 0 if it  is  not
2266                     set.
2267
2268       struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk)
2269
2270              Description
2271                     Return  a  struct  bpf_sock  pointer in TCP_LISTEN state.
2272                     bpf_sk_release() is unnecessary and not allowed.
2273
2274              Return A struct bpf_sock pointer on success, or NULL in case  of
2275                     failure.
2276
2277       struct  bpf_sock  *bpf_skc_lookup_tcp(void  *ctx, struct bpf_sock_tuple
2278       *tuple, u32 tuple_size, u64 netns, u64 flags)
2279
2280              Description
2281                     Look for TCP socket matching tuple, optionally in a child
2282                     network   namespace  netns.  The  return  value  must  be
2283                     checked, and if non-NULL, released via bpf_sk_release().
2284
2285                     This function is identical to bpf_sk_lookup_tcp(), except
2286                     that  it  also  returns  timewait or request sockets. Use
2287                     bpf_sk_fullsock() or bpf_tcp_sock() to  access  the  full
2288                     structure.
2289
2290                     This  helper is available only if the kernel was compiled
2291                     with CONFIG_NET configuration option.
2292
2293              Return Pointer to struct bpf_sock, or NULL in case  of  failure.
2294                     For  sockets  with  reuseport option, the struct bpf_sock
2295                     result is from  reuse->socks[]  using  the  hash  of  the
2296                     tuple.
2297
2298       int   bpf_tcp_check_syncookie(struct   bpf_sock  *sk,  void  *iph,  u32
2299       iph_len, struct tcphdr *th, u32 th_len)
2300
2301              Description
2302                     Check whether iph and th contain a valid SYN  cookie  ACK
2303                     for the listening socket in sk.
2304
2305                     iph points to the start of the IPv4 or IPv6 header, while
2306                     iph_len contains sizeof(struct  iphdr)  or  sizeof(struct
2307                     ip6hdr).
2308
2309                     th  points  to  the start of the TCP header, while th_len
2310                     contains sizeof(struct tcphdr).
2311
2312              Return 0 if iph and th are a valid SYN cookie ACK, or a negative
2313                     error otherwise.
2314
2315       int  bpf_sysctl_get_name(struct  bpf_sysctl  *ctx,  char  *buf,  size_t
2316       buf_len, u64 flags)
2317
2318              Description
2319                     Get name of sysctl in /proc/sys/ and copy  it  into  pro‐
2320                     vided by program buffer buf of size buf_len.
2321
2322                     The   buffer   is  always  NUL  terminated,  unless  it's
2323                     zero-sized.
2324
2325                     If flags is zero, full name (e.g. "net/ipv4/tcp_mem")  is
2326                     copied. Use BPF_F_SYSCTL_BASE_NAME flag to copy base name
2327                     only (e.g. "tcp_mem").
2328
2329              Return Number of character copied (not  including  the  trailing
2330                     NUL).
2331
2332                     -E2BIG  if the buffer wasn't big enough (buf will contain
2333                     truncated name in this case).
2334
2335       int bpf_sysctl_get_current_value(struct  bpf_sysctl  *ctx,  char  *buf,
2336       size_t buf_len)
2337
2338              Description
2339                     Get  current  value  of  sysctl  as  it  is  presented in
2340                     /proc/sys (incl. newline, etc), and copy it as  a  string
2341                     into provided by program buffer buf of size buf_len.
2342
2343                     The  whole  value is copied, no matter what file position
2344                     user space issued e.g. sys_read at.
2345
2346                     The  buffer  is  always  NUL  terminated,   unless   it's
2347                     zero-sized.
2348
2349              Return Number  of  character  copied (not including the trailing
2350                     NUL).
2351
2352                     -E2BIG if the buffer wasn't big enough (buf will  contain
2353                     truncated name in this case).
2354
2355                     -EINVAL  if  current  value was unavailable, e.g. because
2356                     sysctl is uninitialized and read returns -EIO for it.
2357
2358       int bpf_sysctl_get_new_value(struct bpf_sysctl *ctx, char *buf,  size_t
2359       buf_len)
2360
2361              Description
2362                     Get  new  value  being  written  by  user space to sysctl
2363                     (before the actual write happens) and copy it as a string
2364                     into provided by program buffer buf of size buf_len.
2365
2366                     User space may write new value at file position > 0.
2367
2368                     The   buffer   is  always  NUL  terminated,  unless  it's
2369                     zero-sized.
2370
2371              Return Number of character copied (not  including  the  trailing
2372                     NUL).
2373
2374                     -E2BIG  if the buffer wasn't big enough (buf will contain
2375                     truncated name in this case).
2376
2377                     -EINVAL if sysctl is being read.
2378
2379       int bpf_sysctl_set_new_value(struct bpf_sysctl *ctx, const  char  *buf,
2380       size_t buf_len)
2381
2382              Description
2383                     Override  new value being written by user space to sysctl
2384                     with value provided by program  in  buffer  buf  of  size
2385                     buf_len.
2386
2387                     buf  should  contain a string in same form as provided by
2388                     user space on sysctl write.
2389
2390                     User space may write new value at file position >  0.  To
2391                     override  the  whole sysctl value file position should be
2392                     set to zero.
2393
2394              Return 0 on success.
2395
2396                     -E2BIG if the buf_len is too big.
2397
2398                     -EINVAL if sysctl is being read.
2399
2400       int bpf_strtol(const char *buf, size_t buf_len, u64 flags, long *res)
2401
2402              Description
2403                     Convert the initial part of the string from buffer buf of
2404                     size  buf_len  to  a  long integer according to the given
2405                     base and save the result in res.
2406
2407                     The string may begin with an arbitrary  amount  of  white
2408                     space  (as determined by isspace(3)) followed by a single
2409                     optional '-' sign.
2410
2411                     Five least significant bits of flags encode  base,  other
2412                     bits are currently unused.
2413
2414                     Base must be either 8, 10, 16 or 0 to detect it automati‐
2415                     cally similar to user space strtol(3).
2416
2417              Return Number of characters consumed on success. Must  be  posi‐
2418                     tive but no more than buf_len.
2419
2420                     -EINVAL if no valid digits were found or unsupported base
2421                     was provided.
2422
2423                     -ERANGE if resulting value was out of range.
2424
2425       int bpf_strtoul(const char *buf, size_t buf_len,  u64  flags,  unsigned
2426       long *res)
2427
2428              Description
2429                     Convert the initial part of the string from buffer buf of
2430                     size buf_len to an unsigned long integer according to the
2431                     given base and save the result in res.
2432
2433                     The  string  may  begin with an arbitrary amount of white
2434                     space (as determined by isspace(3)).
2435
2436                     Five least significant bits of flags encode  base,  other
2437                     bits are currently unused.
2438
2439                     Base must be either 8, 10, 16 or 0 to detect it automati‐
2440                     cally similar to user space strtoul(3).
2441
2442              Return Number of characters consumed on success. Must  be  posi‐
2443                     tive but no more than buf_len.
2444
2445                     -EINVAL if no valid digits were found or unsupported base
2446                     was provided.
2447
2448                     -ERANGE if resulting value was out of range.
2449
2450       void *bpf_sk_storage_get(struct bpf_map *map, struct bpf_sock *sk, void
2451       *value, u64 flags)
2452
2453              Description
2454                     Get a bpf-local-storage from a sk.
2455
2456                     Logically,  it could be thought of getting the value from
2457                     a map with sk as the key.  From  this  perspective,   the
2458                     usage is not much different from bpf_map_lookup_elem(map,
2459                     &sk) except this helper enforces the key must be  a  full
2460                     socket  and  the  map  must  be a BPF_MAP_TYPE_SK_STORAGE
2461                     also.
2462
2463                     Underneath, the value is stored locally at sk instead  of
2464                     the  map.   The  map  is  used  as  the bpf-local-storage
2465                     "type". The bpf-local-storage "type" (i.e.  the  map)  is
2466                     searched against all bpf-local-storages residing at sk.
2467
2468                     An  optional  flags  (BPF_SK_STORAGE_GET_F_CREATE) can be
2469                     used such that a new bpf-local-storage will be created if
2470                     one  does  not  exist.   value  can be used together with
2471                     BPF_SK_STORAGE_GET_F_CREATE to specify the initial  value
2472                     of  a  bpf-local-storage.   If  value  is  NULL,  the new
2473                     bpf-local-storage will be zero initialized.
2474
2475              Return A bpf-local-storage pointer is returned on success.
2476
2477                     NULL if not found or there was an error in adding  a  new
2478                     bpf-local-storage.
2479
2480       int bpf_sk_storage_delete(struct bpf_map *map, struct bpf_sock *sk)
2481
2482              Description
2483                     Delete a bpf-local-storage from a sk.
2484
2485              Return 0 on success.
2486
2487                     -ENOENT if the bpf-local-storage cannot be found.
2488
2489       int bpf_send_signal(u32 sig)
2490
2491              Description
2492                     Send  signal sig to the process of the current task.  The
2493                     signal may be delivered to any of this process's threads.
2494
2495              Return 0 on success or successfully queued.
2496
2497                     -EBUSY if work queue under nmi is full.
2498
2499                     -EINVAL if sig is invalid.
2500
2501                     -EPERM if no permission to send the sig.
2502
2503                     -EAGAIN if bpf program can try again.
2504
2505       s64 bpf_tcp_gen_syncookie(struct bpf_sock *sk, void *iph, u32  iph_len,
2506       struct tcphdr *th, u32 th_len)
2507
2508              Description
2509                     Try to issue a SYN cookie for the packet with correspond‐
2510                     ing IP/TCP headers, iph and th, on the  listening  socket
2511                     in sk.
2512
2513                     iph points to the start of the IPv4 or IPv6 header, while
2514                     iph_len contains sizeof(struct  iphdr)  or  sizeof(struct
2515                     ip6hdr).
2516
2517                     th  points  to  the start of the TCP header, while th_len
2518                     contains the length of the TCP header.
2519
2520              Return On success, lower 32 bits hold the generated  SYN  cookie
2521                     in  followed by 16 bits which hold the MSS value for that
2522                     cookie, and the top 16 bits are unused.
2523
2524                     On failure, the returned value is one of the following:
2525
2526                     -EINVAL SYN cookie cannot be issued due to error
2527
2528                     -ENOENT SYN cookie should not be issued (no SYN flood)
2529
2530                     -EOPNOTSUPP kernel  configuration  does  not  enable  SYN
2531                     cookies
2532
2533                     -EPROTONOSUPPORT IP packet version is not 4 or 6
2534
2535       int  bpf_skb_output(void  *ctx,  struct  bpf_map  *map, u64 flags, void
2536       *data, u64 size)
2537
2538              Description
2539                     Write raw data blob into a special BPF perf event held by
2540                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
2541                     event must have the following attributes: PERF_SAMPLE_RAW
2542                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
2543                     PERF_COUNT_SW_BPF_OUTPUT as config.
2544
2545                     The flags are used to indicate the index in map for which
2546                     the  value  must  be  put,  masked with BPF_F_INDEX_MASK.
2547                     Alternatively, flags can be set to  BPF_F_CURRENT_CPU  to
2548                     indicate that the index of the current CPU core should be
2549                     used.
2550
2551                     The value to write, of size, is passed through eBPF stack
2552                     and pointed by data.
2553
2554                     ctx is a pointer to in-kernel struct sk_buff.
2555
2556                     This  helper  is  similar  to bpf_perf_event_output() but
2557                     restricted to raw_tracepoint bpf programs.
2558
2559              Return 0 on success, or a negative error in case of failure.
2560
2561       int bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr)
2562
2563              Description
2564                     Safely attempt to read size bytes from user space address
2565                     unsafe_ptr and store the data in dst.
2566
2567              Return 0 on success, or a negative error in case of failure.
2568
2569       int bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
2570
2571              Description
2572                     Safely  attempt  to  read  size  bytes  from kernel space
2573                     address unsafe_ptr and store the data in dst.
2574
2575              Return 0 on success, or a negative error in case of failure.
2576
2577       int   bpf_probe_read_user_str(void   *dst,   u32   size,   const   void
2578       *unsafe_ptr)
2579
2580              Description
2581                     Copy  a NUL terminated string from an unsafe user address
2582                     unsafe_ptr to dst. The size should include the  terminat‐
2583                     ing  NUL  byte. In case the string length is smaller than
2584                     size, the target is not padded with further NUL bytes. If
2585                     the  string length is larger than size, just size-1 bytes
2586                     are copied and the last byte is set to NUL.
2587
2588                     On success, the length of the copied string is  returned.
2589                     This  makes  this  helper  useful in tracing programs for
2590                     reading strings, and more importantly to get  its  length
2591                     at runtime. See the following snippet:
2592
2593                        SEC("kprobe/sys_open")
2594                        void bpf_sys_open(struct pt_regs *ctx)
2595                        {
2596                                char buf[PATHLEN]; // PATHLEN is defined to 256
2597                                int res = bpf_probe_read_user_str(buf, sizeof(buf),
2598                                                                  ctx->di);
2599
2600                                // Consume buf, for example push it to
2601                                // userspace via bpf_perf_event_output(); we
2602                                // can use res (the string length) as event
2603                                // size, after checking its boundaries.
2604                        }
2605
2606                     In  comparison,  using  bpf_probe_read_user() helper here
2607                     instead to read the string would require to estimate  the
2608                     length at compile time, and would often result in copying
2609                     more memory than necessary.
2610
2611                     Another  useful  use  case  is  when  parsing  individual
2612                     process  arguments  or  individual  environment variables
2613                     navigating      current->mm->arg_start      and      cur‐
2614                     rent->mm->env_start:  using  this  helper  and the return
2615                     value, one can quickly iterate at the right offset of the
2616                     memory area.
2617
2618              Return On  success,  the strictly positive length of the string,
2619                     including the trailing NUL character. On error,  a  nega‐
2620                     tive value.
2621
2622       int   bpf_probe_read_kernel_str(void   *dst,   u32   size,  const  void
2623       *unsafe_ptr)
2624
2625              Description
2626                     Copy a  NUL  terminated  string  from  an  unsafe  kernel
2627                     address   unsafe_ptr  to  dst.  Same  semantics  as  with
2628                     bpf_probe_read_user_str() apply.
2629
2630              Return On success, the strictly positive length of  the  string,
2631                     including  the  trailing NUL character. On error, a nega‐
2632                     tive value.
2633
2634       int bpf_tcp_send_ack(void *tp, u32 rcv_nxt)
2635
2636              Description
2637                     Send out a tcp-ack. tp is the in-kernel struct  tcp_sock.
2638                     rcv_nxt is the ack_seq to be sent out.
2639
2640              Return 0 on success, or a negative error in case of failure.
2641
2642       int bpf_send_signal_thread(u32 sig)
2643
2644              Description
2645                     Send  signal  sig to the thread corresponding to the cur‐
2646                     rent task.
2647
2648              Return 0 on success or successfully queued.
2649
2650                     -EBUSY if work queue under nmi is full.
2651
2652                     -EINVAL if sig is invalid.
2653
2654                     -EPERM if no permission to send the sig.
2655
2656                     -EAGAIN if bpf program can try again.
2657
2658       u64 bpf_jiffies64(void)
2659
2660              Description
2661                     Obtain the 64bit jiffies
2662
2663              Return The 64 bit jiffies
2664
2665       int bpf_read_branch_records(struct bpf_perf_event_data *ctx, void *buf,
2666       u32 size, u64 flags)
2667
2668              Description
2669                     For  an  eBPF  program attached to a perf event, retrieve
2670                     the branch records (struct perf_branch_entry)  associated
2671                     to  ctx  and  store it in the buffer pointed by buf up to
2672                     size size bytes.
2673
2674              Return On success, number of bytes written to buf. On  error,  a
2675                     negative value.
2676
2677                     The  flags can be set to BPF_F_GET_BRANCH_RECORDS_SIZE to
2678                     instead return the number of bytes required to store  all
2679                     the branch entries. If this flag is set, buf may be NULL.
2680
2681                     -EINVAL  if  arguments  invalid or size not a multiple of
2682                     sizeof(struct perf_branch_entry).
2683
2684                     -ENOENT if architecture does not support branch records.
2685
2686       int bpf_get_ns_current_pid_tgid(u64 dev, u64 ino, struct bpf_pidns_info
2687       *nsdata, u32 size)
2688
2689              Description
2690                     Returns  0  on  success,  values for pid and tgid as seen
2691                     from the current namespace will be returned in nsdata.
2692
2693              Return 0 on success, or one of the following in case of failure:
2694
2695                     -EINVAL if dev and inum supplied don't  match  dev_t  and
2696                     inode number with nsfs of current task, or if dev conver‐
2697                     sion to dev_t lost high bits.
2698
2699                     -ENOENT if pidns does not exists for the current task.
2700
2701       int bpf_xdp_output(void *ctx, struct  bpf_map  *map,  u64  flags,  void
2702       *data, u64 size)
2703
2704              Description
2705                     Write raw data blob into a special BPF perf event held by
2706                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
2707                     event must have the following attributes: PERF_SAMPLE_RAW
2708                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
2709                     PERF_COUNT_SW_BPF_OUTPUT as config.
2710
2711                     The flags are used to indicate the index in map for which
2712                     the value must  be  put,  masked  with  BPF_F_INDEX_MASK.
2713                     Alternatively,  flags  can be set to BPF_F_CURRENT_CPU to
2714                     indicate that the index of the current CPU core should be
2715                     used.
2716
2717                     The value to write, of size, is passed through eBPF stack
2718                     and pointed by data.
2719
2720                     ctx is a pointer to in-kernel struct xdp_buff.
2721
2722                     This helper  is  similar  to  bpf_perf_eventoutput()  but
2723                     restricted to raw_tracepoint bpf programs.
2724
2725              Return 0 on success, or a negative error in case of failure.
2726
2727       u64 bpf_get_netns_cookie(void *ctx)
2728
2729              Description
2730                     Retrieve the cookie (generated by the kernel) of the net‐
2731                     work namespace the input ctx is associated with. The net‐
2732                     work namespace cookie remains stable for its lifetime and
2733                     provides a global identifier that can be assumed  unique.
2734                     If  ctx  is  NULL, then the helper returns the cookie for
2735                     the initial network namespace. The cookie itself is  very
2736                     similar  to  that  of bpf_get_socket_cookie() helper, but
2737                     for network namespaces instead of sockets.
2738
2739              Return A 8-byte long opaque number.
2740
2741       u64 bpf_get_current_ancestor_cgroup_id(int ancestor_level)
2742
2743              Description
2744                     Return id of cgroup v2 that is  ancestor  of  the  cgroup
2745                     associated  with  the current task at the ancestor_level.
2746                     The root cgroup is at ancestor_level zero and  each  step
2747                     down  the  hierarchy  increments  the  level.  If  ances‐
2748                     tor_level == level of cgroup associated with the  current
2749                     task,  then  return  value  will  be  the same as that of
2750                     bpf_get_current_cgroup_id().
2751
2752                     The helper is  useful  to  implement  policies  based  on
2753                     cgroups that are upper in hierarchy than immediate cgroup
2754                     associated with the current task.
2755
2756                     The format of returned id and helper limitations are same
2757                     as in bpf_get_current_cgroup_id().
2758
2759              Return The  id  is  returned  or  0  in case the id could not be
2760                     retrieved.
2761
2762       int bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags)
2763
2764              Description
2765                     Assign the sk to the skb. When combined with  appropriate
2766                     routing  configuration  to receive the packet towards the
2767                     socket, will cause skb to be delivered to  the  specified
2768                     socket.   Subsequent  redirection  of  skb via  bpf_redi‐
2769                     rect(), bpf_clone_redirect() or other methods outside  of
2770                     BPF may interfere with successful delivery to the socket.
2771
2772                     This operation is only valid from TC ingress path.
2773
2774                     The flags argument must be zero.
2775
2776              Return 0 on success, or a negative error in case of failure:
2777
2778                     -EINVAL if specified flags are not supported.
2779
2780                     -ENOENT if the socket is unavailable for assignment.
2781
2782                     -ENETUNREACH if the socket is unreachable (wrong netns).
2783
2784                     -EOPNOTSUPP  if the operation is not supported, for exam‐
2785                     ple a call from outside of TC ingress.
2786
2787                     -ESOCKTNOSUPPORT if the  socket  type  is  not  supported
2788                     (reuseport).
2789
2790       u64 bpf_ktime_get_boot_ns(void)
2791
2792              Description
2793                     Return  the  time  elapsed since system boot, in nanosec‐
2794                     onds.  Does include the time the  system  was  suspended.
2795                     See: clock_gettime(CLOCK_BOOTTIME)
2796
2797              Return Current ktime.
2798
2799       int  bpf_seq_printf(struct  seq_file *m, const char *fmt, u32 fmt_size,
2800       const void *data, u32 data_len)
2801
2802              Description
2803                     bpf_seq_printf() uses seq_file seq_printf() to print  out
2804                     the  format  string.   The m represents the seq_file. The
2805                     fmt and fmt_size are for the format  string  itself.  The
2806                     data  and  data_len are format string arguments. The data
2807                     are a u64 array and corresponding  format  string  values
2808                     are  stored  in the array. For strings and pointers where
2809                     pointees are accessed, only the pointer values are stored
2810                     in  the  data array.  The data_len is the size of data in
2811                     bytes.
2812
2813                     Formats %s, %p{i,I}{4,6} requires to read kernel  memory.
2814                     Reading  kernel  memory  may  fail  due to either invalid
2815                     address or valid address but  requiring  a  major  memory
2816                     fault.  If reading kernel memory fails, the string for %s
2817                     will  be  an  empty  string,  and  the  ip  address   for
2818                     %p{i,I}{4,6}  will  be 0. Not returning error to bpf pro‐
2819                     gram is consistent with what bpf_trace_printk() does  for
2820                     now.
2821
2822              Return 0 on success, or a negative error in case of failure:
2823
2824                     -EBUSY  if  per-CPU  memory  copy buffer is busy, can try
2825                     again by returning 1 from bpf program.
2826
2827                     -EINVAL  if  arguments  are  invalid,  or   if   fmt   is
2828                     invalid/unsupported.
2829
2830                     -E2BIG if fmt contains too many format specifiers.
2831
2832                     -EOVERFLOW  if an overflow happened: The same object will
2833                     be tried again.
2834
2835       int bpf_seq_write(struct seq_file *m, const void *data, u32 len)
2836
2837              Description
2838                     bpf_seq_write() uses seq_file seq_write()  to  write  the
2839                     data.   The  m  represents the seq_file. The data and len
2840                     represent the data to write in bytes.
2841
2842              Return 0 on success, or a negative error in case of failure:
2843
2844                     -EOVERFLOW if an overflow happened: The same object  will
2845                     be tried again.
2846
2847       u64 bpf_sk_cgroup_id(struct bpf_sock *sk)
2848
2849              Description
2850                     Return the cgroup v2 id of the socket sk.
2851
2852                     sk  must be a non-NULL pointer to a full socket, e.g. one
2853                     returned  from  bpf_sk_lookup_xxx(),   bpf_sk_fullsock(),
2854                     etc.   The   format   of   returned  id  is  same  as  in
2855                     bpf_skb_cgroup_id().
2856
2857                     This helper is available only if the kernel was  compiled
2858                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
2859
2860              Return The  id  is  returned  or  0  in case the id could not be
2861                     retrieved.
2862
2863       u64 bpf_sk_ancestor_cgroup_id(struct bpf_sock *sk, int ancestor_level)
2864
2865              Description
2866                     Return id of cgroup v2 that is ancestor of cgroup associ‐
2867                     ated  with the sk at the ancestor_level.  The root cgroup
2868                     is at ancestor_level zero and each step down the  hierar‐
2869                     chy  increments  the level. If ancestor_level == level of
2870                     cgroup associated with sk, then return value will be same
2871                     as that of bpf_sk_cgroup_id().
2872
2873                     The  helper  is  useful  to  implement  policies based on
2874                     cgroups that are upper in hierarchy than immediate cgroup
2875                     associated with sk.
2876
2877                     The format of returned id and helper limitations are same
2878                     as in bpf_sk_cgroup_id().
2879
2880              Return The id is returned or 0 in  case  the  id  could  not  be
2881                     retrieved.
2882
2883       void  *bpf_ringbuf_output(void  *ringbuf,  void  *data,  u64  size, u64
2884       flags)
2885
2886              Description
2887                     Copy size bytes from data into a ring buffer ringbuf.  If
2888                     BPF_RB_NO_WAKEUP  is  specified in flags, no notification
2889                     of new data availability is sent.  IF BPF_RB_FORCE_WAKEUP
2890                     is  specified  in  flags, notification of new data avail‐
2891                     ability is sent unconditionally.
2892
2893              Return 0, on success; < 0, on error.
2894
2895       void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
2896
2897              Description
2898                     Reserve size bytes of payload in a ring buffer ringbuf.
2899
2900              Return Valid pointer with size bytes of memory available;  NULL,
2901                     otherwise.
2902
2903       void bpf_ringbuf_submit(void *data, u64 flags)
2904
2905              Description
2906                     Submit  reserved  ring buffer sample, pointed to by data.
2907                     If BPF_RB_NO_WAKEUP is specified in flags,  no  notifica‐
2908                     tion    of   new   data   availability   is   sent.    IF
2909                     BPF_RB_FORCE_WAKEUP is specified in  flags,  notification
2910                     of new data availability is sent unconditionally.
2911
2912              Return Nothing. Always succeeds.
2913
2914       void bpf_ringbuf_discard(void *data, u64 flags)
2915
2916              Description
2917                     Discard  reserved ring buffer sample, pointed to by data.
2918                     If BPF_RB_NO_WAKEUP is specified in flags,  no  notifica‐
2919                     tion    of   new   data   availability   is   sent.    IF
2920                     BPF_RB_FORCE_WAKEUP is specified in  flags,  notification
2921                     of new data availability is sent unconditionally.
2922
2923              Return Nothing. Always succeeds.
2924
2925       u64 bpf_ringbuf_query(void *ringbuf, u64 flags)
2926
2927              Description
2928                     Query  various  characteristics  of provided ring buffer.
2929                     What exactly is queries is determined by flags:
2930
2931              System Message: ERROR/3 (/tmp/bpf-helpers.rst:, line 2636)
2932                     Unexpected indentation.
2933
2934                        · BPF_RB_AVAIL_DATA - amount of data not yet consumed;
2935
2936                        · BPF_RB_RING_SIZE - the size of ring buffer;
2937
2938                        · BPF_RB_CONS_POS  -  consumer  position   (can   wrap
2939                          around);
2940
2941                        · BPF_RB_PROD_POS  -  producer(s)  position  (can wrap
2942                          around);
2943
2944              System Message: WARNING/2 (/tmp/bpf-helpers.rst:, line 2640)
2945                     Block quote ends without a blank line;  unexpected  unin‐
2946                     dent.
2947
2948                     Data  returned  is  just  a momentary snapshots of actual
2949                     values and could be inaccurate, so this  facility  should
2950                     be  used  to  power  heuristics and for reporting, not to
2951                     make 100% correct calculation.
2952
2953              Return Requested value, or 0, if flags are not recognized.
2954
2955       int bpf_csum_level(struct sk_buff *skb, u64 level)
2956
2957              Description
2958                     Change the skbs checksum level by one layer up  or  down,
2959                     or  reset  it entirely to none in order to have the stack
2960                     perform checksum validation. The level is  applicable  to
2961                     the  following  protocols: TCP, UDP, GRE, SCTP, FCOE. For
2962                     example, a decap of | ETH | IP | UDP | GUE | IP |  TCP  |
2963                     into  |  ETH  |  IP | TCP | through bpf_skb_adjust_room()
2964                     helper with passing in BPF_F_ADJ_ROOM_NO_CSUM_RESET  flag
2965                     would   require   one   call   to  bpf_csum_level()  with
2966                     BPF_CSUM_LEVEL_DEC since the UDP header is removed. Simi‐
2967                     larly,  an  encap  of the latter into the former could be
2968                     accompanied by a helper  call  to  bpf_csum_level()  with
2969                     BPF_CSUM_LEVEL_INC  if  the  skb  is still intended to be
2970                     processed in higher layers of the stack instead  of  just
2971                     egressing at tc.
2972
2973                     There are three supported level settings at this time:
2974
2975                     · BPF_CSUM_LEVEL_INC:  Increases skb->csum_level for skbs
2976                       with CHECKSUM_UNNECESSARY.
2977
2978                     · BPF_CSUM_LEVEL_DEC: Decreases skb->csum_level for  skbs
2979                       with CHECKSUM_UNNECESSARY.
2980
2981                     · BPF_CSUM_LEVEL_RESET:  Resets  skb->csum_level to 0 and
2982                       sets CHECKSUM_NONE to force checksum validation by  the
2983                       stack.
2984
2985                     · BPF_CSUM_LEVEL_QUERY:   No-op,   returns   the  current
2986                       skb->csum_level.
2987
2988              Return 0 on success, or a negative error in case of failure.  In
2989                     the    case    of   BPF_CSUM_LEVEL_QUERY,   the   current
2990                     skb->csum_level is returned or the error code -EACCES  in
2991                     case the skb is not subject to CHECKSUM_UNNECESSARY.
2992

EXAMPLES

2994       Example  usage  for most of the eBPF helpers listed in this manual page
2995       are available within the Linux kernel sources, at the  following  loca‐
2996       tions:
2997
2998       · samples/bpf/
2999
3000       · tools/testing/selftests/bpf/
3001

LICENSE

3003       eBPF  programs  can  have  an associated license, passed along with the
3004       bytecode instructions to the kernel when the programs are  loaded.  The
3005       format  for  that string is identical to the one in use for kernel mod‐
3006       ules (Dual licenses, such as "Dual BSD/GPL", may be used). Some  helper
3007       functions  are only accessible to programs that are compatible with the
3008       GNU Privacy License (GPL).
3009
3010       In order to use such helpers, the eBPF program must be loaded with  the
3011       correct  license string passed (via attr) to the bpf() system call, and
3012       this generally translates into the C source code of  the  program  con‐
3013       taining a line similar to the following:
3014
3015          char ____license[] __attribute__((section("license"), used)) = "GPL";
3016

IMPLEMENTATION

3018       This  manual  page  is  an  effort to document the existing eBPF helper
3019       functions.  But as of this writing, the BPF sub-system is  under  heavy
3020       development.  New  eBPF  program or map types are added, along with new
3021       helper functions. Some helpers  are  occasionally  made  available  for
3022       additional  program types. So in spite of the efforts of the community,
3023       this page might not be up-to-date. If you want  to  check  by  yourself
3024       what  helper  functions exist in your kernel, or what types of programs
3025       they can support, here are some files among the kernel  tree  that  you
3026       may be interested in:
3027
3028       · include/uapi/linux/bpf.h is the main BPF header. It contains the full
3029         list of all helper functions, as well as many other  BPF  definitions
3030         including  most  of  the  flags,  structs  or  constants  used by the
3031         helpers.
3032
3033       · net/core/filter.c contains the  definition  of  most  network-related
3034         helper  functions,  and the list of program types from which they can
3035         be used.
3036
3037       · kernel/trace/bpf_trace.c is the  equivalent  for  most  tracing  pro‐
3038         gram-related helpers.
3039
3040       · kernel/bpf/verifier.c contains the functions used to check that valid
3041         types of eBPF maps are used with a given helper function.
3042
3043       · kernel/bpf/  directory  contains  other  files  in  which  additional
3044         helpers are defined (for cgroups, sockmaps, etc.).
3045
3046       · The  bpftool  utility can be used to probe the availability of helper
3047         functions on the system (as well as supported program and map  types,
3048         and  a  number  of  other  parameters). To do so, run bpftool feature
3049         probe (see bpftool-feature(8) for details). Add the unprivileged key‐
3050         word to list features available to unprivileged users.
3051
3052       Compatibility  between helper functions and program types can generally
3053       be found in the files where helper functions are defined. Look for  the
3054       struct  bpf_func_proto  objects and for functions returning them: these
3055       functions contain a list of helpers that a given program type can call.
3056       Note  that  the  default:  label  of the switch ... case used to filter
3057       helpers can call other functions, themselves allowing access  to  addi‐
3058       tional helpers. The requirement for GPL license is also in those struct
3059       bpf_func_proto.
3060
3061       Compatibility between helper functions and map types can  be  found  in
3062       the  check_map_func_compatibility()  function  in file kernel/bpf/veri‐
3063       fier.c.
3064
3065       Helper functions that invalidate the checks on data and data_end point‐
3066       ers     for    network    processing    are    listed    in    function
3067       bpf_helper_changes_pkt_data() in file net/core/filter.c.
3068

SEE ALSO

3070       bpf(2), bpftool(8), cgroups(7), ip(8), perf_event_open(2),  sendmsg(2),
3071       socket(7), tc-bpf(8)
3072

COLOPHON

3074       This  page  is  part of release 5.07 of the Linux man-pages project.  A
3075       description of the project, information about reporting bugs,  and  the
3076       latest     version     of     this    page,    can    be    found    at
3077       https://www.kernel.org/doc/man-pages/.
3078
3079
3080
3081                                                                BPF-HELPERS(7)
Impressum