1BPF-HELPERS(7)         Miscellaneous Information Manual         BPF-HELPERS(7)
2
3
4

NAME

6       BPF-HELPERS - list of eBPF helper functions
7

DESCRIPTION

9       The  extended  Berkeley Packet Filter (eBPF) subsystem consists in pro‐
10       grams written in a pseudo-assembly language, then attached  to  one  of
11       the  several  kernel hooks and run in reaction of specific events. This
12       framework differs from the older, "classic" BPF (or "cBPF") in  several
13       aspects,  one  of  them being the ability to call special functions (or
14       "helpers") from within a program.  These functions are restricted to  a
15       white-list of helpers defined in the kernel.
16
17       These helpers are used by eBPF programs to interact with the system, or
18       with the context in which they work. For instance, they can be used  to
19       print  debugging messages, to get the time since the system was booted,
20       to interact with eBPF maps, or to  manipulate  network  packets.  Since
21       there  are  several eBPF program types, and that they do not run in the
22       same context, each program  type  can  only  call  a  subset  of  those
23       helpers.
24
25       Due  to  eBPF  conventions,  a helper can not have more than five argu‐
26       ments.
27
28       Internally, eBPF programs call directly into the compiled helper  func‐
29       tions  without  requiring  any foreign-function interface. As a result,
30       calling helpers introduces no overhead, thus offering excellent perfor‐
31       mance.
32
33       This  document is an attempt to list and document the helpers available
34       to eBPF developers. They are sorted by chronological order (the  oldest
35       helpers in the kernel at the top).
36

HELPERS

38       void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
39
40              Description
41                     Perform a lookup in map for an entry associated to key.
42
43              Return Map  value  associated  to  key,  or NULL if no entry was
44                     found.
45
46       long bpf_map_update_elem(struct bpf_map *map, const  void  *key,  const
47       void *value, u64 flags)
48
49              Description
50                     Add or update the value of the entry associated to key in
51                     map with value. flags is one of:
52
53                     BPF_NOEXIST
54                            The entry for key must not exist in the map.
55
56                     BPF_EXIST
57                            The entry for key must already exist in the map.
58
59                     BPF_ANY
60                            No condition on the existence  of  the  entry  for
61                            key.
62
63                     Flag  value  BPF_NOEXIST cannot be used for maps of types
64                     BPF_MAP_TYPE_ARRAY or BPF_MAP_TYPE_PERCPU_ARRAY  (all el‐
65                     ements always exist), the helper would return an error.
66
67              Return 0 on success, or a negative error in case of failure.
68
69       long bpf_map_delete_elem(struct bpf_map *map, const void *key)
70
71              Description
72                     Delete entry with key from map.
73
74              Return 0 on success, or a negative error in case of failure.
75
76       long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
77
78              Description
79                     For  tracing  programs, safely attempt to read size bytes
80                     from kernel space address unsafe_ptr and store  the  data
81                     in dst.
82
83                     Generally,       use       bpf_probe_read_user()       or
84                     bpf_probe_read_kernel() instead.
85
86              Return 0 on success, or a negative error in case of failure.
87
88       u64 bpf_ktime_get_ns(void)
89
90              Description
91                     Return the time elapsed since system  boot,  in  nanosec‐
92                     onds.   Does  not  include time the system was suspended.
93                     See: clock_gettime(CLOCK_MONOTONIC)
94
95              Return Current ktime.
96
97       long bpf_trace_printk(const char *fmt, u32 fmt_size, ...)
98
99              Description
100                     This helper is a "printk()-like" facility for  debugging.
101                     It  prints  a  message  defined  by  format  fmt (of size
102                     fmt_size) to  file  /sys/kernel/debug/tracing/trace  from
103                     DebugFS, if available. It can take up to three additional
104                     u64 arguments (as an eBPF helpers, the  total  number  of
105                     arguments is limited to five).
106
107                     Each  time the helper is called, it appends a line to the
108                     trace.  Lines are discarded while /sys/kernel/debug/trac‐
109                     ing/trace    is    open,    use   /sys/kernel/debug/trac‐
110                     ing/trace_pipe to avoid this.  The format of the trace is
111                     customizable,  and  the exact output one will get depends
112                     on the options set in /sys/kernel/debug/tracing/trace_op‐
113                     tions  (see  also  the  README file under the same direc‐
114                     tory). However, it usually defaults to something like:
115
116                        telnet-470   [001] .N.. 419421.045894: 0x00000001: <fmt>
117
118                     In the above:
119
120telnet is the name of the current task.
121
122470 is the PID of the current task.
123
124001 is the CPU number on which the task is running.
125
126                        • In .N.., each character refers to a set  of  options
127                          (whether   irqs  are  enabled,  scheduling  options,
128                          whether hard/softirqs are  running,  level  of  pre‐
129                          empt_disabled    respectively).    N    means   that
130                          TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED are set.
131
132419421.045894 is a timestamp.
133
1340x00000001 is a fake value used by BPF for  the  in‐
135                          struction pointer register.
136
137<fmt> is the message formatted with fmt.
138
139                     The  conversion  specifiers supported by fmt are similar,
140                     but more limited than for printk(). They are %d, %i,  %u,
141                     %x,  %ld,  %li, %lu, %lx, %lld, %lli, %llu, %llx, %p, %s.
142                     No modifier (size of field, padding with zeroes, etc.) is
143                     available,  and the helper will return -EINVAL (but print
144                     nothing) if it encounters an unknown specifier.
145
146                     Also, note that bpf_trace_printk() is  slow,  and  should
147                     only  be  used for debugging purposes. For this reason, a
148                     notice block (spanning several lines) is printed to  ker‐
149                     nel  logs  and  states that the helper should not be used
150                     "for production use" the first time this helper  is  used
151                     (or more precisely, when trace_printk() buffers are allo‐
152                     cated). For passing values to  user  space,  perf  events
153                     should be preferred.
154
155              Return The  number of bytes written to the buffer, or a negative
156                     error in case of failure.
157
158       u32 bpf_get_prandom_u32(void)
159
160              Description
161                     Get a pseudo-random number.
162
163                     From a security point of view, this helper uses  its  own
164                     pseudo-random internal state, and cannot be used to infer
165                     the seed of other random functions in  the  kernel.  How‐
166                     ever,  it is essential to note that the generator used by
167                     the helper is not cryptographically secure.
168
169              Return A random 32-bit unsigned value.
170
171       u32 bpf_get_smp_processor_id(void)
172
173              Description
174                     Get the SMP  (symmetric  multiprocessing)  processor  id.
175                     Note that all programs run with migration disabled, which
176                     means that the SMP processor id is stable during all  the
177                     execution of the program.
178
179              Return The SMP id of the processor running the program.
180
181       long  bpf_skb_store_bytes(struct  sk_buff  *skb, u32 offset, const void
182       *from, u32 len, u64 flags)
183
184              Description
185                     Store len bytes from address from into the packet associ‐
186                     ated  to  skb,  at  offset.  flags  are  a combination of
187                     BPF_F_RECOMPUTE_CSUM (automatically recompute the  check‐
188                     sum for the packet after storing the bytes) and BPF_F_IN‐
189                     VALIDATE_HASH (set skb->hash, skb->swhash and skb->l4hash
190                     to 0).
191
192                     A call to this helper is susceptible to change the under‐
193                     lying packet buffer. Therefore, at load time, all  checks
194                     on  pointers  previously done by the verifier are invali‐
195                     dated and must be performed again, if the helper is  used
196                     in combination with direct packet access.
197
198              Return 0 on success, or a negative error in case of failure.
199
200       long bpf_l3_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64
201       to, u64 size)
202
203              Description
204                     Recompute the layer 3 (e.g. IP) checksum for  the  packet
205                     associated  to  skb.  Computation  is incremental, so the
206                     helper must know the former value  of  the  header  field
207                     that  was  modified  (from),  the new value of this field
208                     (to), and the number of bytes (2 or 4)  for  this  field,
209                     stored  in  size.  Alternatively, it is possible to store
210                     the difference between the previous and the new values of
211                     the  header  field  in to, by setting from and size to 0.
212                     For both methods, offset indicates the location of the IP
213                     checksum within the packet.
214
215                     This  helper  works  in combination with bpf_csum_diff(),
216                     which does not update the checksum in-place,  but  offers
217                     more  flexibility and can handle sizes larger than 2 or 4
218                     for the checksum to update.
219
220                     A call to this helper is susceptible to change the under‐
221                     lying  packet buffer. Therefore, at load time, all checks
222                     on pointers previously done by the verifier  are  invali‐
223                     dated  and must be performed again, if the helper is used
224                     in combination with direct packet access.
225
226              Return 0 on success, or a negative error in case of failure.
227
228       long bpf_l4_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64
229       to, u64 flags)
230
231              Description
232                     Recompute  the  layer  4 (e.g. TCP, UDP or ICMP) checksum
233                     for the packet associated to skb. Computation  is  incre‐
234                     mental,  so  the helper must know the former value of the
235                     header field that was modified (from), the new  value  of
236                     this  field  (to),  and  the number of bytes (2 or 4) for
237                     this field, stored on the lowest four bits of flags.  Al‐
238                     ternatively,  it  is possible to store the difference be‐
239                     tween the previous and the new values of the header field
240                     in  to, by setting from and the four lowest bits of flags
241                     to 0. For both methods, offset indicates the location  of
242                     the  IP  checksum  within  the packet. In addition to the
243                     size of the field, flags can be added (bitwise OR) actual
244                     flags. With BPF_F_MARK_MANGLED_0, a null checksum is left
245                     untouched (unless BPF_F_MARK_ENFORCE is added  as  well),
246                     and for updates resulting in a null checksum the value is
247                     set to CSUM_MANGLED_0 instead. Flag BPF_F_PSEUDO_HDR  in‐
248                     dicates   the  checksum  is  to  be  computed  against  a
249                     pseudo-header.
250
251                     This helper works in  combination  with  bpf_csum_diff(),
252                     which  does  not update the checksum in-place, but offers
253                     more flexibility and can handle sizes larger than 2 or  4
254                     for the checksum to update.
255
256                     A call to this helper is susceptible to change the under‐
257                     lying packet buffer. Therefore, at load time, all  checks
258                     on  pointers  previously done by the verifier are invali‐
259                     dated and must be performed again, if the helper is  used
260                     in combination with direct packet access.
261
262              Return 0 on success, or a negative error in case of failure.
263
264       long  bpf_tail_call(void  *ctx, struct bpf_map *prog_array_map, u32 in‐
265       dex)
266
267              Description
268                     This special helper is used to trigger a "tail call",  or
269                     in  other  words,  to jump into another eBPF program. The
270                     same stack frame is used (but values on stack and in reg‐
271                     isters  for the caller are not accessible to the callee).
272                     This mechanism allows for program  chaining,  either  for
273                     raising  the  maximum  number  of available eBPF instruc‐
274                     tions,  or  to  execute  given  programs  in  conditional
275                     blocks.  For security reasons, there is an upper limit to
276                     the number of successive tail  calls  that  can  be  per‐
277                     formed.
278
279                     Upon  call  of  this helper, the program attempts to jump
280                     into a program referenced  at  index  index  in  prog_ar‐
281                     ray_map,  a  special map of type BPF_MAP_TYPE_PROG_ARRAY,
282                     and passes ctx, a pointer to the context.
283
284                     If the call succeeds, the  kernel  immediately  runs  the
285                     first instruction of the new program. This is not a func‐
286                     tion call, and it never returns to the previous  program.
287                     If the call fails, then the helper has no effect, and the
288                     caller continues to run its  subsequent  instructions.  A
289                     call  can  fail  if  the destination program for the jump
290                     does not exist (i.e. index is superior to the  number  of
291                     entries  in  prog_array_map), or if the maximum number of
292                     tail calls has been reached for this chain  of  programs.
293                     This  limit  is  defined  in  the  kernel  by  the  macro
294                     MAX_TAIL_CALL_CNT (not accessible to user  space),  which
295                     is currently set to 33.
296
297              Return 0 on success, or a negative error in case of failure.
298
299       long bpf_clone_redirect(struct sk_buff *skb, u32 ifindex, u64 flags)
300
301              Description
302                     Clone  and  redirect  the packet associated to skb to an‐
303                     other net device  of  index  ifindex.  Both  ingress  and
304                     egress  interfaces  can  be  used  for  redirection.  The
305                     BPF_F_INGRESS value in flags is used to make the distinc‐
306                     tion  (ingress  path  is selected if the flag is present,
307                     egress path otherwise).  This is the only flag  supported
308                     for now.
309
310                     In comparison with bpf_redirect() helper, bpf_clone_redi‐
311                     rect() has the associated cost of duplicating the  packet
312                     buffer, but this can be executed out of the eBPF program.
313                     Conversely, bpf_redirect() is more efficient, but  it  is
314                     handled through an action code where the redirection hap‐
315                     pens only after the eBPF program has returned.
316
317                     A call to this helper is susceptible to change the under‐
318                     lying  packet buffer. Therefore, at load time, all checks
319                     on pointers previously done by the verifier  are  invali‐
320                     dated  and must be performed again, if the helper is used
321                     in combination with direct packet access.
322
323              Return 0 on success, or a negative error in case of failure.
324
325       u64 bpf_get_current_pid_tgid(void)
326
327              Description
328                     Get the current pid and tgid.
329
330              Return A 64-bit integer containing the current tgid and pid, and
331                     created   as   such:  current_task->tgid  <<  32  |  cur‐
332                     rent_task->pid.
333
334       u64 bpf_get_current_uid_gid(void)
335
336              Description
337                     Get the current uid and gid.
338
339              Return A 64-bit integer containing the current GID and UID,  and
340                     created as such: current_gid << 32 | current_uid.
341
342       long bpf_get_current_comm(void *buf, u32 size_of_buf)
343
344              Description
345                     Copy  the  comm attribute of the current task into buf of
346                     size_of_buf. The comm attribute contains the name of  the
347                     executable (excluding the path) for the current task. The
348                     size_of_buf must be strictly positive.  On  success,  the
349                     helper  makes  sure  that  the  buf is NUL-terminated. On
350                     failure, it is filled with zeroes.
351
352              Return 0 on success, or a negative error in case of failure.
353
354       u32 bpf_get_cgroup_classid(struct sk_buff *skb)
355
356              Description
357                     Retrieve the classid for the current task, i.e.  for  the
358                     net_cls cgroup to which skb belongs.
359
360                     This  helper  can  be  used on TC egress path, but not on
361                     ingress.
362
363                     The net_cls cgroup provides an interface to  tag  network
364                     packets based on a user-provided identifier for all traf‐
365                     fic coming  from  the  tasks  belonging  to  the  related
366                     cgroup. See also the related kernel documentation, avail‐
367                     able from the Linux  sources  in  file  Documentation/ad‐
368                     min-guide/cgroup-v1/net_cls.rst.
369
370                     The  Linux kernel has two versions for cgroups: there are
371                     cgroups v1 and cgroups v2. Both are available  to  users,
372                     who  can use a mixture of them, but note that the net_cls
373                     cgroup is for cgroup v1 only. This makes it  incompatible
374                     with   BPF   programs   run   on   cgroups,  which  is  a
375                     cgroup-v2-only feature (a socket can only hold  data  for
376                     one version of cgroups at a time).
377
378                     This  helper is only available is the kernel was compiled
379                     with the CONFIG_CGROUP_NET_CLASSID  configuration  option
380                     set to "y" or to "m".
381
382              Return The classid, or 0 for the default unconfigured classid.
383
384       long  bpf_skb_vlan_push(struct  sk_buff  *skb,  __be16  vlan_proto, u16
385       vlan_tci)
386
387              Description
388                     Push a vlan_tci (VLAN tag control information) of  proto‐
389                     col  vlan_proto to the packet associated to skb, then up‐
390                     date the checksum. Note that if vlan_proto  is  different
391                     from ETH_P_8021Q and ETH_P_8021AD, it is considered to be
392                     ETH_P_8021Q.
393
394                     A call to this helper is susceptible to change the under‐
395                     lying  packet buffer. Therefore, at load time, all checks
396                     on pointers previously done by the verifier  are  invali‐
397                     dated  and must be performed again, if the helper is used
398                     in combination with direct packet access.
399
400              Return 0 on success, or a negative error in case of failure.
401
402       long bpf_skb_vlan_pop(struct sk_buff *skb)
403
404              Description
405                     Pop a VLAN header from the packet associated to skb.
406
407                     A call to this helper is susceptible to change the under‐
408                     lying  packet buffer. Therefore, at load time, all checks
409                     on pointers previously done by the verifier  are  invali‐
410                     dated  and must be performed again, if the helper is used
411                     in combination with direct packet access.
412
413              Return 0 on success, or a negative error in case of failure.
414
415       long bpf_skb_get_tunnel_key(struct sk_buff *skb, struct  bpf_tunnel_key
416       *key, u32 size, u64 flags)
417
418              Description
419                     Get  tunnel  metadata. This helper takes a pointer key to
420                     an empty struct bpf_tunnel_key  of  size,  that  will  be
421                     filled  with tunnel metadata for the packet associated to
422                     skb.  The flags can be set to  BPF_F_TUNINFO_IPV6,  which
423                     indicates  that  the tunnel is based on IPv6 protocol in‐
424                     stead of IPv4.
425
426                     The struct bpf_tunnel_key is an object  that  generalizes
427                     the principal parameters used by various tunneling proto‐
428                     cols into a single struct. This way, it can  be  used  to
429                     easily  make  a decision based on the contents of the en‐
430                     capsulation header, "summarized" in this struct. In  par‐
431                     ticular,  it holds the IP address of the remote end (IPv4
432                     or IPv6, depending on the case)  in  key->remote_ipv4  or
433                     key->remote_ipv6. Also, this struct exposes the key->tun‐
434                     nel_id, which is generally mapped to a VNI (Virtual  Net‐
435                     work  Identifier),  making  it programmable together with
436                     the bpf_skb_set_tunnel_key() helper.
437
438                     Let's imagine that the following code is part of  a  pro‐
439                     gram  attached to the TC ingress interface, on one end of
440                     a GRE tunnel, and is supposed to filter out all  messages
441                     coming  from  remote  ends  with  IPv4 address other than
442                     10.0.0.1:
443
444                        int ret;
445                        struct bpf_tunnel_key key = {};
446
447                        ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0);
448                        if (ret < 0)
449                                return TC_ACT_SHOT;     // drop packet
450
451                        if (key.remote_ipv4 != 0x0a000001)
452                                return TC_ACT_SHOT;     // drop packet
453
454                        return TC_ACT_OK;               // accept packet
455
456                     This interface can also be used  with  all  encapsulation
457                     devices  that can operate in "collect metadata" mode: in‐
458                     stead of having one network device per specific  configu‐
459                     ration,  the "collect metadata" mode only requires a sin‐
460                     gle device where the configuration can be extracted  from
461                     this helper.
462
463                     This  can  be  used together with various tunnels such as
464                     VXLan, Geneve, GRE or IP in IP (IPIP).
465
466              Return 0 on success, or a negative error in case of failure.
467
468       long bpf_skb_set_tunnel_key(struct sk_buff *skb, struct  bpf_tunnel_key
469       *key, u32 size, u64 flags)
470
471              Description
472                     Populate  tunnel  metadata  for packet associated to skb.
473                     The tunnel metadata is set to the  contents  of  key,  of
474                     size.  The  flags can be set to a combination of the fol‐
475                     lowing values:
476
477                     BPF_F_TUNINFO_IPV6
478                            Indicate that the tunnel is based on IPv6 protocol
479                            instead of IPv4.
480
481                     BPF_F_ZERO_CSUM_TX
482                            For  IPv4  packets,  add a flag to tunnel metadata
483                            indicating that  checksum  computation  should  be
484                            skipped and checksum set to zeroes.
485
486                     BPF_F_DONT_FRAGMENT
487                            Add  a flag to tunnel metadata indicating that the
488                            packet should not be fragmented.
489
490                     BPF_F_SEQ_NUMBER
491                            Add a flag to tunnel metadata  indicating  that  a
492                            sequence  number  should be added to tunnel header
493                            before sending the packet. This flag was added for
494                            GRE  encapsulation,  but  might be used with other
495                            protocols as well in the future.
496
497                     Here is a typical usage on the transmit path:
498
499                        struct bpf_tunnel_key key;
500                             populate key ...
501                        bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0);
502                        bpf_clone_redirect(skb, vxlan_dev_ifindex, 0);
503
504                     See also the description of the  bpf_skb_get_tunnel_key()
505                     helper for additional information.
506
507              Return 0 on success, or a negative error in case of failure.
508
509       u64 bpf_perf_event_read(struct bpf_map *map, u64 flags)
510
511              Description
512                     Read  the  value of a perf event counter. This helper re‐
513                     lies on a map of type BPF_MAP_TYPE_PERF_EVENT_ARRAY.  The
514                     nature  of the perf event counter is selected when map is
515                     updated with perf event file descriptors. The map  is  an
516                     array  whose  size  is  the number of available CPUs, and
517                     each cell contains a value relative to one CPU. The value
518                     to  retrieve is indicated by flags, that contains the in‐
519                     dex of the CPU to look up, masked with  BPF_F_INDEX_MASK.
520                     Alternatively,  flags  can be set to BPF_F_CURRENT_CPU to
521                     indicate that the value for the current CPU should be re‐
522                     trieved.
523
524                     Note that before Linux 4.13, only hardware perf event can
525                     be retrieved.
526
527                     Also,    be    aware    that     the     newer     helper
528                     bpf_perf_event_read_value()     is    recommended    over
529                     bpf_perf_event_read() in general. The latter has some ABI
530                     quirks where error and counter value are used as a return
531                     code (which is wrong to do  since  ranges  may  overlap).
532                     This  issue  is  fixed  with bpf_perf_event_read_value(),
533                     which at the same time provides more  features  over  the
534                     bpf_perf_event_read()  interface. Please refer to the de‐
535                     scription of bpf_perf_event_read_value() for details.
536
537              Return The value of the perf event counter read from the map, or
538                     a negative error code in case of failure.
539
540       long bpf_redirect(u32 ifindex, u64 flags)
541
542              Description
543                     Redirect  the  packet  to  another  net  device  of index
544                     ifindex.    This   helper   is   somewhat   similar    to
545                     bpf_clone_redirect(),  except  that  the  packet  is  not
546                     cloned, which provides increased performance.
547
548                     Except for XDP, both ingress and egress interfaces can be
549                     used for redirection. The BPF_F_INGRESS value in flags is
550                     used to make the distinction (ingress path is selected if
551                     the  flag  is present, egress path otherwise). Currently,
552                     XDP only supports redirection to  the  egress  interface,
553                     and accepts no flag at all.
554
555                     The  same  effect  can  also  be  attained  with the more
556                     generic bpf_redirect_map(), which uses a BPF map to store
557                     the  redirect  target instead of providing it directly to
558                     the helper.
559
560              Return For XDP, the helper returns XDP_REDIRECT  on  success  or
561                     XDP_ABORTED on error. For other program types, the values
562                     are TC_ACT_REDIRECT on success or TC_ACT_SHOT on error.
563
564       u32 bpf_get_route_realm(struct sk_buff *skb)
565
566              Description
567                     Retrieve the realm or the  route,  that  is  to  say  the
568                     tclassid  field of the destination for the skb. The iden‐
569                     tifier retrieved is a user-provided tag, similar  to  the
570                     one  used  with  the  net_cls cgroup (see description for
571                     bpf_get_cgroup_classid() helper), but here  this  tag  is
572                     held by a route (a destination entry), not by a task.
573
574                     Retrieving  this  identifier  works  with  the  clsact TC
575                     egress hook (see also  tc-bpf(8)),  or  alternatively  on
576                     conventional  classful  egress  qdiscs,  but  not  on  TC
577                     ingress path. In case of clsact TC egress hook, this  has
578                     the advantage that, internally, the destination entry has
579                     not been dropped yet in the transmit path. Therefore, the
580                     destination  entry  does not need to be artificially held
581                     via netif_keep_dst() for a classful qdisc until  the  skb
582                     is freed.
583
584                     This  helper is available only if the kernel was compiled
585                     with CONFIG_IP_ROUTE_CLASSID configuration option.
586
587              Return The realm of the route for the packet associated to  skb,
588                     or 0 if none was found.
589
590       long  bpf_perf_event_output(void  *ctx, struct bpf_map *map, u64 flags,
591       void *data, u64 size)
592
593              Description
594                     Write raw data blob into a special BPF perf event held by
595                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
596                     event must have the following attributes: PERF_SAMPLE_RAW
597                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
598                     PERF_COUNT_SW_BPF_OUTPUT as config.
599
600                     The flags are used to indicate the index in map for which
601                     the value must be put, masked with BPF_F_INDEX_MASK.  Al‐
602                     ternatively, flags can be set to BPF_F_CURRENT_CPU to in‐
603                     dicate  that  the index of the current CPU core should be
604                     used.
605
606                     The value to write, of size, is passed through eBPF stack
607                     and pointed by data.
608
609                     The  context  of  the program ctx needs also be passed to
610                     the helper.
611
612                     On user space, a program willing to read the values needs
613                     to  call  perf_event_open() on the perf event (either for
614                     one or for all CPUs) and to  store  the  file  descriptor
615                     into  the  map. This must be done before the eBPF program
616                     can send data into it. An example is  available  in  file
617                     samples/bpf/trace_output_user.c   in   the  Linux  kernel
618                     source tree (the eBPF  program  counterpart  is  in  sam‐
619                     ples/bpf/trace_output_kern.c).
620
621                     bpf_perf_event_output()  achieves better performance than
622                     bpf_trace_printk() for sharing data with user space,  and
623                     is much better suitable for streaming data from eBPF pro‐
624                     grams.
625
626                     Note that this helper is not restricted  to  tracing  use
627                     cases and can be used with programs attached to TC or XDP
628                     as well, where it allows for passing data to  user  space
629                     listeners. Data can be:
630
631                     • Only custom structs,
632
633                     • Only the packet payload, or
634
635                     • A combination of both.
636
637              Return 0 on success, or a negative error in case of failure.
638
639       long bpf_skb_load_bytes(const void *skb, u32 offset, void *to, u32 len)
640
641              Description
642                     This helper was provided as an easy way to load data from
643                     a packet. It can be used to load len  bytes  from  offset
644                     from  the  packet  associated  to  skb,  into  the buffer
645                     pointed by to.
646
647                     Since Linux 4.7, usage of this helper has mostly been re‐
648                     placed by "direct packet access", enabling packet data to
649                     be manipulated with skb->data and skb->data_end  pointing
650                     respectively  to the first byte of packet data and to the
651                     byte after the last byte of packet data. However, it  re‐
652                     mains  useful  if  one wishes to read large quantities of
653                     data at once from a packet into the eBPF stack.
654
655              Return 0 on success, or a negative error in case of failure.
656
657       long bpf_get_stackid(void *ctx, struct bpf_map *map, u64 flags)
658
659              Description
660                     Walk a user or a kernel  stack  and  return  its  id.  To
661                     achieve this, the helper needs ctx, which is a pointer to
662                     the context on which the tracing program is executed, and
663                     a pointer to a map of type BPF_MAP_TYPE_STACK_TRACE.
664
665                     The  last  argument,  flags,  holds  the  number of stack
666                     frames  to  skip   (from   0   to   255),   masked   with
667                     BPF_F_SKIP_FIELD_MASK. The next bits can be used to set a
668                     combination of the following flags:
669
670                     BPF_F_USER_STACK
671                            Collect a user space stack  instead  of  a  kernel
672                            stack.
673
674                     BPF_F_FAST_STACK_CMP
675                            Compare stacks by hash only.
676
677                     BPF_F_REUSE_STACKID
678                            If   two  different  stacks  hash  into  the  same
679                            stackid, discard the old one.
680
681                     The stack id retrieved is a 32 bit  long  integer  handle
682                     which  can be further combined with other data (including
683                     other stack ids) and used as a key into maps. This can be
684                     useful  for generating a variety of graphs (such as flame
685                     graphs or off-cpu graphs).
686
687                     For walking a stack, this helper is an  improvement  over
688                     bpf_probe_read(),  which  can be used with unrolled loops
689                     but is not efficient and consumes a lot of eBPF  instruc‐
690                     tions.   Instead,  bpf_get_stackid()  can  collect  up to
691                     PERF_MAX_STACK_DEPTH both kernel and  user  frames.  Note
692                     that  this  limit  can be controlled with the sysctl pro‐
693                     gram, and that it should be manually increased  in  order
694                     to profile long user stacks (such as stacks for Java pro‐
695                     grams). To do so, use:
696
697                        # sysctl kernel.perf_event_max_stack=<new value>
698
699              Return The positive or null stack id on success, or  a  negative
700                     error in case of failure.
701
702       s64 bpf_csum_diff(__be32 *from, u32 from_size, __be32 *to, u32 to_size,
703       __wsum seed)
704
705              Description
706                     Compute  a  checksum  difference,  from  the  raw  buffer
707                     pointed by from, of length from_size (that must be a mul‐
708                     tiple of 4), towards the raw buffer  pointed  by  to,  of
709                     size to_size (same remark). An optional seed can be added
710                     to the value (this can be cascaded,  the  seed  may  come
711                     from a previous call to the helper).
712
713                     This is flexible enough to be used in several ways:
714
715                     • With from_size == 0, to_size > 0 and seed set to check‐
716                       sum, it can be used when pushing new data.
717
718                     • With from_size > 0, to_size == 0 and seed set to check‐
719                       sum, it can be used when removing data from a packet.
720
721                     • With  from_size  > 0, to_size > 0 and seed set to 0, it
722                       can be used to compute a diff. Note that from_size  and
723                       to_size do not need to be equal.
724
725                     This   helper   can   be   used   in   combination   with
726                     bpf_l3_csum_replace() and bpf_l4_csum_replace(), to which
727                     one   can   feed   in   the   difference   computed  with
728                     bpf_csum_diff().
729
730              Return The checksum result, or a negative error code in case  of
731                     failure.
732
733       long bpf_skb_get_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
734
735              Description
736                     Retrieve  tunnel  options metadata for the packet associ‐
737                     ated to skb, and store the raw tunnel option data to  the
738                     buffer opt of size.
739
740                     This  helper  can be used with encapsulation devices that
741                     can operate in "collect metadata" mode (please  refer  to
742                     the  related  note in the description of bpf_skb_get_tun‐
743                     nel_key() for more details). A particular  example  where
744                     this can be used is in combination with the Geneve encap‐
745                     sulation protocol, where  it  allows  for  pushing  (with
746                     bpf_skb_get_tunnel_opt() helper) and retrieving arbitrary
747                     TLVs (Type-Length-Value headers) from the  eBPF  program.
748                     This allows for full customization of these headers.
749
750              Return The size of the option data retrieved.
751
752       long bpf_skb_set_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
753
754              Description
755                     Set  tunnel options metadata for the packet associated to
756                     skb to the option data contained in the raw buffer opt of
757                     size.
758
759                     See  also the description of the bpf_skb_get_tunnel_opt()
760                     helper for additional information.
761
762              Return 0 on success, or a negative error in case of failure.
763
764       long bpf_skb_change_proto(struct sk_buff *skb, __be16 proto, u64 flags)
765
766              Description
767                     Change the protocol of the skb to proto.  Currently  sup‐
768                     ported are transition from IPv4 to IPv6, and from IPv6 to
769                     IPv4. The helper takes care of  the  groundwork  for  the
770                     transition,  including  resizing  the  socket buffer. The
771                     eBPF program is expected to fill the new headers, if any,
772                     via skb_store_bytes() and to recompute the checksums with
773                     bpf_l3_csum_replace() and bpf_l4_csum_replace(). The main
774                     case  for  this helper is to perform NAT64 operations out
775                     of an eBPF program.
776
777                     Internally, the GSO type is marked as dodgy so that head‐
778                     ers  are  checked  and  segments  are recalculated by the
779                     GSO/GRO engine.  The size for GSO target  is  adapted  as
780                     well.
781
782                     All  values  for flags are reserved for future usage, and
783                     must be left at zero.
784
785                     A call to this helper is susceptible to change the under‐
786                     lying  packet buffer. Therefore, at load time, all checks
787                     on pointers previously done by the verifier  are  invali‐
788                     dated  and must be performed again, if the helper is used
789                     in combination with direct packet access.
790
791              Return 0 on success, or a negative error in case of failure.
792
793       long bpf_skb_change_type(struct sk_buff *skb, u32 type)
794
795              Description
796                     Change the packet type for the packet associated to  skb.
797                     This  comes down to setting skb->pkt_type to type, except
798                     the  eBPF  program  does  not  have  a  write  access  to
799                     skb->pkt_type beside this helper. Using a helper here al‐
800                     lows for graceful handling of errors.
801
802                     The major  use  case  is  to  change  incoming  skb*s  to
803                     **PACKET_HOST* in a programmatic way instead of having to
804                     recirculate via redirect(..., BPF_F_INGRESS),  for  exam‐
805                     ple.
806
807                     Note  that type only allows certain values. At this time,
808                     they are:
809
810                     PACKET_HOST
811                            Packet is for us.
812
813                     PACKET_BROADCAST
814                            Send packet to all.
815
816                     PACKET_MULTICAST
817                            Send packet to group.
818
819                     PACKET_OTHERHOST
820                            Send packet to someone else.
821
822              Return 0 on success, or a negative error in case of failure.
823
824       long bpf_skb_under_cgroup(struct sk_buff *skb, struct bpf_map *map, u32
825       index)
826
827              Description
828                     Check  whether skb is a descendant of the cgroup2 held by
829                     map of type BPF_MAP_TYPE_CGROUP_ARRAY, at index.
830
831              Return The return value depends on the result of the  test,  and
832                     can be:
833
834                     • 0, if the skb failed the cgroup2 descendant test.
835
836                     • 1, if the skb succeeded the cgroup2 descendant test.
837
838                     • A negative error code, if an error occurred.
839
840       u32 bpf_get_hash_recalc(struct sk_buff *skb)
841
842              Description
843                     Retrieve  the hash of the packet, skb->hash. If it is not
844                     set, in particular if the hash was cleared  due  to  man‐
845                     gling,  recompute  this  hash. Later accesses to the hash
846                     can be done directly with skb->hash.
847
848                     Calling bpf_set_hash_invalid(), changing a packet  proto‐
849                     type     with    bpf_skb_change_proto(),    or    calling
850                     bpf_skb_store_bytes() with the BPF_F_INVALIDATE_HASH  are
851                     actions  susceptible  to  clear the hash and to trigger a
852                     new computation for the  next  call  to  bpf_get_hash_re‐
853                     calc().
854
855              Return The 32-bit hash.
856
857       u64 bpf_get_current_task(void)
858
859              Description
860                     Get the current task.
861
862              Return A pointer to the current task struct.
863
864       long bpf_probe_write_user(void *dst, const void *src, u32 len)
865
866              Description
867                     Attempt  in a safe way to write len bytes from the buffer
868                     src to dst in memory. It only works for threads that  are
869                     in  user  context, and dst must be a valid user space ad‐
870                     dress.
871
872                     This helper should not be used to implement any  kind  of
873                     security mechanism because of TOC-TOU attacks, but rather
874                     to debug, divert, and manipulate execution of  semi-coop‐
875                     erative processes.
876
877                     Keep  in mind that this feature is meant for experiments,
878                     and it has a risk of crashing the system and running pro‐
879                     grams.  Therefore, when an eBPF program using this helper
880                     is attached, a warning including PID and process name  is
881                     printed to kernel logs.
882
883              Return 0 on success, or a negative error in case of failure.
884
885       long bpf_current_task_under_cgroup(struct bpf_map *map, u32 index)
886
887              Description
888                     Check  whether the probe is being run is the context of a
889                     given subset of the cgroup2  hierarchy.  The  cgroup2  to
890                     test is held by map of type BPF_MAP_TYPE_CGROUP_ARRAY, at
891                     index.
892
893              Return The return value depends on the result of the  test,  and
894                     can be:
895
896                     • 1, if current task belongs to the cgroup2.
897
898                     • 0, if current task does not belong to the cgroup2.
899
900                     • A negative error code, if an error occurred.
901
902       long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
903
904              Description
905                     Resize (trim or grow) the packet associated to skb to the
906                     new len. The flags are reserved  for  future  usage,  and
907                     must be left at zero.
908
909                     The  basic  idea  is  that the helper performs the needed
910                     work to change the size of the packet, then the eBPF pro‐
911                     gram    rewrites    the    rest    via    helpers    like
912                     bpf_skb_store_bytes(),             bpf_l3_csum_replace(),
913                     bpf_l3_csum_replace()  and  others. This helper is a slow
914                     path utility intended for replies with control  messages.
915                     And  because it is targeted for slow path, the helper it‐
916                     self can afford to be slow: it implicitly linearizes, un‐
917                     clones and drops offloads from the skb.
918
919                     A call to this helper is susceptible to change the under‐
920                     lying packet buffer. Therefore, at load time, all  checks
921                     on  pointers  previously done by the verifier are invali‐
922                     dated and must be performed again, if the helper is  used
923                     in combination with direct packet access.
924
925              Return 0 on success, or a negative error in case of failure.
926
927       long bpf_skb_pull_data(struct sk_buff *skb, u32 len)
928
929              Description
930                     Pull in non-linear data in case the skb is non-linear and
931                     not all of len are part of the linear section.  Make  len
932                     bytes  from skb readable and writable. If a zero value is
933                     passed for len, then all bytes in the linear part of  skb
934                     will be made readable and writable.
935
936                     This  helper  is only needed for reading and writing with
937                     direct packet access.
938
939                     For direct packet access, testing that offsets to  access
940                     are  within  packet boundaries (test on skb->data_end) is
941                     susceptible to fail if offsets are invalid, or if the re‐
942                     quested  data is in non-linear parts of the skb. On fail‐
943                     ure the program can just bail out, or in the  case  of  a
944                     non-linear  buffer,  use a helper to make the data avail‐
945                     able. The bpf_skb_load_bytes() helper is a first solution
946                     to  access  the  data.  Another  one  consists  in  using
947                     bpf_skb_pull_data to pull in once the  non-linear  parts,
948                     then retesting and eventually access the data.
949
950                     At  the  same  time,  this also makes sure the skb is un‐
951                     cloned, which is a necessary condition for direct  write.
952                     As this needs to be an invariant for the write part only,
953                     the verifier detects writes and adds a prologue  that  is
954                     calling  bpf_skb_pull_data()  to  effectively unclone the
955                     skb from the very beginning in case it is indeed cloned.
956
957                     A call to this helper is susceptible to change the under‐
958                     lying  packet buffer. Therefore, at load time, all checks
959                     on pointers previously done by the verifier  are  invali‐
960                     dated  and must be performed again, if the helper is used
961                     in combination with direct packet access.
962
963              Return 0 on success, or a negative error in case of failure.
964
965       s64 bpf_csum_update(struct sk_buff *skb, __wsum csum)
966
967              Description
968                     Add the checksum csum into skb->csum in case  the  driver
969                     has  supplied  a checksum for the entire packet into that
970                     field. Return an error otherwise. This helper is intended
971                     to  be  used in combination with bpf_csum_diff(), in par‐
972                     ticular when the checksum needs to be updated after  data
973                     has  been  written  into the packet through direct packet
974                     access.
975
976              Return The checksum on success, or a negative error code in case
977                     of failure.
978
979       void bpf_set_hash_invalid(struct sk_buff *skb)
980
981              Description
982                     Invalidate  the  current  skb->hash. It can be used after
983                     mangling on headers through direct packet access, in  or‐
984                     der  to indicate that the hash is outdated and to trigger
985                     a recalculation the next time the kernel tries to  access
986                     this  hash  or  when  the bpf_get_hash_recalc() helper is
987                     called.
988
989              Return void.
990
991       long bpf_get_numa_node_id(void)
992
993              Description
994                     Return the id of the current NUMA node. The  primary  use
995                     case  for this helper is the selection of sockets for the
996                     local NUMA node, when the program is attached to  sockets
997                     using   the  SO_ATTACH_REUSEPORT_EBPF  option  (see  also
998                     socket(7)), but the helper is  also  available  to  other
999                     eBPF  program  types,  similarly  to  bpf_get_smp_proces‐
1000                     sor_id().
1001
1002              Return The id of current NUMA node.
1003
1004       long bpf_skb_change_head(struct sk_buff *skb, u32 len, u64 flags)
1005
1006              Description
1007                     Grows headroom of packet associated to  skb  and  adjusts
1008                     the  offset  of  the  MAC  header accordingly, adding len
1009                     bytes of space. It automatically extends and  reallocates
1010                     memory as required.
1011
1012                     This  helper  can  be used on a layer 3 skb to push a MAC
1013                     header for redirection into a layer 2 device.
1014
1015                     All values for flags are reserved for future  usage,  and
1016                     must be left at zero.
1017
1018                     A call to this helper is susceptible to change the under‐
1019                     lying packet buffer. Therefore, at load time, all  checks
1020                     on  pointers  previously done by the verifier are invali‐
1021                     dated and must be performed again, if the helper is  used
1022                     in combination with direct packet access.
1023
1024              Return 0 on success, or a negative error in case of failure.
1025
1026       long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
1027
1028              Description
1029                     Adjust  (move)  xdp_md->data by delta bytes. Note that it
1030                     is possible to use  a  negative  value  for  delta.  This
1031                     helper  can  be used to prepare the packet for pushing or
1032                     popping headers.
1033
1034                     A call to this helper is susceptible to change the under‐
1035                     lying  packet buffer. Therefore, at load time, all checks
1036                     on pointers previously done by the verifier  are  invali‐
1037                     dated  and must be performed again, if the helper is used
1038                     in combination with direct packet access.
1039
1040              Return 0 on success, or a negative error in case of failure.
1041
1042       long bpf_probe_read_str(void *dst, u32 size, const void *unsafe_ptr)
1043
1044              Description
1045                     Copy a NUL terminated string from an  unsafe  kernel  ad‐
1046                     dress  unsafe_ptr to dst. See bpf_probe_read_kernel_str()
1047                     for more details.
1048
1049                     Generally,     use      bpf_probe_read_user_str()      or
1050                     bpf_probe_read_kernel_str() instead.
1051
1052              Return On  success,  the strictly positive length of the string,
1053                     including the trailing NUL character. On error,  a  nega‐
1054                     tive value.
1055
1056       u64 bpf_get_socket_cookie(struct sk_buff *skb)
1057
1058              Description
1059                     If  the struct sk_buff pointed by skb has a known socket,
1060                     retrieve the cookie (generated by  the  kernel)  of  this
1061                     socket.   If  no  cookie has been set yet, generate a new
1062                     cookie. Once generated, the socket cookie remains  stable
1063                     for the life of the socket. This helper can be useful for
1064                     monitoring per socket networking traffic statistics as it
1065                     provides  a  global socket identifier that can be assumed
1066                     unique.
1067
1068              Return A 8-byte long unique number  on  success,  or  0  if  the
1069                     socket field is missing inside skb.
1070
1071       u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
1072
1073              Description
1074                     Equivalent to bpf_get_socket_cookie() helper that accepts
1075                     skb, but gets socket from struct bpf_sock_addr context.
1076
1077              Return A 8-byte long unique number.
1078
1079       u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
1080
1081              Description
1082                     Equivalent to bpf_get_socket_cookie() helper that accepts
1083                     skb, but gets socket from struct bpf_sock_ops context.
1084
1085              Return A 8-byte long unique number.
1086
1087       u64 bpf_get_socket_cookie(struct sock *sk)
1088
1089              Description
1090                     Equivalent to bpf_get_socket_cookie() helper that accepts
1091                     sk, but gets socket from a BTF struct sock.  This  helper
1092                     also works for sleepable programs.
1093
1094              Return A 8-byte long unique number or 0 if sk is NULL.
1095
1096       u32 bpf_get_socket_uid(struct sk_buff *skb)
1097
1098              Description
1099                     Get the owner UID of the socked associated to skb.
1100
1101              Return The  owner  UID  of  the socket associated to skb. If the
1102                     socket is NULL, or if it is not a full socket (i.e. if it
1103                     is  a time-wait or a request socket instead), overflowuid
1104                     value is returned (note that overflowuid  might  also  be
1105                     the actual UID value for the socket).
1106
1107       long bpf_set_hash(struct sk_buff *skb, u32 hash)
1108
1109              Description
1110                     Set  the  full  hash for skb (set the field skb->hash) to
1111                     value hash.
1112
1113              Return 0
1114
1115       long bpf_setsockopt(void *bpf_socket,  int  level,  int  optname,  void
1116       *optval, int optlen)
1117
1118              Description
1119                     Emulate  a  call to setsockopt() on the socket associated
1120                     to bpf_socket, which must be a full socket. The level  at
1121                     which  the option resides and the name optname of the op‐
1122                     tion must be specified, see setsockopt(2) for more infor‐
1123                     mation.   The option value of length optlen is pointed by
1124                     optval.
1125
1126                     bpf_socket should be one of the following:
1127
1128struct bpf_sock_ops for BPF_PROG_TYPE_SOCK_OPS.
1129
1130struct bpf_sock_addr for  BPF_CGROUP_INET4_CONNECT  and
1131                       BPF_CGROUP_INET6_CONNECT.
1132
1133                     This helper actually implements a subset of setsockopt().
1134                     It supports the following levels:
1135
1136SOL_SOCKET,  which  supports  the  following  optnames:
1137                       SO_RCVBUF,  SO_SNDBUF, SO_MAX_PACING_RATE, SO_PRIORITY,
1138                       SO_RCVLOWAT,  SO_MARK,  SO_BINDTODEVICE,  SO_KEEPALIVE,
1139                       SO_REUSEADDR,  SO_REUSEPORT, SO_BINDTOIFINDEX, SO_TXRE‐
1140                       HASH.
1141
1142IPPROTO_TCP, which  supports  the  following  optnames:
1143                       TCP_CONGESTION,    TCP_BPF_IW,   TCP_BPF_SNDCWND_CLAMP,
1144                       TCP_SAVE_SYN, TCP_KEEPIDLE, TCP_KEEPINTVL, TCP_KEEPCNT,
1145                       TCP_SYNCNT,     TCP_USER_TIMEOUT,    TCP_NOTSENT_LOWAT,
1146                       TCP_NODELAY,       TCP_MAXSEG,        TCP_WINDOW_CLAMP,
1147                       TCP_THIN_LINEAR_TIMEOUTS,           TCP_BPF_DELACK_MAX,
1148                       TCP_BPF_RTO_MIN.
1149
1150IPPROTO_IP, which supports optname IP_TOS.
1151
1152IPPROTO_IPV6, which supports  the  following  optnames:
1153                       IPV6_TCLASS, IPV6_AUTOFLOWLABEL.
1154
1155              Return 0 on success, or a negative error in case of failure.
1156
1157       long  bpf_skb_adjust_room(struct  sk_buff *skb, s32 len_diff, u32 mode,
1158       u64 flags)
1159
1160              Description
1161                     Grow or shrink the room for data in the packet associated
1162                     to skb by len_diff, and according to the selected mode.
1163
1164                     By  default, the helper will reset any offloaded checksum
1165                     indicator of  the  skb  to  CHECKSUM_NONE.  This  can  be
1166                     avoided by the following flag:
1167
1168BPF_F_ADJ_ROOM_NO_CSUM_RESET:  Do  not  reset offloaded
1169                       checksum data of the skb to CHECKSUM_NONE.
1170
1171                     There are two supported modes at this time:
1172
1173BPF_ADJ_ROOM_MAC: Adjust room at the  mac  layer  (room
1174                       space is added or removed between the layer 2 and layer
1175                       3 headers).
1176
1177BPF_ADJ_ROOM_NET: Adjust  room  at  the  network  layer
1178                       (room space is added or removed between the layer 3 and
1179                       layer 4 headers).
1180
1181                     The following flags are supported at this time:
1182
1183BPF_F_ADJ_ROOM_FIXED_GSO: Do not adjust gso_size.   Ad‐
1184                       justing mss in this way is not allowed for datagrams.
1185
1186BPF_F_ADJ_ROOM_ENCAP_L3_IPV4,        BPF_F_ADJ_ROOM_EN‐
1187                       CAP_L3_IPV6: Any new space is reserved to hold a tunnel
1188                       header.  Configure skb offsets and other fields accord‐
1189                       ingly.
1190
1191BPF_F_ADJ_ROOM_ENCAP_L4_GRE,         BPF_F_ADJ_ROOM_EN‐
1192                       CAP_L4_UDP:  Use with ENCAP_L3 flags to further specify
1193                       the tunnel type.
1194
1195BPF_F_ADJ_ROOM_ENCAP_L2(len):  Use   with   ENCAP_L3/L4
1196                       flags  to  further  specify the tunnel type; len is the
1197                       length of the inner MAC header.
1198
1199BPF_F_ADJ_ROOM_ENCAP_L2_ETH:          Use          with
1200                       BPF_F_ADJ_ROOM_ENCAP_L2  flag to further specify the L2
1201                       type as Ethernet.
1202
1203                     A call to this helper is susceptible to change the under‐
1204                     lying  packet buffer. Therefore, at load time, all checks
1205                     on pointers previously done by the verifier  are  invali‐
1206                     dated  and must be performed again, if the helper is used
1207                     in combination with direct packet access.
1208
1209              Return 0 on success, or a negative error in case of failure.
1210
1211       long bpf_redirect_map(struct bpf_map *map, u64 key, u64 flags)
1212
1213              Description
1214                     Redirect the packet to the endpoint referenced by map  at
1215                     index  key.  Depending  on its type, this map can contain
1216                     references to net devices (for forwarding packets through
1217                     other  ports),  or to CPUs (for redirecting XDP frames to
1218                     another CPU; but this is only implemented for native  XDP
1219                     (with driver support) as of this writing).
1220
1221                     The  lower  two bits of flags are used as the return code
1222                     if the map lookup fails. This is so that the return value
1223                     can  be one of the XDP program return codes up to XDP_TX,
1224                     as chosen by the caller. The higher bits of flags can  be
1225                     set  to  BPF_F_BROADCAST  or BPF_F_EXCLUDE_INGRESS as de‐
1226                     fined below.
1227
1228                     With BPF_F_BROADCAST the packet will  be  broadcasted  to
1229                     all the interfaces in the map, with BPF_F_EXCLUDE_INGRESS
1230                     the ingress interface will be excluded when do broadcast‐
1231                     ing.
1232
1233                     See  also bpf_redirect(), which only supports redirecting
1234                     to an ifindex, but doesn't require a map to do so.
1235
1236              Return XDP_REDIRECT on success, or the value of  the  two  lower
1237                     bits of the flags argument on error.
1238
1239       long  bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32
1240       key, u64 flags)
1241
1242              Description
1243                     Redirect the packet to the socket referenced by  map  (of
1244                     type BPF_MAP_TYPE_SOCKMAP) at index key. Both ingress and
1245                     egress  interfaces  can  be  used  for  redirection.  The
1246                     BPF_F_INGRESS value in flags is used to make the distinc‐
1247                     tion (ingress path is selected if the  flag  is  present,
1248                     egress  path  otherwise). This is the only flag supported
1249                     for now.
1250
1251              Return SK_PASS on success, or SK_DROP on error.
1252
1253       long bpf_sock_map_update(struct  bpf_sock_ops  *skops,  struct  bpf_map
1254       *map, void *key, u64 flags)
1255
1256              Description
1257                     Add an entry to, or update a map referencing sockets. The
1258                     skops is used as a new value for the entry associated  to
1259                     key. flags is one of:
1260
1261                     BPF_NOEXIST
1262                            The entry for key must not exist in the map.
1263
1264                     BPF_EXIST
1265                            The entry for key must already exist in the map.
1266
1267                     BPF_ANY
1268                            No  condition  on  the  existence of the entry for
1269                            key.
1270
1271                     If the map has eBPF programs (parser and verdict),  those
1272                     will  be  inherited  by  the  socket  being added. If the
1273                     socket is already attached to eBPF programs, this results
1274                     in an error.
1275
1276              Return 0 on success, or a negative error in case of failure.
1277
1278       long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
1279
1280              Description
1281                     Adjust  the address pointed by xdp_md->data_meta by delta
1282                     (which can be positive or negative). Note that this oper‐
1283                     ation modifies the address stored in xdp_md->data, so the
1284                     latter must be loaded only  after  the  helper  has  been
1285                     called.
1286
1287                     The use of xdp_md->data_meta is optional and programs are
1288                     not required to use it. The rationale is  that  when  the
1289                     packet  is processed with XDP (e.g. as DoS filter), it is
1290                     possible to push further meta data along with  it  before
1291                     passing  to  the stack, and to give the guarantee that an
1292                     ingress eBPF program attached as a TC classifier  on  the
1293                     same device can pick this up for further post-processing.
1294                     Since TC works with socket buffers, it  remains  possible
1295                     to  set  from XDP the mark or priority pointers, or other
1296                     pointers for the  socket  buffer.   Having  this  scratch
1297                     space  generic and programmable allows for more flexibil‐
1298                     ity as the user is free to store whatever meta data  they
1299                     need.
1300
1301                     A call to this helper is susceptible to change the under‐
1302                     lying packet buffer. Therefore, at load time, all  checks
1303                     on  pointers  previously done by the verifier are invali‐
1304                     dated and must be performed again, if the helper is  used
1305                     in combination with direct packet access.
1306
1307              Return 0 on success, or a negative error in case of failure.
1308
1309       long  bpf_perf_event_read_value(struct  bpf_map *map, u64 flags, struct
1310       bpf_perf_event_value *buf, u32 buf_size)
1311
1312              Description
1313                     Read the value of a perf event counter, and store it into
1314                     buf of size buf_size. This helper relies on a map of type
1315                     BPF_MAP_TYPE_PERF_EVENT_ARRAY. The  nature  of  the  perf
1316                     event  counter  is selected when map is updated with perf
1317                     event file descriptors. The map is an array whose size is
1318                     the  number  of  available CPUs, and each cell contains a
1319                     value relative to one CPU. The value to retrieve is indi‐
1320                     cated  by  flags,  that  contains the index of the CPU to
1321                     look up,  masked  with  BPF_F_INDEX_MASK.  Alternatively,
1322                     flags  can  be  set to BPF_F_CURRENT_CPU to indicate that
1323                     the value for the current CPU should be retrieved.
1324
1325                     This   helper    behaves    in    a    way    close    to
1326                     bpf_perf_event_read()  helper,  save that instead of just
1327                     returning the value observed, it fills the buf structure.
1328                     This  allows for additional data to be retrieved: in par‐
1329                     ticular, the enabled and running times  (in  buf->enabled
1330                     and  buf->running,  respectively) are copied. In general,
1331                     bpf_perf_event_read_value()    is    recommended     over
1332                     bpf_perf_event_read(), which has some ABI issues and pro‐
1333                     vides fewer functionalities.
1334
1335                     These values are interesting, because hardware PMU  (Per‐
1336                     formance Monitoring Unit) counters are limited resources.
1337                     When there are more PMU based  perf  events  opened  than
1338                     available counters, kernel will multiplex these events so
1339                     each event gets certain percentage (but not all)  of  the
1340                     PMU  time.  In case that multiplexing happens, the number
1341                     of samples or counter value will  not  reflect  the  case
1342                     compared  to when no multiplexing occurs. This makes com‐
1343                     parison between different runs difficult.  Typically, the
1344                     counter  value  should  be normalized before comparing to
1345                     other experiments. The usual  normalization  is  done  as
1346                     follows.
1347
1348                        normalized_counter = counter * t_enabled / t_running
1349
1350                     Where  t_enabled is the time enabled for event and t_run‐
1351                     ning is the time running for event since last  normaliza‐
1352                     tion. The enabled and running times are accumulated since
1353                     the perf event open. To achieve  scaling  factor  between
1354                     two  invocations of an eBPF program, users can use CPU id
1355                     as the key (which is typical for perf array usage  model)
1356                     to remember the previous value and do the calculation in‐
1357                     side the eBPF program.
1358
1359              Return 0 on success, or a negative error in case of failure.
1360
1361       long bpf_perf_prog_read_value(struct bpf_perf_event_data  *ctx,  struct
1362       bpf_perf_event_value *buf, u32 buf_size)
1363
1364              Description
1365                     For  en  eBPF  program attached to a perf event, retrieve
1366                     the value of the event  counter  associated  to  ctx  and
1367                     store  it  in  the  structure  pointed by buf and of size
1368                     buf_size. Enabled and running times are  also  stored  in
1369                     the     structure     (see    description    of    helper
1370                     bpf_perf_event_read_value() for more details).
1371
1372              Return 0 on success, or a negative error in case of failure.
1373
1374       long bpf_getsockopt(void *bpf_socket,  int  level,  int  optname,  void
1375       *optval, int optlen)
1376
1377              Description
1378                     Emulate  a  call to getsockopt() on the socket associated
1379                     to bpf_socket, which must be a full socket. The level  at
1380                     which  the option resides and the name optname of the op‐
1381                     tion must be specified, see getsockopt(2) for more infor‐
1382                     mation.   The  retrieved value is stored in the structure
1383                     pointed by opval and of length optlen.
1384
1385                     bpf_socket should be one of the following:
1386
1387struct bpf_sock_ops for BPF_PROG_TYPE_SOCK_OPS.
1388
1389struct bpf_sock_addr for  BPF_CGROUP_INET4_CONNECT  and
1390                       BPF_CGROUP_INET6_CONNECT.
1391
1392                     This helper actually implements a subset of getsockopt().
1393                     It supports the same set of optnames that is supported by
1394                     the   bpf_setsockopt()   helper.    The   exceptions  are
1395                     TCP_BPF_* is bpf_setsockopt() only and  TCP_SAVED_SYN  is
1396                     bpf_getsockopt() only.
1397
1398              Return 0 on success, or a negative error in case of failure.
1399
1400       long bpf_override_return(struct pt_regs *regs, u64 rc)
1401
1402              Description
1403                     Used  for  error  injection,  this helper uses kprobes to
1404                     override the return value of the probed function, and  to
1405                     set  it to rc.  The first argument is the context regs on
1406                     which the kprobe works.
1407
1408                     This helper works by setting the PC (program counter)  to
1409                     an  override function which is run in place of the origi‐
1410                     nal probed function. This means the  probed  function  is
1411                     not  run  at  all.  The replacement function just returns
1412                     with the required value.
1413
1414                     This helper has security implications, and thus  is  sub‐
1415                     ject  to restrictions. It is only available if the kernel
1416                     was compiled with the CONFIG_BPF_KPROBE_OVERRIDE configu‐
1417                     ration  option,  and  in this case it only works on func‐
1418                     tions tagged with  ALLOW_ERROR_INJECTION  in  the  kernel
1419                     code.
1420
1421                     Also,  the helper is only available for the architectures
1422                     having the CONFIG_FUNCTION_ERROR_INJECTION option. As  of
1423                     this writing, x86 architecture is the only one to support
1424                     this feature.
1425
1426              Return 0
1427
1428       long  bpf_sock_ops_cb_flags_set(struct  bpf_sock_ops   *bpf_sock,   int
1429       argval)
1430
1431              Description
1432                     Attempt  to  set  the  value of the bpf_sock_ops_cb_flags
1433                     field for the full TCP socket associated to  bpf_sock_ops
1434                     to argval.
1435
1436                     The  primary  use  of this field is to determine if there
1437                     should   be   calls   to   eBPF    programs    of    type
1438                     BPF_PROG_TYPE_SOCK_OPS at various points in the TCP code.
1439                     A program of the same type can change its value, per con‐
1440                     nection  and  as necessary, when the connection is estab‐
1441                     lished. This field is directly  accessible  for  reading,
1442                     but  this helper must be used for updates in order to re‐
1443                     turn an error if an eBPF program tries to set a  callback
1444                     that is not supported in the current kernel.
1445
1446                     argval is a flag array which can combine these flags:
1447
1448BPF_SOCK_OPS_RTO_CB_FLAG (retransmission time out)
1449
1450BPF_SOCK_OPS_RETRANS_CB_FLAG (retransmission)
1451
1452BPF_SOCK_OPS_STATE_CB_FLAG (TCP state change)
1453
1454BPF_SOCK_OPS_RTT_CB_FLAG (every RTT)
1455
1456                     Therefore,  this function can be used to clear a callback
1457                     flag by setting the appropriate bit to zero. e.g. to dis‐
1458                     able the RTO callback:
1459
1460                     bpf_sock_ops_cb_flags_set(bpf_sock,
1461                            bpf_sock->bpf_sock_ops_cb_flags                  &
1462                            ~BPF_SOCK_OPS_RTO_CB_FLAG)
1463
1464                     Here are some examples of where one could call such  eBPF
1465                     program:
1466
1467                     • When RTO fires.
1468
1469                     • When a packet is retransmitted.
1470
1471                     • When the connection terminates.
1472
1473                     • When a packet is sent.
1474
1475                     • When a packet is received.
1476
1477              Return Code -EINVAL if the socket is not a full TCP socket; oth‐
1478                     erwise, a positive number containing the bits that  could
1479                     not be set is returned (which comes down to 0 if all bits
1480                     were set as required).
1481
1482       long bpf_msg_redirect_map(struct sk_msg_buff *msg, struct bpf_map *map,
1483       u32 key, u64 flags)
1484
1485              Description
1486                     This  helper is used in programs implementing policies at
1487                     the socket level. If the message msg is allowed  to  pass
1488                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1489                     rect  it  to  the  socket  referenced  by  map  (of  type
1490                     BPF_MAP_TYPE_SOCKMAP)  at  index  key.  Both  ingress and
1491                     egress  interfaces  can  be  used  for  redirection.  The
1492                     BPF_F_INGRESS value in flags is used to make the distinc‐
1493                     tion (ingress path is selected if the  flag  is  present,
1494                     egress  path  otherwise). This is the only flag supported
1495                     for now.
1496
1497              Return SK_PASS on success, or SK_DROP on error.
1498
1499       long bpf_msg_apply_bytes(struct sk_msg_buff *msg, u32 bytes)
1500
1501              Description
1502                     For socket policies, apply the verdict of the  eBPF  pro‐
1503                     gram to the next bytes (number of bytes) of message msg.
1504
1505                     For  example,  this  helper  can be used in the following
1506                     cases:
1507
1508                     • A single sendmsg() or sendfile() system  call  contains
1509                       multiple logical messages that the eBPF program is sup‐
1510                       posed to read and for which it should apply a verdict.
1511
1512                     • An eBPF program only cares to read the first bytes of a
1513                       msg.  If  the message has a large payload, then setting
1514                       up and calling the  eBPF  program  repeatedly  for  all
1515                       bytes,  even though the verdict is already known, would
1516                       create unnecessary overhead.
1517
1518                     When called from within an eBPF program, the helper  sets
1519                     a  counter  internal  to  the BPF infrastructure, that is
1520                     used to apply the last verdict  to  the  next  bytes.  If
1521                     bytes  is  smaller  than the current data being processed
1522                     from a sendmsg() or sendfile()  system  call,  the  first
1523                     bytes  will  be  sent and the eBPF program will be re-run
1524                     with the pointer for start of data pointing to byte  num‐
1525                     ber  bytes  + 1. If bytes is larger than the current data
1526                     being processed, then the eBPF verdict will be applied to
1527                     multiple  sendmsg()  or  sendfile() calls until bytes are
1528                     consumed.
1529
1530                     Note that if a socket closes with  the  internal  counter
1531                     holding  a  non-zero value, this is not a problem because
1532                     data is not being buffered for bytes and is sent as it is
1533                     received.
1534
1535              Return 0
1536
1537       long bpf_msg_cork_bytes(struct sk_msg_buff *msg, u32 bytes)
1538
1539              Description
1540                     For socket policies, prevent the execution of the verdict
1541                     eBPF program for message msg until  bytes  (byte  number)
1542                     have been accumulated.
1543
1544                     This  can  be  used  when  one needs a specific number of
1545                     bytes before a verdict can be assigned, even if the  data
1546                     spans multiple sendmsg() or sendfile() calls. The extreme
1547                     case would be a user calling  sendmsg()  repeatedly  with
1548                     1-byte  long message segments. Obviously, this is bad for
1549                     performance, but it is still valid. If the  eBPF  program
1550                     needs  bytes  bytes to validate a header, this helper can
1551                     be used to prevent the eBPF program to  be  called  again
1552                     until bytes have been accumulated.
1553
1554              Return 0
1555
1556       long bpf_msg_pull_data(struct sk_msg_buff *msg, u32 start, u32 end, u64
1557       flags)
1558
1559              Description
1560                     For socket policies, pull in non-linear  data  from  user
1561                     space   for   msg   and   set   pointers   msg->data  and
1562                     msg->data_end to start and end bytes  offsets  into  msg,
1563                     respectively.
1564
1565                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
1566                     it can only parse data that the (data, data_end) pointers
1567                     have already consumed. For sendmsg() hooks this is likely
1568                     the first scatterlist element. But for calls  relying  on
1569                     the  sendpage  handler (e.g. sendfile()) this will be the
1570                     range (0, 0) because the data is shared with  user  space
1571                     and  by  default  the objective is to avoid allowing user
1572                     space to modify data while (or after) eBPF verdict is be‐
1573                     ing  decided. This helper can be used to pull in data and
1574                     to set the start and end pointer to  given  values.  Data
1575                     will  be copied if necessary (i.e. if data was not linear
1576                     and if start and end pointers do not point  to  the  same
1577                     chunk).
1578
1579                     A call to this helper is susceptible to change the under‐
1580                     lying packet buffer. Therefore, at load time, all  checks
1581                     on  pointers  previously done by the verifier are invali‐
1582                     dated and must be performed again, if the helper is  used
1583                     in combination with direct packet access.
1584
1585                     All  values  for flags are reserved for future usage, and
1586                     must be left at zero.
1587
1588              Return 0 on success, or a negative error in case of failure.
1589
1590       long bpf_bind(struct bpf_sock_addr *ctx,  struct  sockaddr  *addr,  int
1591       addr_len)
1592
1593              Description
1594                     Bind  the socket associated to ctx to the address pointed
1595                     by addr, of length addr_len. This allows for making  out‐
1596                     going  connection  from the desired IP address, which can
1597                     be useful for example when all processes inside a  cgroup
1598                     should  use one single IP address on a host that has mul‐
1599                     tiple IP configured.
1600
1601                     This helper works for IPv4 and IPv6, TCP and UDP sockets.
1602                     The   domain   (addr->sa_family)   must  be  AF_INET  (or
1603                     AF_INET6). It's advised to pass zero  port  (sin_port  or
1604                     sin6_port)  which  triggers  IP_BIND_ADDRESS_NO_PORT-like
1605                     behavior and lets the kernel efficiently pick up  an  un‐
1606                     used  port as long as 4-tuple is unique. Passing non-zero
1607                     port might lead to degraded performance.
1608
1609              Return 0 on success, or a negative error in case of failure.
1610
1611       long bpf_xdp_adjust_tail(struct xdp_buff *xdp_md, int delta)
1612
1613              Description
1614                     Adjust (move) xdp_md->data_end by delta bytes. It is pos‐
1615                     sible  to  both  shrink and grow the packet tail.  Shrink
1616                     done via delta being a negative integer.
1617
1618                     A call to this helper is susceptible to change the under‐
1619                     lying  packet buffer. Therefore, at load time, all checks
1620                     on pointers previously done by the verifier  are  invali‐
1621                     dated  and must be performed again, if the helper is used
1622                     in combination with direct packet access.
1623
1624              Return 0 on success, or a negative error in case of failure.
1625
1626       long bpf_skb_get_xfrm_state(struct  sk_buff  *skb,  u32  index,  struct
1627       bpf_xfrm_state *xfrm_state, u32 size, u64 flags)
1628
1629              Description
1630                     Retrieve the XFRM state (IP transform framework, see also
1631                     ip-xfrm(8)) at index in XFRM "security path" for skb.
1632
1633                     The   retrieved   value   is   stored   in   the   struct
1634                     bpf_xfrm_state pointed by xfrm_state and of length size.
1635
1636                     All  values  for flags are reserved for future usage, and
1637                     must be left at zero.
1638
1639                     This helper is available only if the kernel was  compiled
1640                     with CONFIG_XFRM configuration option.
1641
1642              Return 0 on success, or a negative error in case of failure.
1643
1644       long bpf_get_stack(void *ctx, void *buf, u32 size, u64 flags)
1645
1646              Description
1647                     Return  a  user or a kernel stack in bpf program provided
1648                     buffer.  To achieve this, the helper needs ctx, which  is
1649                     a  pointer to the context on which the tracing program is
1650                     executed.  To store the stacktrace, the bpf program  pro‐
1651                     vides buf with a nonnegative size.
1652
1653                     The  last  argument,  flags,  holds  the  number of stack
1654                     frames  to  skip   (from   0   to   255),   masked   with
1655                     BPF_F_SKIP_FIELD_MASK.  The  next bits can be used to set
1656                     the following flags:
1657
1658                     BPF_F_USER_STACK
1659                            Collect a user space stack  instead  of  a  kernel
1660                            stack.
1661
1662                     BPF_F_USER_BUILD_ID
1663                            Collect (build_id, file_offset) instead of ips for
1664                            user stack, only valid if BPF_F_USER_STACK is also
1665                            specified.
1666
1667                            file_offset is an offset relative to the beginning
1668                            of the executable or shared  object  file  backing
1669                            the vma which the ip falls in. It is not an offset
1670                            relative to that object's  base  address.  Accord‐
1671                            ingly,  it  must  be adjusted by adding (sh_addr -
1672                            sh_offset), where sh_{addr,offset}  correspond  to
1673                            the  executable  section containing file_offset in
1674                            the object, for comparisons to  symbols'  st_value
1675                            to be valid.
1676
1677                     bpf_get_stack()  can  collect  up to PERF_MAX_STACK_DEPTH
1678                     both kernel and user frames, subject to sufficient  large
1679                     buffer  size. Note that this limit can be controlled with
1680                     the sysctl program, and that it should  be  manually  in‐
1681                     creased  in  order  to  profile long user stacks (such as
1682                     stacks for Java programs). To do so, use:
1683
1684                        # sysctl kernel.perf_event_max_stack=<new value>
1685
1686              Return The non-negative copied buf length equal to or less  than
1687                     size on success, or a negative error in case of failure.
1688
1689       long bpf_skb_load_bytes_relative(const void *skb, u32 offset, void *to,
1690       u32 len, u32 start_header)
1691
1692              Description
1693                     This helper is similar to bpf_skb_load_bytes() in that it
1694                     provides  an  easy way to load len bytes from offset from
1695                     the packet associated to skb, into the buffer pointed  by
1696                     to.  The  difference  to  bpf_skb_load_bytes()  is that a
1697                     fifth argument start_header exists in order to  select  a
1698                     base offset to start from. start_header can be one of:
1699
1700                     BPF_HDR_START_MAC
1701                            Base offset to load data from is skb's mac header.
1702
1703                     BPF_HDR_START_NET
1704                            Base  offset  to  load  data from is skb's network
1705                            header.
1706
1707                     In general,  "direct  packet  access"  is  the  preferred
1708                     method  to access packet data, however, this helper is in
1709                     particular useful in socket filters where skb->data  does
1710                     not always point to the start of the mac header and where
1711                     "direct packet access" is not available.
1712
1713              Return 0 on success, or a negative error in case of failure.
1714
1715       long bpf_fib_lookup(void *ctx, struct bpf_fib_lookup *params, int plen,
1716       u32 flags)
1717
1718              Description
1719                     Do  FIB  lookup  in  kernel  tables  using  parameters in
1720                     params.  If lookup is successful and result shows  packet
1721                     is  to be forwarded, the neighbor tables are searched for
1722                     the nexthop.  If successful (ie., FIB lookup  shows  for‐
1723                     warding  and nexthop is resolved), the nexthop address is
1724                     returned in ipv4_dst or ipv6_dst based on family, smac is
1725                     set  to mac address of egress device, dmac is set to nex‐
1726                     thop mac address, rt_metric is set to metric  from  route
1727                     (IPv4/IPv6  only), and ifindex is set to the device index
1728                     of the nexthop from the FIB lookup.
1729
1730                     plen argument is the size of the passed in struct.  flags
1731                     argument  can be a combination of one or more of the fol‐
1732                     lowing values:
1733
1734                     BPF_FIB_LOOKUP_DIRECT
1735                            Do a direct table lookup vs full lookup using  FIB
1736                            rules.
1737
1738                     BPF_FIB_LOOKUP_OUTPUT
1739                            Perform lookup from an egress perspective (default
1740                            is ingress).
1741
1742                     ctx is either struct xdp_md for XDP  programs  or  struct
1743                     sk_buff tc cls_act programs.
1744
1745              Return
1746
1747                     • < 0 if any input argument is invalid
1748
1749                     • 0 on success (packet is forwarded, nexthop neighbor ex‐
1750                       ists)
1751
1752                     • > 0 one of BPF_FIB_LKUP_RET_ codes explaining  why  the
1753                       packet is not forwarded or needs assist from full stack
1754
1755                     If  lookup  fails with BPF_FIB_LKUP_RET_FRAG_NEEDED, then
1756                     the MTU was exceeded and output  params->mtu_result  con‐
1757                     tains the MTU.
1758
1759       long  bpf_sock_hash_update(struct  bpf_sock_ops  *skops, struct bpf_map
1760       *map, void *key, u64 flags)
1761
1762              Description
1763                     Add an entry to, or update  a  sockhash  map  referencing
1764                     sockets.   The skops is used as a new value for the entry
1765                     associated to key. flags is one of:
1766
1767                     BPF_NOEXIST
1768                            The entry for key must not exist in the map.
1769
1770                     BPF_EXIST
1771                            The entry for key must already exist in the map.
1772
1773                     BPF_ANY
1774                            No condition on the existence  of  the  entry  for
1775                            key.
1776
1777                     If  the map has eBPF programs (parser and verdict), those
1778                     will be inherited by  the  socket  being  added.  If  the
1779                     socket is already attached to eBPF programs, this results
1780                     in an error.
1781
1782              Return 0 on success, or a negative error in case of failure.
1783
1784       long  bpf_msg_redirect_hash(struct  sk_msg_buff  *msg,  struct  bpf_map
1785       *map, void *key, u64 flags)
1786
1787              Description
1788                     This  helper is used in programs implementing policies at
1789                     the socket level. If the message msg is allowed  to  pass
1790                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1791                     rect  it  to  the  socket  referenced  by  map  (of  type
1792                     BPF_MAP_TYPE_SOCKHASH)  using  hash key. Both ingress and
1793                     egress  interfaces  can  be  used  for  redirection.  The
1794                     BPF_F_INGRESS value in flags is used to make the distinc‐
1795                     tion (ingress path is selected if the  flag  is  present,
1796                     egress  path  otherwise). This is the only flag supported
1797                     for now.
1798
1799              Return SK_PASS on success, or SK_DROP on error.
1800
1801       long bpf_sk_redirect_hash(struct sk_buff  *skb,  struct  bpf_map  *map,
1802       void *key, u64 flags)
1803
1804              Description
1805                     This  helper is used in programs implementing policies at
1806                     the skb socket level. If the sk_buff skb  is  allowed  to
1807                     pass (i.e.  if the verdict eBPF program returns SK_PASS),
1808                     redirect it to the socket  referenced  by  map  (of  type
1809                     BPF_MAP_TYPE_SOCKHASH)  using  hash key. Both ingress and
1810                     egress  interfaces  can  be  used  for  redirection.  The
1811                     BPF_F_INGRESS value in flags is used to make the distinc‐
1812                     tion (ingress path is selected if the  flag  is  present,
1813                     egress  otherwise).  This  is the only flag supported for
1814                     now.
1815
1816              Return SK_PASS on success, or SK_DROP on error.
1817
1818       long bpf_lwt_push_encap(struct sk_buff *skb, u32 type, void  *hdr,  u32
1819       len)
1820
1821              Description
1822                     Encapsulate the packet associated to skb within a Layer 3
1823                     protocol header. This header is provided in the buffer at
1824                     address  hdr,  with len its size in bytes. type indicates
1825                     the protocol of the header and can be one of:
1826
1827                     BPF_LWT_ENCAP_SEG6
1828                            IPv6 encapsulation  with  Segment  Routing  Header
1829                            (struct  ipv6_sr_hdr).  hdr only contains the SRH,
1830                            the IPv6 header is computed by the kernel.
1831
1832                     BPF_LWT_ENCAP_SEG6_INLINE
1833                            Only works if skb contains an IPv6 packet.  Insert
1834                            a  Segment Routing Header (struct ipv6_sr_hdr) in‐
1835                            side the IPv6 header.
1836
1837                     BPF_LWT_ENCAP_IP
1838                            IP  encapsulation  (GRE/GUE/IPIP/etc).  The  outer
1839                            header  must  be IPv4 or IPv6, followed by zero or
1840                            more additional headers, up  to  LWT_BPF_MAX_HEAD‐
1841                            ROOM  total bytes in all prepended headers. Please
1842                            note that if skb_is_gso(skb) is true, no more than
1843                            two  headers  can  be  prepended,  and  the  inner
1844                            header,  if  present,  should  be  either  GRE  or
1845                            UDP/GUE.
1846
1847                     BPF_LWT_ENCAP_SEG6*  types  can be called by BPF programs
1848                     of type BPF_PROG_TYPE_LWT_IN; BPF_LWT_ENCAP_IP  type  can
1849                     be  called  by bpf programs of types BPF_PROG_TYPE_LWT_IN
1850                     and BPF_PROG_TYPE_LWT_XMIT.
1851
1852                     A call to this helper is susceptible to change the under‐
1853                     lying  packet buffer. Therefore, at load time, all checks
1854                     on pointers previously done by the verifier  are  invali‐
1855                     dated  and must be performed again, if the helper is used
1856                     in combination with direct packet access.
1857
1858              Return 0 on success, or a negative error in case of failure.
1859
1860       long bpf_lwt_seg6_store_bytes(struct sk_buff *skb,  u32  offset,  const
1861       void *from, u32 len)
1862
1863              Description
1864                     Store len bytes from address from into the packet associ‐
1865                     ated to skb, at offset. Only the flags, tag and TLVs  in‐
1866                     side  the  outermost  IPv6  Segment Routing Header can be
1867                     modified through this helper.
1868
1869                     A call to this helper is susceptible to change the under‐
1870                     lying  packet buffer. Therefore, at load time, all checks
1871                     on pointers previously done by the verifier  are  invali‐
1872                     dated  and must be performed again, if the helper is used
1873                     in combination with direct packet access.
1874
1875              Return 0 on success, or a negative error in case of failure.
1876
1877       long  bpf_lwt_seg6_adjust_srh(struct  sk_buff  *skb,  u32  offset,  s32
1878       delta)
1879
1880              Description
1881                     Adjust  the  size allocated to TLVs in the outermost IPv6
1882                     Segment Routing Header contained in the packet associated
1883                     to  skb,  at position offset by delta bytes. Only offsets
1884                     after the segments are accepted. delta  can  be  as  well
1885                     positive (growing) as negative (shrinking).
1886
1887                     A call to this helper is susceptible to change the under‐
1888                     lying packet buffer. Therefore, at load time, all  checks
1889                     on  pointers  previously done by the verifier are invali‐
1890                     dated and must be performed again, if the helper is  used
1891                     in combination with direct packet access.
1892
1893              Return 0 on success, or a negative error in case of failure.
1894
1895       long  bpf_lwt_seg6_action(struct sk_buff *skb, u32 action, void *param,
1896       u32 param_len)
1897
1898              Description
1899                     Apply an IPv6 Segment Routing action of  type  action  to
1900                     the packet associated to skb. Each action takes a parame‐
1901                     ter contained at address param, and of  length  param_len
1902                     bytes.  action can be one of:
1903
1904                     SEG6_LOCAL_ACTION_END_X
1905                            End.X action: Endpoint with Layer-3 cross-connect.
1906                            Type of param: struct in6_addr.
1907
1908                     SEG6_LOCAL_ACTION_END_T
1909                            End.T action: Endpoint with  specific  IPv6  table
1910                            lookup.  Type of param: int.
1911
1912                     SEG6_LOCAL_ACTION_END_B6
1913                            End.B6  action:  Endpoint bound to an SRv6 policy.
1914                            Type of param: struct ipv6_sr_hdr.
1915
1916                     SEG6_LOCAL_ACTION_END_B6_ENCAP
1917                            End.B6.Encap action: Endpoint bound to an SRv6 en‐
1918                            capsulation   policy.    Type   of  param:  struct
1919                            ipv6_sr_hdr.
1920
1921                     A call to this helper is susceptible to change the under‐
1922                     lying  packet buffer. Therefore, at load time, all checks
1923                     on pointers previously done by the verifier  are  invali‐
1924                     dated  and must be performed again, if the helper is used
1925                     in combination with direct packet access.
1926
1927              Return 0 on success, or a negative error in case of failure.
1928
1929       long bpf_rc_repeat(void *ctx)
1930
1931              Description
1932                     This helper is used in programs implementing IR decoding,
1933                     to report a successfully decoded repeat key message. This
1934                     delays the generation of a key up  event  for  previously
1935                     generated key down event.
1936
1937                     Some  IR protocols like NEC have a special IR message for
1938                     repeating last button, for when a button is held down.
1939
1940                     The ctx should point to the lirc sample  as  passed  into
1941                     the program.
1942
1943                     This  helper is only available is the kernel was compiled
1944                     with the CONFIG_BPF_LIRC_MODE2 configuration  option  set
1945                     to "y".
1946
1947              Return 0
1948
1949       long bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle)
1950
1951              Description
1952                     This helper is used in programs implementing IR decoding,
1953                     to report a successfully decoded key press with scancode,
1954                     toggle  value in the given protocol. The scancode will be
1955                     translated to a keycode using the rc keymap, and reported
1956                     as an input key down event. After a period a key up event
1957                     is generated. This period can be extended by calling  ei‐
1958                     ther  bpf_rc_keydown()  again  with  the  same values, or
1959                     calling bpf_rc_repeat().
1960
1961                     Some protocols include a toggle bit, in case  the  button
1962                     was  released and pressed again between consecutive scan‐
1963                     codes.
1964
1965                     The ctx should point to the lirc sample  as  passed  into
1966                     the program.
1967
1968                     The  protocol  is  the  decoded protocol number (see enum
1969                     rc_proto for some predefined values).
1970
1971                     This helper is only available is the kernel was  compiled
1972                     with  the  CONFIG_BPF_LIRC_MODE2 configuration option set
1973                     to "y".
1974
1975              Return 0
1976
1977       u64 bpf_skb_cgroup_id(struct sk_buff *skb)
1978
1979              Description
1980                     Return the cgroup v2 id of the socket associated with the
1981                     skb.  This is roughly similar to the bpf_get_cgroup_clas‐
1982                     sid() helper for cgroup v1 by providing a tag resp. iden‐
1983                     tifier  that  can  be  matched on or used for map lookups
1984                     e.g. to implement policy. The cgroup v2  id  of  a  given
1985                     path  in  the  hierarchy is exposed in user space through
1986                     the f_handle API in order to get to the same 64-bit id.
1987
1988                     This helper can be used on TC egress  path,  but  not  on
1989                     ingress, and is available only if the kernel was compiled
1990                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
1991
1992              Return The id is returned or 0 in case the id could not  be  re‐
1993                     trieved.
1994
1995       u64 bpf_get_current_cgroup_id(void)
1996
1997              Description
1998                     Get  the  current  cgroup  id  based on the cgroup within
1999                     which the current task is running.
2000
2001              Return A 64-bit integer containing the current cgroup  id  based
2002                     on the cgroup within which the current task is running.
2003
2004       void *bpf_get_local_storage(void *map, u64 flags)
2005
2006              Description
2007                     Get  the pointer to the local storage area.  The type and
2008                     the size of the local storage is defined by the map argu‐
2009                     ment.   The  flags meaning is specific for each map type,
2010                     and has to be 0 for cgroup local storage.
2011
2012                     Depending on the BPF program type, a local  storage  area
2013                     can  be shared between multiple instances of the BPF pro‐
2014                     gram, running simultaneously.
2015
2016                     A user should care about the synchronization by  himself.
2017                     For  example, by using the BPF_ATOMIC instructions to al‐
2018                     ter the shared data.
2019
2020              Return A pointer to the local storage area.
2021
2022       long  bpf_sk_select_reuseport(struct  sk_reuseport_md  *reuse,   struct
2023       bpf_map *map, void *key, u64 flags)
2024
2025              Description
2026                     Select  a  SO_REUSEPORT socket from a BPF_MAP_TYPE_REUSE‐
2027                     PORT_SOCKARRAY map.  It checks  the  selected  socket  is
2028                     matching the incoming request in the socket buffer.
2029
2030              Return 0 on success, or a negative error in case of failure.
2031
2032       u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level)
2033
2034              Description
2035                     Return id of cgroup v2 that is ancestor of cgroup associ‐
2036                     ated with the skb at the ancestor_level.  The root cgroup
2037                     is  at ancestor_level zero and each step down the hierar‐
2038                     chy increments the level. If ancestor_level ==  level  of
2039                     cgroup  associated  with  skb,  then return value will be
2040                     same as that of bpf_skb_cgroup_id().
2041
2042                     The helper is  useful  to  implement  policies  based  on
2043                     cgroups that are upper in hierarchy than immediate cgroup
2044                     associated with skb.
2045
2046                     The format of returned id and helper limitations are same
2047                     as in bpf_skb_cgroup_id().
2048
2049              Return The  id  is returned or 0 in case the id could not be re‐
2050                     trieved.
2051
2052       struct bpf_sock  *bpf_sk_lookup_tcp(void  *ctx,  struct  bpf_sock_tuple
2053       *tuple, u32 tuple_size, u64 netns, u64 flags)
2054
2055              Description
2056                     Look for TCP socket matching tuple, optionally in a child
2057                     network  namespace  netns.  The  return  value  must   be
2058                     checked, and if non-NULL, released via bpf_sk_release().
2059
2060                     The  ctx should point to the context of the program, such
2061                     as the skb or socket (depending on the hook in use). This
2062                     is  used  to determine the base network namespace for the
2063                     lookup.
2064
2065                     tuple_size must be one of:
2066
2067                     sizeof(tuple->ipv4)
2068                            Look for an IPv4 socket.
2069
2070                     sizeof(tuple->ipv6)
2071                            Look for an IPv6 socket.
2072
2073                     If the netns is a negative signed  32-bit  integer,  then
2074                     the  socket lookup table in the netns associated with the
2075                     ctx will be used. For the TC hooks, this is the netns  of
2076                     the  device  in  the  skb.  For socket hooks, this is the
2077                     netns of the socket.  If netns is any other signed 32-bit
2078                     value greater than or equal to zero then it specifies the
2079                     ID of the netns relative to the netns associated with the
2080                     ctx. netns values beyond the range of 32-bit integers are
2081                     reserved for future use.
2082
2083                     All values for flags are reserved for future  usage,  and
2084                     must be left at zero.
2085
2086                     This  helper is available only if the kernel was compiled
2087                     with CONFIG_NET configuration option.
2088
2089              Return Pointer to struct bpf_sock, or NULL in case  of  failure.
2090                     For  sockets  with  reuseport option, the struct bpf_sock
2091                     result is from reuse->socks[] using the hash of  the  tu‐
2092                     ple.
2093
2094       struct  bpf_sock  *bpf_sk_lookup_udp(void  *ctx,  struct bpf_sock_tuple
2095       *tuple, u32 tuple_size, u64 netns, u64 flags)
2096
2097              Description
2098                     Look for UDP socket matching tuple, optionally in a child
2099                     network   namespace  netns.  The  return  value  must  be
2100                     checked, and if non-NULL, released via bpf_sk_release().
2101
2102                     The ctx should point to the context of the program,  such
2103                     as the skb or socket (depending on the hook in use). This
2104                     is used to determine the base network namespace  for  the
2105                     lookup.
2106
2107                     tuple_size must be one of:
2108
2109                     sizeof(tuple->ipv4)
2110                            Look for an IPv4 socket.
2111
2112                     sizeof(tuple->ipv6)
2113                            Look for an IPv6 socket.
2114
2115                     If  the  netns  is a negative signed 32-bit integer, then
2116                     the socket lookup table in the netns associated with  the
2117                     ctx  will be used. For the TC hooks, this is the netns of
2118                     the device in the skb. For  socket  hooks,  this  is  the
2119                     netns of the socket.  If netns is any other signed 32-bit
2120                     value greater than or equal to zero then it specifies the
2121                     ID of the netns relative to the netns associated with the
2122                     ctx. netns values beyond the range of 32-bit integers are
2123                     reserved for future use.
2124
2125                     All  values  for flags are reserved for future usage, and
2126                     must be left at zero.
2127
2128                     This helper is available only if the kernel was  compiled
2129                     with CONFIG_NET configuration option.
2130
2131              Return Pointer  to  struct bpf_sock, or NULL in case of failure.
2132                     For sockets with reuseport option,  the  struct  bpf_sock
2133                     result  is  from reuse->socks[] using the hash of the tu‐
2134                     ple.
2135
2136       long bpf_sk_release(void *sock)
2137
2138              Description
2139                     Release the reference  held  by  sock.  sock  must  be  a
2140                     non-NULL     pointer     that     was    returned    from
2141                     bpf_sk_lookup_xxx().
2142
2143              Return 0 on success, or a negative error in case of failure.
2144
2145       long bpf_map_push_elem(struct bpf_map  *map,  const  void  *value,  u64
2146       flags)
2147
2148              Description
2149                     Push an element value in map. flags is one of:
2150
2151                     BPF_EXIST
2152                            If  the queue/stack is full, the oldest element is
2153                            removed to make room for this.
2154
2155              Return 0 on success, or a negative error in case of failure.
2156
2157       long bpf_map_pop_elem(struct bpf_map *map, void *value)
2158
2159              Description
2160                     Pop an element from map.
2161
2162              Return 0 on success, or a negative error in case of failure.
2163
2164       long bpf_map_peek_elem(struct bpf_map *map, void *value)
2165
2166              Description
2167                     Get an element from map without removing it.
2168
2169              Return 0 on success, or a negative error in case of failure.
2170
2171       long bpf_msg_push_data(struct sk_msg_buff *msg, u32 start, u32 len, u64
2172       flags)
2173
2174              Description
2175                     For  socket policies, insert len bytes into msg at offset
2176                     start.
2177
2178                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
2179                     it  may  want to insert metadata or options into the msg.
2180                     This can later be read and used by any of the lower layer
2181                     BPF hooks.
2182
2183                     This  helper  may fail if under memory pressure (a malloc
2184                     fails) in these cases BPF programs will get an  appropri‐
2185                     ate error and BPF programs will need to handle them.
2186
2187              Return 0 on success, or a negative error in case of failure.
2188
2189       long  bpf_msg_pop_data(struct sk_msg_buff *msg, u32 start, u32 len, u64
2190       flags)
2191
2192              Description
2193                     Will remove len bytes from a msg starting at byte  start.
2194                     This may result in ENOMEM errors under certain situations
2195                     if an allocation and copy are required due to a full ring
2196                     buffer.   However, the helper will try to avoid doing the
2197                     allocation if possible. Other errors can occur  if  input
2198                     parameters are invalid either due to start byte not being
2199                     valid part of msg  payload  and/or  pop  value  being  to
2200                     large.
2201
2202              Return 0 on success, or a negative error in case of failure.
2203
2204       long bpf_rc_pointer_rel(void *ctx, s32 rel_x, s32 rel_y)
2205
2206              Description
2207                     This helper is used in programs implementing IR decoding,
2208                     to report a successfully decoded pointer movement.
2209
2210                     The ctx should point to the lirc sample  as  passed  into
2211                     the program.
2212
2213                     This  helper is only available is the kernel was compiled
2214                     with the CONFIG_BPF_LIRC_MODE2 configuration  option  set
2215                     to "y".
2216
2217              Return 0
2218
2219       long bpf_spin_lock(struct bpf_spin_lock *lock)
2220
2221              Description
2222                     Acquire a spinlock represented by the pointer lock, which
2223                     is stored as part of a value of a map.  Taking  the  lock
2224                     allows  to  safely  update the rest of the fields in that
2225                     value. The spinlock can (and must) later be released with
2226                     a call to bpf_spin_unlock(lock).
2227
2228                     Spinlocks  in BPF programs come with a number of restric‐
2229                     tions and constraints:
2230
2231bpf_spin_lock objects are only allowed inside  maps  of
2232                       types  BPF_MAP_TYPE_HASH  and  BPF_MAP_TYPE_ARRAY (this
2233                       list could be extended in the future).
2234
2235                     • BTF description of the map is mandatory.
2236
2237                     • The BPF program can take ONE lock at a time, since tak‐
2238                       ing two or more could cause dead locks.
2239
2240                     • Only  one  struct bpf_spin_lock is allowed per map ele‐
2241                       ment.
2242
2243                     • When the lock is taken, calls (either  BPF  to  BPF  or
2244                       helpers) are not allowed.
2245
2246                     • The  BPF_LD_ABS and BPF_LD_IND instructions are not al‐
2247                       lowed inside a spinlock-ed region.
2248
2249                     • The BPF program MUST call bpf_spin_unlock() to  release
2250                       the lock, on all execution paths, before it returns.
2251
2252                     • The  BPF  program  can access struct bpf_spin_lock only
2253                       via the bpf_spin_lock() and bpf_spin_unlock()  helpers.
2254                       Loading  or  storing data into the struct bpf_spin_lock
2255                       lock; field of a map is not allowed.
2256
2257                     • To use the bpf_spin_lock() helper, the BTF  description
2258                       of  the  map  value  must  be  a struct and have struct
2259                       bpf_spin_lock anyname; field at the top level.   Nested
2260                       lock inside another struct is not allowed.
2261
2262                     • The struct bpf_spin_lock lock field in a map value must
2263                       be aligned on a multiple of 4 bytes in that value.
2264
2265                     • Syscall with command BPF_MAP_LOOKUP_ELEM does not  copy
2266                       the bpf_spin_lock field to user space.
2267
2268                     • Syscall  with  command  BPF_MAP_UPDATE_ELEM,  or update
2269                       from a BPF program, do  not  update  the  bpf_spin_lock
2270                       field.
2271
2272bpf_spin_lock  cannot  be on the stack or inside a net‐
2273                       working packet (it can only be inside of a map values).
2274
2275bpf_spin_lock is available to root only.
2276
2277                     • Tracing programs and socket filter programs cannot  use
2278                       bpf_spin_lock()  due  to insufficient preemption checks
2279                       (but this may change in the future).
2280
2281bpf_spin_lock  is  not  allowed  in   inner   maps   of
2282                       map-in-map.
2283
2284              Return 0
2285
2286       long bpf_spin_unlock(struct bpf_spin_lock *lock)
2287
2288              Description
2289                     Release   the   lock  previously  locked  by  a  call  to
2290                     bpf_spin_lock(lock).
2291
2292              Return 0
2293
2294       struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)
2295
2296              Description
2297                     This helper gets a struct bpf_sock pointer such that  all
2298                     the fields in this bpf_sock can be accessed.
2299
2300              Return A  struct bpf_sock pointer on success, or NULL in case of
2301                     failure.
2302
2303       struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk)
2304
2305              Description
2306                     This helper gets a struct  bpf_tcp_sock  pointer  from  a
2307                     struct bpf_sock pointer.
2308
2309              Return A struct bpf_tcp_sock pointer on success, or NULL in case
2310                     of failure.
2311
2312       long bpf_skb_ecn_set_ce(struct sk_buff *skb)
2313
2314              Description
2315                     Set ECN (Explicit Congestion Notification)  field  of  IP
2316                     header to CE (Congestion Encountered) if current value is
2317                     ECT (ECN Capable Transport). Otherwise, do nothing. Works
2318                     with IPv6 and IPv4.
2319
2320              Return 1  if  the  CE  flag is set (either by the current helper
2321                     call or because it was already present), 0 if it  is  not
2322                     set.
2323
2324       struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk)
2325
2326              Description
2327                     Return  a  struct  bpf_sock  pointer in TCP_LISTEN state.
2328                     bpf_sk_release() is unnecessary and not allowed.
2329
2330              Return A struct bpf_sock pointer on success, or NULL in case  of
2331                     failure.
2332
2333       struct  bpf_sock  *bpf_skc_lookup_tcp(void  *ctx, struct bpf_sock_tuple
2334       *tuple, u32 tuple_size, u64 netns, u64 flags)
2335
2336              Description
2337                     Look for TCP socket matching tuple, optionally in a child
2338                     network   namespace  netns.  The  return  value  must  be
2339                     checked, and if non-NULL, released via bpf_sk_release().
2340
2341                     This function is identical to bpf_sk_lookup_tcp(), except
2342                     that  it  also  returns  timewait or request sockets. Use
2343                     bpf_sk_fullsock() or bpf_tcp_sock() to  access  the  full
2344                     structure.
2345
2346                     This  helper is available only if the kernel was compiled
2347                     with CONFIG_NET configuration option.
2348
2349              Return Pointer to struct bpf_sock, or NULL in case  of  failure.
2350                     For  sockets  with  reuseport option, the struct bpf_sock
2351                     result is from reuse->socks[] using the hash of  the  tu‐
2352                     ple.
2353
2354       long  bpf_tcp_check_syncookie(void  *sk, void *iph, u32 iph_len, struct
2355       tcphdr *th, u32 th_len)
2356
2357              Description
2358                     Check whether iph and th contain a valid SYN  cookie  ACK
2359                     for the listening socket in sk.
2360
2361                     iph points to the start of the IPv4 or IPv6 header, while
2362                     iph_len contains sizeof(struct  iphdr)  or  sizeof(struct
2363                     ipv6hdr).
2364
2365                     th  points  to  the start of the TCP header, while th_len
2366                     contains  the  length  of  the  TCP  header   (at   least
2367                     sizeof(struct tcphdr)).
2368
2369              Return 0 if iph and th are a valid SYN cookie ACK, or a negative
2370                     error otherwise.
2371
2372       long bpf_sysctl_get_name(struct  bpf_sysctl  *ctx,  char  *buf,  size_t
2373       buf_len, u64 flags)
2374
2375              Description
2376                     Get  name  of  sysctl in /proc/sys/ and copy it into pro‐
2377                     vided by program buffer buf of size buf_len.
2378
2379                     The  buffer  is  always  NUL  terminated,   unless   it's
2380                     zero-sized.
2381
2382                     If  flags is zero, full name (e.g. "net/ipv4/tcp_mem") is
2383                     copied. Use BPF_F_SYSCTL_BASE_NAME flag to copy base name
2384                     only (e.g. "tcp_mem").
2385
2386              Return Number  of  character  copied (not including the trailing
2387                     NUL).
2388
2389                     -E2BIG if the buffer wasn't big enough (buf will  contain
2390                     truncated name in this case).
2391
2392       long  bpf_sysctl_get_current_value(struct  bpf_sysctl  *ctx, char *buf,
2393       size_t buf_len)
2394
2395              Description
2396                     Get current  value  of  sysctl  as  it  is  presented  in
2397                     /proc/sys  (incl.  newline, etc), and copy it as a string
2398                     into provided by program buffer buf of size buf_len.
2399
2400                     The whole value is copied, no matter what  file  position
2401                     user space issued e.g. sys_read at.
2402
2403                     The   buffer   is  always  NUL  terminated,  unless  it's
2404                     zero-sized.
2405
2406              Return Number of character copied (not  including  the  trailing
2407                     NUL).
2408
2409                     -E2BIG  if the buffer wasn't big enough (buf will contain
2410                     truncated name in this case).
2411
2412                     -EINVAL if current value was  unavailable,  e.g.  because
2413                     sysctl is uninitialized and read returns -EIO for it.
2414
2415       long bpf_sysctl_get_new_value(struct bpf_sysctl *ctx, char *buf, size_t
2416       buf_len)
2417
2418              Description
2419                     Get new value being written by user space to sysctl  (be‐
2420                     fore  the  actual  write happens) and copy it as a string
2421                     into provided by program buffer buf of size buf_len.
2422
2423                     User space may write new value at file position > 0.
2424
2425                     The  buffer  is  always  NUL  terminated,   unless   it's
2426                     zero-sized.
2427
2428              Return Number  of  character  copied (not including the trailing
2429                     NUL).
2430
2431                     -E2BIG if the buffer wasn't big enough (buf will  contain
2432                     truncated name in this case).
2433
2434                     -EINVAL if sysctl is being read.
2435
2436       long  bpf_sysctl_set_new_value(struct bpf_sysctl *ctx, const char *buf,
2437       size_t buf_len)
2438
2439              Description
2440                     Override new value being written by user space to  sysctl
2441                     with  value  provided  by  program  in buffer buf of size
2442                     buf_len.
2443
2444                     buf should contain a string in same form as  provided  by
2445                     user space on sysctl write.
2446
2447                     User  space  may write new value at file position > 0. To
2448                     override the whole sysctl value file position  should  be
2449                     set to zero.
2450
2451              Return 0 on success.
2452
2453                     -E2BIG if the buf_len is too big.
2454
2455                     -EINVAL if sysctl is being read.
2456
2457       long bpf_strtol(const char *buf, size_t buf_len, u64 flags, long *res)
2458
2459              Description
2460                     Convert the initial part of the string from buffer buf of
2461                     size buf_len to a long integer  according  to  the  given
2462                     base and save the result in res.
2463
2464                     The  string  may  begin with an arbitrary amount of white
2465                     space (as determined by isspace(3)) followed by a  single
2466                     optional '-' sign.
2467
2468                     Five  least  significant bits of flags encode base, other
2469                     bits are currently unused.
2470
2471                     Base must be either 8, 10, 16 or 0 to detect it automati‐
2472                     cally similar to user space strtol(3).
2473
2474              Return Number  of  characters consumed on success. Must be posi‐
2475                     tive but no more than buf_len.
2476
2477                     -EINVAL if no valid digits were found or unsupported base
2478                     was provided.
2479
2480                     -ERANGE if resulting value was out of range.
2481
2482       long  bpf_strtoul(const  char *buf, size_t buf_len, u64 flags, unsigned
2483       long *res)
2484
2485              Description
2486                     Convert the initial part of the string from buffer buf of
2487                     size buf_len to an unsigned long integer according to the
2488                     given base and save the result in res.
2489
2490                     The string may begin with an arbitrary  amount  of  white
2491                     space (as determined by isspace(3)).
2492
2493                     Five  least  significant bits of flags encode base, other
2494                     bits are currently unused.
2495
2496                     Base must be either 8, 10, 16 or 0 to detect it automati‐
2497                     cally similar to user space strtoul(3).
2498
2499              Return Number  of  characters consumed on success. Must be posi‐
2500                     tive but no more than buf_len.
2501
2502                     -EINVAL if no valid digits were found or unsupported base
2503                     was provided.
2504
2505                     -ERANGE if resulting value was out of range.
2506
2507       void  *bpf_sk_storage_get(struct  bpf_map  *map, void *sk, void *value,
2508       u64 flags)
2509
2510              Description
2511                     Get a bpf-local-storage from a sk.
2512
2513                     Logically, it could be thought of getting the value  from
2514                     a  map  with  sk as the key.  From this perspective,  the
2515                     usage is not much different from bpf_map_lookup_elem(map,
2516                     &sk)  except  this helper enforces the key must be a full
2517                     socket and the  map  must  be  a  BPF_MAP_TYPE_SK_STORAGE
2518                     also.
2519
2520                     Underneath,  the value is stored locally at sk instead of
2521                     the map.   The  map  is  used  as  the  bpf-local-storage
2522                     "type".  The  bpf-local-storage  "type" (i.e. the map) is
2523                     searched against all bpf-local-storages residing at sk.
2524
2525                     sk is a kernel struct sock pointer for LSM  program.   sk
2526                     is a struct bpf_sock pointer for other program types.
2527
2528                     An  optional  flags  (BPF_SK_STORAGE_GET_F_CREATE) can be
2529                     used such that a new bpf-local-storage will be created if
2530                     one  does  not  exist.   value  can be used together with
2531                     BPF_SK_STORAGE_GET_F_CREATE to specify the initial  value
2532                     of  a  bpf-local-storage.   If  value  is  NULL,  the new
2533                     bpf-local-storage will be zero initialized.
2534
2535              Return A bpf-local-storage pointer is returned on success.
2536
2537                     NULL if not found or there was an error in adding  a  new
2538                     bpf-local-storage.
2539
2540       long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
2541
2542              Description
2543                     Delete a bpf-local-storage from a sk.
2544
2545              Return 0 on success.
2546
2547                     -ENOENT  if the bpf-local-storage cannot be found.  -EIN‐
2548                     VAL if sk is not a fullsock (e.g. a request_sock).
2549
2550       long bpf_send_signal(u32 sig)
2551
2552              Description
2553                     Send signal sig to the process of the current task.   The
2554                     signal may be delivered to any of this process's threads.
2555
2556              Return 0 on success or successfully queued.
2557
2558                     -EBUSY if work queue under nmi is full.
2559
2560                     -EINVAL if sig is invalid.
2561
2562                     -EPERM if no permission to send the sig.
2563
2564                     -EAGAIN if bpf program can try again.
2565
2566       s64  bpf_tcp_gen_syncookie(void  *sk,  void  *iph,  u32 iph_len, struct
2567       tcphdr *th, u32 th_len)
2568
2569              Description
2570                     Try to issue a SYN cookie for the packet with correspond‐
2571                     ing  IP/TCP  headers, iph and th, on the listening socket
2572                     in sk.
2573
2574                     iph points to the start of the IPv4 or IPv6 header, while
2575                     iph_len  contains  sizeof(struct  iphdr) or sizeof(struct
2576                     ipv6hdr).
2577
2578                     th points to the start of the TCP  header,  while  th_len
2579                     contains  the  length  of the TCP header with options (at
2580                     least sizeof(struct tcphdr)).
2581
2582              Return On success, lower 32 bits hold the generated  SYN  cookie
2583                     in  followed by 16 bits which hold the MSS value for that
2584                     cookie, and the top 16 bits are unused.
2585
2586                     On failure, the returned value is one of the following:
2587
2588                     -EINVAL SYN cookie cannot be issued due to error
2589
2590                     -ENOENT SYN cookie should not be issued (no SYN flood)
2591
2592                     -EOPNOTSUPP kernel  configuration  does  not  enable  SYN
2593                     cookies
2594
2595                     -EPROTONOSUPPORT IP packet version is not 4 or 6
2596
2597       long  bpf_skb_output(void  *ctx,  struct  bpf_map *map, u64 flags, void
2598       *data, u64 size)
2599
2600              Description
2601                     Write raw data blob into a special BPF perf event held by
2602                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
2603                     event must have the following attributes: PERF_SAMPLE_RAW
2604                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
2605                     PERF_COUNT_SW_BPF_OUTPUT as config.
2606
2607                     The flags are used to indicate the index in map for which
2608                     the value must be put, masked with BPF_F_INDEX_MASK.  Al‐
2609                     ternatively, flags can be set to BPF_F_CURRENT_CPU to in‐
2610                     dicate  that  the index of the current CPU core should be
2611                     used.
2612
2613                     The value to write, of size, is passed through eBPF stack
2614                     and pointed by data.
2615
2616                     ctx is a pointer to in-kernel struct sk_buff.
2617
2618                     This helper is similar to bpf_perf_event_output() but re‐
2619                     stricted to raw_tracepoint bpf programs.
2620
2621              Return 0 on success, or a negative error in case of failure.
2622
2623       long bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr)
2624
2625              Description
2626                     Safely attempt to read size bytes from user space address
2627                     unsafe_ptr and store the data in dst.
2628
2629              Return 0 on success, or a negative error in case of failure.
2630
2631       long bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
2632
2633              Description
2634                     Safely  attempt  to read size bytes from kernel space ad‐
2635                     dress unsafe_ptr and store the data in dst.
2636
2637              Return 0 on success, or a negative error in case of failure.
2638
2639       long bpf_probe_read_user_str(void  *dst,  u32  size,  const  void  *un‐
2640       safe_ptr)
2641
2642              Description
2643                     Copy  a NUL terminated string from an unsafe user address
2644                     unsafe_ptr to dst. The size should include the  terminat‐
2645                     ing  NUL  byte. In case the string length is smaller than
2646                     size, the target is not padded with further NUL bytes. If
2647                     the  string length is larger than size, just size-1 bytes
2648                     are copied and the last byte is set to NUL.
2649
2650                     On success, returns the number of bytes that  were  writ‐
2651                     ten,  including  the terminal NUL. This makes this helper
2652                     useful in tracing programs for reading strings, and  more
2653                     importantly to get its length at runtime. See the follow‐
2654                     ing snippet:
2655
2656                        SEC("kprobe/sys_open")
2657                        void bpf_sys_open(struct pt_regs *ctx)
2658                        {
2659                                char buf[PATHLEN]; // PATHLEN is defined to 256
2660                                int res;
2661
2662                                res = bpf_probe_read_user_str(buf, sizeof(buf),
2663                                                              ctx->di);
2664
2665                                // Consume buf, for example push it to
2666                                // userspace via bpf_perf_event_output(); we
2667                                // can use res (the string length) as event
2668                                // size, after checking its boundaries.
2669                        }
2670
2671                     In comparison, using  bpf_probe_read_user()  helper  here
2672                     instead  to read the string would require to estimate the
2673                     length at compile time, and would often result in copying
2674                     more memory than necessary.
2675
2676                     Another  useful  use  case  is  when  parsing  individual
2677                     process arguments  or  individual  environment  variables
2678                     navigating      current->mm->arg_start      and      cur‐
2679                     rent->mm->env_start: using this  helper  and  the  return
2680                     value, one can quickly iterate at the right offset of the
2681                     memory area.
2682
2683              Return On success, the strictly positive length  of  the  output
2684                     string, including the trailing NUL character. On error, a
2685                     negative value.
2686
2687       long bpf_probe_read_kernel_str(void *dst, u32  size,  const  void  *un‐
2688       safe_ptr)
2689
2690              Description
2691                     Copy  a  NUL  terminated string from an unsafe kernel ad‐
2692                     dress  unsafe_ptr  to  dst.  Same   semantics   as   with
2693                     bpf_probe_read_user_str() apply.
2694
2695              Return On  success,  the strictly positive length of the string,
2696                     including the trailing NUL character. On error,  a  nega‐
2697                     tive value.
2698
2699       long bpf_tcp_send_ack(void *tp, u32 rcv_nxt)
2700
2701              Description
2702                     Send  out a tcp-ack. tp is the in-kernel struct tcp_sock.
2703                     rcv_nxt is the ack_seq to be sent out.
2704
2705              Return 0 on success, or a negative error in case of failure.
2706
2707       long bpf_send_signal_thread(u32 sig)
2708
2709              Description
2710                     Send signal sig to the thread corresponding to  the  cur‐
2711                     rent task.
2712
2713              Return 0 on success or successfully queued.
2714
2715                     -EBUSY if work queue under nmi is full.
2716
2717                     -EINVAL if sig is invalid.
2718
2719                     -EPERM if no permission to send the sig.
2720
2721                     -EAGAIN if bpf program can try again.
2722
2723       u64 bpf_jiffies64(void)
2724
2725              Description
2726                     Obtain the 64bit jiffies
2727
2728              Return The 64 bit jiffies
2729
2730       long   bpf_read_branch_records(struct  bpf_perf_event_data  *ctx,  void
2731       *buf, u32 size, u64 flags)
2732
2733              Description
2734                     For an eBPF program attached to a  perf  event,  retrieve
2735                     the  branch records (struct perf_branch_entry) associated
2736                     to ctx and store it in the buffer pointed by  buf  up  to
2737                     size size bytes.
2738
2739              Return On  success,  number of bytes written to buf. On error, a
2740                     negative value.
2741
2742                     The flags can be set to BPF_F_GET_BRANCH_RECORDS_SIZE  to
2743                     instead  return the number of bytes required to store all
2744                     the branch entries. If this flag is set, buf may be NULL.
2745
2746                     -EINVAL if arguments invalid or size not  a  multiple  of
2747                     sizeof(struct perf_branch_entry).
2748
2749                     -ENOENT if architecture does not support branch records.
2750
2751       long    bpf_get_ns_current_pid_tgid(u64    dev,    u64    ino,   struct
2752       bpf_pidns_info *nsdata, u32 size)
2753
2754              Description
2755                     Returns 0 on success, values for pid  and  tgid  as  seen
2756                     from the current namespace will be returned in nsdata.
2757
2758              Return 0 on success, or one of the following in case of failure:
2759
2760                     -EINVAL  if  dev  and inum supplied don't match dev_t and
2761                     inode number with nsfs of current task, or if dev conver‐
2762                     sion to dev_t lost high bits.
2763
2764                     -ENOENT if pidns does not exists for the current task.
2765
2766       long  bpf_xdp_output(void  *ctx,  struct  bpf_map *map, u64 flags, void
2767       *data, u64 size)
2768
2769              Description
2770                     Write raw data blob into a special BPF perf event held by
2771                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
2772                     event must have the following attributes: PERF_SAMPLE_RAW
2773                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
2774                     PERF_COUNT_SW_BPF_OUTPUT as config.
2775
2776                     The flags are used to indicate the index in map for which
2777                     the value must be put, masked with BPF_F_INDEX_MASK.  Al‐
2778                     ternatively, flags can be set to BPF_F_CURRENT_CPU to in‐
2779                     dicate  that  the index of the current CPU core should be
2780                     used.
2781
2782                     The value to write, of size, is passed through eBPF stack
2783                     and pointed by data.
2784
2785                     ctx is a pointer to in-kernel struct xdp_buff.
2786
2787                     This  helper is similar to bpf_perf_eventoutput() but re‐
2788                     stricted to raw_tracepoint bpf programs.
2789
2790              Return 0 on success, or a negative error in case of failure.
2791
2792       u64 bpf_get_netns_cookie(void *ctx)
2793
2794              Description
2795                     Retrieve the cookie (generated by the kernel) of the net‐
2796                     work namespace the input ctx is associated with. The net‐
2797                     work namespace cookie remains stable for its lifetime and
2798                     provides  a global identifier that can be assumed unique.
2799                     If ctx is NULL, then the helper returns  the  cookie  for
2800                     the  initial network namespace. The cookie itself is very
2801                     similar to that of  bpf_get_socket_cookie()  helper,  but
2802                     for network namespaces instead of sockets.
2803
2804              Return A 8-byte long opaque number.
2805
2806       u64 bpf_get_current_ancestor_cgroup_id(int ancestor_level)
2807
2808              Description
2809                     Return id of cgroup v2 that is ancestor of the cgroup as‐
2810                     sociated with the current task at the ancestor_level. The
2811                     root  cgroup is at ancestor_level zero and each step down
2812                     the hierarchy increments the level. If ancestor_level  ==
2813                     level  of  cgroup  associated with the current task, then
2814                     return value will be the same  as  that  of  bpf_get_cur‐
2815                     rent_cgroup_id().
2816
2817                     The  helper  is  useful  to  implement  policies based on
2818                     cgroups that are upper in hierarchy than immediate cgroup
2819                     associated with the current task.
2820
2821                     The format of returned id and helper limitations are same
2822                     as in bpf_get_current_cgroup_id().
2823
2824              Return The id is returned or 0 in case the id could not  be  re‐
2825                     trieved.
2826
2827       long bpf_sk_assign(struct sk_buff *skb, void *sk, u64 flags)
2828
2829              Description
2830                     Helper  is overloaded depending on BPF program type. This
2831                     description  applies   to   BPF_PROG_TYPE_SCHED_CLS   and
2832                     BPF_PROG_TYPE_SCHED_ACT programs.
2833
2834                     Assign  the sk to the skb. When combined with appropriate
2835                     routing configuration to receive the packet  towards  the
2836                     socket,  will  cause skb to be delivered to the specified
2837                     socket.  Subsequent redirection  of  skb  via   bpf_redi‐
2838                     rect(),  bpf_clone_redirect() or other methods outside of
2839                     BPF may interfere with successful delivery to the socket.
2840
2841                     This operation is only valid from TC ingress path.
2842
2843                     The flags argument must be zero.
2844
2845              Return 0 on success, or a negative error in case of failure:
2846
2847                     -EINVAL if specified flags are not supported.
2848
2849                     -ENOENT if the socket is unavailable for assignment.
2850
2851                     -ENETUNREACH if the socket is unreachable (wrong netns).
2852
2853                     -EOPNOTSUPP if the operation is not supported, for  exam‐
2854                     ple a call from outside of TC ingress.
2855
2856                     -ESOCKTNOSUPPORT  if  the  socket  type  is not supported
2857                     (reuseport).
2858
2859       long bpf_sk_assign(struct bpf_sk_lookup *ctx, struct bpf_sock *sk,  u64
2860       flags)
2861
2862              Description
2863                     Helper  is overloaded depending on BPF program type. This
2864                     description applies to BPF_PROG_TYPE_SK_LOOKUP programs.
2865
2866                     Select the sk as a result of a socket lookup.
2867
2868                     For the operation to succeed passed socket must  be  com‐
2869                     patible  with  the packet description provided by the ctx
2870                     object.
2871
2872                     L4 protocol (IPPROTO_TCP or IPPROTO_UDP) must be an exact
2873                     match. While IP family (AF_INET or AF_INET6) must be com‐
2874                     patible, that is IPv6 sockets that are not v6-only can be
2875                     selected for IPv4 packets.
2876
2877                     Only TCP listeners and UDP unconnected sockets can be se‐
2878                     lected. sk can also be NULL to reset any previous  selec‐
2879                     tion.
2880
2881                     flags argument can combination of following values:
2882
2883BPF_SK_LOOKUP_F_REPLACE to override the previous socket
2884                       selection, potentially done by a BPF program  that  ran
2885                       before us.
2886
2887BPF_SK_LOOKUP_F_NO_REUSEPORT   to  skip  load-balancing
2888                       within reuseport group for the socket being selected.
2889
2890                     On success ctx->sk will point to the selected socket.
2891
2892              Return 0 on success, or a negative errno in case of failure.
2893
2894-EAFNOSUPPORT if socket family (sk->family) is not com‐
2895                       patible with packet family (ctx->family).
2896
2897-EEXIST  if  socket  has  been already selected, poten‐
2898                       tially by another program, and  BPF_SK_LOOKUP_F_REPLACE
2899                       flag was not specified.
2900
2901-EINVAL if unsupported flags were specified.
2902
2903-EPROTOTYPE   if   socket  L4  protocol  (sk->protocol)
2904                       doesn't match packet protocol (ctx->protocol).
2905
2906-ESOCKTNOSUPPORT if socket is not in allowed state (TCP
2907                       listening or UDP unconnected).
2908
2909       u64 bpf_ktime_get_boot_ns(void)
2910
2911              Description
2912                     Return  the  time  elapsed since system boot, in nanosec‐
2913                     onds.  Does include the time the  system  was  suspended.
2914                     See: clock_gettime(CLOCK_BOOTTIME)
2915
2916              Return Current ktime.
2917
2918       long  bpf_seq_printf(struct seq_file *m, const char *fmt, u32 fmt_size,
2919       const void *data, u32 data_len)
2920
2921              Description
2922                     bpf_seq_printf() uses seq_file seq_printf() to print  out
2923                     the  format  string.   The m represents the seq_file. The
2924                     fmt and fmt_size are for the format  string  itself.  The
2925                     data  and  data_len are format string arguments. The data
2926                     are a u64 array and corresponding  format  string  values
2927                     are  stored  in the array. For strings and pointers where
2928                     pointees are accessed, only the pointer values are stored
2929                     in  the  data array.  The data_len is the size of data in
2930                     bytes - must be a multiple of 8.
2931
2932                     Formats %s, %p{i,I}{4,6} requires to read kernel  memory.
2933                     Reading  kernel memory may fail due to either invalid ad‐
2934                     dress or valid  address  but  requiring  a  major  memory
2935                     fault.  If reading kernel memory fails, the string for %s
2936                     will  be  an  empty  string,  and  the  ip  address   for
2937                     %p{i,I}{4,6}  will  be 0. Not returning error to bpf pro‐
2938                     gram is consistent with what bpf_trace_printk() does  for
2939                     now.
2940
2941              Return 0 on success, or a negative error in case of failure:
2942
2943                     -EBUSY  if  per-CPU  memory  copy buffer is busy, can try
2944                     again by returning 1 from bpf program.
2945
2946                     -EINVAL if arguments  are  invalid,  or  if  fmt  is  in‐
2947                     valid/unsupported.
2948
2949                     -E2BIG if fmt contains too many format specifiers.
2950
2951                     -EOVERFLOW  if an overflow happened: The same object will
2952                     be tried again.
2953
2954       long bpf_seq_write(struct seq_file *m, const void *data, u32 len)
2955
2956              Description
2957                     bpf_seq_write() uses seq_file seq_write()  to  write  the
2958                     data.   The  m  represents the seq_file. The data and len
2959                     represent the data to write in bytes.
2960
2961              Return 0 on success, or a negative error in case of failure:
2962
2963                     -EOVERFLOW if an overflow happened: The same object  will
2964                     be tried again.
2965
2966       u64 bpf_sk_cgroup_id(void *sk)
2967
2968              Description
2969                     Return the cgroup v2 id of the socket sk.
2970
2971                     sk  must  be a non-NULL pointer to a socket, e.g. one re‐
2972                     turned from bpf_sk_lookup_xxx(), bpf_sk_fullsock(),  etc.
2973                     The    format    of   returned   id   is   same   as   in
2974                     bpf_skb_cgroup_id().
2975
2976                     This helper is available only if the kernel was  compiled
2977                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
2978
2979              Return The  id  is returned or 0 in case the id could not be re‐
2980                     trieved.
2981
2982       u64 bpf_sk_ancestor_cgroup_id(void *sk, int ancestor_level)
2983
2984              Description
2985                     Return id of cgroup v2 that is ancestor of cgroup associ‐
2986                     ated  with the sk at the ancestor_level.  The root cgroup
2987                     is at ancestor_level zero and each step down the  hierar‐
2988                     chy  increments  the level. If ancestor_level == level of
2989                     cgroup associated with sk, then return value will be same
2990                     as that of bpf_sk_cgroup_id().
2991
2992                     The  helper  is  useful  to  implement  policies based on
2993                     cgroups that are upper in hierarchy than immediate cgroup
2994                     associated with sk.
2995
2996                     The format of returned id and helper limitations are same
2997                     as in bpf_sk_cgroup_id().
2998
2999              Return The id is returned or 0 in case the id could not  be  re‐
3000                     trieved.
3001
3002       long bpf_ringbuf_output(void *ringbuf, void *data, u64 size, u64 flags)
3003
3004              Description
3005                     Copy size bytes from data into a ring buffer ringbuf.  If
3006                     BPF_RB_NO_WAKEUP is specified in flags,  no  notification
3007                     of new data availability is sent.  If BPF_RB_FORCE_WAKEUP
3008                     is specified in flags, notification of  new  data  avail‐
3009                     ability  is  sent  unconditionally.  If 0 is specified in
3010                     flags, an adaptive notification of new data  availability
3011                     is sent.
3012
3013                     An  adaptive notification is a notification sent whenever
3014                     the user-space process has caught  up  and  consumed  all
3015                     available  payloads.  In  case  the user-space process is
3016                     still processing a previous payload, then no notification
3017                     is  needed as it will process the newly added payload au‐
3018                     tomatically.
3019
3020              Return 0 on success, or a negative error in case of failure.
3021
3022       void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
3023
3024              Description
3025                     Reserve size bytes of payload in a ring  buffer  ringbuf.
3026                     flags must be 0.
3027
3028              Return Valid  pointer with size bytes of memory available; NULL,
3029                     otherwise.
3030
3031       void bpf_ringbuf_submit(void *data, u64 flags)
3032
3033              Description
3034                     Submit reserved ring buffer sample, pointed to  by  data.
3035                     If  BPF_RB_NO_WAKEUP  is specified in flags, no notifica‐
3036                     tion   of   new   data   availability   is   sent.     If
3037                     BPF_RB_FORCE_WAKEUP  is  specified in flags, notification
3038                     of new data availability is sent unconditionally.   If  0
3039                     is  specified  in  flags, an adaptive notification of new
3040                     data availability is sent.
3041
3042                     See 'bpf_ringbuf_output()' for the definition of adaptive
3043                     notification.
3044
3045              Return Nothing. Always succeeds.
3046
3047       void bpf_ringbuf_discard(void *data, u64 flags)
3048
3049              Description
3050                     Discard  reserved ring buffer sample, pointed to by data.
3051                     If BPF_RB_NO_WAKEUP is specified in flags,  no  notifica‐
3052                     tion    of   new   data   availability   is   sent.    If
3053                     BPF_RB_FORCE_WAKEUP is specified in  flags,  notification
3054                     of  new  data availability is sent unconditionally.  If 0
3055                     is specified in flags, an adaptive  notification  of  new
3056                     data availability is sent.
3057
3058                     See 'bpf_ringbuf_output()' for the definition of adaptive
3059                     notification.
3060
3061              Return Nothing. Always succeeds.
3062
3063       u64 bpf_ringbuf_query(void *ringbuf, u64 flags)
3064
3065              Description
3066                     Query various characteristics of  provided  ring  buffer.
3067                     What exactly is queries is determined by flags:
3068
3069BPF_RB_AVAIL_DATA: Amount of data not yet consumed.
3070
3071BPF_RB_RING_SIZE: The size of ring buffer.
3072
3073BPF_RB_CONS_POS: Consumer position (can wrap around).
3074
3075BPF_RB_PROD_POS:   Producer(s)   position   (can   wrap
3076                       around).
3077
3078                     Data returned is just a momentary snapshot of actual val‐
3079                     ues  and  could be inaccurate, so this facility should be
3080                     used to power heuristics and for reporting, not  to  make
3081                     100% correct calculation.
3082
3083              Return Requested value, or 0, if flags are not recognized.
3084
3085       long bpf_csum_level(struct sk_buff *skb, u64 level)
3086
3087              Description
3088                     Change  the  skbs checksum level by one layer up or down,
3089                     or reset it entirely to none in order to have  the  stack
3090                     perform  checksum  validation. The level is applicable to
3091                     the following protocols: TCP, UDP, GRE, SCTP,  FCOE.  For
3092                     example,  a  decap of | ETH | IP | UDP | GUE | IP | TCP |
3093                     into | ETH | IP |  TCP  |  through  bpf_skb_adjust_room()
3094                     helper  with passing in BPF_F_ADJ_ROOM_NO_CSUM_RESET flag
3095                     would  require  one   call   to   bpf_csum_level()   with
3096                     BPF_CSUM_LEVEL_DEC since the UDP header is removed. Simi‐
3097                     larly, an encap of the latter into the  former  could  be
3098                     accompanied  by  a  helper  call to bpf_csum_level() with
3099                     BPF_CSUM_LEVEL_INC if the skb is  still  intended  to  be
3100                     processed  in  higher layers of the stack instead of just
3101                     egressing at tc.
3102
3103                     There are three supported level settings at this time:
3104
3105BPF_CSUM_LEVEL_INC: Increases skb->csum_level for  skbs
3106                       with CHECKSUM_UNNECESSARY.
3107
3108BPF_CSUM_LEVEL_DEC:  Decreases skb->csum_level for skbs
3109                       with CHECKSUM_UNNECESSARY.
3110
3111BPF_CSUM_LEVEL_RESET: Resets skb->csum_level to  0  and
3112                       sets  CHECKSUM_NONE to force checksum validation by the
3113                       stack.
3114
3115BPF_CSUM_LEVEL_QUERY:  No-op,   returns   the   current
3116                       skb->csum_level.
3117
3118              Return 0  on success, or a negative error in case of failure. In
3119                     the   case   of   BPF_CSUM_LEVEL_QUERY,    the    current
3120                     skb->csum_level  is returned or the error code -EACCES in
3121                     case the skb is not subject to CHECKSUM_UNNECESSARY.
3122
3123       struct tcp6_sock *bpf_skc_to_tcp6_sock(void *sk)
3124
3125              Description
3126                     Dynamically cast a sk pointer to a tcp6_sock pointer.
3127
3128              Return sk if casting is valid, or NULL otherwise.
3129
3130       struct tcp_sock *bpf_skc_to_tcp_sock(void *sk)
3131
3132              Description
3133                     Dynamically cast a sk pointer to a tcp_sock pointer.
3134
3135              Return sk if casting is valid, or NULL otherwise.
3136
3137       struct tcp_timewait_sock *bpf_skc_to_tcp_timewait_sock(void *sk)
3138
3139              Description
3140                     Dynamically cast a  sk  pointer  to  a  tcp_timewait_sock
3141                     pointer.
3142
3143              Return sk if casting is valid, or NULL otherwise.
3144
3145       struct tcp_request_sock *bpf_skc_to_tcp_request_sock(void *sk)
3146
3147              Description
3148                     Dynamically  cast  a  sk  pointer  to  a tcp_request_sock
3149                     pointer.
3150
3151              Return sk if casting is valid, or NULL otherwise.
3152
3153       struct udp6_sock *bpf_skc_to_udp6_sock(void *sk)
3154
3155              Description
3156                     Dynamically cast a sk pointer to a udp6_sock pointer.
3157
3158              Return sk if casting is valid, or NULL otherwise.
3159
3160       long bpf_get_task_stack(struct task_struct *task, void *buf, u32  size,
3161       u64 flags)
3162
3163              Description
3164                     Return  a  user or a kernel stack in bpf program provided
3165                     buffer.  To achieve this, the helper needs task, which is
3166                     a  valid  pointer  to  struct  task_struct.  To store the
3167                     stacktrace, the bpf program provides buf with a  nonnega‐
3168                     tive size.
3169
3170                     The  last  argument,  flags,  holds  the  number of stack
3171                     frames  to  skip   (from   0   to   255),   masked   with
3172                     BPF_F_SKIP_FIELD_MASK.  The  next bits can be used to set
3173                     the following flags:
3174
3175                     BPF_F_USER_STACK
3176                            Collect a user space stack  instead  of  a  kernel
3177                            stack.
3178
3179                     BPF_F_USER_BUILD_ID
3180                            Collect  buildid+offset  instead  of  ips for user
3181                            stack, only  valid  if  BPF_F_USER_STACK  is  also
3182                            specified.
3183
3184                     bpf_get_task_stack()      can      collect      up     to
3185                     PERF_MAX_STACK_DEPTH both kernel and user frames, subject
3186                     to sufficient large buffer size. Note that this limit can
3187                     be controlled with the sysctl program, and that it should
3188                     be  manually  increased  in  order  to  profile long user
3189                     stacks (such as stacks for Java programs). To do so, use:
3190
3191                        # sysctl kernel.perf_event_max_stack=<new value>
3192
3193              Return The non-negative copied buf length equal to or less  than
3194                     size on success, or a negative error in case of failure.
3195
3196       long  bpf_load_hdr_opt(struct  bpf_sock_ops *skops, void *searchby_res,
3197       u32 len, u64 flags)
3198
3199              Description
3200                     Load header option.  Support  reading  a  particular  TCP
3201                     header option for bpf program (BPF_PROG_TYPE_SOCK_OPS).
3202
3203                     If  flags  is  0,  it  will  search  the  option from the
3204                     skops->skb_data.  The comment in struct bpf_sock_ops  has
3205                     details   on   what  skb_data  contains  under  different
3206                     skops->op.
3207
3208                     The first byte of the  searchby_res  specifies  the  kind
3209                     that it wants to search.
3210
3211                     If  the  searching kind is an experimental kind (i.e. 253
3212                     or 254 according to RFC6994).  It also needs  to  specify
3213                     the  "magic" which is either 2 bytes or 4 bytes.  It then
3214                     also needs to specify the size of the magic by using  the
3215                     2nd  byte  which  is "kind-length" of a TCP header option
3216                     and the "kind-length" also includes  the  first  2  bytes
3217                     "kind"  and  "kind-length"  itself as a normal TCP header
3218                     option also does.
3219
3220                     For example, to search experimental kind 254 with 2  byte
3221                     magic  0xeB9F, the searchby_res should be [ 254, 4, 0xeB,
3222                     0x9F, 0, 0, .... 0 ].
3223
3224                     To search for the standard window scale option  (3),  the
3225                     searchby_res  should  be  [  3,  0,  0,  .... 0 ].  Note,
3226                     kind-length must be 0 for regular option.
3227
3228                     Searching for No-Op (0) and  End-of-Option-List  (1)  are
3229                     not supported.
3230
3231                     len must be at least 2 bytes which is the minimal size of
3232                     a header option.
3233
3234                     Supported flags:
3235
3236BPF_LOAD_HDR_OPT_TCP_SYN to search from  the  saved_syn
3237                       packet or the just-received syn packet.
3238
3239              Return >   0   when  found,  the  header  option  is  copied  to
3240                     searchby_res.  The  return  value  is  the  total  length
3241                     copied. On failure, a negative error code is returned:
3242
3243                     -EINVAL if a parameter is invalid.
3244
3245                     -ENOMSG if the option is not found.
3246
3247                     -ENOENT    if   no   syn   packet   is   available   when
3248                     BPF_LOAD_HDR_OPT_TCP_SYN is used.
3249
3250                     -ENOSPC if there is not enough space.  Only len number of
3251                     bytes are copied.
3252
3253                     -EFAULT  on  failure  to  parse the header options in the
3254                     packet.
3255
3256                     -EPERM if the helper cannot be  used  under  the  current
3257                     skops->op.
3258
3259       long  bpf_store_hdr_opt(struct  bpf_sock_ops  *skops, const void *from,
3260       u32 len, u64 flags)
3261
3262              Description
3263                     Store header option.  The data will be copied from buffer
3264                     from with length len to the TCP header.
3265
3266                     The  buffer  from  should  have the whole option that in‐
3267                     cludes the kind, kind-length, and the actual option data.
3268                     The   len   must  be  at  least  kind-length  long.   The
3269                     kind-length does not have to be 4 byte aligned.  The ker‐
3270                     nel will take care of the padding and setting the 4 bytes
3271                     aligned value to th->doff.
3272
3273                     This helper will check for duplicated option by searching
3274                     the same option in the outgoing skb.
3275
3276                     This     helper     can    only    be    called    during
3277                     BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
3278
3279              Return 0 on success, or negative error in case of failure:
3280
3281                     -EINVAL If param is invalid.
3282
3283                     -ENOSPC if there is  not  enough  space  in  the  header.
3284                     Nothing has been written
3285
3286                     -EEXIST if the option already exists.
3287
3288                     -EFAULT on failure to parse the existing header options.
3289
3290                     -EPERM  if  the  helper  cannot be used under the current
3291                     skops->op.
3292
3293       long  bpf_reserve_hdr_opt(struct  bpf_sock_ops  *skops,  u32  len,  u64
3294       flags)
3295
3296              Description
3297                     Reserve  len  bytes for the bpf header option.  The space
3298                     will   be   used   by   bpf_store_hdr_opt()   later    in
3299                     BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
3300
3301                     If  bpf_reserve_hdr_opt()  is  called multiple times, the
3302                     total number of bytes will be reserved.
3303
3304                     This    helper    can    only    be     called     during
3305                     BPF_SOCK_OPS_HDR_OPT_LEN_CB.
3306
3307              Return 0 on success, or negative error in case of failure:
3308
3309                     -EINVAL if a parameter is invalid.
3310
3311                     -ENOSPC if there is not enough space in the header.
3312
3313                     -EPERM  if  the  helper  cannot be used under the current
3314                     skops->op.
3315
3316       void *bpf_inode_storage_get(struct  bpf_map  *map,  void  *inode,  void
3317       *value, u64 flags)
3318
3319              Description
3320                     Get a bpf_local_storage from an inode.
3321
3322                     Logically,  it  could  be thought of as getting the value
3323                     from a map with inode as the key.  From this perspective,
3324                     the     usage     is     not    much    different    from
3325                     bpf_map_lookup_elem(map, &inode) except this  helper  en‐
3326                     forces  the key must be an inode and the map must also be
3327                     a BPF_MAP_TYPE_INODE_STORAGE.
3328
3329                     Underneath, the value is stored locally at inode  instead
3330                     of  the  map.   The  map is used as the bpf-local-storage
3331                     "type". The bpf-local-storage "type" (i.e.  the  map)  is
3332                     searched against all bpf_local_storage residing at inode.
3333
3334                     An optional flags (BPF_LOCAL_STORAGE_GET_F_CREATE) can be
3335                     used such that a new bpf_local_storage will be created if
3336                     one  does  not  exist.   value  can be used together with
3337                     BPF_LOCAL_STORAGE_GET_F_CREATE  to  specify  the  initial
3338                     value  of a bpf_local_storage.  If value is NULL, the new
3339                     bpf_local_storage will be zero initialized.
3340
3341              Return A bpf_local_storage pointer is returned on success.
3342
3343                     NULL if not found or there was an error in adding  a  new
3344                     bpf_local_storage.
3345
3346       int bpf_inode_storage_delete(struct bpf_map *map, void *inode)
3347
3348              Description
3349                     Delete a bpf_local_storage from an inode.
3350
3351              Return 0 on success.
3352
3353                     -ENOENT if the bpf_local_storage cannot be found.
3354
3355       long bpf_d_path(struct path *path, char *buf, u32 sz)
3356
3357              Description
3358                     Return  full  path  for  given  struct path object, which
3359                     needs to be the kernel BTF path object. The path  is  re‐
3360                     turned  in the provided buffer buf of size sz and is zero
3361                     terminated.
3362
3363              Return On success, the strictly positive length of  the  string,
3364                     including  the  trailing NUL character. On error, a nega‐
3365                     tive value.
3366
3367       long bpf_copy_from_user(void *dst, u32 size, const void *user_ptr)
3368
3369              Description
3370                     Read size bytes from  user  space  address  user_ptr  and
3371                     store   the   data   in   dst.   This  is  a  wrapper  of
3372                     copy_from_user().
3373
3374              Return 0 on success, or a negative error in case of failure.
3375
3376       long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32
3377       btf_ptr_size, u64 flags)
3378
3379              Description
3380                     Use  BTF  to store a string representation of ptr->ptr in
3381                     str, using ptr->type_id.  This value should  specify  the
3382                     type      that      ptr->ptr      points     to.     LLVM
3383                     __builtin_btf_type_id(type, 1) can be used to look up vm‐
3384                     linux  BTF  type ids. Traversing the data structure using
3385                     BTF, the type information and values are  stored  in  the
3386                     first  str_size  -  1  bytes  of  str.   Safe copy of the
3387                     pointer data is carried out to avoid kernel crashes  dur‐
3388                     ing operation.  Smaller types can use string space on the
3389                     stack; larger programs can use  map  data  to  store  the
3390                     string representation.
3391
3392                     The  string can be subsequently shared with userspace via
3393                     bpf_perf_event_output()  or   ring   buffer   interfaces.
3394                     bpf_trace_printk()  is  to  be  avoided  as it places too
3395                     small a limit on string size to be useful.
3396
3397                     flags is a combination of
3398
3399                     BTF_F_COMPACT
3400                            no formatting around type information
3401
3402                     BTF_F_NONAME
3403                            no struct/union member names/types
3404
3405                     BTF_F_PTR_RAW
3406                            show raw (unobfuscated) pointer values; equivalent
3407                            to printk specifier %px.
3408
3409                     BTF_F_ZERO
3410                            show  zero-valued  struct/union  members; they are
3411                            not displayed by default
3412
3413              Return The number of bytes that were written (or would have been
3414                     written  if  output  had  to  be  truncated due to string
3415                     size), or a negative error in cases of failure.
3416
3417       long bpf_seq_printf_btf(struct seq_file *m, struct  btf_ptr  *ptr,  u32
3418       ptr_size, u64 flags)
3419
3420              Description
3421                     Use  BTF to write to seq_write a string representation of
3422                     ptr->ptr, using ptr->type_id as  per  bpf_snprintf_btf().
3423                     flags are identical to those used for bpf_snprintf_btf.
3424
3425              Return 0 on success or a negative error in case of failure.
3426
3427       u64 bpf_skb_cgroup_classid(struct sk_buff *skb)
3428
3429              Description
3430                     See  bpf_get_cgroup_classid()  for  the main description.
3431                     This helper differs from bpf_get_cgroup_classid() in that
3432                     the  cgroup  v1  net_cls class is retrieved only from the
3433                     skb's associated socket instead of the current process.
3434
3435              Return The id is returned or 0 in case the id could not  be  re‐
3436                     trieved.
3437
3438       long  bpf_redirect_neigh(u32  ifindex,  struct bpf_redir_neigh *params,
3439       int plen, u64 flags)
3440
3441              Description
3442                     Redirect the  packet  to  another  net  device  of  index
3443                     ifindex and fill in L2 addresses from neighboring subsys‐
3444                     tem. This helper is somewhat similar  to  bpf_redirect(),
3445                     except  that  it populates L2 addresses as well, meaning,
3446                     internally, the helper relies on the neighbor lookup  for
3447                     the L2 address of the nexthop.
3448
3449                     The  helper  will perform a FIB lookup based on the skb's
3450                     networking header to get the address of the next hop, un‐
3451                     less  this  is supplied by the caller in the params argu‐
3452                     ment. The plen argument indicates the len of  params  and
3453                     should be set to 0 if params is NULL.
3454
3455                     The  flags argument is reserved and must be 0. The helper
3456                     is currently only supported for tc BPF program types, and
3457                     enabled for IPv4 and IPv6 protocols.
3458
3459              Return The   helper   returns   TC_ACT_REDIRECT  on  success  or
3460                     TC_ACT_SHOT on error.
3461
3462       void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
3463
3464              Description
3465                     Take a pointer to a percpu ksym, percpu_ptr, and return a
3466                     pointer  to  the percpu kernel variable on cpu. A ksym is
3467                     an extern variable decorated  with  '__ksym'.  For  ksym,
3468                     there  is  a global var (either static or global) defined
3469                     of the same name in the kernel. The ksym is percpu if the
3470                     global var is percpu.  The returned pointer points to the
3471                     global percpu var on cpu.
3472
3473                     bpf_per_cpu_ptr() has the same semantic as  per_cpu_ptr()
3474                     in  the  kernel, except that bpf_per_cpu_ptr() may return
3475                     NULL. This happens if cpu is larger than nr_cpu_ids.  The
3476                     caller  of  bpf_per_cpu_ptr()  must  check  the  returned
3477                     value.
3478
3479              Return A pointer pointing to the kernel percpu variable on  cpu,
3480                     or NULL, if cpu is invalid.
3481
3482       void *bpf_this_cpu_ptr(const void *percpu_ptr)
3483
3484              Description
3485                     Take a pointer to a percpu ksym, percpu_ptr, and return a
3486                     pointer to the percpu kernel variable on  this  cpu.  See
3487                     the description of 'ksym' in bpf_per_cpu_ptr().
3488
3489                     bpf_this_cpu_ptr()    has    the    same    semantic   as
3490                     this_cpu_ptr()   in   the    kernel.    Different    from
3491                     bpf_per_cpu_ptr(), it would never return NULL.
3492
3493              Return A  pointer pointing to the kernel percpu variable on this
3494                     cpu.
3495
3496       long bpf_redirect_peer(u32 ifindex, u64 flags)
3497
3498              Description
3499                     Redirect the  packet  to  another  net  device  of  index
3500                     ifindex.   This  helper  is somewhat similar to bpf_redi‐
3501                     rect(),  except  that  the  redirection  happens  to  the
3502                     ifindex'  peer  device  and  the netns switch takes place
3503                     from ingress to ingress without going through  the  CPU's
3504                     backlog queue.
3505
3506                     The  flags argument is reserved and must be 0. The helper
3507                     is currently only supported for tc BPF program  types  at
3508                     the  ingress hook and for veth device types. The peer de‐
3509                     vice must reside in a different network namespace.
3510
3511              Return The  helper  returns  TC_ACT_REDIRECT   on   success   or
3512                     TC_ACT_SHOT on error.
3513
3514       void  *bpf_task_storage_get(struct  bpf_map  *map,  struct  task_struct
3515       *task, void *value, u64 flags)
3516
3517              Description
3518                     Get a bpf_local_storage from the task.
3519
3520                     Logically, it could be thought of as  getting  the  value
3521                     from  a map with task as the key.  From this perspective,
3522                     the    usage    is    not     much     different     from
3523                     bpf_map_lookup_elem(map,  &task)  except  this helper en‐
3524                     forces the key must be a task_struct  and  the  map  must
3525                     also be a BPF_MAP_TYPE_TASK_STORAGE.
3526
3527                     Underneath,  the  value is stored locally at task instead
3528                     of the map.  The map is  used  as  the  bpf-local-storage
3529                     "type".  The  bpf-local-storage  "type" (i.e. the map) is
3530                     searched against all bpf_local_storage residing at task.
3531
3532                     An optional flags (BPF_LOCAL_STORAGE_GET_F_CREATE) can be
3533                     used such that a new bpf_local_storage will be created if
3534                     one does not exist.  value  can  be  used  together  with
3535                     BPF_LOCAL_STORAGE_GET_F_CREATE  to  specify  the  initial
3536                     value of a bpf_local_storage.  If value is NULL, the  new
3537                     bpf_local_storage will be zero initialized.
3538
3539              Return A bpf_local_storage pointer is returned on success.
3540
3541                     NULL  if  not found or there was an error in adding a new
3542                     bpf_local_storage.
3543
3544       long bpf_task_storage_delete(struct bpf_map  *map,  struct  task_struct
3545       *task)
3546
3547              Description
3548                     Delete a bpf_local_storage from a task.
3549
3550              Return 0 on success.
3551
3552                     -ENOENT if the bpf_local_storage cannot be found.
3553
3554       struct task_struct *bpf_get_current_task_btf(void)
3555
3556              Description
3557                     Return a BTF pointer to the "current" task.  This pointer
3558                     can  also   be   used   in   helpers   that   accept   an
3559                     ARG_PTR_TO_BTF_ID of type task_struct.
3560
3561              Return Pointer to the current task.
3562
3563       long bpf_bprm_opts_set(struct linux_binprm *bprm, u64 flags)
3564
3565              Description
3566                     Set or clear certain options on bprm:
3567
3568                     BPF_F_BPRM_SECUREEXEC  Set  the secureexec bit which sets
3569                     the AT_SECURE auxv for glibc. The bit is cleared  if  the
3570                     flag is not specified.
3571
3572              Return -EINVAL if invalid flags are passed, zero otherwise.
3573
3574       u64 bpf_ktime_get_coarse_ns(void)
3575
3576              Description
3577                     Return a coarse-grained version of the time elapsed since
3578                     system boot, in nanoseconds. Does not  include  time  the
3579                     system was suspended.
3580
3581                     See: clock_gettime(CLOCK_MONOTONIC_COARSE)
3582
3583              Return Current ktime.
3584
3585       long bpf_ima_inode_hash(struct inode *inode, void *dst, u32 size)
3586
3587              Description
3588                     Returns  the stored IMA hash of the inode (if it's avail‐
3589                     able).  If the hash is larger than size, then  only  size
3590                     bytes will be copied to dst
3591
3592              Return The  hash_algo  is returned on success, -EOPNOTSUP if IMA
3593                     is disabled or -EINVAL if invalid arguments are passed.
3594
3595       struct socket *bpf_sock_from_file(struct file *file)
3596
3597              Description
3598                     If the given file represents a socket, returns the  asso‐
3599                     ciated socket.
3600
3601              Return A  pointer  to  a struct socket on success or NULL if the
3602                     file is not a socket.
3603
3604       long bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32  len_diff,
3605       u64 flags)
3606
3607              Description
3608                     Check  packet  size  against  exceeding MTU of net device
3609                     (based on ifindex).  This helper will likely be  used  in
3610                     combination  with  helpers  that adjust/change the packet
3611                     size.
3612
3613                     The argument len_diff can be used  for  querying  with  a
3614                     planned  size  change.  This allows to check MTU prior to
3615                     changing packet ctx. Providing a len_diff adjustment that
3616                     is larger than the actual packet size (resulting in nega‐
3617                     tive packet size) will in principle not exceed  the  MTU,
3618                     which  is  why it is not considered a failure.  Other BPF
3619                     helpers  are  needed  for  performing  the  planned  size
3620                     change; therefore the responsibility for catching a nega‐
3621                     tive packet size belongs in those helpers.
3622
3623                     Specifying ifindex zero means the MTU check is  performed
3624                     against  the  current  net  device.  This is practical if
3625                     this isn't used prior to redirect.
3626
3627                     On input mtu_len must be a valid pointer,  else  verifier
3628                     will  reject  BPF  program.  If the value mtu_len is ini‐
3629                     tialized to zero then the ctx packet size is  use.   When
3630                     value  mtu_len  is  provided as input this specify the L3
3631                     length that the MTU check is done against.  Remember  XDP
3632                     and TC length operate at L2, but this value is L3 as this
3633                     correlate to MTU and IP-header tot_len values  which  are
3634                     L3 (similar behavior as bpf_fib_lookup).
3635
3636                     The Linux kernel route table can configure MTUs on a more
3637                     specific per route level, which is not provided  by  this
3638                     helper.    For   route   level   MTU   checks   use   the
3639                     bpf_fib_lookup() helper.
3640
3641                     ctx is either struct xdp_md for XDP  programs  or  struct
3642                     sk_buff for tc cls_act programs.
3643
3644                     The flags argument can be a combination of one or more of
3645                     the following values:
3646
3647                     BPF_MTU_CHK_SEGS
3648                            This flag will only works for ctx struct  sk_buff.
3649                            If  packet  context  contains extra packet segment
3650                            buffers (often knows as GSO skb), then  MTU  check
3651                            is  harder  to  check  at  this  point, because in
3652                            transmit path it is possible for the skb packet to
3653                            get  re-segmented  (depending  on  net device fea‐
3654                            tures).  This could still be a MTU  violation,  so
3655                            this  flag  enables  performing  MTU check against
3656                            segments, with a different violation  return  code
3657                            to tell it apart. Check cannot use len_diff.
3658
3659                     On  return  mtu_len pointer contains the MTU value of the
3660                     net device.  Remember the net device  configured  MTU  is
3661                     the L3 size, which is returned here and XDP and TC length
3662                     operate at L2.  Helper take this into  account  for  you,
3663                     but remember when using MTU value in your BPF-code.
3664
3665              Return
3666
3667                     • 0  on  success,  and  populate  MTU  value  in  mtu_len
3668                       pointer.
3669
3670                     • < 0 if any input argument is invalid (mtu_len  not  up‐
3671                       dated)
3672
3673                     MTU  violations return positive values, but also populate
3674                     MTU value in mtu_len pointer, as this can be  needed  for
3675                     implementing PMTU handing:
3676
3677BPF_MTU_CHK_RET_FRAG_NEEDED
3678
3679BPF_MTU_CHK_RET_SEGS_TOOBIG
3680
3681       long bpf_for_each_map_elem(struct bpf_map *map, void *callback_fn, void
3682       *callback_ctx, u64 flags)
3683
3684              Description
3685                     For each element in map, call callback_fn  function  with
3686                     map, callback_ctx and other map-specific parameters.  The
3687                     callback_fn should be a static  function  and  the  call‐
3688                     back_ctx  should be a pointer to the stack.  The flags is
3689                     used to control certain  aspects  of  the  helper.   Cur‐
3690                     rently, the flags must be 0.
3691
3692                     The following are a list of supported map types and their
3693                     respective expected callback signatures:
3694
3695                     BPF_MAP_TYPE_HASH,              BPF_MAP_TYPE_PERCPU_HASH,
3696                     BPF_MAP_TYPE_LRU_HASH,      BPF_MAP_TYPE_LRU_PERCPU_HASH,
3697                     BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY
3698
3699                     long (*callback_fn)(struct bpf_map *map, const void *key,
3700                     void *value, void *ctx);
3701
3702                     For  per_cpu  maps, the map_value is the value on the cpu
3703                     where the bpf_prog is running.
3704
3705                     If callback_fn return 0, the helper will continue to  the
3706                     next  element. If return value is 1, the helper will skip
3707                     the rest of elements and return. Other return values  are
3708                     not used now.
3709
3710              Return The number of traversed map elements for success, -EINVAL
3711                     for invalid flags.
3712
3713       long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64  *data,
3714       u32 data_len)
3715
3716              Description
3717                     Outputs  a  string  into  the str buffer of size str_size
3718                     based on a  format  string  stored  in  a  read-only  map
3719                     pointed by fmt.
3720
3721                     Each  format specifier in fmt corresponds to one u64 ele‐
3722                     ment in the data array. For strings  and  pointers  where
3723                     pointees are accessed, only the pointer values are stored
3724                     in the data array. The data_len is the size  of  data  in
3725                     bytes - must be a multiple of 8.
3726
3727                     Formats  %s  and %p{i,I}{4,6} require to read kernel mem‐
3728                     ory. Reading kernel memory may fail due to either invalid
3729                     address  or  valid  address  but requiring a major memory
3730                     fault. If reading kernel memory fails, the string for  %s
3731                     will   be  an  empty  string,  and  the  ip  address  for
3732                     %p{i,I}{4,6} will be 0.  Not returning error to bpf  pro‐
3733                     gram  is consistent with what bpf_trace_printk() does for
3734                     now.
3735
3736              Return The strictly positive length of the formatted string, in‐
3737                     cluding  the trailing zero character. If the return value
3738                     is  greater  than  str_size,  str  contains  a  truncated
3739                     string,  guaranteed  to  be  zero-terminated  except when
3740                     str_size is 0.
3741
3742                     Or -EBUSY if the per-CPU memory copy buffer is busy.
3743
3744       long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
3745
3746              Description
3747                     Execute bpf syscall with given arguments.
3748
3749              Return A syscall result.
3750
3751       long bpf_btf_find_by_name_kind(char *name, int name_sz, u32  kind,  int
3752       flags)
3753
3754              Description
3755                     Find  BTF type with given name and kind in vmlinux BTF or
3756                     in module's BTFs.
3757
3758              Return Returns btf_id and btf_obj_fd in lower and upper 32 bits.
3759
3760       long bpf_sys_close(u32 fd)
3761
3762              Description
3763                     Execute close syscall for given FD.
3764
3765              Return A syscall result.
3766
3767       long bpf_timer_init(struct bpf_timer *timer, struct bpf_map  *map,  u64
3768       flags)
3769
3770              Description
3771                     Initialize  the  timer.   First  4  bits of flags specify
3772                     clockid.     Only    CLOCK_MONOTONIC,     CLOCK_REALTIME,
3773                     CLOCK_BOOTTIME  are allowed.  All other bits of flags are
3774                     reserved.  The verifier will reject the program if  timer
3775                     is not from the same map.
3776
3777              Return 0  on  success.   -EBUSY if timer is already initialized.
3778                     -EINVAL if invalid flags are passed.  -EPERM if timer  is
3779                     in a map that doesn't have any user references.  The user
3780                     space should either hold a file descriptor to a map  with
3781                     timers  or pin such map in bpffs. When map is unpinned or
3782                     file descriptor is closed all timers in the map  will  be
3783                     cancelled and freed.
3784
3785       long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn)
3786
3787              Description
3788                     Configure the timer to call callback_fn static function.
3789
3790              Return 0  on success.  -EINVAL if timer was not initialized with
3791                     bpf_timer_init() earlier.  -EPERM if timer is  in  a  map
3792                     that  doesn't  have  any user references.  The user space
3793                     should either hold a file descriptor to a map with timers
3794                     or  pin  such  map in bpffs. When map is unpinned or file
3795                     descriptor is closed all timers in the map will  be  can‐
3796                     celled and freed.
3797
3798       long bpf_timer_start(struct bpf_timer *timer, u64 nsecs, u64 flags)
3799
3800              Description
3801                     Set timer expiration N nanoseconds from the current time.
3802                     The configured callback will be invoked in soft irq  con‐
3803                     text  on  some  cpu  and  will  not repeat unless another
3804                     bpf_timer_start() is made.  In such case the next invoca‐
3805                     tion  can  migrate  to  a  different  cpu.   Since struct
3806                     bpf_timer is a field inside map element the map owns  the
3807                     timer. The bpf_timer_set_callback() will increment refcnt
3808                     of BPF program to make sure that callback_fn  code  stays
3809                     valid.   When  user space reference to a map reaches zero
3810                     all timers in a map are cancelled and corresponding  pro‐
3811                     gram's refcnts are decremented. This is done to make sure
3812                     that Ctrl-C of a user process doesn't  leave  any  timers
3813                     running.  If  map  is pinned in bpffs the callback_fn can
3814                     re-arm itself indefinitely.  bpf_map_update/delete_elem()
3815                     helpers  and  user space sys_bpf commands cancel and free
3816                     the timer in the given map element.  The map can  contain
3817                     timers that invoke callback_fn-s from different programs.
3818                     The same callback_fn can serve different timers from dif‐
3819                     ferent maps if key/value layout matches across maps.  Ev‐
3820                     ery bpf_timer_set_callback()  can  have  different  call‐
3821                     back_fn.
3822
3823              Return 0  on success.  -EINVAL if timer was not initialized with
3824                     bpf_timer_init() earlier or invalid flags are passed.
3825
3826       long bpf_timer_cancel(struct bpf_timer *timer)
3827
3828              Description
3829                     Cancel the timer and wait for callback_fn to finish if it
3830                     was running.
3831
3832              Return 0  if  the  timer was not active.  1 if the timer was ac‐
3833                     tive.   -EINVAL  if  timer  was  not   initialized   with
3834                     bpf_timer_init()  earlier.  -EDEADLK if callback_fn tried
3835                     to call bpf_timer_cancel() on its own timer  which  would
3836                     have led to a deadlock otherwise.
3837
3838       u64 bpf_get_func_ip(void *ctx)
3839
3840              Description
3841                     Get  address  of  the  traced  function  (for tracing and
3842                     kprobe programs).
3843
3844              Return Address of the traced function.   0  for  kprobes  placed
3845                     within the function (not at the entry).
3846
3847       u64 bpf_get_attach_cookie(void *ctx)
3848
3849              Description
3850                     Get  bpf_cookie  value  provided  (optionally) during the
3851                     program attachment. It might be different for each  indi‐
3852                     vidual  attachment,  even  if  BPF  program itself is the
3853                     same.  Expects BPF program context ctx as a  first  argu‐
3854                     ment.
3855
3856                     Supported for the following program types:
3857
3858                            • kprobe/uprobe;
3859
3860                            • tracepoint;
3861
3862                            • perf_event.
3863
3864              Return Value  specified  by user at BPF link creation/attachment
3865                     time or 0, if it was not specified.
3866
3867       long bpf_task_pt_regs(struct task_struct *task)
3868
3869              Description
3870                     Get the struct pt_regs associated with task.
3871
3872              Return A pointer to struct pt_regs.
3873
3874       long bpf_get_branch_snapshot(void *entries, u32 size, u64 flags)
3875
3876              Description
3877                     Get branch trace from hardware engines  like  Intel  LBR.
3878                     The  hardware  engine is stopped shortly after the helper
3879                     is called. Therefore, the user need to filter branch  en‐
3880                     tries  based  on  the  actual use case. To capture branch
3881                     trace before the trigger point of the  BPF  program,  the
3882                     helper  should be called at the beginning of the BPF pro‐
3883                     gram.
3884
3885                     The data is stored as struct perf_branch_entry into  out‐
3886                     put buffer entries. size is the size of entries in bytes.
3887                     flags is reserved for now and must be zero.
3888
3889              Return On success, number of bytes written to buf. On  error,  a
3890                     negative value.
3891
3892                     -EINVAL if flags is not zero.
3893
3894                     -ENOENT if architecture does not support branch records.
3895
3896       long bpf_trace_vprintk(const char *fmt, u32 fmt_size, const void *data,
3897       u32 data_len)
3898
3899              Description
3900                     Behaves like bpf_trace_printk() helper, but takes an  ar‐
3901                     ray of u64 to format and can handle more format args as a
3902                     result.
3903
3904                     Arguments are to be used as in bpf_seq_printf() helper.
3905
3906              Return The number of bytes written to the buffer, or a  negative
3907                     error in case of failure.
3908
3909       struct unix_sock *bpf_skc_to_unix_sock(void *sk)
3910
3911              Description
3912                     Dynamically cast a sk pointer to a unix_sock pointer.
3913
3914              Return sk if casting is valid, or NULL otherwise.
3915
3916       long bpf_kallsyms_lookup_name(const char *name, int name_sz, int flags,
3917       u64 *res)
3918
3919              Description
3920                     Get the address of a kernel symbol, returned in res.  res
3921                     is set to 0 if the symbol is not found.
3922
3923              Return On success, zero. On error, a negative value.
3924
3925                     -EINVAL if flags is not zero.
3926
3927                     -EINVAL if string name is not the same size as name_sz.
3928
3929                     -ENOENT if symbol is not found.
3930
3931                     -EPERM  if caller does not have permission to obtain ker‐
3932                     nel address.
3933
3934       long bpf_find_vma(struct  task_struct  *task,  u64  addr,  void  *call‐
3935       back_fn, void *callback_ctx, u64 flags)
3936
3937              Description
3938                     Find  vma  of  task  that contains addr, call callback_fn
3939                     function with task, vma,  and  callback_ctx.   The  call‐
3940                     back_fn  should be a static function and the callback_ctx
3941                     should be a pointer to the stack.  The flags is  used  to
3942                     control  certain  aspects  of the helper.  Currently, the
3943                     flags must be 0.
3944
3945                     The expected callback signature is
3946
3947                     long  (*callback_fn)(struct  task_struct  *task,   struct
3948                     vm_area_struct *vma, void *callback_ctx);
3949
3950              Return 0  on  success.   -ENOENT  if task->mm is NULL, or no vma
3951                     contains addr.  -EBUSY if failed to try  lock  mmap_lock.
3952                     -EINVAL for invalid flags.
3953
3954       long  bpf_loop(u32 nr_loops, void *callback_fn, void *callback_ctx, u64
3955       flags)
3956
3957              Description
3958                     For nr_loops, call callback_fn function with callback_ctx
3959                     as  the  context  parameter.  The callback_fn should be a
3960                     static function and the callback_ctx should be a  pointer
3961                     to  the  stack.  The flags is used to control certain as‐
3962                     pects of the helper.  Currently, the  flags  must  be  0.
3963                     Currently,  nr_loops  is  limited to 1 << 23 (~8 million)
3964                     loops.
3965
3966                     long (*callback_fn)(u32 index, void *ctx);
3967
3968                     where index is the current index in the loop.  The  index
3969                     is zero-indexed.
3970
3971                     If callback_fn returns 0, the helper will continue to the
3972                     next loop. If return value is 1, the helper will skip the
3973                     rest of the loops and return. Other return values are not
3974                     used now, and will be rejected by the verifier.
3975
3976              Return The number of loops performed, -EINVAL for invalid flags,
3977                     -E2BIG if nr_loops exceeds the maximum number of loops.
3978
3979       long bpf_strncmp(const char *s1, u32 s1_sz, const char *s2)
3980
3981              Description
3982                     Do  strncmp()  between  s1  and s2. s1 doesn't need to be
3983                     null-terminated and s1_sz is the maximum storage size  of
3984                     s1. s2 must be a read-only string.
3985
3986              Return An  integer  less than, equal to, or greater than zero if
3987                     the first s1_sz bytes of s1 is found to be less than,  to
3988                     match, or be greater than s2.
3989
3990       long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
3991
3992              Description
3993                     Get  n-th  argument  register  (zero based) of the traced
3994                     function (for tracing programs) returned in value.
3995
3996              Return 0 on success.  -EINVAL if n >= argument register count of
3997                     traced function.
3998
3999       long bpf_get_func_ret(void *ctx, u64 *value)
4000
4001              Description
4002                     Get return value of the traced function (for tracing pro‐
4003                     grams) in value.
4004
4005              Return 0 on success.  -EOPNOTSUPP  for  tracing  programs  other
4006                     than BPF_TRACE_FEXIT or BPF_MODIFY_RETURN.
4007
4008       long bpf_get_func_arg_cnt(void *ctx)
4009
4010              Description
4011                     Get number of registers of the traced function (for trac‐
4012                     ing programs) where  function  arguments  are  stored  in
4013                     these registers.
4014
4015              Return The number of argument registers of the traced function.
4016
4017       int bpf_get_retval(void)
4018
4019              Description
4020                     Get  the BPF program's return value that will be returned
4021                     to the upper layers.
4022
4023                     This helper is currently supported by cgroup programs and
4024                     only by the hooks where BPF program's return value is re‐
4025                     turned to the userspace via errno.
4026
4027              Return The BPF program's return value.
4028
4029       int bpf_set_retval(int retval)
4030
4031              Description
4032                     Set the BPF program's return value that will be  returned
4033                     to the upper layers.
4034
4035                     This helper is currently supported by cgroup programs and
4036                     only by the hooks where BPF program's return value is re‐
4037                     turned to the userspace via errno.
4038
4039                     Note  that  there  is the following corner case where the
4040                     program exports an error via bpf_set_retval  but  signals
4041                     success via 'return 1':
4042                        bpf_set_retval(-EPERM); return 1;
4043
4044                     In  this  case,  the  BPF program's return value will use
4045                     helper's   -EPERM.   This   still    holds    true    for
4046                     cgroup/bind{4,6}  which supports extra 'return 3' success
4047                     case.
4048
4049              Return 0 on success, or a negative error in case of failure.
4050
4051       u64 bpf_xdp_get_buff_len(struct xdp_buff *xdp_md)
4052
4053              Description
4054                     Get the total size of a given xdp buff (linear and  paged
4055                     area)
4056
4057              Return The total size of a given xdp buffer.
4058
4059       long bpf_xdp_load_bytes(struct xdp_buff *xdp_md, u32 offset, void *buf,
4060       u32 len)
4061
4062              Description
4063                     This helper is provided as an easy way to load data  from
4064                     a  xdp buffer. It can be used to load len bytes from off‐
4065                     set from the frame associated to xdp_md, into the  buffer
4066                     pointed by buf.
4067
4068              Return 0 on success, or a negative error in case of failure.
4069
4070       long  bpf_xdp_store_bytes(struct  xdp_buff  *xdp_md,  u32  offset, void
4071       *buf, u32 len)
4072
4073              Description
4074                     Store len bytes from buffer buf into the frame associated
4075                     to xdp_md, at offset.
4076
4077              Return 0 on success, or a negative error in case of failure.
4078
4079       long bpf_copy_from_user_task(void *dst, u32 size, const void *user_ptr,
4080       struct task_struct *tsk, u64 flags)
4081
4082              Description
4083                     Read size bytes from user space address user_ptr in tsk's
4084                     address  space,  and stores the data in dst. flags is not
4085                     used yet and is provided for future  extensibility.  This
4086                     helper can only be used by sleepable programs.
4087
4088              Return 0  on success, or a negative error in case of failure. On
4089                     error dst buffer is zeroed out.
4090
4091       long  bpf_skb_set_tstamp(struct   sk_buff   *skb,   u64   tstamp,   u32
4092       tstamp_type)
4093
4094              Description
4095                     Change  the __sk_buff->tstamp_type to tstamp_type and set
4096                     tstamp to the __sk_buff->tstamp together.
4097
4098                     If there is no need to change the __sk_buff->tstamp_type,
4099                     the   tstamp   value   can   be   directly   written   to
4100                     __sk_buff->tstamp instead.
4101
4102                     BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that will
4103                     be  kept during bpf_redirect_*().  A non zero tstamp must
4104                     be    used    with    the    BPF_SKB_TSTAMP_DELIVERY_MONO
4105                     tstamp_type.
4106
4107                     A BPF_SKB_TSTAMP_UNSPEC tstamp_type can only be used with
4108                     a zero tstamp.
4109
4110                     Only IPv4 and IPv6 skb->protocol are supported.
4111
4112                     This function is most useful when it needs to set a  mono
4113                     delivery  time  to  __sk_buff->tstamp  and then bpf_redi‐
4114                     rect_*() to the egress of an iface.  For example,  chang‐
4115                     ing  the  (rcv) timestamp in __sk_buff->tstamp at ingress
4116                     to a mono delivery  time  and  then  bpf_redirect_*()  to
4117                     sch_fq@phy-dev.
4118
4119              Return 0  on success.  -EINVAL for invalid input -EOPNOTSUPP for
4120                     unsupported protocol
4121
4122       long bpf_ima_file_hash(struct file *file, void *dst, u32 size)
4123
4124              Description
4125                     Returns a calculated IMA hash of the file.  If  the  hash
4126                     is  larger than size, then only size bytes will be copied
4127                     to dst
4128
4129              Return The hash_algo is returned on success, -EOPNOTSUP  if  the
4130                     hash  calculation  failed or -EINVAL if invalid arguments
4131                     are passed.
4132
4133       void *bpf_kptr_xchg(void *map_value, void *ptr)
4134
4135              Description
4136                     Exchange kptr at pointer map_value with ptr,  and  return
4137                     the  old  value.  ptr can be NULL, otherwise it must be a
4138                     referenced pointer  which  will  be  released  when  this
4139                     helper is called.
4140
4141              Return The  old  value of kptr (which can be NULL). The returned
4142                     pointer if not NULL, is a reference  which  must  be  re‐
4143                     leased using its corresponding release function, or moved
4144                     into a BPF map before program exit.
4145
4146       void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void  *key,
4147       u32 cpu)
4148
4149              Description
4150                     Perform a lookup in percpu map for an entry associated to
4151                     key on cpu.
4152
4153              Return Map value associated to key on cpu, or NULL if  no  entry
4154                     was found or cpu is invalid.
4155
4156       struct mptcp_sock *bpf_skc_to_mptcp_sock(void *sk)
4157
4158              Description
4159                     Dynamically cast a sk pointer to a mptcp_sock pointer.
4160
4161              Return sk if casting is valid, or NULL otherwise.
4162
4163       long  bpf_dynptr_from_mem(void  *data,  u32  size,  u64  flags,  struct
4164       bpf_dynptr *ptr)
4165
4166              Description
4167                     Get a dynptr to local memory data.
4168
4169                     data must be a ptr to a map value.  The maximum size sup‐
4170                     ported is DYNPTR_MAX_SIZE.  flags is currently unused.
4171
4172              Return 0 on success, -E2BIG if the size exceeds DYNPTR_MAX_SIZE,
4173                     -EINVAL if flags is not 0.
4174
4175       long bpf_ringbuf_reserve_dynptr(void *ringbuf,  u32  size,  u64  flags,
4176       struct bpf_dynptr *ptr)
4177
4178              Description
4179                     Reserve  size  bytes  of payload in a ring buffer ringbuf
4180                     through the dynptr interface. flags must be 0.
4181
4182                     Please  note  that   a   corresponding   bpf_ringbuf_sub‐
4183                     mit_dynptr  or  bpf_ringbuf_discard_dynptr must be called
4184                     on ptr, even if the reservation fails. This  is  enforced
4185                     by the verifier.
4186
4187              Return 0 on success, or a negative error in case of failure.
4188
4189       void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags)
4190
4191              Description
4192                     Submit  reserved  ring buffer sample, pointed to by data,
4193                     through the dynptr interface. This  is  a  no-op  if  the
4194                     dynptr is invalid/null.
4195
4196                     For  more  information  on  flags,  please see 'bpf_ring‐
4197                     buf_submit'.
4198
4199              Return Nothing. Always succeeds.
4200
4201       void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags)
4202
4203              Description
4204                     Discard reserved ring buffer sample  through  the  dynptr
4205                     interface. This is a no-op if the dynptr is invalid/null.
4206
4207                     For  more  information  on  flags,  please see 'bpf_ring‐
4208                     buf_discard'.
4209
4210              Return Nothing. Always succeeds.
4211
4212       long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr  *src,
4213       u32 offset, u64 flags)
4214
4215              Description
4216                     Read  len  bytes  from src into dst, starting from offset
4217                     into src.  flags is currently unused.
4218
4219              Return 0 on success, -E2BIG if offset + len exceeds  the  length
4220                     of  src's data, -EINVAL if src is an invalid dynptr or if
4221                     flags is not 0.
4222
4223       long bpf_dynptr_write(const struct bpf_dynptr *dst,  u32  offset,  void
4224       *src, u32 len, u64 flags)
4225
4226              Description
4227                     Write  len  bytes from src into dst, starting from offset
4228                     into dst.  flags is currently unused.
4229
4230              Return 0 on success, -E2BIG if offset + len exceeds  the  length
4231                     of  dst's data, -EINVAL if dst is an invalid dynptr or if
4232                     dst is a read-only dynptr or if flags is not 0.
4233
4234       void *bpf_dynptr_data(const struct bpf_dynptr  *ptr,  u32  offset,  u32
4235       len)
4236
4237              Description
4238                     Get a pointer to the underlying dynptr data.
4239
4240                     len  must  be a statically known value. The returned data
4241                     slice is invalidated whenever the dynptr is invalidated.
4242
4243              Return Pointer to the underlying dynptr data, NULL if the dynptr
4244                     is  read-only, if the dynptr is invalid, or if the offset
4245                     and length is out of bounds.
4246
4247       s64 bpf_tcp_raw_gen_syncookie_ipv4(struct  iphdr  *iph,  struct  tcphdr
4248       *th, u32 th_len)
4249
4250              Description
4251                     Try to issue a SYN cookie for the packet with correspond‐
4252                     ing IPv4/TCP headers, iph and th, without depending on  a
4253                     listening socket.
4254
4255                     iph points to the IPv4 header.
4256
4257                     th  points  to  the start of the TCP header, while th_len
4258                     contains  the  length  of  the  TCP  header   (at   least
4259                     sizeof(struct tcphdr)).
4260
4261              Return On  success,  lower 32 bits hold the generated SYN cookie
4262                     in followed by 16 bits which hold the MSS value for  that
4263                     cookie, and the top 16 bits are unused.
4264
4265                     On failure, the returned value is one of the following:
4266
4267                     -EINVAL if th_len is invalid.
4268
4269       s64  bpf_tcp_raw_gen_syncookie_ipv6(struct  ipv6hdr *iph, struct tcphdr
4270       *th, u32 th_len)
4271
4272              Description
4273                     Try to issue a SYN cookie for the packet with correspond‐
4274                     ing  IPv6/TCP headers, iph and th, without depending on a
4275                     listening socket.
4276
4277                     iph points to the IPv6 header.
4278
4279                     th points to the start of the TCP  header,  while  th_len
4280                     contains   the   length  of  the  TCP  header  (at  least
4281                     sizeof(struct tcphdr)).
4282
4283              Return On success, lower 32 bits hold the generated  SYN  cookie
4284                     in  followed by 16 bits which hold the MSS value for that
4285                     cookie, and the top 16 bits are unused.
4286
4287                     On failure, the returned value is one of the following:
4288
4289                     -EINVAL if th_len is invalid.
4290
4291                     -EPROTONOSUPPORT if CONFIG_IPV6 is not builtin.
4292
4293       long bpf_tcp_raw_check_syncookie_ipv4(struct iphdr *iph, struct  tcphdr
4294       *th)
4295
4296              Description
4297                     Check  whether  iph and th contain a valid SYN cookie ACK
4298                     without depending on a listening socket.
4299
4300                     iph points to the IPv4 header.
4301
4302                     th points to the TCP header.
4303
4304              Return 0 if iph and th are a valid SYN cookie ACK.
4305
4306                     On failure, the returned value is one of the following:
4307
4308                     -EACCES if the SYN cookie is not valid.
4309
4310       long  bpf_tcp_raw_check_syncookie_ipv6(struct  ipv6hdr   *iph,   struct
4311       tcphdr *th)
4312
4313              Description
4314                     Check  whether  iph and th contain a valid SYN cookie ACK
4315                     without depending on a listening socket.
4316
4317                     iph points to the IPv6 header.
4318
4319                     th points to the TCP header.
4320
4321              Return 0 if iph and th are a valid SYN cookie ACK.
4322
4323                     On failure, the returned value is one of the following:
4324
4325                     -EACCES if the SYN cookie is not valid.
4326
4327                     -EPROTONOSUPPORT if CONFIG_IPV6 is not builtin.
4328
4329       u64 bpf_ktime_get_tai_ns(void)
4330
4331              Description
4332                     A nonsettable system-wide clock derived  from  wall-clock
4333                     time  but ignoring leap seconds.  This clock does not ex‐
4334                     perience discontinuities and backwards  jumps  caused  by
4335                     NTP inserting leap seconds as CLOCK_REALTIME does.
4336
4337                     See: clock_gettime(CLOCK_TAI)
4338
4339              Return Current ktime.
4340
4341       long  bpf_user_ringbuf_drain(struct  bpf_map  *map,  void *callback_fn,
4342       void *ctx, u64 flags)
4343
4344              Description
4345                     Drain samples from the specified user  ring  buffer,  and
4346                     invoke the provided callback for each such sample:
4347
4348                     long (*callback_fn)(const struct bpf_dynptr *dynptr, void
4349                     *ctx);
4350
4351                     If callback_fn returns 0, the helper will continue to try
4352                     and   drain   the   next  sample,  up  to  a  maximum  of
4353                     BPF_MAX_USER_RINGBUF_SAMPLES samples. If the return value
4354                     is  1,  the  helper will skip the rest of the samples and
4355                     return. Other return values are not used now, and will be
4356                     rejected by the verifier.
4357
4358              Return The number of drained samples if no error was encountered
4359                     while draining samples, or 0 if no samples  were  present
4360                     in   the  ring  buffer.  If  a  user-space  producer  was
4361                     epoll-waiting on this map, and at least  one  sample  was
4362                     drained,  they will receive an event notification notify‐
4363                     ing them of available space in the ring  buffer.  If  the
4364                     BPF_RB_NO_WAKEUP  flag  is  passed  to  this function, no
4365                     wakeup   notification    will    be    sent.    If    the
4366                     BPF_RB_FORCE_WAKEUP flag is passed, a wakeup notification
4367                     will be sent even if no sample was drained.
4368
4369                     On failure, the returned value is one of the following:
4370
4371                     -EBUSY if the ring buffer is contended, and another call‐
4372                     ing context was concurrently draining the ring buffer.
4373
4374                     -EINVAL  if  user-space is not properly tracking the ring
4375                     buffer due to the producer position not being aligned  to
4376                     8  bytes,  a  sample not being aligned to 8 bytes, or the
4377                     producer position not matching the advertised length of a
4378                     sample.
4379
4380                     -E2BIG  if user-space has tried to publish a sample which
4381                     is larger than the size of the ring buffer, or which can‐
4382                     not fit within a struct bpf_dynptr.
4383
4384       void  *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup,
4385       void *value, u64 flags)
4386
4387              Description
4388                     Get a bpf_local_storage from the cgroup.
4389
4390                     Logically, it could be thought of as  getting  the  value
4391                     from  a  map  with cgroup as the key.  From this perspec‐
4392                     tive,    the   usage   is   not   much   different   from
4393                     bpf_map_lookup_elem(map,  &cgroup) except this helper en‐
4394                     forces the key must be a cgroup struct and the  map  must
4395                     also be a BPF_MAP_TYPE_CGRP_STORAGE.
4396
4397                     In  reality, the local-storage value is embedded directly
4398                     inside of the cgroup object itself, rather than being lo‐
4399                     cated  in the BPF_MAP_TYPE_CGRP_STORAGE map. When the lo‐
4400                     cal-storage value is queried for some map on a cgroup ob‐
4401                     ject,  the kernel will perform an O(n) iteration over all
4402                     of the live local-storage values for that  cgroup  object
4403                     until the local-storage value for the map is found.
4404
4405                     An optional flags (BPF_LOCAL_STORAGE_GET_F_CREATE) can be
4406                     used such that a new bpf_local_storage will be created if
4407                     one  does  not  exist.   value  can be used together with
4408                     BPF_LOCAL_STORAGE_GET_F_CREATE  to  specify  the  initial
4409                     value  of a bpf_local_storage.  If value is NULL, the new
4410                     bpf_local_storage will be zero initialized.
4411
4412              Return A bpf_local_storage pointer is returned on success.
4413
4414                     NULL if not found or there was an error in adding  a  new
4415                     bpf_local_storage.
4416
4417       long   bpf_cgrp_storage_delete(struct   bpf_map   *map,  struct  cgroup
4418       *cgroup)
4419
4420              Description
4421                     Delete a bpf_local_storage from a cgroup.
4422
4423              Return 0 on success.
4424
4425                     -ENOENT if the bpf_local_storage cannot be found.
4426

EXAMPLES

4428       Example usage for most of the eBPF helpers listed in this  manual  page
4429       are  available  within the Linux kernel sources, at the following loca‐
4430       tions:
4431
4432samples/bpf/
4433
4434tools/testing/selftests/bpf/
4435

LICENSE

4437       eBPF programs can have an associated license,  passed  along  with  the
4438       bytecode  instructions  to the kernel when the programs are loaded. The
4439       format for that string is identical to the one in use for  kernel  mod‐
4440       ules  (Dual licenses, such as "Dual BSD/GPL", may be used). Some helper
4441       functions are only accessible to programs that are compatible with  the
4442       GNU Privacy License (GPL).
4443
4444       In  order to use such helpers, the eBPF program must be loaded with the
4445       correct license string passed (via attr) to the bpf() system call,  and
4446       this  generally  translates  into the C source code of the program con‐
4447       taining a line similar to the following:
4448
4449          char ____license[] __attribute__((section("license"), used)) = "GPL";
4450

IMPLEMENTATION

4452       This manual page is an effort to  document  the  existing  eBPF  helper
4453       functions.   But  as of this writing, the BPF sub-system is under heavy
4454       development. New eBPF program or map types are added,  along  with  new
4455       helper  functions. Some helpers are occasionally made available for ad‐
4456       ditional program types. So in spite of the efforts  of  the  community,
4457       this  page  might  not  be up-to-date. If you want to check by yourself
4458       what helper functions exist in your kernel, or what types  of  programs
4459       they  can  support,  here are some files among the kernel tree that you
4460       may be interested in:
4461
4462include/uapi/linux/bpf.h is the main BPF header. It contains the full
4463         list  of  all helper functions, as well as many other BPF definitions
4464         including most of  the  flags,  structs  or  constants  used  by  the
4465         helpers.
4466
4467net/core/filter.c  contains  the  definition  of most network-related
4468         helper functions, and the list of program types from which  they  can
4469         be used.
4470
4471kernel/trace/bpf_trace.c  is  the  equivalent  for  most tracing pro‐
4472         gram-related helpers.
4473
4474kernel/bpf/verifier.c contains the functions used to check that valid
4475         types of eBPF maps are used with a given helper function.
4476
4477kernel/bpf/  directory  contains  other  files  in  which  additional
4478         helpers are defined (for cgroups, sockmaps, etc.).
4479
4480       • The bpftool utility can be used to probe the availability  of  helper
4481         functions  on the system (as well as supported program and map types,
4482         and a number of other parameters). To  do  so,  run  bpftool  feature
4483         probe (see bpftool-feature(8) for details). Add the unprivileged key‐
4484         word to list features available to unprivileged users.
4485
4486       Compatibility between helper functions and program types can  generally
4487       be  found in the files where helper functions are defined. Look for the
4488       struct bpf_func_proto objects and for functions returning  them:  these
4489       functions contain a list of helpers that a given program type can call.
4490       Note that the default: label of the switch  ...  case  used  to  filter
4491       helpers  can  call other functions, themselves allowing access to addi‐
4492       tional helpers. The requirement for GPL license is also in those struct
4493       bpf_func_proto.
4494
4495       Compatibility  between  helper  functions and map types can be found in
4496       the check_map_func_compatibility() function  in  file  kernel/bpf/veri‐
4497       fier.c.
4498
4499       Helper functions that invalidate the checks on data and data_end point‐
4500       ers    for    network    processing    are    listed    in     function
4501       bpf_helper_changes_pkt_data() in file net/core/filter.c.
4502

SEE ALSO

4504       bpf(2),  bpftool(8), cgroups(7), ip(8), perf_event_open(2), sendmsg(2),
4505       socket(7), tc-bpf(8)
4506
4507
4508
4509
4510Linux v6.2                        2023-04-11                    BPF-HELPERS(7)
Impressum