1BPF-HELPERS(7)         Miscellaneous Information Manual         BPF-HELPERS(7)
2
3
4

NAME

6       BPF-HELPERS - list of eBPF helper functions
7

DESCRIPTION

9       The  extended  Berkeley Packet Filter (eBPF) subsystem consists in pro‐
10       grams written in a pseudo-assembly language, then attached  to  one  of
11       the  several  kernel hooks and run in reaction of specific events. This
12       framework differs from the older, "classic" BPF (or "cBPF") in  several
13       aspects,  one  of  them being the ability to call special functions (or
14       "helpers") from within a program.  These functions are restricted to  a
15       white-list of helpers defined in the kernel.
16
17       These helpers are used by eBPF programs to interact with the system, or
18       with the context in which they work. For instance, they can be used  to
19       print  debugging messages, to get the time since the system was booted,
20       to interact with eBPF maps, or to  manipulate  network  packets.  Since
21       there  are  several eBPF program types, and that they do not run in the
22       same context, each program  type  can  only  call  a  subset  of  those
23       helpers.
24
25       Due  to  eBPF  conventions,  a helper can not have more than five argu‐
26       ments.
27
28       Internally, eBPF programs call directly into the compiled helper  func‐
29       tions  without  requiring  any foreign-function interface. As a result,
30       calling helpers introduces no overhead, thus offering excellent perfor‐
31       mance.
32
33       This  document is an attempt to list and document the helpers available
34       to eBPF developers. They are sorted by chronological order (the  oldest
35       helpers in the kernel at the top).
36

HELPERS

38       void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
39
40              Description
41                     Perform a lookup in map for an entry associated to key.
42
43              Return Map  value  associated  to  key,  or NULL if no entry was
44                     found.
45
46       long bpf_map_update_elem(struct bpf_map *map, const  void  *key,  const
47       void *value, u64 flags)
48
49              Description
50                     Add or update the value of the entry associated to key in
51                     map with value. flags is one of:
52
53                     BPF_NOEXIST
54                            The entry for key must not exist in the map.
55
56                     BPF_EXIST
57                            The entry for key must already exist in the map.
58
59                     BPF_ANY
60                            No condition on the existence  of  the  entry  for
61                            key.
62
63                     Flag  value  BPF_NOEXIST cannot be used for maps of types
64                     BPF_MAP_TYPE_ARRAY or BPF_MAP_TYPE_PERCPU_ARRAY  (all el‐
65                     ements always exist), the helper would return an error.
66
67              Return 0 on success, or a negative error in case of failure.
68
69       long bpf_map_delete_elem(struct bpf_map *map, const void *key)
70
71              Description
72                     Delete entry with key from map.
73
74              Return 0 on success, or a negative error in case of failure.
75
76       long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr)
77
78              Description
79                     For  tracing  programs, safely attempt to read size bytes
80                     from kernel space address unsafe_ptr and store  the  data
81                     in dst.
82
83                     Generally,       use       bpf_probe_read_user()       or
84                     bpf_probe_read_kernel() instead.
85
86              Return 0 on success, or a negative error in case of failure.
87
88       u64 bpf_ktime_get_ns(void)
89
90              Description
91                     Return the time elapsed since system  boot,  in  nanosec‐
92                     onds.   Does  not  include time the system was suspended.
93                     See: clock_gettime(CLOCK_MONOTONIC)
94
95              Return Current ktime.
96
97       long bpf_trace_printk(const char *fmt, u32 fmt_size, ...)
98
99              Description
100                     This helper is a "printk()-like" facility for  debugging.
101                     It  prints  a  message  defined  by  format  fmt (of size
102                     fmt_size) to  file  /sys/kernel/debug/tracing/trace  from
103                     DebugFS, if available. It can take up to three additional
104                     u64 arguments (as an eBPF helpers, the  total  number  of
105                     arguments is limited to five).
106
107                     Each  time the helper is called, it appends a line to the
108                     trace.  Lines are discarded while /sys/kernel/debug/trac‐
109                     ing/trace    is    open,    use   /sys/kernel/debug/trac‐
110                     ing/trace_pipe to avoid this.  The format of the trace is
111                     customizable,  and  the exact output one will get depends
112                     on the options set in /sys/kernel/debug/tracing/trace_op‐
113                     tions  (see  also  the  README file under the same direc‐
114                     tory). However, it usually defaults to something like:
115
116                        telnet-470   [001] .N.. 419421.045894: 0x00000001: <formatted msg>
117
118                     In the above:
119
120telnet is the name of the current task.
121
122470 is the PID of the current task.
123
124001 is the CPU number on which the task is running.
125
126                        • In .N.., each character refers to a set  of  options
127                          (whether   irqs  are  enabled,  scheduling  options,
128                          whether hard/softirqs are  running,  level  of  pre‐
129                          empt_disabled    respectively).    N    means   that
130                          TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED are set.
131
132419421.045894 is a timestamp.
133
1340x00000001 is a fake value used by BPF for  the  in‐
135                          struction pointer register.
136
137<formatted msg> is the message formatted with fmt.
138
139                     The  conversion  specifiers supported by fmt are similar,
140                     but more limited than for printk(). They are %d, %i,  %u,
141                     %x,  %ld,  %li, %lu, %lx, %lld, %lli, %llu, %llx, %p, %s.
142                     No modifier (size of field, padding with zeroes, etc.) is
143                     available,  and the helper will return -EINVAL (but print
144                     nothing) if it encounters an unknown specifier.
145
146                     Also, note that bpf_trace_printk() is  slow,  and  should
147                     only  be  used for debugging purposes. For this reason, a
148                     notice block (spanning several lines) is printed to  ker‐
149                     nel  logs  and  states that the helper should not be used
150                     "for production use" the first time this helper  is  used
151                     (or more precisely, when trace_printk() buffers are allo‐
152                     cated). For passing values to  user  space,  perf  events
153                     should be preferred.
154
155              Return The  number of bytes written to the buffer, or a negative
156                     error in case of failure.
157
158       u32 bpf_get_prandom_u32(void)
159
160              Description
161                     Get a pseudo-random number.
162
163                     From a security point of view, this helper uses  its  own
164                     pseudo-random internal state, and cannot be used to infer
165                     the seed of other random functions in  the  kernel.  How‐
166                     ever,  it is essential to note that the generator used by
167                     the helper is not cryptographically secure.
168
169              Return A random 32-bit unsigned value.
170
171       u32 bpf_get_smp_processor_id(void)
172
173              Description
174                     Get the SMP  (symmetric  multiprocessing)  processor  id.
175                     Note that all programs run with migration disabled, which
176                     means that the SMP processor id is stable during all  the
177                     execution of the program.
178
179              Return The SMP id of the processor running the program.
180
181       long  bpf_skb_store_bytes(struct  sk_buff  *skb, u32 offset, const void
182       *from, u32 len, u64 flags)
183
184              Description
185                     Store len bytes from address from into the packet associ‐
186                     ated  to  skb,  at  offset.  flags  are  a combination of
187                     BPF_F_RECOMPUTE_CSUM (automatically recompute the  check‐
188                     sum for the packet after storing the bytes) and BPF_F_IN‐
189                     VALIDATE_HASH (set skb->hash, skb->swhash and skb->l4hash
190                     to 0).
191
192                     A call to this helper is susceptible to change the under‐
193                     lying packet buffer. Therefore, at load time, all  checks
194                     on  pointers  previously done by the verifier are invali‐
195                     dated and must be performed again, if the helper is  used
196                     in combination with direct packet access.
197
198              Return 0 on success, or a negative error in case of failure.
199
200       long bpf_l3_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64
201       to, u64 size)
202
203              Description
204                     Recompute the layer 3 (e.g. IP) checksum for  the  packet
205                     associated  to  skb.  Computation  is incremental, so the
206                     helper must know the former value  of  the  header  field
207                     that  was  modified  (from),  the new value of this field
208                     (to), and the number of bytes (2 or 4)  for  this  field,
209                     stored  in  size.  Alternatively, it is possible to store
210                     the difference between the previous and the new values of
211                     the  header  field  in to, by setting from and size to 0.
212                     For both methods, offset indicates the location of the IP
213                     checksum within the packet.
214
215                     This  helper  works  in combination with bpf_csum_diff(),
216                     which does not update the checksum in-place,  but  offers
217                     more  flexibility and can handle sizes larger than 2 or 4
218                     for the checksum to update.
219
220                     A call to this helper is susceptible to change the under‐
221                     lying  packet buffer. Therefore, at load time, all checks
222                     on pointers previously done by the verifier  are  invali‐
223                     dated  and must be performed again, if the helper is used
224                     in combination with direct packet access.
225
226              Return 0 on success, or a negative error in case of failure.
227
228       long bpf_l4_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64
229       to, u64 flags)
230
231              Description
232                     Recompute  the  layer  4 (e.g. TCP, UDP or ICMP) checksum
233                     for the packet associated to skb. Computation  is  incre‐
234                     mental,  so  the helper must know the former value of the
235                     header field that was modified (from), the new  value  of
236                     this  field  (to),  and  the number of bytes (2 or 4) for
237                     this field, stored on the lowest four bits of flags.  Al‐
238                     ternatively,  it  is possible to store the difference be‐
239                     tween the previous and the new values of the header field
240                     in  to, by setting from and the four lowest bits of flags
241                     to 0. For both methods, offset indicates the location  of
242                     the  IP  checksum  within  the packet. In addition to the
243                     size of the field, flags can be added (bitwise OR) actual
244                     flags. With BPF_F_MARK_MANGLED_0, a null checksum is left
245                     untouched (unless BPF_F_MARK_ENFORCE is added  as  well),
246                     and for updates resulting in a null checksum the value is
247                     set to CSUM_MANGLED_0 instead. Flag BPF_F_PSEUDO_HDR  in‐
248                     dicates   the  checksum  is  to  be  computed  against  a
249                     pseudo-header.
250
251                     This helper works in  combination  with  bpf_csum_diff(),
252                     which  does  not update the checksum in-place, but offers
253                     more flexibility and can handle sizes larger than 2 or  4
254                     for the checksum to update.
255
256                     A call to this helper is susceptible to change the under‐
257                     lying packet buffer. Therefore, at load time, all  checks
258                     on  pointers  previously done by the verifier are invali‐
259                     dated and must be performed again, if the helper is  used
260                     in combination with direct packet access.
261
262              Return 0 on success, or a negative error in case of failure.
263
264       long  bpf_tail_call(void  *ctx, struct bpf_map *prog_array_map, u32 in‐
265       dex)
266
267              Description
268                     This special helper is used to trigger a "tail call",  or
269                     in  other  words,  to jump into another eBPF program. The
270                     same stack frame is used (but values on stack and in reg‐
271                     isters  for the caller are not accessible to the callee).
272                     This mechanism allows for program  chaining,  either  for
273                     raising  the  maximum  number  of available eBPF instruc‐
274                     tions,  or  to  execute  given  programs  in  conditional
275                     blocks.  For security reasons, there is an upper limit to
276                     the number of successive tail  calls  that  can  be  per‐
277                     formed.
278
279                     Upon  call  of  this helper, the program attempts to jump
280                     into a program referenced  at  index  index  in  prog_ar‐
281                     ray_map,  a  special map of type BPF_MAP_TYPE_PROG_ARRAY,
282                     and passes ctx, a pointer to the context.
283
284                     If the call succeeds, the  kernel  immediately  runs  the
285                     first instruction of the new program. This is not a func‐
286                     tion call, and it never returns to the previous  program.
287                     If the call fails, then the helper has no effect, and the
288                     caller continues to run its  subsequent  instructions.  A
289                     call  can  fail  if  the destination program for the jump
290                     does not exist (i.e. index is superior to the  number  of
291                     entries  in  prog_array_map), or if the maximum number of
292                     tail calls has been reached for this chain  of  programs.
293                     This  limit  is  defined  in  the  kernel  by  the  macro
294                     MAX_TAIL_CALL_CNT (not accessible to user  space),  which
295                     is currently set to 33.
296
297              Return 0 on success, or a negative error in case of failure.
298
299       long bpf_clone_redirect(struct sk_buff *skb, u32 ifindex, u64 flags)
300
301              Description
302                     Clone  and  redirect  the packet associated to skb to an‐
303                     other net device  of  index  ifindex.  Both  ingress  and
304                     egress  interfaces  can  be  used  for  redirection.  The
305                     BPF_F_INGRESS value in flags is used to make the distinc‐
306                     tion  (ingress  path  is selected if the flag is present,
307                     egress path otherwise).  This is the only flag  supported
308                     for now.
309
310                     In comparison with bpf_redirect() helper, bpf_clone_redi‐
311                     rect() has the associated cost of duplicating the  packet
312                     buffer, but this can be executed out of the eBPF program.
313                     Conversely, bpf_redirect() is more efficient, but  it  is
314                     handled through an action code where the redirection hap‐
315                     pens only after the eBPF program has returned.
316
317                     A call to this helper is susceptible to change the under‐
318                     lying  packet buffer. Therefore, at load time, all checks
319                     on pointers previously done by the verifier  are  invali‐
320                     dated  and must be performed again, if the helper is used
321                     in combination with direct packet access.
322
323              Return 0 on success, or a negative error in case of failure.
324
325       u64 bpf_get_current_pid_tgid(void)
326
327              Description
328                     Get the current pid and tgid.
329
330              Return A 64-bit integer containing the current tgid and pid, and
331                     created   as   such:  current_task->tgid  <<  32  |  cur‐
332                     rent_task->pid.
333
334       u64 bpf_get_current_uid_gid(void)
335
336              Description
337                     Get the current uid and gid.
338
339              Return A 64-bit integer containing the current GID and UID,  and
340                     created as such: current_gid << 32 | current_uid.
341
342       long bpf_get_current_comm(void *buf, u32 size_of_buf)
343
344              Description
345                     Copy  the  comm attribute of the current task into buf of
346                     size_of_buf. The comm attribute contains the name of  the
347                     executable (excluding the path) for the current task. The
348                     size_of_buf must be strictly positive.  On  success,  the
349                     helper  makes  sure  that  the  buf is NUL-terminated. On
350                     failure, it is filled with zeroes.
351
352              Return 0 on success, or a negative error in case of failure.
353
354       u32 bpf_get_cgroup_classid(struct sk_buff *skb)
355
356              Description
357                     Retrieve the classid for the current task, i.e.  for  the
358                     net_cls cgroup to which skb belongs.
359
360                     This  helper  can  be  used on TC egress path, but not on
361                     ingress.
362
363                     The net_cls cgroup provides an interface to  tag  network
364                     packets based on a user-provided identifier for all traf‐
365                     fic coming  from  the  tasks  belonging  to  the  related
366                     cgroup. See also the related kernel documentation, avail‐
367                     able from the Linux  sources  in  file  Documentation/ad‐
368                     min-guide/cgroup-v1/net_cls.rst.
369
370                     The  Linux kernel has two versions for cgroups: there are
371                     cgroups v1 and cgroups v2. Both are available  to  users,
372                     who  can use a mixture of them, but note that the net_cls
373                     cgroup is for cgroup v1 only. This makes it  incompatible
374                     with   BPF   programs   run   on   cgroups,  which  is  a
375                     cgroup-v2-only feature (a socket can only hold  data  for
376                     one version of cgroups at a time).
377
378                     This  helper is only available is the kernel was compiled
379                     with the CONFIG_CGROUP_NET_CLASSID  configuration  option
380                     set to "y" or to "m".
381
382              Return The classid, or 0 for the default unconfigured classid.
383
384       long  bpf_skb_vlan_push(struct  sk_buff  *skb,  __be16  vlan_proto, u16
385       vlan_tci)
386
387              Description
388                     Push a vlan_tci (VLAN tag control information) of  proto‐
389                     col  vlan_proto to the packet associated to skb, then up‐
390                     date the checksum. Note that if vlan_proto  is  different
391                     from ETH_P_8021Q and ETH_P_8021AD, it is considered to be
392                     ETH_P_8021Q.
393
394                     A call to this helper is susceptible to change the under‐
395                     lying  packet buffer. Therefore, at load time, all checks
396                     on pointers previously done by the verifier  are  invali‐
397                     dated  and must be performed again, if the helper is used
398                     in combination with direct packet access.
399
400              Return 0 on success, or a negative error in case of failure.
401
402       long bpf_skb_vlan_pop(struct sk_buff *skb)
403
404              Description
405                     Pop a VLAN header from the packet associated to skb.
406
407                     A call to this helper is susceptible to change the under‐
408                     lying  packet buffer. Therefore, at load time, all checks
409                     on pointers previously done by the verifier  are  invali‐
410                     dated  and must be performed again, if the helper is used
411                     in combination with direct packet access.
412
413              Return 0 on success, or a negative error in case of failure.
414
415       long bpf_skb_get_tunnel_key(struct sk_buff *skb, struct  bpf_tunnel_key
416       *key, u32 size, u64 flags)
417
418              Description
419                     Get  tunnel  metadata. This helper takes a pointer key to
420                     an empty struct bpf_tunnel_key  of  size,  that  will  be
421                     filled  with tunnel metadata for the packet associated to
422                     skb.  The flags can be set to  BPF_F_TUNINFO_IPV6,  which
423                     indicates  that  the tunnel is based on IPv6 protocol in‐
424                     stead of IPv4.
425
426                     The struct bpf_tunnel_key is an object  that  generalizes
427                     the principal parameters used by various tunneling proto‐
428                     cols into a single struct. This way, it can  be  used  to
429                     easily  make  a decision based on the contents of the en‐
430                     capsulation header, "summarized" in this struct. In  par‐
431                     ticular,  it holds the IP address of the remote end (IPv4
432                     or IPv6, depending on the case)  in  key->remote_ipv4  or
433                     key->remote_ipv6. Also, this struct exposes the key->tun‐
434                     nel_id, which is generally mapped to a VNI (Virtual  Net‐
435                     work  Identifier),  making  it programmable together with
436                     the bpf_skb_set_tunnel_key() helper.
437
438                     Let's imagine that the following code is part of  a  pro‐
439                     gram  attached to the TC ingress interface, on one end of
440                     a GRE tunnel, and is supposed to filter out all  messages
441                     coming  from  remote  ends  with  IPv4 address other than
442                     10.0.0.1:
443
444                        int ret;
445                        struct bpf_tunnel_key key = {};
446
447                        ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0);
448                        if (ret < 0)
449                                return TC_ACT_SHOT;     // drop packet
450
451                        if (key.remote_ipv4 != 0x0a000001)
452                                return TC_ACT_SHOT;     // drop packet
453
454                        return TC_ACT_OK;               // accept packet
455
456                     This interface can also be used  with  all  encapsulation
457                     devices  that can operate in "collect metadata" mode: in‐
458                     stead of having one network device per specific  configu‐
459                     ration,  the "collect metadata" mode only requires a sin‐
460                     gle device where the configuration can be extracted  from
461                     this helper.
462
463                     This  can  be  used together with various tunnels such as
464                     VXLan, Geneve, GRE or IP in IP (IPIP).
465
466              Return 0 on success, or a negative error in case of failure.
467
468       long bpf_skb_set_tunnel_key(struct sk_buff *skb, struct  bpf_tunnel_key
469       *key, u32 size, u64 flags)
470
471              Description
472                     Populate  tunnel  metadata  for packet associated to skb.
473                     The tunnel metadata is set to the  contents  of  key,  of
474                     size.  The  flags can be set to a combination of the fol‐
475                     lowing values:
476
477                     BPF_F_TUNINFO_IPV6
478                            Indicate that the tunnel is based on IPv6 protocol
479                            instead of IPv4.
480
481                     BPF_F_ZERO_CSUM_TX
482                            For  IPv4  packets,  add a flag to tunnel metadata
483                            indicating that  checksum  computation  should  be
484                            skipped and checksum set to zeroes.
485
486                     BPF_F_DONT_FRAGMENT
487                            Add  a flag to tunnel metadata indicating that the
488                            packet should not be fragmented.
489
490                     BPF_F_SEQ_NUMBER
491                            Add a flag to tunnel metadata  indicating  that  a
492                            sequence  number  should be added to tunnel header
493                            before sending the packet. This flag was added for
494                            GRE  encapsulation,  but  might be used with other
495                            protocols as well in the future.
496
497                     Here is a typical usage on the transmit path:
498
499                        struct bpf_tunnel_key key;
500                             populate key ...
501                        bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0);
502                        bpf_clone_redirect(skb, vxlan_dev_ifindex, 0);
503
504                     See also the description of the  bpf_skb_get_tunnel_key()
505                     helper for additional information.
506
507              Return 0 on success, or a negative error in case of failure.
508
509       u64 bpf_perf_event_read(struct bpf_map *map, u64 flags)
510
511              Description
512                     Read  the  value of a perf event counter. This helper re‐
513                     lies on a map of type BPF_MAP_TYPE_PERF_EVENT_ARRAY.  The
514                     nature  of the perf event counter is selected when map is
515                     updated with perf event file descriptors. The map  is  an
516                     array  whose  size  is  the number of available CPUs, and
517                     each cell contains a value relative to one CPU. The value
518                     to  retrieve is indicated by flags, that contains the in‐
519                     dex of the CPU to look up, masked with  BPF_F_INDEX_MASK.
520                     Alternatively,  flags  can be set to BPF_F_CURRENT_CPU to
521                     indicate that the value for the current CPU should be re‐
522                     trieved.
523
524                     Note that before Linux 4.13, only hardware perf event can
525                     be retrieved.
526
527                     Also,    be    aware    that     the     newer     helper
528                     bpf_perf_event_read_value()     is    recommended    over
529                     bpf_perf_event_read() in general. The latter has some ABI
530                     quirks where error and counter value are used as a return
531                     code (which is wrong to do  since  ranges  may  overlap).
532                     This  issue  is  fixed  with bpf_perf_event_read_value(),
533                     which at the same time provides more  features  over  the
534                     bpf_perf_event_read()  interface. Please refer to the de‐
535                     scription of bpf_perf_event_read_value() for details.
536
537              Return The value of the perf event counter read from the map, or
538                     a negative error code in case of failure.
539
540       long bpf_redirect(u32 ifindex, u64 flags)
541
542              Description
543                     Redirect  the  packet  to  another  net  device  of index
544                     ifindex.    This   helper   is   somewhat   similar    to
545                     bpf_clone_redirect(),  except  that  the  packet  is  not
546                     cloned, which provides increased performance.
547
548                     Except for XDP, both ingress and egress interfaces can be
549                     used for redirection. The BPF_F_INGRESS value in flags is
550                     used to make the distinction (ingress path is selected if
551                     the  flag  is present, egress path otherwise). Currently,
552                     XDP only supports redirection to  the  egress  interface,
553                     and accepts no flag at all.
554
555                     The  same  effect  can  also  be  attained  with the more
556                     generic bpf_redirect_map(), which uses a BPF map to store
557                     the  redirect  target instead of providing it directly to
558                     the helper.
559
560              Return For XDP, the helper returns XDP_REDIRECT  on  success  or
561                     XDP_ABORTED on error. For other program types, the values
562                     are TC_ACT_REDIRECT on success or TC_ACT_SHOT on error.
563
564       u32 bpf_get_route_realm(struct sk_buff *skb)
565
566              Description
567                     Retrieve the realm or the  route,  that  is  to  say  the
568                     tclassid  field of the destination for the skb. The iden‐
569                     tifier retrieved is a user-provided tag, similar  to  the
570                     one  used  with  the  net_cls cgroup (see description for
571                     bpf_get_cgroup_classid() helper), but here  this  tag  is
572                     held by a route (a destination entry), not by a task.
573
574                     Retrieving  this  identifier  works  with  the  clsact TC
575                     egress hook (see also  tc-bpf(8)),  or  alternatively  on
576                     conventional  classful  egress  qdiscs,  but  not  on  TC
577                     ingress path. In case of clsact TC egress hook, this  has
578                     the advantage that, internally, the destination entry has
579                     not been dropped yet in the transmit path. Therefore, the
580                     destination  entry  does not need to be artificially held
581                     via netif_keep_dst() for a classful qdisc until  the  skb
582                     is freed.
583
584                     This  helper is available only if the kernel was compiled
585                     with CONFIG_IP_ROUTE_CLASSID configuration option.
586
587              Return The realm of the route for the packet associated to  skb,
588                     or 0 if none was found.
589
590       long  bpf_perf_event_output(void  *ctx, struct bpf_map *map, u64 flags,
591       void *data, u64 size)
592
593              Description
594                     Write raw data blob into a special BPF perf event held by
595                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
596                     event must have the following attributes: PERF_SAMPLE_RAW
597                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
598                     PERF_COUNT_SW_BPF_OUTPUT as config.
599
600                     The flags are used to indicate the index in map for which
601                     the value must be put, masked with BPF_F_INDEX_MASK.  Al‐
602                     ternatively, flags can be set to BPF_F_CURRENT_CPU to in‐
603                     dicate  that  the index of the current CPU core should be
604                     used.
605
606                     The value to write, of size, is passed through eBPF stack
607                     and pointed by data.
608
609                     The  context  of  the program ctx needs also be passed to
610                     the helper.
611
612                     On user space, a program willing to read the values needs
613                     to  call  perf_event_open() on the perf event (either for
614                     one or for all CPUs) and to  store  the  file  descriptor
615                     into  the  map. This must be done before the eBPF program
616                     can send data into it. An example is  available  in  file
617                     samples/bpf/trace_output_user.c   in   the  Linux  kernel
618                     source tree (the eBPF  program  counterpart  is  in  sam‐
619                     ples/bpf/trace_output_kern.c).
620
621                     bpf_perf_event_output()  achieves better performance than
622                     bpf_trace_printk() for sharing data with user space,  and
623                     is much better suitable for streaming data from eBPF pro‐
624                     grams.
625
626                     Note that this helper is not restricted  to  tracing  use
627                     cases and can be used with programs attached to TC or XDP
628                     as well, where it allows for passing data to  user  space
629                     listeners. Data can be:
630
631                     • Only custom structs,
632
633                     • Only the packet payload, or
634
635                     • A combination of both.
636
637              Return 0 on success, or a negative error in case of failure.
638
639       long bpf_skb_load_bytes(const void *skb, u32 offset, void *to, u32 len)
640
641              Description
642                     This helper was provided as an easy way to load data from
643                     a packet. It can be used to load len  bytes  from  offset
644                     from  the  packet  associated  to  skb,  into  the buffer
645                     pointed by to.
646
647                     Since Linux 4.7, usage of this helper has mostly been re‐
648                     placed by "direct packet access", enabling packet data to
649                     be manipulated with skb->data and skb->data_end  pointing
650                     respectively  to the first byte of packet data and to the
651                     byte after the last byte of packet data. However, it  re‐
652                     mains  useful  if  one wishes to read large quantities of
653                     data at once from a packet into the eBPF stack.
654
655              Return 0 on success, or a negative error in case of failure.
656
657       long bpf_get_stackid(void *ctx, struct bpf_map *map, u64 flags)
658
659              Description
660                     Walk a user or a kernel  stack  and  return  its  id.  To
661                     achieve this, the helper needs ctx, which is a pointer to
662                     the context on which the tracing program is executed, and
663                     a pointer to a map of type BPF_MAP_TYPE_STACK_TRACE.
664
665                     The  last  argument,  flags,  holds  the  number of stack
666                     frames  to  skip   (from   0   to   255),   masked   with
667                     BPF_F_SKIP_FIELD_MASK. The next bits can be used to set a
668                     combination of the following flags:
669
670                     BPF_F_USER_STACK
671                            Collect a user space stack  instead  of  a  kernel
672                            stack.
673
674                     BPF_F_FAST_STACK_CMP
675                            Compare stacks by hash only.
676
677                     BPF_F_REUSE_STACKID
678                            If   two  different  stacks  hash  into  the  same
679                            stackid, discard the old one.
680
681                     The stack id retrieved is a 32 bit  long  integer  handle
682                     which  can be further combined with other data (including
683                     other stack ids) and used as a key into maps. This can be
684                     useful  for generating a variety of graphs (such as flame
685                     graphs or off-cpu graphs).
686
687                     For walking a stack, this helper is an  improvement  over
688                     bpf_probe_read(),  which  can be used with unrolled loops
689                     but is not efficient and consumes a lot of eBPF  instruc‐
690                     tions.   Instead,  bpf_get_stackid()  can  collect  up to
691                     PERF_MAX_STACK_DEPTH both kernel and  user  frames.  Note
692                     that  this  limit  can be controlled with the sysctl pro‐
693                     gram, and that it should be manually increased  in  order
694                     to profile long user stacks (such as stacks for Java pro‐
695                     grams). To do so, use:
696
697                        # sysctl kernel.perf_event_max_stack=<new value>
698
699              Return The positive or null stack id on success, or  a  negative
700                     error in case of failure.
701
702       s64 bpf_csum_diff(__be32 *from, u32 from_size, __be32 *to, u32 to_size,
703       __wsum seed)
704
705              Description
706                     Compute  a  checksum  difference,  from  the  raw  buffer
707                     pointed by from, of length from_size (that must be a mul‐
708                     tiple of 4), towards the raw buffer  pointed  by  to,  of
709                     size to_size (same remark). An optional seed can be added
710                     to the value (this can be cascaded,  the  seed  may  come
711                     from a previous call to the helper).
712
713                     This is flexible enough to be used in several ways:
714
715                     • With from_size == 0, to_size > 0 and seed set to check‐
716                       sum, it can be used when pushing new data.
717
718                     • With from_size > 0, to_size == 0 and seed set to check‐
719                       sum, it can be used when removing data from a packet.
720
721                     • With  from_size  > 0, to_size > 0 and seed set to 0, it
722                       can be used to compute a diff. Note that from_size  and
723                       to_size do not need to be equal.
724
725                     This   helper   can   be   used   in   combination   with
726                     bpf_l3_csum_replace() and bpf_l4_csum_replace(), to which
727                     one   can   feed   in   the   difference   computed  with
728                     bpf_csum_diff().
729
730              Return The checksum result, or a negative error code in case  of
731                     failure.
732
733       long bpf_skb_get_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
734
735              Description
736                     Retrieve  tunnel  options metadata for the packet associ‐
737                     ated to skb, and store the raw tunnel option data to  the
738                     buffer opt of size.
739
740                     This  helper  can be used with encapsulation devices that
741                     can operate in "collect metadata" mode (please  refer  to
742                     the  related  note in the description of bpf_skb_get_tun‐
743                     nel_key() for more details). A particular  example  where
744                     this can be used is in combination with the Geneve encap‐
745                     sulation protocol, where  it  allows  for  pushing  (with
746                     bpf_skb_get_tunnel_opt() helper) and retrieving arbitrary
747                     TLVs (Type-Length-Value headers) from the  eBPF  program.
748                     This allows for full customization of these headers.
749
750              Return The size of the option data retrieved.
751
752       long bpf_skb_set_tunnel_opt(struct sk_buff *skb, void *opt, u32 size)
753
754              Description
755                     Set  tunnel options metadata for the packet associated to
756                     skb to the option data contained in the raw buffer opt of
757                     size.
758
759                     See  also the description of the bpf_skb_get_tunnel_opt()
760                     helper for additional information.
761
762              Return 0 on success, or a negative error in case of failure.
763
764       long bpf_skb_change_proto(struct sk_buff *skb, __be16 proto, u64 flags)
765
766              Description
767                     Change the protocol of the skb to proto.  Currently  sup‐
768                     ported are transition from IPv4 to IPv6, and from IPv6 to
769                     IPv4. The helper takes care of  the  groundwork  for  the
770                     transition,  including  resizing  the  socket buffer. The
771                     eBPF program is expected to fill the new headers, if any,
772                     via skb_store_bytes() and to recompute the checksums with
773                     bpf_l3_csum_replace() and bpf_l4_csum_replace(). The main
774                     case  for  this helper is to perform NAT64 operations out
775                     of an eBPF program.
776
777                     Internally, the GSO type is marked as dodgy so that head‐
778                     ers  are  checked  and  segments  are recalculated by the
779                     GSO/GRO engine.  The size for GSO target  is  adapted  as
780                     well.
781
782                     All  values  for flags are reserved for future usage, and
783                     must be left at zero.
784
785                     A call to this helper is susceptible to change the under‐
786                     lying  packet buffer. Therefore, at load time, all checks
787                     on pointers previously done by the verifier  are  invali‐
788                     dated  and must be performed again, if the helper is used
789                     in combination with direct packet access.
790
791              Return 0 on success, or a negative error in case of failure.
792
793       long bpf_skb_change_type(struct sk_buff *skb, u32 type)
794
795              Description
796                     Change the packet type for the packet associated to  skb.
797                     This  comes down to setting skb->pkt_type to type, except
798                     the  eBPF  program  does  not  have  a  write  access  to
799                     skb->pkt_type beside this helper. Using a helper here al‐
800                     lows for graceful handling of errors.
801
802                     The major  use  case  is  to  change  incoming  skb*s  to
803                     **PACKET_HOST* in a programmatic way instead of having to
804                     recirculate via redirect(..., BPF_F_INGRESS),  for  exam‐
805                     ple.
806
807                     Note  that type only allows certain values. At this time,
808                     they are:
809
810                     PACKET_HOST
811                            Packet is for us.
812
813                     PACKET_BROADCAST
814                            Send packet to all.
815
816                     PACKET_MULTICAST
817                            Send packet to group.
818
819                     PACKET_OTHERHOST
820                            Send packet to someone else.
821
822              Return 0 on success, or a negative error in case of failure.
823
824       long bpf_skb_under_cgroup(struct sk_buff *skb, struct bpf_map *map, u32
825       index)
826
827              Description
828                     Check  whether skb is a descendant of the cgroup2 held by
829                     map of type BPF_MAP_TYPE_CGROUP_ARRAY, at index.
830
831              Return The return value depends on the result of the  test,  and
832                     can be:
833
834                     • 0, if the skb failed the cgroup2 descendant test.
835
836                     • 1, if the skb succeeded the cgroup2 descendant test.
837
838                     • A negative error code, if an error occurred.
839
840       u32 bpf_get_hash_recalc(struct sk_buff *skb)
841
842              Description
843                     Retrieve  the hash of the packet, skb->hash. If it is not
844                     set, in particular if the hash was cleared  due  to  man‐
845                     gling,  recompute  this  hash. Later accesses to the hash
846                     can be done directly with skb->hash.
847
848                     Calling bpf_set_hash_invalid(), changing a packet  proto‐
849                     type     with    bpf_skb_change_proto(),    or    calling
850                     bpf_skb_store_bytes() with the BPF_F_INVALIDATE_HASH  are
851                     actions  susceptible  to  clear the hash and to trigger a
852                     new computation for the  next  call  to  bpf_get_hash_re‐
853                     calc().
854
855              Return The 32-bit hash.
856
857       u64 bpf_get_current_task(void)
858
859              Description
860                     Get the current task.
861
862              Return A pointer to the current task struct.
863
864       long bpf_probe_write_user(void *dst, const void *src, u32 len)
865
866              Description
867                     Attempt  in a safe way to write len bytes from the buffer
868                     src to dst in memory. It only works for threads that  are
869                     in  user  context, and dst must be a valid user space ad‐
870                     dress.
871
872                     This helper should not be used to implement any  kind  of
873                     security mechanism because of TOC-TOU attacks, but rather
874                     to debug, divert, and manipulate execution of  semi-coop‐
875                     erative processes.
876
877                     Keep  in mind that this feature is meant for experiments,
878                     and it has a risk of crashing the system and running pro‐
879                     grams.  Therefore, when an eBPF program using this helper
880                     is attached, a warning including PID and process name  is
881                     printed to kernel logs.
882
883              Return 0 on success, or a negative error in case of failure.
884
885       long bpf_current_task_under_cgroup(struct bpf_map *map, u32 index)
886
887              Description
888                     Check  whether the probe is being run is the context of a
889                     given subset of the cgroup2  hierarchy.  The  cgroup2  to
890                     test is held by map of type BPF_MAP_TYPE_CGROUP_ARRAY, at
891                     index.
892
893              Return The return value depends on the result of the  test,  and
894                     can be:
895
896                     • 1, if current task belongs to the cgroup2.
897
898                     • 0, if current task does not belong to the cgroup2.
899
900                     • A negative error code, if an error occurred.
901
902       long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
903
904              Description
905                     Resize (trim or grow) the packet associated to skb to the
906                     new len. The flags are reserved  for  future  usage,  and
907                     must be left at zero.
908
909                     The  basic  idea  is  that the helper performs the needed
910                     work to change the size of the packet, then the eBPF pro‐
911                     gram    rewrites    the    rest    via    helpers    like
912                     bpf_skb_store_bytes(),             bpf_l3_csum_replace(),
913                     bpf_l3_csum_replace()  and  others. This helper is a slow
914                     path utility intended for replies with control  messages.
915                     And  because it is targeted for slow path, the helper it‐
916                     self can afford to be slow: it implicitly linearizes, un‐
917                     clones and drops offloads from the skb.
918
919                     A call to this helper is susceptible to change the under‐
920                     lying packet buffer. Therefore, at load time, all  checks
921                     on  pointers  previously done by the verifier are invali‐
922                     dated and must be performed again, if the helper is  used
923                     in combination with direct packet access.
924
925              Return 0 on success, or a negative error in case of failure.
926
927       long bpf_skb_pull_data(struct sk_buff *skb, u32 len)
928
929              Description
930                     Pull in non-linear data in case the skb is non-linear and
931                     not all of len are part of the linear section.  Make  len
932                     bytes  from skb readable and writable. If a zero value is
933                     passed for len, then all bytes in the linear part of  skb
934                     will be made readable and writable.
935
936                     This  helper  is only needed for reading and writing with
937                     direct packet access.
938
939                     For direct packet access, testing that offsets to  access
940                     are  within  packet boundaries (test on skb->data_end) is
941                     susceptible to fail if offsets are invalid, or if the re‐
942                     quested  data is in non-linear parts of the skb. On fail‐
943                     ure the program can just bail out, or in the  case  of  a
944                     non-linear  buffer,  use a helper to make the data avail‐
945                     able. The bpf_skb_load_bytes() helper is a first solution
946                     to  access  the  data.  Another  one  consists  in  using
947                     bpf_skb_pull_data to pull in once the  non-linear  parts,
948                     then retesting and eventually access the data.
949
950                     At  the  same  time,  this also makes sure the skb is un‐
951                     cloned, which is a necessary condition for direct  write.
952                     As this needs to be an invariant for the write part only,
953                     the verifier detects writes and adds a prologue  that  is
954                     calling  bpf_skb_pull_data()  to  effectively unclone the
955                     skb from the very beginning in case it is indeed cloned.
956
957                     A call to this helper is susceptible to change the under‐
958                     lying  packet buffer. Therefore, at load time, all checks
959                     on pointers previously done by the verifier  are  invali‐
960                     dated  and must be performed again, if the helper is used
961                     in combination with direct packet access.
962
963              Return 0 on success, or a negative error in case of failure.
964
965       s64 bpf_csum_update(struct sk_buff *skb, __wsum csum)
966
967              Description
968                     Add the checksum csum into skb->csum in case  the  driver
969                     has  supplied  a checksum for the entire packet into that
970                     field. Return an error otherwise. This helper is intended
971                     to  be  used in combination with bpf_csum_diff(), in par‐
972                     ticular when the checksum needs to be updated after  data
973                     has  been  written  into the packet through direct packet
974                     access.
975
976              Return The checksum on success, or a negative error code in case
977                     of failure.
978
979       void bpf_set_hash_invalid(struct sk_buff *skb)
980
981              Description
982                     Invalidate  the  current  skb->hash. It can be used after
983                     mangling on headers through direct packet access, in  or‐
984                     der  to indicate that the hash is outdated and to trigger
985                     a recalculation the next time the kernel tries to  access
986                     this  hash  or  when  the bpf_get_hash_recalc() helper is
987                     called.
988
989              Return void.
990
991       long bpf_get_numa_node_id(void)
992
993              Description
994                     Return the id of the current NUMA node. The  primary  use
995                     case  for this helper is the selection of sockets for the
996                     local NUMA node, when the program is attached to  sockets
997                     using   the  SO_ATTACH_REUSEPORT_EBPF  option  (see  also
998                     socket(7)), but the helper is  also  available  to  other
999                     eBPF  program  types,  similarly  to  bpf_get_smp_proces‐
1000                     sor_id().
1001
1002              Return The id of current NUMA node.
1003
1004       long bpf_skb_change_head(struct sk_buff *skb, u32 len, u64 flags)
1005
1006              Description
1007                     Grows headroom of packet associated to  skb  and  adjusts
1008                     the  offset  of  the  MAC  header accordingly, adding len
1009                     bytes of space. It automatically extends and  reallocates
1010                     memory as required.
1011
1012                     This  helper  can  be used on a layer 3 skb to push a MAC
1013                     header for redirection into a layer 2 device.
1014
1015                     All values for flags are reserved for future  usage,  and
1016                     must be left at zero.
1017
1018                     A call to this helper is susceptible to change the under‐
1019                     lying packet buffer. Therefore, at load time, all  checks
1020                     on  pointers  previously done by the verifier are invali‐
1021                     dated and must be performed again, if the helper is  used
1022                     in combination with direct packet access.
1023
1024              Return 0 on success, or a negative error in case of failure.
1025
1026       long bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
1027
1028              Description
1029                     Adjust  (move)  xdp_md->data by delta bytes. Note that it
1030                     is possible to use  a  negative  value  for  delta.  This
1031                     helper  can  be used to prepare the packet for pushing or
1032                     popping headers.
1033
1034                     A call to this helper is susceptible to change the under‐
1035                     lying  packet buffer. Therefore, at load time, all checks
1036                     on pointers previously done by the verifier  are  invali‐
1037                     dated  and must be performed again, if the helper is used
1038                     in combination with direct packet access.
1039
1040              Return 0 on success, or a negative error in case of failure.
1041
1042       long bpf_probe_read_str(void *dst, u32 size, const void *unsafe_ptr)
1043
1044              Description
1045                     Copy a NUL terminated string from an  unsafe  kernel  ad‐
1046                     dress  unsafe_ptr to dst. See bpf_probe_read_kernel_str()
1047                     for more details.
1048
1049                     Generally,     use      bpf_probe_read_user_str()      or
1050                     bpf_probe_read_kernel_str() instead.
1051
1052              Return On  success,  the strictly positive length of the string,
1053                     including the trailing NUL character. On error,  a  nega‐
1054                     tive value.
1055
1056       u64 bpf_get_socket_cookie(struct sk_buff *skb)
1057
1058              Description
1059                     If  the struct sk_buff pointed by skb has a known socket,
1060                     retrieve the cookie (generated by  the  kernel)  of  this
1061                     socket.   If  no  cookie has been set yet, generate a new
1062                     cookie. Once generated, the socket cookie remains  stable
1063                     for the life of the socket. This helper can be useful for
1064                     monitoring per socket networking traffic statistics as it
1065                     provides  a  global socket identifier that can be assumed
1066                     unique.
1067
1068              Return A 8-byte long unique number  on  success,  or  0  if  the
1069                     socket field is missing inside skb.
1070
1071       u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
1072
1073              Description
1074                     Equivalent to bpf_get_socket_cookie() helper that accepts
1075                     skb, but gets socket from struct bpf_sock_addr context.
1076
1077              Return A 8-byte long unique number.
1078
1079       u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
1080
1081              Description
1082                     Equivalent to bpf_get_socket_cookie() helper that accepts
1083                     skb, but gets socket from struct bpf_sock_ops context.
1084
1085              Return A 8-byte long unique number.
1086
1087       u64 bpf_get_socket_cookie(struct sock *sk)
1088
1089              Description
1090                     Equivalent to bpf_get_socket_cookie() helper that accepts
1091                     sk, but gets socket from a BTF struct sock.  This  helper
1092                     also works for sleepable programs.
1093
1094              Return A 8-byte long unique number or 0 if sk is NULL.
1095
1096       u32 bpf_get_socket_uid(struct sk_buff *skb)
1097
1098              Description
1099                     Get the owner UID of the socked associated to skb.
1100
1101              Return The  owner  UID  of  the socket associated to skb. If the
1102                     socket is NULL, or if it is not a full socket (i.e. if it
1103                     is  a time-wait or a request socket instead), overflowuid
1104                     value is returned (note that overflowuid  might  also  be
1105                     the actual UID value for the socket).
1106
1107       long bpf_set_hash(struct sk_buff *skb, u32 hash)
1108
1109              Description
1110                     Set  the  full  hash for skb (set the field skb->hash) to
1111                     value hash.
1112
1113              Return 0
1114
1115       long bpf_setsockopt(void *bpf_socket,  int  level,  int  optname,  void
1116       *optval, int optlen)
1117
1118              Description
1119                     Emulate  a  call to setsockopt() on the socket associated
1120                     to bpf_socket, which must be a full socket. The level  at
1121                     which  the option resides and the name optname of the op‐
1122                     tion must be specified, see setsockopt(2) for more infor‐
1123                     mation.   The option value of length optlen is pointed by
1124                     optval.
1125
1126                     bpf_socket should be one of the following:
1127
1128struct bpf_sock_ops for BPF_PROG_TYPE_SOCK_OPS.
1129
1130struct bpf_sock_addr for  BPF_CGROUP_INET4_CONNECT  and
1131                       BPF_CGROUP_INET6_CONNECT.
1132
1133                     This helper actually implements a subset of setsockopt().
1134                     It supports the following levels:
1135
1136SOL_SOCKET,  which  supports  the  following  optnames:
1137                       SO_RCVBUF,  SO_SNDBUF, SO_MAX_PACING_RATE, SO_PRIORITY,
1138                       SO_RCVLOWAT, SO_MARK, SO_BINDTODEVICE, SO_KEEPALIVE.
1139
1140IPPROTO_TCP, which  supports  the  following  optnames:
1141                       TCP_CONGESTION,    TCP_BPF_IW,   TCP_BPF_SNDCWND_CLAMP,
1142                       TCP_SAVE_SYN, TCP_KEEPIDLE, TCP_KEEPINTVL, TCP_KEEPCNT,
1143                       TCP_SYNCNT, TCP_USER_TIMEOUT, TCP_NOTSENT_LOWAT.
1144
1145IPPROTO_IP, which supports optname IP_TOS.
1146
1147IPPROTO_IPV6, which supports optname IPV6_TCLASS.
1148
1149              Return 0 on success, or a negative error in case of failure.
1150
1151       long  bpf_skb_adjust_room(struct  sk_buff *skb, s32 len_diff, u32 mode,
1152       u64 flags)
1153
1154              Description
1155                     Grow or shrink the room for data in the packet associated
1156                     to skb by len_diff, and according to the selected mode.
1157
1158                     By  default, the helper will reset any offloaded checksum
1159                     indicator of  the  skb  to  CHECKSUM_NONE.  This  can  be
1160                     avoided by the following flag:
1161
1162BPF_F_ADJ_ROOM_NO_CSUM_RESET:  Do  not  reset offloaded
1163                       checksum data of the skb to CHECKSUM_NONE.
1164
1165                     There are two supported modes at this time:
1166
1167BPF_ADJ_ROOM_MAC: Adjust room at the  mac  layer  (room
1168                       space is added or removed between the layer 2 and layer
1169                       3 headers).
1170
1171BPF_ADJ_ROOM_NET: Adjust  room  at  the  network  layer
1172                       (room space is added or removed between the layer 3 and
1173                       layer 4 headers).
1174
1175                     The following flags are supported at this time:
1176
1177BPF_F_ADJ_ROOM_FIXED_GSO: Do not adjust gso_size.   Ad‐
1178                       justing mss in this way is not allowed for datagrams.
1179
1180BPF_F_ADJ_ROOM_ENCAP_L3_IPV4,        BPF_F_ADJ_ROOM_EN‐
1181                       CAP_L3_IPV6: Any new space is reserved to hold a tunnel
1182                       header.  Configure skb offsets and other fields accord‐
1183                       ingly.
1184
1185BPF_F_ADJ_ROOM_ENCAP_L4_GRE,         BPF_F_ADJ_ROOM_EN‐
1186                       CAP_L4_UDP:  Use with ENCAP_L3 flags to further specify
1187                       the tunnel type.
1188
1189BPF_F_ADJ_ROOM_ENCAP_L2(len):  Use   with   ENCAP_L3/L4
1190                       flags  to  further  specify the tunnel type; len is the
1191                       length of the inner MAC header.
1192
1193BPF_F_ADJ_ROOM_ENCAP_L2_ETH:          Use          with
1194                       BPF_F_ADJ_ROOM_ENCAP_L2  flag to further specify the L2
1195                       type as Ethernet.
1196
1197                     A call to this helper is susceptible to change the under‐
1198                     lying  packet buffer. Therefore, at load time, all checks
1199                     on pointers previously done by the verifier  are  invali‐
1200                     dated  and must be performed again, if the helper is used
1201                     in combination with direct packet access.
1202
1203              Return 0 on success, or a negative error in case of failure.
1204
1205       long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
1206
1207              Description
1208                     Redirect the packet to the endpoint referenced by map  at
1209                     index  key.  Depending  on its type, this map can contain
1210                     references to net devices (for forwarding packets through
1211                     other  ports),  or to CPUs (for redirecting XDP frames to
1212                     another CPU; but this is only implemented for native  XDP
1213                     (with driver support) as of this writing).
1214
1215                     The  lower  two bits of flags are used as the return code
1216                     if the map lookup fails. This is so that the return value
1217                     can  be one of the XDP program return codes up to XDP_TX,
1218                     as chosen by the caller. The higher bits of flags can  be
1219                     set  to  BPF_F_BROADCAST  or BPF_F_EXCLUDE_INGRESS as de‐
1220                     fined below.
1221
1222                     With BPF_F_BROADCAST the packet will  be  broadcasted  to
1223                     all the interfaces in the map, with BPF_F_EXCLUDE_INGRESS
1224                     the ingress interface will be excluded when do broadcast‐
1225                     ing.
1226
1227                     See  also bpf_redirect(), which only supports redirecting
1228                     to an ifindex, but doesn't require a map to do so.
1229
1230              Return XDP_REDIRECT on success, or the value of  the  two  lower
1231                     bits of the flags argument on error.
1232
1233       long  bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32
1234       key, u64 flags)
1235
1236              Description
1237                     Redirect the packet to the socket referenced by  map  (of
1238                     type BPF_MAP_TYPE_SOCKMAP) at index key. Both ingress and
1239                     egress  interfaces  can  be  used  for  redirection.  The
1240                     BPF_F_INGRESS value in flags is used to make the distinc‐
1241                     tion (ingress path is selected if the  flag  is  present,
1242                     egress  path  otherwise). This is the only flag supported
1243                     for now.
1244
1245              Return SK_PASS on success, or SK_DROP on error.
1246
1247       long bpf_sock_map_update(struct  bpf_sock_ops  *skops,  struct  bpf_map
1248       *map, void *key, u64 flags)
1249
1250              Description
1251                     Add an entry to, or update a map referencing sockets. The
1252                     skops is used as a new value for the entry associated  to
1253                     key. flags is one of:
1254
1255                     BPF_NOEXIST
1256                            The entry for key must not exist in the map.
1257
1258                     BPF_EXIST
1259                            The entry for key must already exist in the map.
1260
1261                     BPF_ANY
1262                            No  condition  on  the  existence of the entry for
1263                            key.
1264
1265                     If the map has eBPF programs (parser and verdict),  those
1266                     will  be  inherited  by  the  socket  being added. If the
1267                     socket is already attached to eBPF programs, this results
1268                     in an error.
1269
1270              Return 0 on success, or a negative error in case of failure.
1271
1272       long bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
1273
1274              Description
1275                     Adjust  the address pointed by xdp_md->data_meta by delta
1276                     (which can be positive or negative). Note that this oper‐
1277                     ation modifies the address stored in xdp_md->data, so the
1278                     latter must be loaded only  after  the  helper  has  been
1279                     called.
1280
1281                     The use of xdp_md->data_meta is optional and programs are
1282                     not required to use it. The rationale is  that  when  the
1283                     packet  is processed with XDP (e.g. as DoS filter), it is
1284                     possible to push further meta data along with  it  before
1285                     passing  to  the stack, and to give the guarantee that an
1286                     ingress eBPF program attached as a TC classifier  on  the
1287                     same device can pick this up for further post-processing.
1288                     Since TC works with socket buffers, it  remains  possible
1289                     to  set  from XDP the mark or priority pointers, or other
1290                     pointers for the  socket  buffer.   Having  this  scratch
1291                     space  generic and programmable allows for more flexibil‐
1292                     ity as the user is free to store whatever meta data  they
1293                     need.
1294
1295                     A call to this helper is susceptible to change the under‐
1296                     lying packet buffer. Therefore, at load time, all  checks
1297                     on  pointers  previously done by the verifier are invali‐
1298                     dated and must be performed again, if the helper is  used
1299                     in combination with direct packet access.
1300
1301              Return 0 on success, or a negative error in case of failure.
1302
1303       long  bpf_perf_event_read_value(struct  bpf_map *map, u64 flags, struct
1304       bpf_perf_event_value *buf, u32 buf_size)
1305
1306              Description
1307                     Read the value of a perf event counter, and store it into
1308                     buf of size buf_size. This helper relies on a map of type
1309                     BPF_MAP_TYPE_PERF_EVENT_ARRAY. The  nature  of  the  perf
1310                     event  counter  is selected when map is updated with perf
1311                     event file descriptors. The map is an array whose size is
1312                     the  number  of  available CPUs, and each cell contains a
1313                     value relative to one CPU. The value to retrieve is indi‐
1314                     cated  by  flags,  that  contains the index of the CPU to
1315                     look up,  masked  with  BPF_F_INDEX_MASK.  Alternatively,
1316                     flags  can  be  set to BPF_F_CURRENT_CPU to indicate that
1317                     the value for the current CPU should be retrieved.
1318
1319                     This   helper    behaves    in    a    way    close    to
1320                     bpf_perf_event_read()  helper,  save that instead of just
1321                     returning the value observed, it fills the buf structure.
1322                     This  allows for additional data to be retrieved: in par‐
1323                     ticular, the enabled and running times  (in  buf->enabled
1324                     and  buf->running,  respectively) are copied. In general,
1325                     bpf_perf_event_read_value()    is    recommended     over
1326                     bpf_perf_event_read(), which has some ABI issues and pro‐
1327                     vides fewer functionalities.
1328
1329                     These values are interesting, because hardware PMU  (Per‐
1330                     formance Monitoring Unit) counters are limited resources.
1331                     When there are more PMU based  perf  events  opened  than
1332                     available counters, kernel will multiplex these events so
1333                     each event gets certain percentage (but not all)  of  the
1334                     PMU  time.  In case that multiplexing happens, the number
1335                     of samples or counter value will  not  reflect  the  case
1336                     compared  to when no multiplexing occurs. This makes com‐
1337                     parison between different runs difficult.  Typically, the
1338                     counter  value  should  be normalized before comparing to
1339                     other experiments. The usual  normalization  is  done  as
1340                     follows.
1341
1342                        normalized_counter = counter * t_enabled / t_running
1343
1344                     Where  t_enabled is the time enabled for event and t_run‐
1345                     ning is the time running for event since last  normaliza‐
1346                     tion. The enabled and running times are accumulated since
1347                     the perf event open. To achieve  scaling  factor  between
1348                     two  invocations of an eBPF program, users can use CPU id
1349                     as the key (which is typical for perf array usage  model)
1350                     to remember the previous value and do the calculation in‐
1351                     side the eBPF program.
1352
1353              Return 0 on success, or a negative error in case of failure.
1354
1355       long bpf_perf_prog_read_value(struct bpf_perf_event_data  *ctx,  struct
1356       bpf_perf_event_value *buf, u32 buf_size)
1357
1358              Description
1359                     For  en  eBPF  program attached to a perf event, retrieve
1360                     the value of the event  counter  associated  to  ctx  and
1361                     store  it  in  the  structure  pointed by buf and of size
1362                     buf_size. Enabled and running times are  also  stored  in
1363                     the     structure     (see    description    of    helper
1364                     bpf_perf_event_read_value() for more details).
1365
1366              Return 0 on success, or a negative error in case of failure.
1367
1368       long bpf_getsockopt(void *bpf_socket,  int  level,  int  optname,  void
1369       *optval, int optlen)
1370
1371              Description
1372                     Emulate  a  call to getsockopt() on the socket associated
1373                     to bpf_socket, which must be a full socket. The level  at
1374                     which  the option resides and the name optname of the op‐
1375                     tion must be specified, see getsockopt(2) for more infor‐
1376                     mation.   The  retrieved value is stored in the structure
1377                     pointed by opval and of length optlen.
1378
1379                     bpf_socket should be one of the following:
1380
1381struct bpf_sock_ops for BPF_PROG_TYPE_SOCK_OPS.
1382
1383struct bpf_sock_addr for  BPF_CGROUP_INET4_CONNECT  and
1384                       BPF_CGROUP_INET6_CONNECT.
1385
1386                     This helper actually implements a subset of getsockopt().
1387                     It supports the following levels:
1388
1389IPPROTO_TCP, which supports optname TCP_CONGESTION.
1390
1391IPPROTO_IP, which supports optname IP_TOS.
1392
1393IPPROTO_IPV6, which supports optname IPV6_TCLASS.
1394
1395              Return 0 on success, or a negative error in case of failure.
1396
1397       long bpf_override_return(struct pt_regs *regs, u64 rc)
1398
1399              Description
1400                     Used for error injection, this  helper  uses  kprobes  to
1401                     override  the return value of the probed function, and to
1402                     set it to rc.  The first argument is the context regs  on
1403                     which the kprobe works.
1404
1405                     This  helper works by setting the PC (program counter) to
1406                     an override function which is run in place of the  origi‐
1407                     nal  probed  function.  This means the probed function is
1408                     not run at all. The  replacement  function  just  returns
1409                     with the required value.
1410
1411                     This  helper  has security implications, and thus is sub‐
1412                     ject to restrictions. It is only available if the  kernel
1413                     was compiled with the CONFIG_BPF_KPROBE_OVERRIDE configu‐
1414                     ration option, and in this case it only  works  on  func‐
1415                     tions  tagged  with  ALLOW_ERROR_INJECTION  in the kernel
1416                     code.
1417
1418                     Also, the helper is only available for the  architectures
1419                     having  the CONFIG_FUNCTION_ERROR_INJECTION option. As of
1420                     this writing, x86 architecture is the only one to support
1421                     this feature.
1422
1423              Return 0
1424
1425       long   bpf_sock_ops_cb_flags_set(struct   bpf_sock_ops  *bpf_sock,  int
1426       argval)
1427
1428              Description
1429                     Attempt to set the  value  of  the  bpf_sock_ops_cb_flags
1430                     field  for the full TCP socket associated to bpf_sock_ops
1431                     to argval.
1432
1433                     The primary use of this field is to  determine  if  there
1434                     should    be    calls    to   eBPF   programs   of   type
1435                     BPF_PROG_TYPE_SOCK_OPS at various points in the TCP code.
1436                     A program of the same type can change its value, per con‐
1437                     nection and as necessary, when the connection  is  estab‐
1438                     lished.  This  field  is directly accessible for reading,
1439                     but this helper must be used for updates in order to  re‐
1440                     turn  an error if an eBPF program tries to set a callback
1441                     that is not supported in the current kernel.
1442
1443                     argval is a flag array which can combine these flags:
1444
1445BPF_SOCK_OPS_RTO_CB_FLAG (retransmission time out)
1446
1447BPF_SOCK_OPS_RETRANS_CB_FLAG (retransmission)
1448
1449BPF_SOCK_OPS_STATE_CB_FLAG (TCP state change)
1450
1451BPF_SOCK_OPS_RTT_CB_FLAG (every RTT)
1452
1453                     Therefore, this function can be used to clear a  callback
1454                     flag by setting the appropriate bit to zero. e.g. to dis‐
1455                     able the RTO callback:
1456
1457                     bpf_sock_ops_cb_flags_set(bpf_sock,
1458                            bpf_sock->bpf_sock_ops_cb_flags                  &
1459                            ~BPF_SOCK_OPS_RTO_CB_FLAG)
1460
1461                     Here  are some examples of where one could call such eBPF
1462                     program:
1463
1464                     • When RTO fires.
1465
1466                     • When a packet is retransmitted.
1467
1468                     • When the connection terminates.
1469
1470                     • When a packet is sent.
1471
1472                     • When a packet is received.
1473
1474              Return Code -EINVAL if the socket is not a full TCP socket; oth‐
1475                     erwise,  a positive number containing the bits that could
1476                     not be set is returned (which comes down to 0 if all bits
1477                     were set as required).
1478
1479       long bpf_msg_redirect_map(struct sk_msg_buff *msg, struct bpf_map *map,
1480       u32 key, u64 flags)
1481
1482              Description
1483                     This helper is used in programs implementing policies  at
1484                     the  socket  level. If the message msg is allowed to pass
1485                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1486                     rect  it  to  the  socket  referenced  by  map  (of  type
1487                     BPF_MAP_TYPE_SOCKMAP) at  index  key.  Both  ingress  and
1488                     egress  interfaces  can  be  used  for  redirection.  The
1489                     BPF_F_INGRESS value in flags is used to make the distinc‐
1490                     tion  (ingress  path  is selected if the flag is present,
1491                     egress path otherwise). This is the only  flag  supported
1492                     for now.
1493
1494              Return SK_PASS on success, or SK_DROP on error.
1495
1496       long bpf_msg_apply_bytes(struct sk_msg_buff *msg, u32 bytes)
1497
1498              Description
1499                     For  socket  policies, apply the verdict of the eBPF pro‐
1500                     gram to the next bytes (number of bytes) of message msg.
1501
1502                     For example, this helper can be  used  in  the  following
1503                     cases:
1504
1505                     • A  single  sendmsg() or sendfile() system call contains
1506                       multiple logical messages that the eBPF program is sup‐
1507                       posed to read and for which it should apply a verdict.
1508
1509                     • An eBPF program only cares to read the first bytes of a
1510                       msg. If the message has a large payload,  then  setting
1511                       up  and  calling  the  eBPF  program repeatedly for all
1512                       bytes, even though the verdict is already known,  would
1513                       create unnecessary overhead.
1514
1515                     When  called from within an eBPF program, the helper sets
1516                     a counter internal to the  BPF  infrastructure,  that  is
1517                     used  to  apply  the  last  verdict to the next bytes. If
1518                     bytes is smaller than the current  data  being  processed
1519                     from  a  sendmsg()  or  sendfile() system call, the first
1520                     bytes will be sent and the eBPF program  will  be  re-run
1521                     with  the pointer for start of data pointing to byte num‐
1522                     ber bytes + 1. If bytes is larger than the  current  data
1523                     being processed, then the eBPF verdict will be applied to
1524                     multiple sendmsg() or sendfile() calls  until  bytes  are
1525                     consumed.
1526
1527                     Note  that  if  a socket closes with the internal counter
1528                     holding a non-zero value, this is not a  problem  because
1529                     data is not being buffered for bytes and is sent as it is
1530                     received.
1531
1532              Return 0
1533
1534       long bpf_msg_cork_bytes(struct sk_msg_buff *msg, u32 bytes)
1535
1536              Description
1537                     For socket policies, prevent the execution of the verdict
1538                     eBPF  program  for  message msg until bytes (byte number)
1539                     have been accumulated.
1540
1541                     This can be used when one  needs  a  specific  number  of
1542                     bytes  before a verdict can be assigned, even if the data
1543                     spans multiple sendmsg() or sendfile() calls. The extreme
1544                     case  would  be  a user calling sendmsg() repeatedly with
1545                     1-byte long message segments. Obviously, this is bad  for
1546                     performance,  but  it is still valid. If the eBPF program
1547                     needs bytes bytes to validate a header, this  helper  can
1548                     be  used  to  prevent the eBPF program to be called again
1549                     until bytes have been accumulated.
1550
1551              Return 0
1552
1553       long bpf_msg_pull_data(struct sk_msg_buff *msg, u32 start, u32 end, u64
1554       flags)
1555
1556              Description
1557                     For  socket  policies,  pull in non-linear data from user
1558                     space  for   msg   and   set   pointers   msg->data   and
1559                     msg->data_end  to  start  and end bytes offsets into msg,
1560                     respectively.
1561
1562                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
1563                     it can only parse data that the (data, data_end) pointers
1564                     have already consumed. For sendmsg() hooks this is likely
1565                     the  first  scatterlist element. But for calls relying on
1566                     the sendpage handler (e.g. sendfile()) this will  be  the
1567                     range  (0,  0) because the data is shared with user space
1568                     and by default the objective is to  avoid  allowing  user
1569                     space to modify data while (or after) eBPF verdict is be‐
1570                     ing decided. This helper can be used to pull in data  and
1571                     to  set  the  start and end pointer to given values. Data
1572                     will be copied if necessary (i.e. if data was not  linear
1573                     and  if  start  and end pointers do not point to the same
1574                     chunk).
1575
1576                     A call to this helper is susceptible to change the under‐
1577                     lying  packet buffer. Therefore, at load time, all checks
1578                     on pointers previously done by the verifier  are  invali‐
1579                     dated  and must be performed again, if the helper is used
1580                     in combination with direct packet access.
1581
1582                     All values for flags are reserved for future  usage,  and
1583                     must be left at zero.
1584
1585              Return 0 on success, or a negative error in case of failure.
1586
1587       long  bpf_bind(struct  bpf_sock_addr  *ctx,  struct sockaddr *addr, int
1588       addr_len)
1589
1590              Description
1591                     Bind the socket associated to ctx to the address  pointed
1592                     by  addr, of length addr_len. This allows for making out‐
1593                     going connection from the desired IP address,  which  can
1594                     be  useful for example when all processes inside a cgroup
1595                     should use one single IP address on a host that has  mul‐
1596                     tiple IP configured.
1597
1598                     This helper works for IPv4 and IPv6, TCP and UDP sockets.
1599                     The  domain  (addr->sa_family)  must   be   AF_INET   (or
1600                     AF_INET6).  It's  advised  to pass zero port (sin_port or
1601                     sin6_port)  which  triggers  IP_BIND_ADDRESS_NO_PORT-like
1602                     behavior  and  lets the kernel efficiently pick up an un‐
1603                     used port as long as 4-tuple is unique. Passing  non-zero
1604                     port might lead to degraded performance.
1605
1606              Return 0 on success, or a negative error in case of failure.
1607
1608       long bpf_xdp_adjust_tail(struct xdp_buff *xdp_md, int delta)
1609
1610              Description
1611                     Adjust (move) xdp_md->data_end by delta bytes. It is pos‐
1612                     sible to both shrink and grow the  packet  tail.   Shrink
1613                     done via delta being a negative integer.
1614
1615                     A call to this helper is susceptible to change the under‐
1616                     lying packet buffer. Therefore, at load time, all  checks
1617                     on  pointers  previously done by the verifier are invali‐
1618                     dated and must be performed again, if the helper is  used
1619                     in combination with direct packet access.
1620
1621              Return 0 on success, or a negative error in case of failure.
1622
1623       long  bpf_skb_get_xfrm_state(struct  sk_buff  *skb,  u32  index, struct
1624       bpf_xfrm_state *xfrm_state, u32 size, u64 flags)
1625
1626              Description
1627                     Retrieve the XFRM state (IP transform framework, see also
1628                     ip-xfrm(8)) at index in XFRM "security path" for skb.
1629
1630                     The   retrieved   value   is   stored   in   the   struct
1631                     bpf_xfrm_state pointed by xfrm_state and of length size.
1632
1633                     All values for flags are reserved for future  usage,  and
1634                     must be left at zero.
1635
1636                     This  helper is available only if the kernel was compiled
1637                     with CONFIG_XFRM configuration option.
1638
1639              Return 0 on success, or a negative error in case of failure.
1640
1641       long bpf_get_stack(void *ctx, void *buf, u32 size, u64 flags)
1642
1643              Description
1644                     Return a user or a kernel stack in bpf  program  provided
1645                     buffer.   To achieve this, the helper needs ctx, which is
1646                     a pointer to the context on which the tracing program  is
1647                     executed.   To store the stacktrace, the bpf program pro‐
1648                     vides buf with a nonnegative size.
1649
1650                     The last argument,  flags,  holds  the  number  of  stack
1651                     frames   to   skip   (from   0   to   255),  masked  with
1652                     BPF_F_SKIP_FIELD_MASK. The next bits can be used  to  set
1653                     the following flags:
1654
1655                     BPF_F_USER_STACK
1656                            Collect  a  user  space  stack instead of a kernel
1657                            stack.
1658
1659                     BPF_F_USER_BUILD_ID
1660                            Collect (build_id, file_offset) instead of ips for
1661                            user stack, only valid if BPF_F_USER_STACK is also
1662                            specified.
1663
1664                            file_offset is an offset relative to the beginning
1665                            of  the  executable  or shared object file backing
1666                            the vma which the ip falls in. It is not an offset
1667                            relative  to  that  object's base address. Accord‐
1668                            ingly, it must be adjusted by  adding  (sh_addr  -
1669                            sh_offset),  where  sh_{addr,offset} correspond to
1670                            the executable section containing  file_offset  in
1671                            the  object,  for comparisons to symbols' st_value
1672                            to be valid.
1673
1674                     bpf_get_stack() can collect  up  to  PERF_MAX_STACK_DEPTH
1675                     both  kernel and user frames, subject to sufficient large
1676                     buffer size. Note that this limit can be controlled  with
1677                     the  sysctl  program,  and that it should be manually in‐
1678                     creased in order to profile long  user  stacks  (such  as
1679                     stacks for Java programs). To do so, use:
1680
1681                        # sysctl kernel.perf_event_max_stack=<new value>
1682
1683              Return The  non-negative copied buf length equal to or less than
1684                     size on success, or a negative error in case of failure.
1685
1686       long bpf_skb_load_bytes_relative(const void *skb, u32 offset, void *to,
1687       u32 len, u32 start_header)
1688
1689              Description
1690                     This helper is similar to bpf_skb_load_bytes() in that it
1691                     provides an easy way to load len bytes from  offset  from
1692                     the  packet associated to skb, into the buffer pointed by
1693                     to. The difference  to  bpf_skb_load_bytes()  is  that  a
1694                     fifth  argument  start_header exists in order to select a
1695                     base offset to start from. start_header can be one of:
1696
1697                     BPF_HDR_START_MAC
1698                            Base offset to load data from is skb's mac header.
1699
1700                     BPF_HDR_START_NET
1701                            Base offset to load data  from  is  skb's  network
1702                            header.
1703
1704                     In  general,  "direct  packet  access"  is  the preferred
1705                     method to access packet data, however, this helper is  in
1706                     particular  useful in socket filters where skb->data does
1707                     not always point to the start of the mac header and where
1708                     "direct packet access" is not available.
1709
1710              Return 0 on success, or a negative error in case of failure.
1711
1712       long bpf_fib_lookup(void *ctx, struct bpf_fib_lookup *params, int plen,
1713       u32 flags)
1714
1715              Description
1716                     Do FIB  lookup  in  kernel  tables  using  parameters  in
1717                     params.   If lookup is successful and result shows packet
1718                     is to be forwarded, the neighbor tables are searched  for
1719                     the  nexthop.   If successful (ie., FIB lookup shows for‐
1720                     warding and nexthop is resolved), the nexthop address  is
1721                     returned in ipv4_dst or ipv6_dst based on family, smac is
1722                     set to mac address of egress device, dmac is set to  nex‐
1723                     thop  mac  address, rt_metric is set to metric from route
1724                     (IPv4/IPv6 only), and ifindex is set to the device  index
1725                     of the nexthop from the FIB lookup.
1726
1727                     plen argument is the size of the passed in struct.  flags
1728                     argument can be a combination of one or more of the  fol‐
1729                     lowing values:
1730
1731                     BPF_FIB_LOOKUP_DIRECT
1732                            Do  a direct table lookup vs full lookup using FIB
1733                            rules.
1734
1735                     BPF_FIB_LOOKUP_OUTPUT
1736                            Perform lookup from an egress perspective (default
1737                            is ingress).
1738
1739                     ctx  is  either  struct xdp_md for XDP programs or struct
1740                     sk_buff tc cls_act programs.
1741
1742              Return
1743
1744                     • < 0 if any input argument is invalid
1745
1746                     • 0 on success (packet is forwarded, nexthop neighbor ex‐
1747                       ists)
1748
1749                     • >  0  one of BPF_FIB_LKUP_RET_ codes explaining why the
1750                       packet is not forwarded or needs assist from full stack
1751
1752                     If lookup fails with  BPF_FIB_LKUP_RET_FRAG_NEEDED,  then
1753                     the  MTU  was exceeded and output params->mtu_result con‐
1754                     tains the MTU.
1755
1756       long bpf_sock_hash_update(struct bpf_sock_ops  *skops,  struct  bpf_map
1757       *map, void *key, u64 flags)
1758
1759              Description
1760                     Add  an  entry  to,  or update a sockhash map referencing
1761                     sockets.  The skops is used as a new value for the  entry
1762                     associated to key. flags is one of:
1763
1764                     BPF_NOEXIST
1765                            The entry for key must not exist in the map.
1766
1767                     BPF_EXIST
1768                            The entry for key must already exist in the map.
1769
1770                     BPF_ANY
1771                            No  condition  on  the  existence of the entry for
1772                            key.
1773
1774                     If the map has eBPF programs (parser and verdict),  those
1775                     will  be  inherited  by  the  socket  being added. If the
1776                     socket is already attached to eBPF programs, this results
1777                     in an error.
1778
1779              Return 0 on success, or a negative error in case of failure.
1780
1781       long  bpf_msg_redirect_hash(struct  sk_msg_buff  *msg,  struct  bpf_map
1782       *map, void *key, u64 flags)
1783
1784              Description
1785                     This helper is used in programs implementing policies  at
1786                     the  socket  level. If the message msg is allowed to pass
1787                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1788                     rect  it  to  the  socket  referenced  by  map  (of  type
1789                     BPF_MAP_TYPE_SOCKHASH) using hash key. Both  ingress  and
1790                     egress  interfaces  can  be  used  for  redirection.  The
1791                     BPF_F_INGRESS value in flags is used to make the distinc‐
1792                     tion  (ingress  path  is selected if the flag is present,
1793                     egress path otherwise). This is the only  flag  supported
1794                     for now.
1795
1796              Return SK_PASS on success, or SK_DROP on error.
1797
1798       long  bpf_sk_redirect_hash(struct  sk_buff  *skb,  struct bpf_map *map,
1799       void *key, u64 flags)
1800
1801              Description
1802                     This helper is used in programs implementing policies  at
1803                     the  skb  socket  level. If the sk_buff skb is allowed to
1804                     pass (i.e.  if the verdict eBPF program returns SK_PASS),
1805                     redirect  it  to  the  socket  referenced by map (of type
1806                     BPF_MAP_TYPE_SOCKHASH) using hash key. Both  ingress  and
1807                     egress  interfaces  can  be  used  for  redirection.  The
1808                     BPF_F_INGRESS value in flags is used to make the distinc‐
1809                     tion  (ingress  path  is selected if the flag is present,
1810                     egress otherwise). This is the only  flag  supported  for
1811                     now.
1812
1813              Return SK_PASS on success, or SK_DROP on error.
1814
1815       long  bpf_lwt_push_encap(struct  sk_buff *skb, u32 type, void *hdr, u32
1816       len)
1817
1818              Description
1819                     Encapsulate the packet associated to skb within a Layer 3
1820                     protocol header. This header is provided in the buffer at
1821                     address hdr, with len its size in bytes.  type  indicates
1822                     the protocol of the header and can be one of:
1823
1824                     BPF_LWT_ENCAP_SEG6
1825                            IPv6  encapsulation  with  Segment  Routing Header
1826                            (struct ipv6_sr_hdr). hdr only contains  the  SRH,
1827                            the IPv6 header is computed by the kernel.
1828
1829                     BPF_LWT_ENCAP_SEG6_INLINE
1830                            Only  works if skb contains an IPv6 packet. Insert
1831                            a Segment Routing Header (struct ipv6_sr_hdr)  in‐
1832                            side the IPv6 header.
1833
1834                     BPF_LWT_ENCAP_IP
1835                            IP  encapsulation  (GRE/GUE/IPIP/etc).  The  outer
1836                            header must be IPv4 or IPv6, followed by  zero  or
1837                            more  additional  headers, up to LWT_BPF_MAX_HEAD‐
1838                            ROOM total bytes in all prepended headers.  Please
1839                            note that if skb_is_gso(skb) is true, no more than
1840                            two  headers  can  be  prepended,  and  the  inner
1841                            header,  if  present,  should  be  either  GRE  or
1842                            UDP/GUE.
1843
1844                     BPF_LWT_ENCAP_SEG6* types can be called by  BPF  programs
1845                     of  type  BPF_PROG_TYPE_LWT_IN; BPF_LWT_ENCAP_IP type can
1846                     be called by bpf programs of  types  BPF_PROG_TYPE_LWT_IN
1847                     and BPF_PROG_TYPE_LWT_XMIT.
1848
1849                     A call to this helper is susceptible to change the under‐
1850                     lying packet buffer. Therefore, at load time, all  checks
1851                     on  pointers  previously done by the verifier are invali‐
1852                     dated and must be performed again, if the helper is  used
1853                     in combination with direct packet access.
1854
1855              Return 0 on success, or a negative error in case of failure.
1856
1857       long  bpf_lwt_seg6_store_bytes(struct  sk_buff  *skb, u32 offset, const
1858       void *from, u32 len)
1859
1860              Description
1861                     Store len bytes from address from into the packet associ‐
1862                     ated  to skb, at offset. Only the flags, tag and TLVs in‐
1863                     side the outermost IPv6 Segment  Routing  Header  can  be
1864                     modified through this helper.
1865
1866                     A call to this helper is susceptible to change the under‐
1867                     lying packet buffer. Therefore, at load time, all  checks
1868                     on  pointers  previously done by the verifier are invali‐
1869                     dated and must be performed again, if the helper is  used
1870                     in combination with direct packet access.
1871
1872              Return 0 on success, or a negative error in case of failure.
1873
1874       long  bpf_lwt_seg6_adjust_srh(struct  sk_buff  *skb,  u32  offset,  s32
1875       delta)
1876
1877              Description
1878                     Adjust the size allocated to TLVs in the  outermost  IPv6
1879                     Segment Routing Header contained in the packet associated
1880                     to skb, at position offset by delta bytes.  Only  offsets
1881                     after  the  segments  are  accepted. delta can be as well
1882                     positive (growing) as negative (shrinking).
1883
1884                     A call to this helper is susceptible to change the under‐
1885                     lying  packet buffer. Therefore, at load time, all checks
1886                     on pointers previously done by the verifier  are  invali‐
1887                     dated  and must be performed again, if the helper is used
1888                     in combination with direct packet access.
1889
1890              Return 0 on success, or a negative error in case of failure.
1891
1892       long bpf_lwt_seg6_action(struct sk_buff *skb, u32 action, void  *param,
1893       u32 param_len)
1894
1895              Description
1896                     Apply  an  IPv6  Segment Routing action of type action to
1897                     the packet associated to skb. Each action takes a parame‐
1898                     ter  contained  at address param, and of length param_len
1899                     bytes.  action can be one of:
1900
1901                     SEG6_LOCAL_ACTION_END_X
1902                            End.X action: Endpoint with Layer-3 cross-connect.
1903                            Type of param: struct in6_addr.
1904
1905                     SEG6_LOCAL_ACTION_END_T
1906                            End.T  action:  Endpoint  with specific IPv6 table
1907                            lookup.  Type of param: int.
1908
1909                     SEG6_LOCAL_ACTION_END_B6
1910                            End.B6 action: Endpoint bound to an  SRv6  policy.
1911                            Type of param: struct ipv6_sr_hdr.
1912
1913                     SEG6_LOCAL_ACTION_END_B6_ENCAP
1914                            End.B6.Encap action: Endpoint bound to an SRv6 en‐
1915                            capsulation  policy.   Type   of   param:   struct
1916                            ipv6_sr_hdr.
1917
1918                     A call to this helper is susceptible to change the under‐
1919                     lying packet buffer. Therefore, at load time, all  checks
1920                     on  pointers  previously done by the verifier are invali‐
1921                     dated and must be performed again, if the helper is  used
1922                     in combination with direct packet access.
1923
1924              Return 0 on success, or a negative error in case of failure.
1925
1926       long bpf_rc_repeat(void *ctx)
1927
1928              Description
1929                     This helper is used in programs implementing IR decoding,
1930                     to report a successfully decoded repeat key message. This
1931                     delays  the  generation  of a key up event for previously
1932                     generated key down event.
1933
1934                     Some IR protocols like NEC have a special IR message  for
1935                     repeating last button, for when a button is held down.
1936
1937                     The  ctx  should  point to the lirc sample as passed into
1938                     the program.
1939
1940                     This helper is only available is the kernel was  compiled
1941                     with  the  CONFIG_BPF_LIRC_MODE2 configuration option set
1942                     to "y".
1943
1944              Return 0
1945
1946       long bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle)
1947
1948              Description
1949                     This helper is used in programs implementing IR decoding,
1950                     to report a successfully decoded key press with scancode,
1951                     toggle value in the given protocol. The scancode will  be
1952                     translated to a keycode using the rc keymap, and reported
1953                     as an input key down event. After a period a key up event
1954                     is  generated. This period can be extended by calling ei‐
1955                     ther bpf_rc_keydown() again  with  the  same  values,  or
1956                     calling bpf_rc_repeat().
1957
1958                     Some  protocols  include a toggle bit, in case the button
1959                     was released and pressed again between consecutive  scan‐
1960                     codes.
1961
1962                     The  ctx  should  point to the lirc sample as passed into
1963                     the program.
1964
1965                     The protocol is the decoded  protocol  number  (see  enum
1966                     rc_proto for some predefined values).
1967
1968                     This  helper is only available is the kernel was compiled
1969                     with the CONFIG_BPF_LIRC_MODE2 configuration  option  set
1970                     to "y".
1971
1972              Return 0
1973
1974       u64 bpf_skb_cgroup_id(struct sk_buff *skb)
1975
1976              Description
1977                     Return the cgroup v2 id of the socket associated with the
1978                     skb.  This is roughly similar to the bpf_get_cgroup_clas‐
1979                     sid() helper for cgroup v1 by providing a tag resp. iden‐
1980                     tifier that can be matched on or  used  for  map  lookups
1981                     e.g.  to  implement  policy.  The cgroup v2 id of a given
1982                     path in the hierarchy is exposed in  user  space  through
1983                     the f_handle API in order to get to the same 64-bit id.
1984
1985                     This  helper  can  be  used on TC egress path, but not on
1986                     ingress, and is available only if the kernel was compiled
1987                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
1988
1989              Return The  id  is returned or 0 in case the id could not be re‐
1990                     trieved.
1991
1992       u64 bpf_get_current_cgroup_id(void)
1993
1994              Description
1995                     Get the current cgroup id  based  on  the  cgroup  within
1996                     which the current task is running.
1997
1998              Return A  64-bit  integer containing the current cgroup id based
1999                     on the cgroup within which the current task is running.
2000
2001       void *bpf_get_local_storage(void *map, u64 flags)
2002
2003              Description
2004                     Get the pointer to the local storage area.  The type  and
2005                     the size of the local storage is defined by the map argu‐
2006                     ment.  The flags meaning is specific for each  map  type,
2007                     and has to be 0 for cgroup local storage.
2008
2009                     Depending  on  the BPF program type, a local storage area
2010                     can be shared between multiple instances of the BPF  pro‐
2011                     gram, running simultaneously.
2012
2013                     A  user should care about the synchronization by himself.
2014                     For example, by using the BPF_ATOMIC instructions to  al‐
2015                     ter the shared data.
2016
2017              Return A pointer to the local storage area.
2018
2019       long   bpf_sk_select_reuseport(struct  sk_reuseport_md  *reuse,  struct
2020       bpf_map *map, void *key, u64 flags)
2021
2022              Description
2023                     Select a SO_REUSEPORT socket from  a  BPF_MAP_TYPE_REUSE‐
2024                     PORT_SOCKARRAY  map.   It  checks  the selected socket is
2025                     matching the incoming request in the socket buffer.
2026
2027              Return 0 on success, or a negative error in case of failure.
2028
2029       u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level)
2030
2031              Description
2032                     Return id of cgroup v2 that is ancestor of cgroup associ‐
2033                     ated with the skb at the ancestor_level.  The root cgroup
2034                     is at ancestor_level zero and each step down the  hierar‐
2035                     chy  increments  the level. If ancestor_level == level of
2036                     cgroup associated with skb, then  return  value  will  be
2037                     same as that of bpf_skb_cgroup_id().
2038
2039                     The  helper  is  useful  to  implement  policies based on
2040                     cgroups that are upper in hierarchy than immediate cgroup
2041                     associated with skb.
2042
2043                     The format of returned id and helper limitations are same
2044                     as in bpf_skb_cgroup_id().
2045
2046              Return The id is returned or 0 in case the id could not  be  re‐
2047                     trieved.
2048
2049       struct  bpf_sock  *bpf_sk_lookup_tcp(void  *ctx,  struct bpf_sock_tuple
2050       *tuple, u32 tuple_size, u64 netns, u64 flags)
2051
2052              Description
2053                     Look for TCP socket matching tuple, optionally in a child
2054                     network   namespace  netns.  The  return  value  must  be
2055                     checked, and if non-NULL, released via bpf_sk_release().
2056
2057                     The ctx should point to the context of the program,  such
2058                     as the skb or socket (depending on the hook in use). This
2059                     is used to determine the base network namespace  for  the
2060                     lookup.
2061
2062                     tuple_size must be one of:
2063
2064                     sizeof(tuple->ipv4)
2065                            Look for an IPv4 socket.
2066
2067                     sizeof(tuple->ipv6)
2068                            Look for an IPv6 socket.
2069
2070                     If  the  netns  is a negative signed 32-bit integer, then
2071                     the socket lookup table in the netns associated with  the
2072                     ctx  will be used. For the TC hooks, this is the netns of
2073                     the device in the skb. For  socket  hooks,  this  is  the
2074                     netns of the socket.  If netns is any other signed 32-bit
2075                     value greater than or equal to zero then it specifies the
2076                     ID of the netns relative to the netns associated with the
2077                     ctx. netns values beyond the range of 32-bit integers are
2078                     reserved for future use.
2079
2080                     All  values  for flags are reserved for future usage, and
2081                     must be left at zero.
2082
2083                     This helper is available only if the kernel was  compiled
2084                     with CONFIG_NET configuration option.
2085
2086              Return Pointer  to  struct bpf_sock, or NULL in case of failure.
2087                     For sockets with reuseport option,  the  struct  bpf_sock
2088                     result  is  from reuse->socks[] using the hash of the tu‐
2089                     ple.
2090
2091       struct bpf_sock  *bpf_sk_lookup_udp(void  *ctx,  struct  bpf_sock_tuple
2092       *tuple, u32 tuple_size, u64 netns, u64 flags)
2093
2094              Description
2095                     Look for UDP socket matching tuple, optionally in a child
2096                     network  namespace  netns.  The  return  value  must   be
2097                     checked, and if non-NULL, released via bpf_sk_release().
2098
2099                     The  ctx should point to the context of the program, such
2100                     as the skb or socket (depending on the hook in use). This
2101                     is  used  to determine the base network namespace for the
2102                     lookup.
2103
2104                     tuple_size must be one of:
2105
2106                     sizeof(tuple->ipv4)
2107                            Look for an IPv4 socket.
2108
2109                     sizeof(tuple->ipv6)
2110                            Look for an IPv6 socket.
2111
2112                     If the netns is a negative signed  32-bit  integer,  then
2113                     the  socket lookup table in the netns associated with the
2114                     ctx will be used. For the TC hooks, this is the netns  of
2115                     the  device  in  the  skb.  For socket hooks, this is the
2116                     netns of the socket.  If netns is any other signed 32-bit
2117                     value greater than or equal to zero then it specifies the
2118                     ID of the netns relative to the netns associated with the
2119                     ctx. netns values beyond the range of 32-bit integers are
2120                     reserved for future use.
2121
2122                     All values for flags are reserved for future  usage,  and
2123                     must be left at zero.
2124
2125                     This  helper is available only if the kernel was compiled
2126                     with CONFIG_NET configuration option.
2127
2128              Return Pointer to struct bpf_sock, or NULL in case  of  failure.
2129                     For  sockets  with  reuseport option, the struct bpf_sock
2130                     result is from reuse->socks[] using the hash of  the  tu‐
2131                     ple.
2132
2133       long bpf_sk_release(void *sock)
2134
2135              Description
2136                     Release  the  reference  held  by  sock.  sock  must be a
2137                     non-NULL    pointer    that     was     returned     from
2138                     bpf_sk_lookup_xxx().
2139
2140              Return 0 on success, or a negative error in case of failure.
2141
2142       long  bpf_map_push_elem(struct  bpf_map  *map,  const  void *value, u64
2143       flags)
2144
2145              Description
2146                     Push an element value in map. flags is one of:
2147
2148                     BPF_EXIST
2149                            If the queue/stack is full, the oldest element  is
2150                            removed to make room for this.
2151
2152              Return 0 on success, or a negative error in case of failure.
2153
2154       long bpf_map_pop_elem(struct bpf_map *map, void *value)
2155
2156              Description
2157                     Pop an element from map.
2158
2159              Return 0 on success, or a negative error in case of failure.
2160
2161       long bpf_map_peek_elem(struct bpf_map *map, void *value)
2162
2163              Description
2164                     Get an element from map without removing it.
2165
2166              Return 0 on success, or a negative error in case of failure.
2167
2168       long bpf_msg_push_data(struct sk_msg_buff *msg, u32 start, u32 len, u64
2169       flags)
2170
2171              Description
2172                     For socket policies, insert len bytes into msg at  offset
2173                     start.
2174
2175                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
2176                     it may want to insert metadata or options into  the  msg.
2177                     This can later be read and used by any of the lower layer
2178                     BPF hooks.
2179
2180                     This helper may fail if under memory pressure  (a  malloc
2181                     fails)  in these cases BPF programs will get an appropri‐
2182                     ate error and BPF programs will need to handle them.
2183
2184              Return 0 on success, or a negative error in case of failure.
2185
2186       long bpf_msg_pop_data(struct sk_msg_buff *msg, u32 start, u32 len,  u64
2187       flags)
2188
2189              Description
2190                     Will  remove len bytes from a msg starting at byte start.
2191                     This may result in ENOMEM errors under certain situations
2192                     if an allocation and copy are required due to a full ring
2193                     buffer.  However, the helper will try to avoid doing  the
2194                     allocation  if  possible. Other errors can occur if input
2195                     parameters are invalid either due to start byte not being
2196                     valid  part  of  msg  payload  and/or  pop value being to
2197                     large.
2198
2199              Return 0 on success, or a negative error in case of failure.
2200
2201       long bpf_rc_pointer_rel(void *ctx, s32 rel_x, s32 rel_y)
2202
2203              Description
2204                     This helper is used in programs implementing IR decoding,
2205                     to report a successfully decoded pointer movement.
2206
2207                     The  ctx  should  point to the lirc sample as passed into
2208                     the program.
2209
2210                     This helper is only available is the kernel was  compiled
2211                     with  the  CONFIG_BPF_LIRC_MODE2 configuration option set
2212                     to "y".
2213
2214              Return 0
2215
2216       long bpf_spin_lock(struct bpf_spin_lock *lock)
2217
2218              Description
2219                     Acquire a spinlock represented by the pointer lock, which
2220                     is  stored  as  part of a value of a map. Taking the lock
2221                     allows to safely update the rest of the  fields  in  that
2222                     value. The spinlock can (and must) later be released with
2223                     a call to bpf_spin_unlock(lock).
2224
2225                     Spinlocks in BPF programs come with a number of  restric‐
2226                     tions and constraints:
2227
2228bpf_spin_lock  objects  are only allowed inside maps of
2229                       types BPF_MAP_TYPE_HASH  and  BPF_MAP_TYPE_ARRAY  (this
2230                       list could be extended in the future).
2231
2232                     • BTF description of the map is mandatory.
2233
2234                     • The BPF program can take ONE lock at a time, since tak‐
2235                       ing two or more could cause dead locks.
2236
2237                     • Only one struct bpf_spin_lock is allowed per  map  ele‐
2238                       ment.
2239
2240                     • When  the  lock  is  taken, calls (either BPF to BPF or
2241                       helpers) are not allowed.
2242
2243                     • The BPF_LD_ABS and BPF_LD_IND instructions are not  al‐
2244                       lowed inside a spinlock-ed region.
2245
2246                     • The  BPF program MUST call bpf_spin_unlock() to release
2247                       the lock, on all execution paths, before it returns.
2248
2249                     • The BPF program can access  struct  bpf_spin_lock  only
2250                       via  the bpf_spin_lock() and bpf_spin_unlock() helpers.
2251                       Loading or storing data into the  struct  bpf_spin_lock
2252                       lock; field of a map is not allowed.
2253
2254                     • To  use the bpf_spin_lock() helper, the BTF description
2255                       of the map value must  be  a  struct  and  have  struct
2256                       bpf_spin_lock  anyname; field at the top level.  Nested
2257                       lock inside another struct is not allowed.
2258
2259                     • The struct bpf_spin_lock lock field in a map value must
2260                       be aligned on a multiple of 4 bytes in that value.
2261
2262                     • Syscall  with command BPF_MAP_LOOKUP_ELEM does not copy
2263                       the bpf_spin_lock field to user space.
2264
2265                     • Syscall with  command  BPF_MAP_UPDATE_ELEM,  or  update
2266                       from  a  BPF  program,  do not update the bpf_spin_lock
2267                       field.
2268
2269bpf_spin_lock cannot be on the stack or inside  a  net‐
2270                       working packet (it can only be inside of a map values).
2271
2272bpf_spin_lock is available to root only.
2273
2274                     • Tracing  programs and socket filter programs cannot use
2275                       bpf_spin_lock() due to insufficient  preemption  checks
2276                       (but this may change in the future).
2277
2278bpf_spin_lock   is   not   allowed  in  inner  maps  of
2279                       map-in-map.
2280
2281              Return 0
2282
2283       long bpf_spin_unlock(struct bpf_spin_lock *lock)
2284
2285              Description
2286                     Release  the  lock  previously  locked  by  a   call   to
2287                     bpf_spin_lock(lock).
2288
2289              Return 0
2290
2291       struct bpf_sock *bpf_sk_fullsock(struct bpf_sock *sk)
2292
2293              Description
2294                     This  helper gets a struct bpf_sock pointer such that all
2295                     the fields in this bpf_sock can be accessed.
2296
2297              Return A struct bpf_sock pointer on success, or NULL in case  of
2298                     failure.
2299
2300       struct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *sk)
2301
2302              Description
2303                     This  helper  gets  a  struct bpf_tcp_sock pointer from a
2304                     struct bpf_sock pointer.
2305
2306              Return A struct bpf_tcp_sock pointer on success, or NULL in case
2307                     of failure.
2308
2309       long bpf_skb_ecn_set_ce(struct sk_buff *skb)
2310
2311              Description
2312                     Set  ECN  (Explicit  Congestion Notification) field of IP
2313                     header to CE (Congestion Encountered) if current value is
2314                     ECT (ECN Capable Transport). Otherwise, do nothing. Works
2315                     with IPv6 and IPv4.
2316
2317              Return 1 if the CE flag is set (either  by  the  current  helper
2318                     call  or  because it was already present), 0 if it is not
2319                     set.
2320
2321       struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk)
2322
2323              Description
2324                     Return a struct bpf_sock  pointer  in  TCP_LISTEN  state.
2325                     bpf_sk_release() is unnecessary and not allowed.
2326
2327              Return A  struct bpf_sock pointer on success, or NULL in case of
2328                     failure.
2329
2330       struct bpf_sock *bpf_skc_lookup_tcp(void  *ctx,  struct  bpf_sock_tuple
2331       *tuple, u32 tuple_size, u64 netns, u64 flags)
2332
2333              Description
2334                     Look for TCP socket matching tuple, optionally in a child
2335                     network  namespace  netns.  The  return  value  must   be
2336                     checked, and if non-NULL, released via bpf_sk_release().
2337
2338                     This function is identical to bpf_sk_lookup_tcp(), except
2339                     that it also returns timewait  or  request  sockets.  Use
2340                     bpf_sk_fullsock()  or  bpf_tcp_sock()  to access the full
2341                     structure.
2342
2343                     This helper is available only if the kernel was  compiled
2344                     with CONFIG_NET configuration option.
2345
2346              Return Pointer  to  struct bpf_sock, or NULL in case of failure.
2347                     For sockets with reuseport option,  the  struct  bpf_sock
2348                     result  is  from reuse->socks[] using the hash of the tu‐
2349                     ple.
2350
2351       long bpf_tcp_check_syncookie(void *sk, void *iph, u32  iph_len,  struct
2352       tcphdr *th, u32 th_len)
2353
2354              Description
2355                     Check  whether  iph and th contain a valid SYN cookie ACK
2356                     for the listening socket in sk.
2357
2358                     iph points to the start of the IPv4 or IPv6 header, while
2359                     iph_len  contains  sizeof(struct  iphdr) or sizeof(struct
2360                     ipv6hdr).
2361
2362                     th points to the start of the TCP  header,  while  th_len
2363                     contains   the   length  of  the  TCP  header  (at  least
2364                     sizeof(struct tcphdr)).
2365
2366              Return 0 if iph and th are a valid SYN cookie ACK, or a negative
2367                     error otherwise.
2368
2369       long  bpf_sysctl_get_name(struct  bpf_sysctl  *ctx,  char  *buf, size_t
2370       buf_len, u64 flags)
2371
2372              Description
2373                     Get name of sysctl in /proc/sys/ and copy  it  into  pro‐
2374                     vided by program buffer buf of size buf_len.
2375
2376                     The   buffer   is  always  NUL  terminated,  unless  it's
2377                     zero-sized.
2378
2379                     If flags is zero, full name (e.g. "net/ipv4/tcp_mem")  is
2380                     copied. Use BPF_F_SYSCTL_BASE_NAME flag to copy base name
2381                     only (e.g. "tcp_mem").
2382
2383              Return Number of character copied (not  including  the  trailing
2384                     NUL).
2385
2386                     -E2BIG  if the buffer wasn't big enough (buf will contain
2387                     truncated name in this case).
2388
2389       long bpf_sysctl_get_current_value(struct bpf_sysctl  *ctx,  char  *buf,
2390       size_t buf_len)
2391
2392              Description
2393                     Get  current  value  of  sysctl  as  it  is  presented in
2394                     /proc/sys (incl. newline, etc), and copy it as  a  string
2395                     into provided by program buffer buf of size buf_len.
2396
2397                     The  whole  value is copied, no matter what file position
2398                     user space issued e.g. sys_read at.
2399
2400                     The  buffer  is  always  NUL  terminated,   unless   it's
2401                     zero-sized.
2402
2403              Return Number  of  character  copied (not including the trailing
2404                     NUL).
2405
2406                     -E2BIG if the buffer wasn't big enough (buf will  contain
2407                     truncated name in this case).
2408
2409                     -EINVAL  if  current  value was unavailable, e.g. because
2410                     sysctl is uninitialized and read returns -EIO for it.
2411
2412       long bpf_sysctl_get_new_value(struct bpf_sysctl *ctx, char *buf, size_t
2413       buf_len)
2414
2415              Description
2416                     Get  new value being written by user space to sysctl (be‐
2417                     fore the actual write happens) and copy it  as  a  string
2418                     into provided by program buffer buf of size buf_len.
2419
2420                     User space may write new value at file position > 0.
2421
2422                     The   buffer   is  always  NUL  terminated,  unless  it's
2423                     zero-sized.
2424
2425              Return Number of character copied (not  including  the  trailing
2426                     NUL).
2427
2428                     -E2BIG  if the buffer wasn't big enough (buf will contain
2429                     truncated name in this case).
2430
2431                     -EINVAL if sysctl is being read.
2432
2433       long bpf_sysctl_set_new_value(struct bpf_sysctl *ctx, const char  *buf,
2434       size_t buf_len)
2435
2436              Description
2437                     Override  new value being written by user space to sysctl
2438                     with value provided by program  in  buffer  buf  of  size
2439                     buf_len.
2440
2441                     buf  should  contain a string in same form as provided by
2442                     user space on sysctl write.
2443
2444                     User space may write new value at file position >  0.  To
2445                     override  the  whole sysctl value file position should be
2446                     set to zero.
2447
2448              Return 0 on success.
2449
2450                     -E2BIG if the buf_len is too big.
2451
2452                     -EINVAL if sysctl is being read.
2453
2454       long bpf_strtol(const char *buf, size_t buf_len, u64 flags, long *res)
2455
2456              Description
2457                     Convert the initial part of the string from buffer buf of
2458                     size  buf_len  to  a  long integer according to the given
2459                     base and save the result in res.
2460
2461                     The string may begin with an arbitrary  amount  of  white
2462                     space  (as determined by isspace(3)) followed by a single
2463                     optional '-' sign.
2464
2465                     Five least significant bits of flags encode  base,  other
2466                     bits are currently unused.
2467
2468                     Base must be either 8, 10, 16 or 0 to detect it automati‐
2469                     cally similar to user space strtol(3).
2470
2471              Return Number of characters consumed on success. Must  be  posi‐
2472                     tive but no more than buf_len.
2473
2474                     -EINVAL if no valid digits were found or unsupported base
2475                     was provided.
2476
2477                     -ERANGE if resulting value was out of range.
2478
2479       long bpf_strtoul(const char *buf, size_t buf_len, u64  flags,  unsigned
2480       long *res)
2481
2482              Description
2483                     Convert the initial part of the string from buffer buf of
2484                     size buf_len to an unsigned long integer according to the
2485                     given base and save the result in res.
2486
2487                     The  string  may  begin with an arbitrary amount of white
2488                     space (as determined by isspace(3)).
2489
2490                     Five least significant bits of flags encode  base,  other
2491                     bits are currently unused.
2492
2493                     Base must be either 8, 10, 16 or 0 to detect it automati‐
2494                     cally similar to user space strtoul(3).
2495
2496              Return Number of characters consumed on success. Must  be  posi‐
2497                     tive but no more than buf_len.
2498
2499                     -EINVAL if no valid digits were found or unsupported base
2500                     was provided.
2501
2502                     -ERANGE if resulting value was out of range.
2503
2504       void *bpf_sk_storage_get(struct bpf_map *map, void  *sk,  void  *value,
2505       u64 flags)
2506
2507              Description
2508                     Get a bpf-local-storage from a sk.
2509
2510                     Logically,  it could be thought of getting the value from
2511                     a map with sk as the key.  From  this  perspective,   the
2512                     usage is not much different from bpf_map_lookup_elem(map,
2513                     &sk) except this helper enforces the key must be  a  full
2514                     socket  and  the  map  must  be a BPF_MAP_TYPE_SK_STORAGE
2515                     also.
2516
2517                     Underneath, the value is stored locally at sk instead  of
2518                     the  map.   The  map  is  used  as  the bpf-local-storage
2519                     "type". The bpf-local-storage "type" (i.e.  the  map)  is
2520                     searched against all bpf-local-storages residing at sk.
2521
2522                     sk  is  a kernel struct sock pointer for LSM program.  sk
2523                     is a struct bpf_sock pointer for other program types.
2524
2525                     An optional flags  (BPF_SK_STORAGE_GET_F_CREATE)  can  be
2526                     used such that a new bpf-local-storage will be created if
2527                     one does not exist.  value  can  be  used  together  with
2528                     BPF_SK_STORAGE_GET_F_CREATE  to specify the initial value
2529                     of a  bpf-local-storage.   If  value  is  NULL,  the  new
2530                     bpf-local-storage will be zero initialized.
2531
2532              Return A bpf-local-storage pointer is returned on success.
2533
2534                     NULL  if  not found or there was an error in adding a new
2535                     bpf-local-storage.
2536
2537       long bpf_sk_storage_delete(struct bpf_map *map, void *sk)
2538
2539              Description
2540                     Delete a bpf-local-storage from a sk.
2541
2542              Return 0 on success.
2543
2544                     -ENOENT if the bpf-local-storage cannot be found.   -EIN‐
2545                     VAL if sk is not a fullsock (e.g. a request_sock).
2546
2547       long bpf_send_signal(u32 sig)
2548
2549              Description
2550                     Send  signal sig to the process of the current task.  The
2551                     signal may be delivered to any of this process's threads.
2552
2553              Return 0 on success or successfully queued.
2554
2555                     -EBUSY if work queue under nmi is full.
2556
2557                     -EINVAL if sig is invalid.
2558
2559                     -EPERM if no permission to send the sig.
2560
2561                     -EAGAIN if bpf program can try again.
2562
2563       s64 bpf_tcp_gen_syncookie(void *sk,  void  *iph,  u32  iph_len,  struct
2564       tcphdr *th, u32 th_len)
2565
2566              Description
2567                     Try to issue a SYN cookie for the packet with correspond‐
2568                     ing IP/TCP headers, iph and th, on the  listening  socket
2569                     in sk.
2570
2571                     iph points to the start of the IPv4 or IPv6 header, while
2572                     iph_len contains sizeof(struct  iphdr)  or  sizeof(struct
2573                     ipv6hdr).
2574
2575                     th  points  to  the start of the TCP header, while th_len
2576                     contains the length of the TCP header  with  options  (at
2577                     least sizeof(struct tcphdr)).
2578
2579              Return On  success,  lower 32 bits hold the generated SYN cookie
2580                     in followed by 16 bits which hold the MSS value for  that
2581                     cookie, and the top 16 bits are unused.
2582
2583                     On failure, the returned value is one of the following:
2584
2585                     -EINVAL SYN cookie cannot be issued due to error
2586
2587                     -ENOENT SYN cookie should not be issued (no SYN flood)
2588
2589                     -EOPNOTSUPP  kernel  configuration  does  not  enable SYN
2590                     cookies
2591
2592                     -EPROTONOSUPPORT IP packet version is not 4 or 6
2593
2594       long bpf_skb_output(void *ctx, struct bpf_map  *map,  u64  flags,  void
2595       *data, u64 size)
2596
2597              Description
2598                     Write raw data blob into a special BPF perf event held by
2599                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
2600                     event must have the following attributes: PERF_SAMPLE_RAW
2601                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
2602                     PERF_COUNT_SW_BPF_OUTPUT as config.
2603
2604                     The flags are used to indicate the index in map for which
2605                     the value must be put, masked with BPF_F_INDEX_MASK.  Al‐
2606                     ternatively, flags can be set to BPF_F_CURRENT_CPU to in‐
2607                     dicate that the index of the current CPU core  should  be
2608                     used.
2609
2610                     The value to write, of size, is passed through eBPF stack
2611                     and pointed by data.
2612
2613                     ctx is a pointer to in-kernel struct sk_buff.
2614
2615                     This helper is similar to bpf_perf_event_output() but re‐
2616                     stricted to raw_tracepoint bpf programs.
2617
2618              Return 0 on success, or a negative error in case of failure.
2619
2620       long bpf_probe_read_user(void *dst, u32 size, const void *unsafe_ptr)
2621
2622              Description
2623                     Safely attempt to read size bytes from user space address
2624                     unsafe_ptr and store the data in dst.
2625
2626              Return 0 on success, or a negative error in case of failure.
2627
2628       long bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
2629
2630              Description
2631                     Safely attempt to read size bytes from kernel  space  ad‐
2632                     dress unsafe_ptr and store the data in dst.
2633
2634              Return 0 on success, or a negative error in case of failure.
2635
2636       long  bpf_probe_read_user_str(void  *dst,  u32  size,  const  void *un‐
2637       safe_ptr)
2638
2639              Description
2640                     Copy a NUL terminated string from an unsafe user  address
2641                     unsafe_ptr  to dst. The size should include the terminat‐
2642                     ing NUL byte. In case the string length is  smaller  than
2643                     size, the target is not padded with further NUL bytes. If
2644                     the string length is larger than size, just size-1  bytes
2645                     are copied and the last byte is set to NUL.
2646
2647                     On  success,  returns the number of bytes that were writ‐
2648                     ten, including the terminal NUL. This makes  this  helper
2649                     useful  in tracing programs for reading strings, and more
2650                     importantly to get its length at runtime. See the follow‐
2651                     ing snippet:
2652
2653                        SEC("kprobe/sys_open")
2654                        void bpf_sys_open(struct pt_regs *ctx)
2655                        {
2656                                char buf[PATHLEN]; // PATHLEN is defined to 256
2657                                int res = bpf_probe_read_user_str(buf, sizeof(buf),
2658                                                                  ctx->di);
2659
2660                                // Consume buf, for example push it to
2661                                // userspace via bpf_perf_event_output(); we
2662                                // can use res (the string length) as event
2663                                // size, after checking its boundaries.
2664                        }
2665
2666                     In  comparison,  using  bpf_probe_read_user() helper here
2667                     instead to read the string would require to estimate  the
2668                     length at compile time, and would often result in copying
2669                     more memory than necessary.
2670
2671                     Another  useful  use  case  is  when  parsing  individual
2672                     process  arguments  or  individual  environment variables
2673                     navigating      current->mm->arg_start      and      cur‐
2674                     rent->mm->env_start:  using  this  helper  and the return
2675                     value, one can quickly iterate at the right offset of the
2676                     memory area.
2677
2678              Return On  success,  the  strictly positive length of the output
2679                     string, including the trailing NUL character. On error, a
2680                     negative value.
2681
2682       long  bpf_probe_read_kernel_str(void  *dst,  u32  size, const void *un‐
2683       safe_ptr)
2684
2685              Description
2686                     Copy a NUL terminated string from an  unsafe  kernel  ad‐
2687                     dress   unsafe_ptr   to   dst.  Same  semantics  as  with
2688                     bpf_probe_read_user_str() apply.
2689
2690              Return On success, the strictly positive length of  the  string,
2691                     including  the  trailing NUL character. On error, a nega‐
2692                     tive value.
2693
2694       long bpf_tcp_send_ack(void *tp, u32 rcv_nxt)
2695
2696              Description
2697                     Send out a tcp-ack. tp is the in-kernel struct  tcp_sock.
2698                     rcv_nxt is the ack_seq to be sent out.
2699
2700              Return 0 on success, or a negative error in case of failure.
2701
2702       long bpf_send_signal_thread(u32 sig)
2703
2704              Description
2705                     Send  signal  sig to the thread corresponding to the cur‐
2706                     rent task.
2707
2708              Return 0 on success or successfully queued.
2709
2710                     -EBUSY if work queue under nmi is full.
2711
2712                     -EINVAL if sig is invalid.
2713
2714                     -EPERM if no permission to send the sig.
2715
2716                     -EAGAIN if bpf program can try again.
2717
2718       u64 bpf_jiffies64(void)
2719
2720              Description
2721                     Obtain the 64bit jiffies
2722
2723              Return The 64 bit jiffies
2724
2725       long  bpf_read_branch_records(struct  bpf_perf_event_data  *ctx,   void
2726       *buf, u32 size, u64 flags)
2727
2728              Description
2729                     For  an  eBPF  program attached to a perf event, retrieve
2730                     the branch records (struct perf_branch_entry)  associated
2731                     to  ctx  and  store it in the buffer pointed by buf up to
2732                     size size bytes.
2733
2734              Return On success, number of bytes written to buf. On  error,  a
2735                     negative value.
2736
2737                     The  flags can be set to BPF_F_GET_BRANCH_RECORDS_SIZE to
2738                     instead return the number of bytes required to store  all
2739                     the branch entries. If this flag is set, buf may be NULL.
2740
2741                     -EINVAL  if  arguments  invalid or size not a multiple of
2742                     sizeof(struct perf_branch_entry).
2743
2744                     -ENOENT if architecture does not support branch records.
2745
2746       long   bpf_get_ns_current_pid_tgid(u64    dev,    u64    ino,    struct
2747       bpf_pidns_info *nsdata, u32 size)
2748
2749              Description
2750                     Returns  0  on  success,  values for pid and tgid as seen
2751                     from the current namespace will be returned in nsdata.
2752
2753              Return 0 on success, or one of the following in case of failure:
2754
2755                     -EINVAL if dev and inum supplied don't  match  dev_t  and
2756                     inode number with nsfs of current task, or if dev conver‐
2757                     sion to dev_t lost high bits.
2758
2759                     -ENOENT if pidns does not exists for the current task.
2760
2761       long bpf_xdp_output(void *ctx, struct bpf_map  *map,  u64  flags,  void
2762       *data, u64 size)
2763
2764              Description
2765                     Write raw data blob into a special BPF perf event held by
2766                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
2767                     event must have the following attributes: PERF_SAMPLE_RAW
2768                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
2769                     PERF_COUNT_SW_BPF_OUTPUT as config.
2770
2771                     The flags are used to indicate the index in map for which
2772                     the value must be put, masked with BPF_F_INDEX_MASK.  Al‐
2773                     ternatively, flags can be set to BPF_F_CURRENT_CPU to in‐
2774                     dicate that the index of the current CPU core  should  be
2775                     used.
2776
2777                     The value to write, of size, is passed through eBPF stack
2778                     and pointed by data.
2779
2780                     ctx is a pointer to in-kernel struct xdp_buff.
2781
2782                     This helper is similar to bpf_perf_eventoutput() but  re‐
2783                     stricted to raw_tracepoint bpf programs.
2784
2785              Return 0 on success, or a negative error in case of failure.
2786
2787       u64 bpf_get_netns_cookie(void *ctx)
2788
2789              Description
2790                     Retrieve the cookie (generated by the kernel) of the net‐
2791                     work namespace the input ctx is associated with. The net‐
2792                     work namespace cookie remains stable for its lifetime and
2793                     provides a global identifier that can be assumed  unique.
2794                     If  ctx  is  NULL, then the helper returns the cookie for
2795                     the initial network namespace. The cookie itself is  very
2796                     similar  to  that  of bpf_get_socket_cookie() helper, but
2797                     for network namespaces instead of sockets.
2798
2799              Return A 8-byte long opaque number.
2800
2801       u64 bpf_get_current_ancestor_cgroup_id(int ancestor_level)
2802
2803              Description
2804                     Return id of cgroup v2 that is ancestor of the cgroup as‐
2805                     sociated with the current task at the ancestor_level. The
2806                     root cgroup is at ancestor_level zero and each step  down
2807                     the  hierarchy increments the level. If ancestor_level ==
2808                     level of cgroup associated with the  current  task,  then
2809                     return  value  will  be  the same as that of bpf_get_cur‐
2810                     rent_cgroup_id().
2811
2812                     The helper is  useful  to  implement  policies  based  on
2813                     cgroups that are upper in hierarchy than immediate cgroup
2814                     associated with the current task.
2815
2816                     The format of returned id and helper limitations are same
2817                     as in bpf_get_current_cgroup_id().
2818
2819              Return The  id  is returned or 0 in case the id could not be re‐
2820                     trieved.
2821
2822       long bpf_sk_assign(struct sk_buff *skb, void *sk, u64 flags)
2823
2824              Description
2825                     Helper is overloaded depending on BPF program type.  This
2826                     description   applies   to   BPF_PROG_TYPE_SCHED_CLS  and
2827                     BPF_PROG_TYPE_SCHED_ACT programs.
2828
2829                     Assign the sk to the skb. When combined with  appropriate
2830                     routing  configuration  to receive the packet towards the
2831                     socket, will cause skb to be delivered to  the  specified
2832                     socket.   Subsequent  redirection  of  skb via  bpf_redi‐
2833                     rect(), bpf_clone_redirect() or other methods outside  of
2834                     BPF may interfere with successful delivery to the socket.
2835
2836                     This operation is only valid from TC ingress path.
2837
2838                     The flags argument must be zero.
2839
2840              Return 0 on success, or a negative error in case of failure:
2841
2842                     -EINVAL if specified flags are not supported.
2843
2844                     -ENOENT if the socket is unavailable for assignment.
2845
2846                     -ENETUNREACH if the socket is unreachable (wrong netns).
2847
2848                     -EOPNOTSUPP  if the operation is not supported, for exam‐
2849                     ple a call from outside of TC ingress.
2850
2851                     -ESOCKTNOSUPPORT if the  socket  type  is  not  supported
2852                     (reuseport).
2853
2854       long  bpf_sk_assign(struct bpf_sk_lookup *ctx, struct bpf_sock *sk, u64
2855       flags)
2856
2857              Description
2858                     Helper is overloaded depending on BPF program type.  This
2859                     description applies to BPF_PROG_TYPE_SK_LOOKUP programs.
2860
2861                     Select the sk as a result of a socket lookup.
2862
2863                     For  the  operation to succeed passed socket must be com‐
2864                     patible with the packet description provided by  the  ctx
2865                     object.
2866
2867                     L4 protocol (IPPROTO_TCP or IPPROTO_UDP) must be an exact
2868                     match. While IP family (AF_INET or AF_INET6) must be com‐
2869                     patible, that is IPv6 sockets that are not v6-only can be
2870                     selected for IPv4 packets.
2871
2872                     Only TCP listeners and UDP unconnected sockets can be se‐
2873                     lected.  sk can also be NULL to reset any previous selec‐
2874                     tion.
2875
2876                     flags argument can combination of following values:
2877
2878BPF_SK_LOOKUP_F_REPLACE to override the previous socket
2879                       selection,  potentially  done by a BPF program that ran
2880                       before us.
2881
2882BPF_SK_LOOKUP_F_NO_REUSEPORT  to  skip   load-balancing
2883                       within reuseport group for the socket being selected.
2884
2885                     On success ctx->sk will point to the selected socket.
2886
2887              Return 0 on success, or a negative errno in case of failure.
2888
2889-EAFNOSUPPORT if socket family (sk->family) is not com‐
2890                       patible with packet family (ctx->family).
2891
2892-EEXIST if socket has  been  already  selected,  poten‐
2893                       tially  by another program, and BPF_SK_LOOKUP_F_REPLACE
2894                       flag was not specified.
2895
2896-EINVAL if unsupported flags were specified.
2897
2898-EPROTOTYPE  if  socket  L4   protocol   (sk->protocol)
2899                       doesn't match packet protocol (ctx->protocol).
2900
2901-ESOCKTNOSUPPORT if socket is not in allowed state (TCP
2902                       listening or UDP unconnected).
2903
2904       u64 bpf_ktime_get_boot_ns(void)
2905
2906              Description
2907                     Return the time elapsed since system  boot,  in  nanosec‐
2908                     onds.   Does  include  the time the system was suspended.
2909                     See: clock_gettime(CLOCK_BOOTTIME)
2910
2911              Return Current ktime.
2912
2913       long bpf_seq_printf(struct seq_file *m, const char *fmt, u32  fmt_size,
2914       const void *data, u32 data_len)
2915
2916              Description
2917                     bpf_seq_printf()  uses seq_file seq_printf() to print out
2918                     the format string.  The m represents  the  seq_file.  The
2919                     fmt  and  fmt_size  are for the format string itself. The
2920                     data and data_len are format string arguments.  The  data
2921                     are  a  u64  array and corresponding format string values
2922                     are stored in the array. For strings and  pointers  where
2923                     pointees are accessed, only the pointer values are stored
2924                     in the data array.  The data_len is the size of  data  in
2925                     bytes - must be a multiple of 8.
2926
2927                     Formats  %s, %p{i,I}{4,6} requires to read kernel memory.
2928                     Reading kernel memory may fail due to either invalid  ad‐
2929                     dress  or  valid  address  but  requiring  a major memory
2930                     fault. If reading kernel memory fails, the string for  %s
2931                     will   be  an  empty  string,  and  the  ip  address  for
2932                     %p{i,I}{4,6} will be 0. Not returning error to  bpf  pro‐
2933                     gram  is consistent with what bpf_trace_printk() does for
2934                     now.
2935
2936              Return 0 on success, or a negative error in case of failure:
2937
2938                     -EBUSY if per-CPU memory copy buffer  is  busy,  can  try
2939                     again by returning 1 from bpf program.
2940
2941                     -EINVAL  if  arguments  are  invalid,  or  if  fmt is in‐
2942                     valid/unsupported.
2943
2944                     -E2BIG if fmt contains too many format specifiers.
2945
2946                     -EOVERFLOW if an overflow happened: The same object  will
2947                     be tried again.
2948
2949       long bpf_seq_write(struct seq_file *m, const void *data, u32 len)
2950
2951              Description
2952                     bpf_seq_write()  uses  seq_file  seq_write() to write the
2953                     data.  The m represents the seq_file. The  data  and  len
2954                     represent the data to write in bytes.
2955
2956              Return 0 on success, or a negative error in case of failure:
2957
2958                     -EOVERFLOW  if an overflow happened: The same object will
2959                     be tried again.
2960
2961       u64 bpf_sk_cgroup_id(void *sk)
2962
2963              Description
2964                     Return the cgroup v2 id of the socket sk.
2965
2966                     sk must be a non-NULL pointer to a socket, e.g.  one  re‐
2967                     turned  from bpf_sk_lookup_xxx(), bpf_sk_fullsock(), etc.
2968                     The   format   of   returned   id   is   same    as    in
2969                     bpf_skb_cgroup_id().
2970
2971                     This  helper is available only if the kernel was compiled
2972                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
2973
2974              Return The id is returned or 0 in case the id could not  be  re‐
2975                     trieved.
2976
2977       u64 bpf_sk_ancestor_cgroup_id(void *sk, int ancestor_level)
2978
2979              Description
2980                     Return id of cgroup v2 that is ancestor of cgroup associ‐
2981                     ated with the sk at the ancestor_level.  The root  cgroup
2982                     is  at ancestor_level zero and each step down the hierar‐
2983                     chy increments the level. If ancestor_level ==  level  of
2984                     cgroup associated with sk, then return value will be same
2985                     as that of bpf_sk_cgroup_id().
2986
2987                     The helper is  useful  to  implement  policies  based  on
2988                     cgroups that are upper in hierarchy than immediate cgroup
2989                     associated with sk.
2990
2991                     The format of returned id and helper limitations are same
2992                     as in bpf_sk_cgroup_id().
2993
2994              Return The  id  is returned or 0 in case the id could not be re‐
2995                     trieved.
2996
2997       long bpf_ringbuf_output(void *ringbuf, void *data, u64 size, u64 flags)
2998
2999              Description
3000                     Copy size bytes from data into a ring buffer ringbuf.  If
3001                     BPF_RB_NO_WAKEUP  is  specified in flags, no notification
3002                     of new data availability is sent.  If BPF_RB_FORCE_WAKEUP
3003                     is  specified  in  flags, notification of new data avail‐
3004                     ability is sent unconditionally.  If 0  is  specified  in
3005                     flags,  an adaptive notification of new data availability
3006                     is sent.
3007
3008                     An adaptive notification is a notification sent  whenever
3009                     the  user-space  process  has  caught up and consumed all
3010                     available payloads. In case  the  user-space  process  is
3011                     still processing a previous payload, then no notification
3012                     is needed as it will process the newly added payload  au‐
3013                     tomatically.
3014
3015              Return 0 on success, or a negative error in case of failure.
3016
3017       void *bpf_ringbuf_reserve(void *ringbuf, u64 size, u64 flags)
3018
3019              Description
3020                     Reserve  size  bytes of payload in a ring buffer ringbuf.
3021                     flags must be 0.
3022
3023              Return Valid pointer with size bytes of memory available;  NULL,
3024                     otherwise.
3025
3026       void bpf_ringbuf_submit(void *data, u64 flags)
3027
3028              Description
3029                     Submit  reserved  ring buffer sample, pointed to by data.
3030                     If BPF_RB_NO_WAKEUP is specified in flags,  no  notifica‐
3031                     tion    of   new   data   availability   is   sent.    If
3032                     BPF_RB_FORCE_WAKEUP is specified in  flags,  notification
3033                     of  new  data availability is sent unconditionally.  If 0
3034                     is specified in flags, an adaptive  notification  of  new
3035                     data availability is sent.
3036
3037                     See 'bpf_ringbuf_output()' for the definition of adaptive
3038                     notification.
3039
3040              Return Nothing. Always succeeds.
3041
3042       void bpf_ringbuf_discard(void *data, u64 flags)
3043
3044              Description
3045                     Discard reserved ring buffer sample, pointed to by  data.
3046                     If  BPF_RB_NO_WAKEUP  is specified in flags, no notifica‐
3047                     tion   of   new   data   availability   is   sent.     If
3048                     BPF_RB_FORCE_WAKEUP  is  specified in flags, notification
3049                     of new data availability is sent unconditionally.   If  0
3050                     is  specified  in  flags, an adaptive notification of new
3051                     data availability is sent.
3052
3053                     See 'bpf_ringbuf_output()' for the definition of adaptive
3054                     notification.
3055
3056              Return Nothing. Always succeeds.
3057
3058       u64 bpf_ringbuf_query(void *ringbuf, u64 flags)
3059
3060              Description
3061                     Query  various  characteristics  of provided ring buffer.
3062                     What exactly is queries is determined by flags:
3063
3064BPF_RB_AVAIL_DATA: Amount of data not yet consumed.
3065
3066BPF_RB_RING_SIZE: The size of ring buffer.
3067
3068BPF_RB_CONS_POS: Consumer position (can wrap around).
3069
3070BPF_RB_PROD_POS:   Producer(s)   position   (can   wrap
3071                       around).
3072
3073                     Data returned is just a momentary snapshot of actual val‐
3074                     ues and could be inaccurate, so this facility  should  be
3075                     used  to  power heuristics and for reporting, not to make
3076                     100% correct calculation.
3077
3078              Return Requested value, or 0, if flags are not recognized.
3079
3080       long bpf_csum_level(struct sk_buff *skb, u64 level)
3081
3082              Description
3083                     Change the skbs checksum level by one layer up  or  down,
3084                     or  reset  it entirely to none in order to have the stack
3085                     perform checksum validation. The level is  applicable  to
3086                     the  following  protocols: TCP, UDP, GRE, SCTP, FCOE. For
3087                     example, a decap of | ETH | IP | UDP | GUE | IP |  TCP  |
3088                     into  |  ETH  |  IP | TCP | through bpf_skb_adjust_room()
3089                     helper with passing in BPF_F_ADJ_ROOM_NO_CSUM_RESET  flag
3090                     would   require   one   call   to  bpf_csum_level()  with
3091                     BPF_CSUM_LEVEL_DEC since the UDP header is removed. Simi‐
3092                     larly,  an  encap  of the latter into the former could be
3093                     accompanied by a helper  call  to  bpf_csum_level()  with
3094                     BPF_CSUM_LEVEL_INC  if  the  skb  is still intended to be
3095                     processed in higher layers of the stack instead  of  just
3096                     egressing at tc.
3097
3098                     There are three supported level settings at this time:
3099
3100BPF_CSUM_LEVEL_INC:  Increases skb->csum_level for skbs
3101                       with CHECKSUM_UNNECESSARY.
3102
3103BPF_CSUM_LEVEL_DEC: Decreases skb->csum_level for  skbs
3104                       with CHECKSUM_UNNECESSARY.
3105
3106BPF_CSUM_LEVEL_RESET:  Resets  skb->csum_level to 0 and
3107                       sets CHECKSUM_NONE to force checksum validation by  the
3108                       stack.
3109
3110BPF_CSUM_LEVEL_QUERY:   No-op,   returns   the  current
3111                       skb->csum_level.
3112
3113              Return 0 on success, or a negative error in case of failure.  In
3114                     the    case    of   BPF_CSUM_LEVEL_QUERY,   the   current
3115                     skb->csum_level is returned or the error code -EACCES  in
3116                     case the skb is not subject to CHECKSUM_UNNECESSARY.
3117
3118       struct tcp6_sock *bpf_skc_to_tcp6_sock(void *sk)
3119
3120              Description
3121                     Dynamically cast a sk pointer to a tcp6_sock pointer.
3122
3123              Return sk if casting is valid, or NULL otherwise.
3124
3125       struct tcp_sock *bpf_skc_to_tcp_sock(void *sk)
3126
3127              Description
3128                     Dynamically cast a sk pointer to a tcp_sock pointer.
3129
3130              Return sk if casting is valid, or NULL otherwise.
3131
3132       struct tcp_timewait_sock *bpf_skc_to_tcp_timewait_sock(void *sk)
3133
3134              Description
3135                     Dynamically  cast  a  sk  pointer  to a tcp_timewait_sock
3136                     pointer.
3137
3138              Return sk if casting is valid, or NULL otherwise.
3139
3140       struct tcp_request_sock *bpf_skc_to_tcp_request_sock(void *sk)
3141
3142              Description
3143                     Dynamically cast  a  sk  pointer  to  a  tcp_request_sock
3144                     pointer.
3145
3146              Return sk if casting is valid, or NULL otherwise.
3147
3148       struct udp6_sock *bpf_skc_to_udp6_sock(void *sk)
3149
3150              Description
3151                     Dynamically cast a sk pointer to a udp6_sock pointer.
3152
3153              Return sk if casting is valid, or NULL otherwise.
3154
3155       long  bpf_get_task_stack(struct task_struct *task, void *buf, u32 size,
3156       u64 flags)
3157
3158              Description
3159                     Return a user or a kernel stack in bpf  program  provided
3160                     buffer.  To achieve this, the helper needs task, which is
3161                     a valid pointer  to  struct  task_struct.  To  store  the
3162                     stacktrace,  the bpf program provides buf with a nonnega‐
3163                     tive size.
3164
3165                     The last argument,  flags,  holds  the  number  of  stack
3166                     frames   to   skip   (from   0   to   255),  masked  with
3167                     BPF_F_SKIP_FIELD_MASK. The next bits can be used  to  set
3168                     the following flags:
3169
3170                     BPF_F_USER_STACK
3171                            Collect  a  user  space  stack instead of a kernel
3172                            stack.
3173
3174                     BPF_F_USER_BUILD_ID
3175                            Collect buildid+offset instead  of  ips  for  user
3176                            stack,  only  valid  if  BPF_F_USER_STACK  is also
3177                            specified.
3178
3179                     bpf_get_task_stack()     can      collect      up      to
3180                     PERF_MAX_STACK_DEPTH both kernel and user frames, subject
3181                     to sufficient large buffer size. Note that this limit can
3182                     be controlled with the sysctl program, and that it should
3183                     be manually increased  in  order  to  profile  long  user
3184                     stacks (such as stacks for Java programs). To do so, use:
3185
3186                        # sysctl kernel.perf_event_max_stack=<new value>
3187
3188              Return The  non-negative copied buf length equal to or less than
3189                     size on success, or a negative error in case of failure.
3190
3191       long bpf_load_hdr_opt(struct bpf_sock_ops *skops,  void  *searchby_res,
3192       u32 len, u64 flags)
3193
3194              Description
3195                     Load  header  option.   Support  reading a particular TCP
3196                     header option for bpf program (BPF_PROG_TYPE_SOCK_OPS).
3197
3198                     If flags is  0,  it  will  search  the  option  from  the
3199                     skops->skb_data.   The comment in struct bpf_sock_ops has
3200                     details  on  what  skb_data  contains   under   different
3201                     skops->op.
3202
3203                     The  first  byte  of  the searchby_res specifies the kind
3204                     that it wants to search.
3205
3206                     If the searching kind is an experimental kind  (i.e.  253
3207                     or  254  according to RFC6994).  It also needs to specify
3208                     the "magic" which is either 2 bytes or 4 bytes.  It  then
3209                     also  needs to specify the size of the magic by using the
3210                     2nd byte which is "kind-length" of a  TCP  header  option
3211                     and  the  "kind-length"  also  includes the first 2 bytes
3212                     "kind" and "kind-length" itself as a  normal  TCP  header
3213                     option also does.
3214
3215                     For  example, to search experimental kind 254 with 2 byte
3216                     magic 0xeB9F, the searchby_res should be [ 254, 4,  0xeB,
3217                     0x9F, 0, 0, .... 0 ].
3218
3219                     To  search  for the standard window scale option (3), the
3220                     searchby_res should be [  3,  0,  0,  ....  0  ].   Note,
3221                     kind-length must be 0 for regular option.
3222
3223                     Searching  for  No-Op  (0) and End-of-Option-List (1) are
3224                     not supported.
3225
3226                     len must be at least 2 bytes which is the minimal size of
3227                     a header option.
3228
3229                     Supported flags:
3230
3231BPF_LOAD_HDR_OPT_TCP_SYN  to  search from the saved_syn
3232                       packet or the just-received syn packet.
3233
3234              Return >  0  when  found,  the  header  option  is   copied   to
3235                     searchby_res.   The  return  value  is  the  total length
3236                     copied. On failure, a negative error code is returned:
3237
3238                     -EINVAL if a parameter is invalid.
3239
3240                     -ENOMSG if the option is not found.
3241
3242                     -ENOENT   if   no   syn   packet   is   available    when
3243                     BPF_LOAD_HDR_OPT_TCP_SYN is used.
3244
3245                     -ENOSPC if there is not enough space.  Only len number of
3246                     bytes are copied.
3247
3248                     -EFAULT on failure to parse the  header  options  in  the
3249                     packet.
3250
3251                     -EPERM  if  the  helper  cannot be used under the current
3252                     skops->op.
3253
3254       long bpf_store_hdr_opt(struct bpf_sock_ops *skops,  const  void  *from,
3255       u32 len, u64 flags)
3256
3257              Description
3258                     Store header option.  The data will be copied from buffer
3259                     from with length len to the TCP header.
3260
3261                     The buffer from should have the  whole  option  that  in‐
3262                     cludes the kind, kind-length, and the actual option data.
3263                     The  len  must  be  at  least  kind-length   long.    The
3264                     kind-length does not have to be 4 byte aligned.  The ker‐
3265                     nel will take care of the padding and setting the 4 bytes
3266                     aligned value to th->doff.
3267
3268                     This helper will check for duplicated option by searching
3269                     the same option in the outgoing skb.
3270
3271                     This    helper    can    only    be     called     during
3272                     BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
3273
3274              Return 0 on success, or negative error in case of failure:
3275
3276                     -EINVAL If param is invalid.
3277
3278                     -ENOSPC  if  there  is  not  enough  space in the header.
3279                     Nothing has been written
3280
3281                     -EEXIST if the option already exists.
3282
3283                     -EFAULT on failure to parse the existing header options.
3284
3285                     -EPERM if the helper cannot be  used  under  the  current
3286                     skops->op.
3287
3288       long  bpf_reserve_hdr_opt(struct  bpf_sock_ops  *skops,  u32  len,  u64
3289       flags)
3290
3291              Description
3292                     Reserve len bytes for the bpf header option.   The  space
3293                     will    be   used   by   bpf_store_hdr_opt()   later   in
3294                     BPF_SOCK_OPS_WRITE_HDR_OPT_CB.
3295
3296                     If bpf_reserve_hdr_opt() is called  multiple  times,  the
3297                     total number of bytes will be reserved.
3298
3299                     This     helper     can    only    be    called    during
3300                     BPF_SOCK_OPS_HDR_OPT_LEN_CB.
3301
3302              Return 0 on success, or negative error in case of failure:
3303
3304                     -EINVAL if a parameter is invalid.
3305
3306                     -ENOSPC if there is not enough space in the header.
3307
3308                     -EPERM if the helper cannot be  used  under  the  current
3309                     skops->op.
3310
3311       void  *bpf_inode_storage_get(struct  bpf_map  *map,  void  *inode, void
3312       *value, u64 flags)
3313
3314              Description
3315                     Get a bpf_local_storage from an inode.
3316
3317                     Logically, it could be thought of as  getting  the  value
3318                     from a map with inode as the key.  From this perspective,
3319                     the    usage    is    not     much     different     from
3320                     bpf_map_lookup_elem(map,  &inode)  except this helper en‐
3321                     forces the key must be an inode and the map must also  be
3322                     a BPF_MAP_TYPE_INODE_STORAGE.
3323
3324                     Underneath,  the value is stored locally at inode instead
3325                     of the map.  The map is  used  as  the  bpf-local-storage
3326                     "type".  The  bpf-local-storage  "type" (i.e. the map) is
3327                     searched against all bpf_local_storage residing at inode.
3328
3329                     An optional flags (BPF_LOCAL_STORAGE_GET_F_CREATE) can be
3330                     used such that a new bpf_local_storage will be created if
3331                     one does not exist.  value  can  be  used  together  with
3332                     BPF_LOCAL_STORAGE_GET_F_CREATE  to  specify  the  initial
3333                     value of a bpf_local_storage.  If value is NULL, the  new
3334                     bpf_local_storage will be zero initialized.
3335
3336              Return A bpf_local_storage pointer is returned on success.
3337
3338                     NULL  if  not found or there was an error in adding a new
3339                     bpf_local_storage.
3340
3341       int bpf_inode_storage_delete(struct bpf_map *map, void *inode)
3342
3343              Description
3344                     Delete a bpf_local_storage from an inode.
3345
3346              Return 0 on success.
3347
3348                     -ENOENT if the bpf_local_storage cannot be found.
3349
3350       long bpf_d_path(struct path *path, char *buf, u32 sz)
3351
3352              Description
3353                     Return full path for  given  struct  path  object,  which
3354                     needs  to  be the kernel BTF path object. The path is re‐
3355                     turned in the provided buffer buf of size sz and is  zero
3356                     terminated.
3357
3358              Return On  success,  the strictly positive length of the string,
3359                     including the trailing NUL character. On error,  a  nega‐
3360                     tive value.
3361
3362       long bpf_copy_from_user(void *dst, u32 size, const void *user_ptr)
3363
3364              Description
3365                     Read  size  bytes  from  user  space address user_ptr and
3366                     store  the  data  in  dst.   This   is   a   wrapper   of
3367                     copy_from_user().
3368
3369              Return 0 on success, or a negative error in case of failure.
3370
3371       long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32
3372       btf_ptr_size, u64 flags)
3373
3374              Description
3375                     Use BTF to store a string representation of  ptr->ptr  in
3376                     str,  using  ptr->type_id.  This value should specify the
3377                     type     that     ptr->ptr      points      to.      LLVM
3378                     __builtin_btf_type_id(type, 1) can be used to look up vm‐
3379                     linux BTF type ids. Traversing the data  structure  using
3380                     BTF,  the  type  information and values are stored in the
3381                     first str_size - 1  bytes  of  str.   Safe  copy  of  the
3382                     pointer  data is carried out to avoid kernel crashes dur‐
3383                     ing operation.  Smaller types can use string space on the
3384                     stack;  larger  programs  can  use  map data to store the
3385                     string representation.
3386
3387                     The string can be subsequently shared with userspace  via
3388                     bpf_perf_event_output()   or   ring   buffer  interfaces.
3389                     bpf_trace_printk() is to be  avoided  as  it  places  too
3390                     small a limit on string size to be useful.
3391
3392                     flags is a combination of
3393
3394                     BTF_F_COMPACT
3395                            no formatting around type information
3396
3397                     BTF_F_NONAME
3398                            no struct/union member names/types
3399
3400                     BTF_F_PTR_RAW
3401                            show raw (unobfuscated) pointer values; equivalent
3402                            to printk specifier %px.
3403
3404                     BTF_F_ZERO
3405                            show zero-valued struct/union  members;  they  are
3406                            not displayed by default
3407
3408              Return The number of bytes that were written (or would have been
3409                     written if output had  to  be  truncated  due  to  string
3410                     size), or a negative error in cases of failure.
3411
3412       long  bpf_seq_printf_btf(struct  seq_file  *m, struct btf_ptr *ptr, u32
3413       ptr_size, u64 flags)
3414
3415              Description
3416                     Use BTF to write to seq_write a string representation  of
3417                     ptr->ptr,  using  ptr->type_id as per bpf_snprintf_btf().
3418                     flags are identical to those used for bpf_snprintf_btf.
3419
3420              Return 0 on success or a negative error in case of failure.
3421
3422       u64 bpf_skb_cgroup_classid(struct sk_buff *skb)
3423
3424              Description
3425                     See bpf_get_cgroup_classid() for  the  main  description.
3426                     This helper differs from bpf_get_cgroup_classid() in that
3427                     the cgroup v1 net_cls class is retrieved  only  from  the
3428                     skb's associated socket instead of the current process.
3429
3430              Return The  id  is returned or 0 in case the id could not be re‐
3431                     trieved.
3432
3433       long bpf_redirect_neigh(u32 ifindex,  struct  bpf_redir_neigh  *params,
3434       int plen, u64 flags)
3435
3436              Description
3437                     Redirect  the  packet  to  another  net  device  of index
3438                     ifindex and fill in L2 addresses from neighboring subsys‐
3439                     tem.  This  helper is somewhat similar to bpf_redirect(),
3440                     except that it populates L2 addresses as  well,  meaning,
3441                     internally,  the helper relies on the neighbor lookup for
3442                     the L2 address of the nexthop.
3443
3444                     The helper will perform a FIB lookup based on  the  skb's
3445                     networking header to get the address of the next hop, un‐
3446                     less this is supplied by the caller in the  params  argu‐
3447                     ment.  The  plen argument indicates the len of params and
3448                     should be set to 0 if params is NULL.
3449
3450                     The flags argument is reserved and must be 0. The  helper
3451                     is currently only supported for tc BPF program types, and
3452                     enabled for IPv4 and IPv6 protocols.
3453
3454              Return The  helper  returns  TC_ACT_REDIRECT   on   success   or
3455                     TC_ACT_SHOT on error.
3456
3457       void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
3458
3459              Description
3460                     Take a pointer to a percpu ksym, percpu_ptr, and return a
3461                     pointer to the percpu kernel variable on cpu. A  ksym  is
3462                     an  extern  variable  decorated  with '__ksym'. For ksym,
3463                     there is a global var (either static or  global)  defined
3464                     of the same name in the kernel. The ksym is percpu if the
3465                     global var is percpu.  The returned pointer points to the
3466                     global percpu var on cpu.
3467
3468                     bpf_per_cpu_ptr()  has the same semantic as per_cpu_ptr()
3469                     in the kernel, except that bpf_per_cpu_ptr()  may  return
3470                     NULL.  This happens if cpu is larger than nr_cpu_ids. The
3471                     caller  of  bpf_per_cpu_ptr()  must  check  the  returned
3472                     value.
3473
3474              Return A  pointer pointing to the kernel percpu variable on cpu,
3475                     or NULL, if cpu is invalid.
3476
3477       void *bpf_this_cpu_ptr(const void *percpu_ptr)
3478
3479              Description
3480                     Take a pointer to a percpu ksym, percpu_ptr, and return a
3481                     pointer  to  the  percpu kernel variable on this cpu. See
3482                     the description of 'ksym' in bpf_per_cpu_ptr().
3483
3484                     bpf_this_cpu_ptr()   has    the    same    semantic    as
3485                     this_cpu_ptr()    in    the    kernel.   Different   from
3486                     bpf_per_cpu_ptr(), it would never return NULL.
3487
3488              Return A pointer pointing to the kernel percpu variable on  this
3489                     cpu.
3490
3491       long bpf_redirect_peer(u32 ifindex, u64 flags)
3492
3493              Description
3494                     Redirect  the  packet  to  another  net  device  of index
3495                     ifindex.  This helper is somewhat  similar  to  bpf_redi‐
3496                     rect(),  except  that  the  redirection  happens  to  the
3497                     ifindex' peer device and the  netns  switch  takes  place
3498                     from  ingress  to ingress without going through the CPU's
3499                     backlog queue.
3500
3501                     The flags argument is reserved and must be 0. The  helper
3502                     is  currently  only supported for tc BPF program types at
3503                     the ingress hook and for veth device types. The peer  de‐
3504                     vice must reside in a different network namespace.
3505
3506              Return The   helper   returns   TC_ACT_REDIRECT  on  success  or
3507                     TC_ACT_SHOT on error.
3508
3509       void  *bpf_task_storage_get(struct  bpf_map  *map,  struct  task_struct
3510       *task, void *value, u64 flags)
3511
3512              Description
3513                     Get a bpf_local_storage from the task.
3514
3515                     Logically,  it  could  be thought of as getting the value
3516                     from a map with task as the key.  From this  perspective,
3517                     the     usage     is     not    much    different    from
3518                     bpf_map_lookup_elem(map, &task) except  this  helper  en‐
3519                     forces  the  key  must  be a task_struct and the map must
3520                     also be a BPF_MAP_TYPE_TASK_STORAGE.
3521
3522                     Underneath, the value is stored locally at  task  instead
3523                     of  the  map.   The  map is used as the bpf-local-storage
3524                     "type". The bpf-local-storage "type" (i.e.  the  map)  is
3525                     searched against all bpf_local_storage residing at task.
3526
3527                     An optional flags (BPF_LOCAL_STORAGE_GET_F_CREATE) can be
3528                     used such that a new bpf_local_storage will be created if
3529                     one  does  not  exist.   value  can be used together with
3530                     BPF_LOCAL_STORAGE_GET_F_CREATE  to  specify  the  initial
3531                     value  of a bpf_local_storage.  If value is NULL, the new
3532                     bpf_local_storage will be zero initialized.
3533
3534              Return A bpf_local_storage pointer is returned on success.
3535
3536                     NULL if not found or there was an error in adding  a  new
3537                     bpf_local_storage.
3538
3539       long  bpf_task_storage_delete(struct  bpf_map  *map, struct task_struct
3540       *task)
3541
3542              Description
3543                     Delete a bpf_local_storage from a task.
3544
3545              Return 0 on success.
3546
3547                     -ENOENT if the bpf_local_storage cannot be found.
3548
3549       struct task_struct *bpf_get_current_task_btf(void)
3550
3551              Description
3552                     Return a BTF pointer to the "current" task.  This pointer
3553                     can   also   be   used   in   helpers   that   accept  an
3554                     ARG_PTR_TO_BTF_ID of type task_struct.
3555
3556              Return Pointer to the current task.
3557
3558       long bpf_bprm_opts_set(struct linux_binprm *bprm, u64 flags)
3559
3560              Description
3561                     Set or clear certain options on bprm:
3562
3563                     BPF_F_BPRM_SECUREEXEC Set the secureexec bit  which  sets
3564                     the  AT_SECURE  auxv for glibc. The bit is cleared if the
3565                     flag is not specified.
3566
3567              Return -EINVAL if invalid flags are passed, zero otherwise.
3568
3569       u64 bpf_ktime_get_coarse_ns(void)
3570
3571              Description
3572                     Return a coarse-grained version of the time elapsed since
3573                     system  boot,  in  nanoseconds. Does not include time the
3574                     system was suspended.
3575
3576                     See: clock_gettime(CLOCK_MONOTONIC_COARSE)
3577
3578              Return Current ktime.
3579
3580       long bpf_ima_inode_hash(struct inode *inode, void *dst, u32 size)
3581
3582              Description
3583                     Returns the stored IMA hash of the inode (if it's  avail‐
3584                     able).   If  the hash is larger than size, then only size
3585                     bytes will be copied to dst
3586
3587              Return The hash_algo is returned on success, -EOPNOTSUP  if  IMA
3588                     is disabled or -EINVAL if invalid arguments are passed.
3589
3590       struct socket *bpf_sock_from_file(struct file *file)
3591
3592              Description
3593                     If  the given file represents a socket, returns the asso‐
3594                     ciated socket.
3595
3596              Return A pointer to a struct socket on success or  NULL  if  the
3597                     file is not a socket.
3598
3599       long  bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32 len_diff,
3600       u64 flags)
3601
3602              Description
3603                     Check packet size against exceeding  MTU  of  net  device
3604                     (based  on  ifindex).  This helper will likely be used in
3605                     combination with helpers that  adjust/change  the  packet
3606                     size.
3607
3608                     The  argument  len_diff  can  be used for querying with a
3609                     planned size change. This allows to check  MTU  prior  to
3610                     changing packet ctx. Providing a len_diff adjustment that
3611                     is larger than the actual packet size (resulting in nega‐
3612                     tive  packet  size) will in principle not exceed the MTU,
3613                     which is why it is not considered a failure.   Other  BPF
3614                     helpers  are  needed  for  performing  the  planned  size
3615                     change; therefore the responsibility for catching a nega‐
3616                     tive packet size belongs in those helpers.
3617
3618                     Specifying  ifindex zero means the MTU check is performed
3619                     against the current net device.   This  is  practical  if
3620                     this isn't used prior to redirect.
3621
3622                     On  input  mtu_len must be a valid pointer, else verifier
3623                     will reject BPF program.  If the value  mtu_len  is  ini‐
3624                     tialized  to  zero then the ctx packet size is use.  When
3625                     value mtu_len is provided as input this  specify  the  L3
3626                     length  that  the MTU check is done against. Remember XDP
3627                     and TC length operate at L2, but this value is L3 as this
3628                     correlate  to  MTU and IP-header tot_len values which are
3629                     L3 (similar behavior as bpf_fib_lookup).
3630
3631                     The Linux kernel route table can configure MTUs on a more
3632                     specific  per  route level, which is not provided by this
3633                     helper.    For   route   level   MTU   checks   use   the
3634                     bpf_fib_lookup() helper.
3635
3636                     ctx  is  either  struct xdp_md for XDP programs or struct
3637                     sk_buff for tc cls_act programs.
3638
3639                     The flags argument can be a combination of one or more of
3640                     the following values:
3641
3642                     BPF_MTU_CHK_SEGS
3643                            This  flag will only works for ctx struct sk_buff.
3644                            If packet context contains  extra  packet  segment
3645                            buffers  (often  knows as GSO skb), then MTU check
3646                            is harder to  check  at  this  point,  because  in
3647                            transmit path it is possible for the skb packet to
3648                            get re-segmented (depending  on  net  device  fea‐
3649                            tures).   This  could still be a MTU violation, so
3650                            this flag enables  performing  MTU  check  against
3651                            segments,  with  a different violation return code
3652                            to tell it apart. Check cannot use len_diff.
3653
3654                     On return mtu_len pointer contains the MTU value  of  the
3655                     net  device.   Remember  the net device configured MTU is
3656                     the L3 size, which is returned here and XDP and TC length
3657                     operate  at  L2.   Helper take this into account for you,
3658                     but remember when using MTU value in your BPF-code.
3659
3660              Return
3661
3662                     • 0  on  success,  and  populate  MTU  value  in  mtu_len
3663                       pointer.
3664
3665                     • <  0  if any input argument is invalid (mtu_len not up‐
3666                       dated)
3667
3668                     MTU violations return positive values, but also  populate
3669                     MTU  value  in mtu_len pointer, as this can be needed for
3670                     implementing PMTU handing:
3671
3672BPF_MTU_CHK_RET_FRAG_NEEDED
3673
3674BPF_MTU_CHK_RET_SEGS_TOOBIG
3675
3676       long bpf_for_each_map_elem(struct bpf_map *map, void *callback_fn, void
3677       *callback_ctx, u64 flags)
3678
3679              Description
3680                     For  each  element in map, call callback_fn function with
3681                     map, callback_ctx and other map-specific parameters.  The
3682                     callback_fn  should  be  a  static function and the call‐
3683                     back_ctx should be a pointer to the stack.  The flags  is
3684                     used  to  control  certain  aspects  of the helper.  Cur‐
3685                     rently, the flags must be 0.
3686
3687                     The following are a list of supported map types and their
3688                     respective expected callback signatures:
3689
3690                     BPF_MAP_TYPE_HASH,              BPF_MAP_TYPE_PERCPU_HASH,
3691                     BPF_MAP_TYPE_LRU_HASH,      BPF_MAP_TYPE_LRU_PERCPU_HASH,
3692                     BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY
3693
3694                     long (*callback_fn)(struct bpf_map *map, const void *key,
3695                     void *value, void *ctx);
3696
3697                     For per_cpu maps, the map_value is the value on  the  cpu
3698                     where the bpf_prog is running.
3699
3700                     If  callback_fn return 0, the helper will continue to the
3701                     next element. If return value is 1, the helper will  skip
3702                     the  rest of elements and return. Other return values are
3703                     not used now.
3704
3705              Return The number of traversed map elements for success, -EINVAL
3706                     for invalid flags.
3707
3708       long  bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data,
3709       u32 data_len)
3710
3711              Description
3712                     Outputs a string into the str  buffer  of  size  str_size
3713                     based  on  a  format  string  stored  in  a read-only map
3714                     pointed by fmt.
3715
3716                     Each format specifier in fmt corresponds to one u64  ele‐
3717                     ment  in  the  data array. For strings and pointers where
3718                     pointees are accessed, only the pointer values are stored
3719                     in  the  data  array. The data_len is the size of data in
3720                     bytes - must be a multiple of 8.
3721
3722                     Formats %s and %p{i,I}{4,6} require to read  kernel  mem‐
3723                     ory. Reading kernel memory may fail due to either invalid
3724                     address or valid address but  requiring  a  major  memory
3725                     fault.  If reading kernel memory fails, the string for %s
3726                     will  be  an  empty  string,  and  the  ip  address   for
3727                     %p{i,I}{4,6}  will be 0.  Not returning error to bpf pro‐
3728                     gram is consistent with what bpf_trace_printk() does  for
3729                     now.
3730
3731              Return The strictly positive length of the formatted string, in‐
3732                     cluding the trailing zero character. If the return  value
3733                     is  greater  than  str_size,  str  contains  a  truncated
3734                     string, guaranteed  to  be  zero-terminated  except  when
3735                     str_size is 0.
3736
3737                     Or -EBUSY if the per-CPU memory copy buffer is busy.
3738
3739       long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
3740
3741              Description
3742                     Execute bpf syscall with given arguments.
3743
3744              Return A syscall result.
3745
3746       long  bpf_btf_find_by_name_kind(char  *name, int name_sz, u32 kind, int
3747       flags)
3748
3749              Description
3750                     Find BTF type with given name and kind in vmlinux BTF  or
3751                     in module's BTFs.
3752
3753              Return Returns btf_id and btf_obj_fd in lower and upper 32 bits.
3754
3755       long bpf_sys_close(u32 fd)
3756
3757              Description
3758                     Execute close syscall for given FD.
3759
3760              Return A syscall result.
3761
3762       long  bpf_timer_init(struct  bpf_timer *timer, struct bpf_map *map, u64
3763       flags)
3764
3765              Description
3766                     Initialize the timer.  First  4  bits  of  flags  specify
3767                     clockid.      Only    CLOCK_MONOTONIC,    CLOCK_REALTIME,
3768                     CLOCK_BOOTTIME are allowed.  All other bits of flags  are
3769                     reserved.   The verifier will reject the program if timer
3770                     is not from the same map.
3771
3772              Return 0 on success.  -EBUSY if timer  is  already  initialized.
3773                     -EINVAL  if invalid flags are passed.  -EPERM if timer is
3774                     in a map that doesn't have any user references.  The user
3775                     space  should either hold a file descriptor to a map with
3776                     timers or pin such map in bpffs. When map is unpinned  or
3777                     file  descriptor  is closed all timers in the map will be
3778                     cancelled and freed.
3779
3780       long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn)
3781
3782              Description
3783                     Configure the timer to call callback_fn static function.
3784
3785              Return 0 on success.  -EINVAL if timer was not initialized  with
3786                     bpf_timer_init()  earlier.   -EPERM  if timer is in a map
3787                     that doesn't have any user references.   The  user  space
3788                     should either hold a file descriptor to a map with timers
3789                     or pin such map in bpffs. When map is  unpinned  or  file
3790                     descriptor  is  closed all timers in the map will be can‐
3791                     celled and freed.
3792
3793       long bpf_timer_start(struct bpf_timer *timer, u64 nsecs, u64 flags)
3794
3795              Description
3796                     Set timer expiration N nanoseconds from the current time.
3797                     The  configured callback will be invoked in soft irq con‐
3798                     text on some cpu  and  will  not  repeat  unless  another
3799                     bpf_timer_start() is made.  In such case the next invoca‐
3800                     tion can  migrate  to  a  different  cpu.   Since  struct
3801                     bpf_timer  is a field inside map element the map owns the
3802                     timer. The bpf_timer_set_callback() will increment refcnt
3803                     of  BPF  program to make sure that callback_fn code stays
3804                     valid.  When user space reference to a map  reaches  zero
3805                     all  timers in a map are cancelled and corresponding pro‐
3806                     gram's refcnts are decremented. This is done to make sure
3807                     that  Ctrl-C  of  a user process doesn't leave any timers
3808                     running. If map is pinned in bpffs  the  callback_fn  can
3809                     re-arm itself indefinitely.  bpf_map_update/delete_elem()
3810                     helpers and user space sys_bpf commands cancel  and  free
3811                     the  timer in the given map element.  The map can contain
3812                     timers that invoke callback_fn-s from different programs.
3813                     The same callback_fn can serve different timers from dif‐
3814                     ferent maps if key/value layout matches across maps.  Ev‐
3815                     ery  bpf_timer_set_callback()  can  have  different call‐
3816                     back_fn.
3817
3818              Return 0 on success.  -EINVAL if timer was not initialized  with
3819                     bpf_timer_init() earlier or invalid flags are passed.
3820
3821       long bpf_timer_cancel(struct bpf_timer *timer)
3822
3823              Description
3824                     Cancel the timer and wait for callback_fn to finish if it
3825                     was running.
3826
3827              Return 0 if the timer was not active.  1 if the  timer  was  ac‐
3828                     tive.    -EINVAL   if  timer  was  not  initialized  with
3829                     bpf_timer_init() earlier.  -EDEADLK if callback_fn  tried
3830                     to  call  bpf_timer_cancel() on its own timer which would
3831                     have led to a deadlock otherwise.
3832
3833       u64 bpf_get_func_ip(void *ctx)
3834
3835              Description
3836                     Get address of  the  traced  function  (for  tracing  and
3837                     kprobe programs).
3838
3839              Return Address  of  the  traced  function.  0 for kprobes placed
3840                     within the function (not at the entry).
3841
3842       u64 bpf_get_attach_cookie(void *ctx)
3843
3844              Description
3845                     Get bpf_cookie value  provided  (optionally)  during  the
3846                     program  attachment. It might be different for each indi‐
3847                     vidual attachment, even if  BPF  program  itself  is  the
3848                     same.   Expects  BPF program context ctx as a first argu‐
3849                     ment.
3850
3851                     Supported for the following program types:
3852
3853                            • kprobe/uprobe;
3854
3855                            • tracepoint;
3856
3857                            • perf_event.
3858
3859              Return Value specified by user at BPF  link  creation/attachment
3860                     time or 0, if it was not specified.
3861
3862       long bpf_task_pt_regs(struct task_struct *task)
3863
3864              Description
3865                     Get the struct pt_regs associated with task.
3866
3867              Return A pointer to struct pt_regs.
3868
3869       long bpf_get_branch_snapshot(void *entries, u32 size, u64 flags)
3870
3871              Description
3872                     Get  branch  trace  from hardware engines like Intel LBR.
3873                     The hardware engine is stopped shortly after  the  helper
3874                     is  called. Therefore, the user need to filter branch en‐
3875                     tries based on the actual use  case.  To  capture  branch
3876                     trace  before  the  trigger point of the BPF program, the
3877                     helper should be called at the beginning of the BPF  pro‐
3878                     gram.
3879
3880                     The  data is stored as struct perf_branch_entry into out‐
3881                     put buffer entries. size is the size of entries in bytes.
3882                     flags is reserved for now and must be zero.
3883
3884              Return On  success,  number of bytes written to buf. On error, a
3885                     negative value.
3886
3887                     -EINVAL if flags is not zero.
3888
3889                     -ENOENT if architecture does not support branch records.
3890
3891       long bpf_trace_vprintk(const char *fmt, u32 fmt_size, const void *data,
3892       u32 data_len)
3893
3894              Description
3895                     Behaves  like bpf_trace_printk() helper, but takes an ar‐
3896                     ray of u64 to format and can handle more format args as a
3897                     result.
3898
3899                     Arguments are to be used as in bpf_seq_printf() helper.
3900
3901              Return The  number of bytes written to the buffer, or a negative
3902                     error in case of failure.
3903
3904       struct unix_sock *bpf_skc_to_unix_sock(void *sk)
3905
3906              Description
3907                     Dynamically cast a sk pointer to a unix_sock pointer.
3908
3909              Return sk if casting is valid, or NULL otherwise.
3910
3911       long bpf_kallsyms_lookup_name(const char *name, int name_sz, int flags,
3912       u64 *res)
3913
3914              Description
3915                     Get  the address of a kernel symbol, returned in res. res
3916                     is set to 0 if the symbol is not found.
3917
3918              Return On success, zero. On error, a negative value.
3919
3920                     -EINVAL if flags is not zero.
3921
3922                     -EINVAL if string name is not the same size as name_sz.
3923
3924                     -ENOENT if symbol is not found.
3925
3926                     -EPERM if caller does not have permission to obtain  ker‐
3927                     nel address.
3928
3929       long  bpf_find_vma(struct  task_struct  *task,  u64  addr,  void *call‐
3930       back_fn, void *callback_ctx, u64 flags)
3931
3932              Description
3933                     Find vma of task that  contains  addr,  call  callback_fn
3934                     function  with  task,  vma,  and callback_ctx.  The call‐
3935                     back_fn should be a static function and the  callback_ctx
3936                     should  be  a pointer to the stack.  The flags is used to
3937                     control certain aspects of the  helper.   Currently,  the
3938                     flags must be 0.
3939
3940                     The expected callback signature is
3941
3942                     long   (*callback_fn)(struct  task_struct  *task,  struct
3943                     vm_area_struct *vma, void *callback_ctx);
3944
3945              Return 0 on success.  -ENOENT if task->mm is  NULL,  or  no  vma
3946                     contains  addr.   -EBUSY if failed to try lock mmap_lock.
3947                     -EINVAL for invalid flags.
3948
3949       long bpf_loop(u32 nr_loops, void *callback_fn, void *callback_ctx,  u64
3950       flags)
3951
3952              Description
3953                     For nr_loops, call callback_fn function with callback_ctx
3954                     as the context parameter.  The callback_fn  should  be  a
3955                     static  function and the callback_ctx should be a pointer
3956                     to the stack.  The flags is used to control  certain  as‐
3957                     pects  of  the  helper.   Currently, the flags must be 0.
3958                     Currently, nr_loops is limited to 1 <<  23  (~8  million)
3959                     loops.
3960
3961                     long (*callback_fn)(u32 index, void *ctx);
3962
3963                     where  index  is the current index in the loop. The index
3964                     is zero-indexed.
3965
3966                     If callback_fn returns 0, the helper will continue to the
3967                     next loop. If return value is 1, the helper will skip the
3968                     rest of the loops and return. Other return values are not
3969                     used now, and will be rejected by the verifier.
3970
3971              Return The number of loops performed, -EINVAL for invalid flags,
3972                     -E2BIG if nr_loops exceeds the maximum number of loops.
3973
3974       long bpf_strncmp(const char *s1, u32 s1_sz, const char *s2)
3975
3976              Description
3977                     Do strncmp() between s1 and s2. s1  doesn't  need  to  be
3978                     null-terminated  and s1_sz is the maximum storage size of
3979                     s1. s2 must be a read-only string.
3980
3981              Return An integer less than, equal to, or greater than  zero  if
3982                     the  first s1_sz bytes of s1 is found to be less than, to
3983                     match, or be greater than s2.
3984
3985       long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
3986
3987              Description
3988                     Get n-th argument register (zero  based)  of  the  traced
3989                     function (for tracing programs) returned in value.
3990
3991              Return 0 on success.  -EINVAL if n >= argument register count of
3992                     traced function.
3993
3994       long bpf_get_func_ret(void *ctx, u64 *value)
3995
3996              Description
3997                     Get return value of the traced function (for tracing pro‐
3998                     grams) in value.
3999
4000              Return 0  on  success.   -EOPNOTSUPP  for tracing programs other
4001                     than BPF_TRACE_FEXIT or BPF_MODIFY_RETURN.
4002
4003       long bpf_get_func_arg_cnt(void *ctx)
4004
4005              Description
4006                     Get number of registers of the traced function (for trac‐
4007                     ing  programs)  where  function  arguments  are stored in
4008                     these registers.
4009
4010              Return The number of argument registers of the traced function.
4011
4012       int bpf_get_retval(void)
4013
4014              Description
4015                     Get the BPF program's return value that will be  returned
4016                     to the upper layers.
4017
4018                     This helper is currently supported by cgroup programs and
4019                     only by the hooks where BPF program's return value is re‐
4020                     turned to the userspace via errno.
4021
4022              Return The BPF program's return value.
4023
4024       int bpf_set_retval(int retval)
4025
4026              Description
4027                     Set  the BPF program's return value that will be returned
4028                     to the upper layers.
4029
4030                     This helper is currently supported by cgroup programs and
4031                     only by the hooks where BPF program's return value is re‐
4032                     turned to the userspace via errno.
4033
4034                     Note that there is the following corner  case  where  the
4035                     program  exports  an error via bpf_set_retval but signals
4036                     success via 'return 1':
4037                        bpf_set_retval(-EPERM); return 1;
4038
4039                     In this case, the BPF program's  return  value  will  use
4040                     helper's    -EPERM.    This    still   holds   true   for
4041                     cgroup/bind{4,6} which supports extra 'return 3'  success
4042                     case.
4043
4044              Return 0 on success, or a negative error in case of failure.
4045
4046       u64 bpf_xdp_get_buff_len(struct xdp_buff *xdp_md)
4047
4048              Description
4049                     Get  the total size of a given xdp buff (linear and paged
4050                     area)
4051
4052              Return The total size of a given xdp buffer.
4053
4054       long bpf_xdp_load_bytes(struct xdp_buff *xdp_md, u32 offset, void *buf,
4055       u32 len)
4056
4057              Description
4058                     This  helper is provided as an easy way to load data from
4059                     a xdp buffer. It can be used to load len bytes from  off‐
4060                     set  from the frame associated to xdp_md, into the buffer
4061                     pointed by buf.
4062
4063              Return 0 on success, or a negative error in case of failure.
4064
4065       long bpf_xdp_store_bytes(struct  xdp_buff  *xdp_md,  u32  offset,  void
4066       *buf, u32 len)
4067
4068              Description
4069                     Store len bytes from buffer buf into the frame associated
4070                     to xdp_md, at offset.
4071
4072              Return 0 on success, or a negative error in case of failure.
4073
4074       long bpf_copy_from_user_task(void *dst, u32 size, const void *user_ptr,
4075       struct task_struct *tsk, u64 flags)
4076
4077              Description
4078                     Read size bytes from user space address user_ptr in tsk's
4079                     address space, and stores the data in dst. flags  is  not
4080                     used  yet  and is provided for future extensibility. This
4081                     helper can only be used by sleepable programs.
4082
4083              Return 0 on success, or a negative error in case of failure.  On
4084                     error dst buffer is zeroed out.
4085
4086       long   bpf_skb_set_tstamp(struct   sk_buff   *skb,   u64   tstamp,  u32
4087       tstamp_type)
4088
4089              Description
4090                     Change the __sk_buff->tstamp_type to tstamp_type and  set
4091                     tstamp to the __sk_buff->tstamp together.
4092
4093                     If there is no need to change the __sk_buff->tstamp_type,
4094                     the   tstamp   value   can   be   directly   written   to
4095                     __sk_buff->tstamp instead.
4096
4097                     BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that will
4098                     be kept during bpf_redirect_*().  A non zero tstamp  must
4099                     be    used    with    the    BPF_SKB_TSTAMP_DELIVERY_MONO
4100                     tstamp_type.
4101
4102                     A BPF_SKB_TSTAMP_UNSPEC tstamp_type can only be used with
4103                     a zero tstamp.
4104
4105                     Only IPv4 and IPv6 skb->protocol are supported.
4106
4107                     This  function is most useful when it needs to set a mono
4108                     delivery time to  __sk_buff->tstamp  and  then  bpf_redi‐
4109                     rect_*()  to the egress of an iface.  For example, chang‐
4110                     ing the (rcv) timestamp in __sk_buff->tstamp  at  ingress
4111                     to  a  mono  delivery  time  and then bpf_redirect_*() to
4112                     sch_fq@phy-dev.
4113
4114              Return 0 on success.  -EINVAL for invalid input -EOPNOTSUPP  for
4115                     unsupported protocol
4116
4117       long bpf_ima_file_hash(struct file *file, void *dst, u32 size)
4118
4119              Description
4120                     Returns  a  calculated IMA hash of the file.  If the hash
4121                     is larger than size, then only size bytes will be  copied
4122                     to dst
4123
4124              Return The  hash_algo  is returned on success, -EOPNOTSUP if the
4125                     hash calculation failed or -EINVAL if  invalid  arguments
4126                     are passed.
4127
4128       void *bpf_kptr_xchg(void *map_value, void *ptr)
4129
4130              Description
4131                     Exchange  kptr  at pointer map_value with ptr, and return
4132                     the old value. ptr can be NULL, otherwise it  must  be  a
4133                     referenced  pointer  which  will  be  released  when this
4134                     helper is called.
4135
4136              Return The old value of kptr (which can be NULL).  The  returned
4137                     pointer  if  not  NULL,  is a reference which must be re‐
4138                     leased using its corresponding release function, or moved
4139                     into a BPF map before program exit.
4140
4141       void  *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key,
4142       u32 cpu)
4143
4144              Description
4145                     Perform a lookup in percpu map for an entry associated to
4146                     key on cpu.
4147
4148              Return Map  value  associated to key on cpu, or NULL if no entry
4149                     was found or cpu is invalid.
4150
4151       struct mptcp_sock *bpf_skc_to_mptcp_sock(void *sk)
4152
4153              Description
4154                     Dynamically cast a sk pointer to a mptcp_sock pointer.
4155
4156              Return sk if casting is valid, or NULL otherwise.
4157
4158       long  bpf_dynptr_from_mem(void  *data,  u32  size,  u64  flags,  struct
4159       bpf_dynptr *ptr)
4160
4161              Description
4162                     Get a dynptr to local memory data.
4163
4164                     data must be a ptr to a map value.  The maximum size sup‐
4165                     ported is DYNPTR_MAX_SIZE.  flags is currently unused.
4166
4167              Return 0 on success, -E2BIG if the size exceeds DYNPTR_MAX_SIZE,
4168                     -EINVAL if flags is not 0.
4169
4170       long  bpf_ringbuf_reserve_dynptr(void  *ringbuf,  u32  size, u64 flags,
4171       struct bpf_dynptr *ptr)
4172
4173              Description
4174                     Reserve size bytes of payload in a  ring  buffer  ringbuf
4175                     through the dynptr interface. flags must be 0.
4176
4177                     Please   note   that   a  corresponding  bpf_ringbuf_sub‐
4178                     mit_dynptr or bpf_ringbuf_discard_dynptr must  be  called
4179                     on  ptr,  even if the reservation fails. This is enforced
4180                     by the verifier.
4181
4182              Return 0 on success, or a negative error in case of failure.
4183
4184       void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags)
4185
4186              Description
4187                     Submit reserved ring buffer sample, pointed to  by  data,
4188                     through  the  dynptr  interface.  This  is a no-op if the
4189                     dynptr is invalid/null.
4190
4191                     For more information  on  flags,  please  see  'bpf_ring‐
4192                     buf_submit'.
4193
4194              Return Nothing. Always succeeds.
4195
4196       void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags)
4197
4198              Description
4199                     Discard  reserved  ring  buffer sample through the dynptr
4200                     interface. This is a no-op if the dynptr is invalid/null.
4201
4202                     For more information  on  flags,  please  see  'bpf_ring‐
4203                     buf_discard'.
4204
4205              Return Nothing. Always succeeds.
4206
4207       long  bpf_dynptr_read(void  *dst,  u32 len, struct bpf_dynptr *src, u32
4208       offset, u64 flags)
4209
4210              Description
4211                     Read len bytes from src into dst,  starting  from  offset
4212                     into src.  flags is currently unused.
4213
4214              Return 0  on  success, -E2BIG if offset + len exceeds the length
4215                     of src's data, -EINVAL if src is an invalid dynptr or  if
4216                     flags is not 0.
4217
4218       long  bpf_dynptr_write(struct  bpf_dynptr  *dst, u32 offset, void *src,
4219       u32 len, u64 flags)
4220
4221              Description
4222                     Write len bytes from src into dst, starting  from  offset
4223                     into dst.  flags is currently unused.
4224
4225              Return 0  on  success, -E2BIG if offset + len exceeds the length
4226                     of dst's data, -EINVAL if dst is an invalid dynptr or  if
4227                     dst is a read-only dynptr or if flags is not 0.
4228
4229       void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
4230
4231              Description
4232                     Get a pointer to the underlying dynptr data.
4233
4234                     len  must  be a statically known value. The returned data
4235                     slice is invalidated whenever the dynptr is invalidated.
4236
4237              Return Pointer to the underlying dynptr data, NULL if the dynptr
4238                     is  read-only, if the dynptr is invalid, or if the offset
4239                     and length is out of bounds.
4240
4241       s64 bpf_tcp_raw_gen_syncookie_ipv4(struct  iphdr  *iph,  struct  tcphdr
4242       *th, u32 th_len)
4243
4244              Description
4245                     Try to issue a SYN cookie for the packet with correspond‐
4246                     ing IPv4/TCP headers, iph and th, without depending on  a
4247                     listening socket.
4248
4249                     iph points to the IPv4 header.
4250
4251                     th  points  to  the start of the TCP header, while th_len
4252                     contains  the  length  of  the  TCP  header   (at   least
4253                     sizeof(struct tcphdr)).
4254
4255              Return On  success,  lower 32 bits hold the generated SYN cookie
4256                     in followed by 16 bits which hold the MSS value for  that
4257                     cookie, and the top 16 bits are unused.
4258
4259                     On failure, the returned value is one of the following:
4260
4261                     -EINVAL if th_len is invalid.
4262
4263       s64  bpf_tcp_raw_gen_syncookie_ipv6(struct  ipv6hdr *iph, struct tcphdr
4264       *th, u32 th_len)
4265
4266              Description
4267                     Try to issue a SYN cookie for the packet with correspond‐
4268                     ing  IPv6/TCP headers, iph and th, without depending on a
4269                     listening socket.
4270
4271                     iph points to the IPv6 header.
4272
4273                     th points to the start of the TCP  header,  while  th_len
4274                     contains   the   length  of  the  TCP  header  (at  least
4275                     sizeof(struct tcphdr)).
4276
4277              Return On success, lower 32 bits hold the generated  SYN  cookie
4278                     in  followed by 16 bits which hold the MSS value for that
4279                     cookie, and the top 16 bits are unused.
4280
4281                     On failure, the returned value is one of the following:
4282
4283                     -EINVAL if th_len is invalid.
4284
4285                     -EPROTONOSUPPORT if CONFIG_IPV6 is not builtin.
4286
4287       long bpf_tcp_raw_check_syncookie_ipv4(struct iphdr *iph, struct  tcphdr
4288       *th)
4289
4290              Description
4291                     Check  whether  iph and th contain a valid SYN cookie ACK
4292                     without depending on a listening socket.
4293
4294                     iph points to the IPv4 header.
4295
4296                     th points to the TCP header.
4297
4298              Return 0 if iph and th are a valid SYN cookie ACK.
4299
4300                     On failure, the returned value is one of the following:
4301
4302                     -EACCES if the SYN cookie is not valid.
4303
4304       long  bpf_tcp_raw_check_syncookie_ipv6(struct  ipv6hdr   *iph,   struct
4305       tcphdr *th)
4306
4307              Description
4308                     Check  whether  iph and th contain a valid SYN cookie ACK
4309                     without depending on a listening socket.
4310
4311                     iph points to the IPv6 header.
4312
4313                     th points to the TCP header.
4314
4315              Return 0 if iph and th are a valid SYN cookie ACK.
4316
4317                     On failure, the returned value is one of the following:
4318
4319                     -EACCES if the SYN cookie is not valid.
4320
4321                     -EPROTONOSUPPORT if CONFIG_IPV6 is not builtin.
4322
4323       u64 bpf_ktime_get_tai_ns(void)
4324
4325              Description
4326                     A nonsettable system-wide clock derived  from  wall-clock
4327                     time  but ignoring leap seconds.  This clock does not ex‐
4328                     perience discontinuities and backwards  jumps  caused  by
4329                     NTP inserting leap seconds as CLOCK_REALTIME does.
4330
4331                     See: clock_gettime(CLOCK_TAI)
4332
4333              Return Current ktime.
4334
4335       long  bpf_user_ringbuf_drain(struct  bpf_map  *map,  void *callback_fn,
4336       void *ctx, u64 flags)
4337
4338              Description
4339                     Drain samples from the specified user  ring  buffer,  and
4340                     invoke the provided callback for each such sample:
4341
4342                     long   (*callback_fn)(struct   bpf_dynptr  *dynptr,  void
4343                     *ctx);
4344
4345                     If callback_fn returns 0, the helper will continue to try
4346                     and   drain   the   next  sample,  up  to  a  maximum  of
4347                     BPF_MAX_USER_RINGBUF_SAMPLES samples. If the return value
4348                     is  1,  the  helper will skip the rest of the samples and
4349                     return. Other return values are not used now, and will be
4350                     rejected by the verifier.
4351
4352              Return The number of drained samples if no error was encountered
4353                     while draining samples, or 0 if no samples  were  present
4354                     in   the  ring  buffer.  If  a  user-space  producer  was
4355                     epoll-waiting on this map, and at least  one  sample  was
4356                     drained,  they will receive an event notification notify‐
4357                     ing them of available space in the ring  buffer.  If  the
4358                     BPF_RB_NO_WAKEUP  flag  is  passed  to  this function, no
4359                     wakeup   notification    will    be    sent.    If    the
4360                     BPF_RB_FORCE_WAKEUP flag is passed, a wakeup notification
4361                     will be sent even if no sample was drained.
4362
4363                     On failure, the returned value is one of the following:
4364
4365                     -EBUSY if the ring buffer is contended, and another call‐
4366                     ing context was concurrently draining the ring buffer.
4367
4368                     -EINVAL  if  user-space is not properly tracking the ring
4369                     buffer due to the producer position not being aligned  to
4370                     8  bytes,  a  sample not being aligned to 8 bytes, or the
4371                     producer position not matching the advertised length of a
4372                     sample.
4373
4374                     -E2BIG  if user-space has tried to publish a sample which
4375                     is larger than the size of the ring buffer, or which can‐
4376                     not fit within a struct bpf_dynptr.
4377

EXAMPLES

4379       Example  usage  for most of the eBPF helpers listed in this manual page
4380       are available within the Linux kernel sources, at the  following  loca‐
4381       tions:
4382
4383samples/bpf/
4384
4385tools/testing/selftests/bpf/
4386

LICENSE

4388       eBPF  programs  can  have  an associated license, passed along with the
4389       bytecode instructions to the kernel when the programs are  loaded.  The
4390       format  for  that string is identical to the one in use for kernel mod‐
4391       ules (Dual licenses, such as "Dual BSD/GPL", may be used). Some  helper
4392       functions  are only accessible to programs that are compatible with the
4393       GNU Privacy License (GPL).
4394
4395       In order to use such helpers, the eBPF program must be loaded with  the
4396       correct  license string passed (via attr) to the bpf() system call, and
4397       this generally translates into the C source code of  the  program  con‐
4398       taining a line similar to the following:
4399
4400          char ____license[] __attribute__((section("license"), used)) = "GPL";
4401

IMPLEMENTATION

4403       This  manual  page  is  an  effort to document the existing eBPF helper
4404       functions.  But as of this writing, the BPF sub-system is  under  heavy
4405       development.  New  eBPF  program or map types are added, along with new
4406       helper functions. Some helpers are occasionally made available for  ad‐
4407       ditional  program  types.  So in spite of the efforts of the community,
4408       this page might not be up-to-date. If you want  to  check  by  yourself
4409       what  helper  functions exist in your kernel, or what types of programs
4410       they can support, here are some files among the kernel  tree  that  you
4411       may be interested in:
4412
4413include/uapi/linux/bpf.h is the main BPF header. It contains the full
4414         list of all helper functions, as well as many other  BPF  definitions
4415         including  most  of  the  flags,  structs  or  constants  used by the
4416         helpers.
4417
4418net/core/filter.c contains the  definition  of  most  network-related
4419         helper  functions,  and the list of program types from which they can
4420         be used.
4421
4422kernel/trace/bpf_trace.c is the  equivalent  for  most  tracing  pro‐
4423         gram-related helpers.
4424
4425kernel/bpf/verifier.c contains the functions used to check that valid
4426         types of eBPF maps are used with a given helper function.
4427
4428kernel/bpf/  directory  contains  other  files  in  which  additional
4429         helpers are defined (for cgroups, sockmaps, etc.).
4430
4431       • The  bpftool  utility can be used to probe the availability of helper
4432         functions on the system (as well as supported program and map  types,
4433         and  a  number  of  other  parameters). To do so, run bpftool feature
4434         probe (see bpftool-feature(8) for details). Add the unprivileged key‐
4435         word to list features available to unprivileged users.
4436
4437       Compatibility  between helper functions and program types can generally
4438       be found in the files where helper functions are defined. Look for  the
4439       struct  bpf_func_proto  objects and for functions returning them: these
4440       functions contain a list of helpers that a given program type can call.
4441       Note  that  the  default:  label  of the switch ... case used to filter
4442       helpers can call other functions, themselves allowing access  to  addi‐
4443       tional helpers. The requirement for GPL license is also in those struct
4444       bpf_func_proto.
4445
4446       Compatibility between helper functions and map types can  be  found  in
4447       the  check_map_func_compatibility()  function  in file kernel/bpf/veri‐
4448       fier.c.
4449
4450       Helper functions that invalidate the checks on data and data_end point‐
4451       ers     for    network    processing    are    listed    in    function
4452       bpf_helper_changes_pkt_data() in file net/core/filter.c.
4453

SEE ALSO

4455       bpf(2), bpftool(8), cgroups(7), ip(8), perf_event_open(2),  sendmsg(2),
4456       socket(7), tc-bpf(8)
4457
4458
4459
4460
4461Linux v6.1                        2022-09-26                    BPF-HELPERS(7)
Impressum