1BPF-HELPERS(7)             Linux Programmer's Manual            BPF-HELPERS(7)
2
3
4

NAME

6       BPF-HELPERS - list of eBPF helper functions
7

DESCRIPTION

9       The  extended  Berkeley Packet Filter (eBPF) subsystem consists in pro‐
10       grams written in a pseudo-assembly language, then attached  to  one  of
11       the  several  kernel hooks and run in reaction of specific events. This
12       framework differs from the older, "classic" BPF (or "cBPF") in  several
13       aspects,  one  of  them being the ability to call special functions (or
14       "helpers") from within a program.  These functions are restricted to  a
15       white-list of helpers defined in the kernel.
16
17       These helpers are used by eBPF programs to interact with the system, or
18       with the context in which they work. For instance, they can be used  to
19       print  debugging messages, to get the time since the system was booted,
20       to interact with eBPF maps, or to  manipulate  network  packets.  Since
21       there  are  several eBPF program types, and that they do not run in the
22       same context, each program  type  can  only  call  a  subset  of  those
23       helpers.
24
25       Due  to  eBPF  conventions,  a helper can not have more than five argu‐
26       ments.
27
28       Internally, eBPF programs call directly into the compiled helper  func‐
29       tions  without  requiring  any foreign-function interface. As a result,
30       calling helpers introduces no overhead, thus offering excellent perfor‐
31       mance.
32
33       This  document is an attempt to list and document the helpers available
34       to eBPF developers. They are sorted by chronological order (the  oldest
35       helpers in the kernel at the top).
36

HELPERS

38       void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
39
40              Description
41                     Perform a lookup in map for an entry associated to key.
42
43              Return Map  value  associated  to  key,  or NULL if no entry was
44                     found.
45
46       int bpf_map_update_elem(struct bpf_map *map,  const  void  *key,  const
47       void *value, u64 flags)
48
49              Description
50                     Add or update the value of the entry associated to key in
51                     map with value. flags is one of:
52
53                     BPF_NOEXIST
54                            The entry for key must not exist in the map.
55
56                     BPF_EXIST
57                            The entry for key must already exist in the map.
58
59                     BPF_ANY
60                            No condition on the existence  of  the  entry  for
61                            key.
62
63                     Flag  value  BPF_NOEXIST cannot be used for maps of types
64                     BPF_MAP_TYPE_ARRAY  or  BPF_MAP_TYPE_PERCPU_ARRAY    (all
65                     elements always exist), the helper would return an error.
66
67              Return 0 on success, or a negative error in case of failure.
68
69       int bpf_map_delete_elem(struct bpf_map *map, const void *key)
70
71              Description
72                     Delete entry with key from map.
73
74              Return 0 on success, or a negative error in case of failure.
75
76       int  bpf_map_push_elem(struct  bpf_map  *map,  const  void  *value, u64
77       flags)
78
79              Description
80                     Push an element value in map. flags is one of:
81
82                     BPF_EXIST If the queue/stack is full, the oldest  element
83                     is removed to make room for this.
84
85              Return 0 on success, or a negative error in case of failure.
86
87       int bpf_probe_read(void *dst, u32 size, const void *src)
88
89              Description
90                     For  tracing  programs, safely attempt to read size bytes
91                     from address src and store the data in dst.
92
93              Return 0 on success, or a negative error in case of failure.
94
95       u64 bpf_ktime_get_ns(void)
96
97              Description
98                     Return the time elapsed since system  boot,  in  nanosec‐
99                     onds.
100
101              Return Current ktime.
102
103       int bpf_trace_printk(const char *fmt, u32 fmt_size, ...)
104
105              Description
106                     This  helper is a "printk()-like" facility for debugging.
107                     It prints a  message  defined  by  format  fmt  (of  size
108                     fmt_size)  to  file  /sys/kernel/debug/tracing/trace from
109                     DebugFS, if available. It can take up to three additional
110                     u64  arguments  (as  an eBPF helpers, the total number of
111                     arguments is limited to five).
112
113                     Each time the helper is called, it appends a line to  the
114                     trace.   The format of the trace is customizable, and the
115                     exact output one will get depends on the options  set  in
116                     /sys/kernel/debug/tracing/trace_options   (see  also  the
117                     README file under the same directory). However,  it  usu‐
118                     ally defaults to something like:
119
120                        telnet-470   [001] .N.. 419421.045894: 0x00000001: <formatted msg>
121
122                     In the above:
123
124                        · telnet is the name of the current task.
125
126                        · 470 is the PID of the current task.
127
128                        · 001 is the CPU number on which the task is running.
129
130                        · In  .N..,  each character refers to a set of options
131                          (whether  irqs  are  enabled,  scheduling   options,
132                          whether  hard/softirqs  are  running,  level of pre‐
133                          empt_disabled   respectively).    N    means    that
134                          TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED are set.
135
136                        · 419421.045894 is a timestamp.
137
138                        · 0x00000001  is  a  fake  value  used  by BPF for the
139                          instruction pointer register.
140
141                        · <formatted msg> is the message formatted with fmt.
142
143                     The conversion specifiers supported by fmt  are  similar,
144                     but  more limited than for printk(). They are %d, %i, %u,
145                     %x, %ld, %li, %lu, %lx, %lld, %lli, %llu, %llx,  %p,  %s.
146                     No modifier (size of field, padding with zeroes, etc.) is
147                     available, and the helper will return -EINVAL (but  print
148                     nothing) if it encounters an unknown specifier.
149
150                     Also,  note  that  bpf_trace_printk() is slow, and should
151                     only be used for debugging purposes. For this  reason,  a
152                     notice bloc (spanning several lines) is printed to kernel
153                     logs and states that the helper should not be  used  "for
154                     production  use"  the  first time this helper is used (or
155                     more precisely, when  trace_printk()  buffers  are  allo‐
156                     cated).  For  passing  values  to user space, perf events
157                     should be preferred.
158
159              Return The number of bytes written to the buffer, or a  negative
160                     error in case of failure.
161
162       u32 bpf_get_prandom_u32(void)
163
164              Description
165                     Get a pseudo-random number.
166
167                     From  a  security point of view, this helper uses its own
168                     pseudo-random internal state, and cannot be used to infer
169                     the  seed  of  other random functions in the kernel. How‐
170                     ever, it is essential to note that the generator used  by
171                     the helper is not cryptographically secure.
172
173              Return A random 32-bit unsigned value.
174
175       u32 bpf_get_smp_processor_id(void)
176
177              Description
178                     Get  the  SMP  (symmetric  multiprocessing) processor id.
179                     Note that all  programs  run  with  preemption  disabled,
180                     which  means  that  the SMP processor id is stable during
181                     all the execution of the program.
182
183              Return The SMP id of the processor running the program.
184
185       int bpf_skb_store_bytes(struct sk_buff *skb,  u32  offset,  const  void
186       *from, u32 len, u64 flags)
187
188              Description
189                     Store len bytes from address from into the packet associ‐
190                     ated to skb,  at  offset.  flags  are  a  combination  of
191                     BPF_F_RECOMPUTE_CSUM  (automatically recompute the check‐
192                     sum  for  the  packet  after  storing  the   bytes)   and
193                     BPF_F_INVALIDATE_HASH  (set  skb->hash,  skb->swhash  and
194                     skb->l4hash to 0).
195
196                     A call to this helper is susceptible to change the under‐
197                     laying packet buffer. Therefore, at load time, all checks
198                     on pointers previously done by the verifier  are  invali‐
199                     dated  and must be performed again, if the helper is used
200                     in combination with direct packet access.
201
202              Return 0 on success, or a negative error in case of failure.
203
204       int bpf_l3_csum_replace(struct sk_buff *skb, u32 offset, u64 from,  u64
205       to, u64 size)
206
207              Description
208                     Recompute  the  layer 3 (e.g. IP) checksum for the packet
209                     associated to skb. Computation  is  incremental,  so  the
210                     helper  must  know  the  former value of the header field
211                     that was modified (from), the new  value  of  this  field
212                     (to),  and  the  number of bytes (2 or 4) for this field,
213                     stored in size.  Alternatively, it is possible  to  store
214                     the difference between the previous and the new values of
215                     the header field in to, by setting from and  size  to  0.
216                     For both methods, offset indicates the location of the IP
217                     checksum within the packet.
218
219                     This helper works in  combination  with  bpf_csum_diff(),
220                     which  does  not update the checksum in-place, but offers
221                     more flexibility and can handle sizes larger than 2 or  4
222                     for the checksum to update.
223
224                     A call to this helper is susceptible to change the under‐
225                     laying packet buffer. Therefore, at load time, all checks
226                     on  pointers  previously done by the verifier are invali‐
227                     dated and must be performed again, if the helper is  used
228                     in combination with direct packet access.
229
230              Return 0 on success, or a negative error in case of failure.
231
232       int  bpf_l4_csum_replace(struct sk_buff *skb, u32 offset, u64 from, u64
233       to, u64 flags)
234
235              Description
236                     Recompute the layer 4 (e.g. TCP, UDP  or  ICMP)  checksum
237                     for  the  packet associated to skb. Computation is incre‐
238                     mental, so the helper must know the former value  of  the
239                     header  field  that was modified (from), the new value of
240                     this field (to), and the number of bytes  (2  or  4)  for
241                     this  field,  stored  on  the  lowest four bits of flags.
242                     Alternatively, it is possible  to  store  the  difference
243                     between  the  previous  and  the new values of the header
244                     field in to, by setting from and the four lowest bits  of
245                     flags  to 0. For both methods, offset indicates the loca‐
246                     tion of the IP checksum within the packet. In addition to
247                     the  size  of  the field, flags can be added (bitwise OR)
248                     actual flags. With BPF_F_MARK_MANGLED_0, a null  checksum
249                     is  left untouched (unless BPF_F_MARK_ENFORCE is added as
250                     well), and for updates resulting in a null  checksum  the
251                     value    is   set   to   CSUM_MANGLED_0   instead.   Flag
252                     BPF_F_PSEUDO_HDR indicates the checksum is to be computed
253                     against a pseudo-header.
254
255                     This  helper  works  in combination with bpf_csum_diff(),
256                     which does not update the checksum in-place,  but  offers
257                     more  flexibility and can handle sizes larger than 2 or 4
258                     for the checksum to update.
259
260                     A call to this helper is susceptible to change the under‐
261                     laying packet buffer. Therefore, at load time, all checks
262                     on pointers previously done by the verifier  are  invali‐
263                     dated  and must be performed again, if the helper is used
264                     in combination with direct packet access.
265
266              Return 0 on success, or a negative error in case of failure.
267
268       int bpf_tail_call(void *ctx, struct bpf_map *prog_array_map, u32 index)
269
270              Description
271                     This special helper is used to trigger a "tail call",  or
272                     in  other  words,  to jump into another eBPF program. The
273                     same stack frame is used (but values on stack and in reg‐
274                     isters  for the caller are not accessible to the callee).
275                     This mechanism allows for program  chaining,  either  for
276                     raising  the  maximum  number  of available eBPF instruc‐
277                     tions,  or  to  execute  given  programs  in  conditional
278                     blocks.  For security reasons, there is an upper limit to
279                     the number of successive tail  calls  that  can  be  per‐
280                     formed.
281
282                     Upon  call  of  this helper, the program attempts to jump
283                     into   a   program   referenced   at   index   index   in
284                     prog_array_map,     a     special     map     of     type
285                     BPF_MAP_TYPE_PROG_ARRAY, and passes ctx, a pointer to the
286                     context.
287
288                     If  the  call  succeeds,  the kernel immediately runs the
289                     first instruction of the new program. This is not a func‐
290                     tion  call, and it never returns to the previous program.
291                     If the call fails, then the helper has no effect, and the
292                     caller  continues  to  run its subsequent instructions. A
293                     call can fail if the destination  program  for  the  jump
294                     does  not  exist (i.e. index is superior to the number of
295                     entries in prog_array_map), or if the maximum  number  of
296                     tail  calls  has been reached for this chain of programs.
297                     This  limit  is  defined  in  the  kernel  by  the  macro
298                     MAX_TAIL_CALL_CNT  (not  accessible to user space), which
299                     is currently set to 32.
300
301              Return 0 on success, or a negative error in case of failure.
302
303       int bpf_clone_redirect(struct sk_buff *skb, u32 ifindex, u64 flags)
304
305              Description
306                     Clone and  redirect  the  packet  associated  to  skb  to
307                     another  net  device  of  index ifindex. Both ingress and
308                     egress  interfaces  can  be  used  for  redirection.  The
309                     BPF_F_INGRESS value in flags is used to make the distinc‐
310                     tion (ingress path is selected if the  flag  is  present,
311                     egress  path otherwise).  This is the only flag supported
312                     for now.
313
314                     In comparison with bpf_redirect() helper, bpf_clone_redi‐
315                     rect()  has the associated cost of duplicating the packet
316                     buffer, but this can be executed out of the eBPF program.
317                     Conversely,  bpf_redirect()  is more efficient, but it is
318                     handled through an action code where the redirection hap‐
319                     pens only after the eBPF program has returned.
320
321                     A call to this helper is susceptible to change the under‐
322                     laying packet buffer. Therefore, at load time, all checks
323                     on  pointers  previously done by the verifier are invali‐
324                     dated and must be performed again, if the helper is  used
325                     in combination with direct packet access.
326
327              Return 0 on success, or a negative error in case of failure.
328
329       u64 bpf_get_current_pid_tgid(void)
330
331              Return A 64-bit integer containing the current tgid and pid, and
332                     created  as  such:  current_task->tgid  <<  32   |   cur‐
333                     rent_task->pid.
334
335       u64 bpf_get_current_uid_gid(void)
336
337              Return A  64-bit integer containing the current GID and UID, and
338                     created as such: current_gid << 32 | current_uid.
339
340       int bpf_get_current_comm(char *buf, u32 size_of_buf)
341
342              Description
343                     Copy the comm attribute of the current task into  buf  of
344                     size_of_buf.  The comm attribute contains the name of the
345                     executable (excluding the path) for the current task. The
346                     size_of_buf  must  be  strictly positive. On success, the
347                     helper makes sure that  the  buf  is  NUL-terminated.  On
348                     failure, it is filled with zeroes.
349
350              Return 0 on success, or a negative error in case of failure.
351
352       u32 bpf_get_cgroup_classid(struct sk_buff *skb)
353
354              Description
355                     Retrieve  the  classid for the current task, i.e. for the
356                     net_cls cgroup to which skb belongs.
357
358                     This helper can be used on TC egress  path,  but  not  on
359                     ingress.
360
361                     The  net_cls  cgroup provides an interface to tag network
362                     packets based on a user-provided identifier for all traf‐
363                     fic  coming  from  the  tasks  belonging  to  the related
364                     cgroup. See also the related kernel documentation, avail‐
365                     able   from   the   Linux   sources  in  file  Documenta‐
366                     tion/cgroup-v1/net_cls.txt.
367
368                     The Linux kernel has two versions for cgroups: there  are
369                     cgroups  v1  and cgroups v2. Both are available to users,
370                     who can use a mixture of them, but note that the  net_cls
371                     cgroup  is for cgroup v1 only. This makes it incompatible
372                     with  BPF  programs  run   on   cgroups,   which   is   a
373                     cgroup-v2-only  feature  (a socket can only hold data for
374                     one version of cgroups at a time).
375
376                     This helper is only available is the kernel was  compiled
377                     with  the  CONFIG_CGROUP_NET_CLASSID configuration option
378                     set to "y" or to "m".
379
380              Return The classid, or 0 for the default unconfigured classid.
381
382       int  bpf_skb_vlan_push(struct  sk_buff  *skb,  __be16  vlan_proto,  u16
383       vlan_tci)
384
385              Description
386                     Push  a vlan_tci (VLAN tag control information) of proto‐
387                     col vlan_proto to the  packet  associated  to  skb,  then
388                     update the checksum. Note that if vlan_proto is different
389                     from ETH_P_8021Q and ETH_P_8021AD, it is considered to be
390                     ETH_P_8021Q.
391
392                     A call to this helper is susceptible to change the under‐
393                     laying packet buffer. Therefore, at load time, all checks
394                     on  pointers  previously done by the verifier are invali‐
395                     dated and must be performed again, if the helper is  used
396                     in combination with direct packet access.
397
398              Return 0 on success, or a negative error in case of failure.
399
400       int bpf_skb_vlan_pop(struct sk_buff *skb)
401
402              Description
403                     Pop a VLAN header from the packet associated to skb.
404
405                     A call to this helper is susceptible to change the under‐
406                     laying packet buffer. Therefore, at load time, all checks
407                     on  pointers  previously done by the verifier are invali‐
408                     dated and must be performed again, if the helper is  used
409                     in combination with direct packet access.
410
411              Return 0 on success, or a negative error in case of failure.
412
413       int  bpf_skb_get_tunnel_key(struct  sk_buff *skb, struct bpf_tunnel_key
414       *key, u32 size, u64 flags)
415
416              Description
417                     Get tunnel metadata. This helper takes a pointer  key  to
418                     an  empty  struct  bpf_tunnel_key  of  size, that will be
419                     filled with tunnel metadata for the packet associated  to
420                     skb.   The  flags can be set to BPF_F_TUNINFO_IPV6, which
421                     indicates that the  tunnel  is  based  on  IPv6  protocol
422                     instead of IPv4.
423
424                     The  struct  bpf_tunnel_key is an object that generalizes
425                     the principal parameters used by various tunneling proto‐
426                     cols  into  a  single struct. This way, it can be used to
427                     easily make a decision  based  on  the  contents  of  the
428                     encapsulation  header,  "summarized"  in  this struct. In
429                     particular, it holds the IP address  of  the  remote  end
430                     (IPv4 or IPv6, depending on the case) in key->remote_ipv4
431                     or  key->remote_ipv6.  Also,  this  struct  exposes   the
432                     key->tunnel_id,  which is generally mapped to a VNI (Vir‐
433                     tual Network Identifier), making it programmable together
434                     with the bpf_skb_set_tunnel_key() helper.
435
436                     Let's  imagine  that the following code is part of a pro‐
437                     gram attached to the TC ingress interface, on one end  of
438                     a  GRE tunnel, and is supposed to filter out all messages
439                     coming from remote ends  with  IPv4  address  other  than
440                     10.0.0.1:
441
442                        int ret;
443                        struct bpf_tunnel_key key = {};
444
445                        ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0);
446                        if (ret < 0)
447                                return TC_ACT_SHOT;     // drop packet
448
449                        if (key.remote_ipv4 != 0x0a000001)
450                                return TC_ACT_SHOT;     // drop packet
451
452                        return TC_ACT_OK;               // accept packet
453
454                     This  interface  can  also be used with all encapsulation
455                     devices that can  operate  in  "collect  metadata"  mode:
456                     instead of having one network device per specific config‐
457                     uration, the "collect metadata" mode only requires a sin‐
458                     gle  device where the configuration can be extracted from
459                     this helper.
460
461                     This can be used together with various  tunnels  such  as
462                     VXLan, Geneve, GRE or IP in IP (IPIP).
463
464              Return 0 on success, or a negative error in case of failure.
465
466       int  bpf_skb_set_tunnel_key(struct  sk_buff *skb, struct bpf_tunnel_key
467       *key, u32 size, u64 flags)
468
469              Description
470                     Populate tunnel metadata for packet  associated  to  skb.
471                     The  tunnel  metadata  is  set to the contents of key, of
472                     size. The flags can be set to a combination of  the  fol‐
473                     lowing values:
474
475                     BPF_F_TUNINFO_IPV6
476                            Indicate that the tunnel is based on IPv6 protocol
477                            instead of IPv4.
478
479                     BPF_F_ZERO_CSUM_TX
480                            For IPv4 packets, add a flag  to  tunnel  metadata
481                            indicating  that  checksum  computation  should be
482                            skipped and checksum set to zeroes.
483
484                     BPF_F_DONT_FRAGMENT
485                            Add a flag to tunnel metadata indicating that  the
486                            packet should not be fragmented.
487
488                     BPF_F_SEQ_NUMBER
489                            Add  a  flag  to tunnel metadata indicating that a
490                            sequence number should be added to  tunnel  header
491                            before sending the packet. This flag was added for
492                            GRE encapsulation, but might be  used  with  other
493                            protocols as well in the future.
494
495                     Here is a typical usage on the transmit path:
496
497                        struct bpf_tunnel_key key;
498                             populate key ...
499                        bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0);
500                        bpf_clone_redirect(skb, vxlan_dev_ifindex, 0);
501
502                     See  also the description of the bpf_skb_get_tunnel_key()
503                     helper for additional information.
504
505              Return 0 on success, or a negative error in case of failure.
506
507       u64 bpf_perf_event_read(struct bpf_map *map, u64 flags)
508
509              Description
510                     Read the value of  a  perf  event  counter.  This  helper
511                     relies  on  a  map of type BPF_MAP_TYPE_PERF_EVENT_ARRAY.
512                     The nature of the perf event counter is selected when map
513                     is  updated  with perf event file descriptors. The map is
514                     an array whose size is the number of available CPUs,  and
515                     each cell contains a value relative to one CPU. The value
516                     to retrieve is indicated  by  flags,  that  contains  the
517                     index    of   the   CPU   to   look   up,   masked   with
518                     BPF_F_INDEX_MASK. Alternatively,  flags  can  be  set  to
519                     BPF_F_CURRENT_CPU to indicate that the value for the cur‐
520                     rent CPU should be retrieved.
521
522                     Note that before Linux 4.13, only hardware perf event can
523                     be retrieved.
524
525                     Also,     be     aware     that    the    newer    helper
526                     bpf_perf_event_read_value()    is    recommended     over
527                     bpf_perf_event_read() in general. The latter has some ABI
528                     quirks where error and counter value are used as a return
529                     code  (which  is  wrong  to do since ranges may overlap).
530                     This issue  is  fixed  with  bpf_perf_event_read_value(),
531                     which  at  the  same time provides more features over the
532                     bpf_perf_event_read()  interface.  Please  refer  to  the
533                     description of bpf_perf_event_read_value() for details.
534
535              Return The value of the perf event counter read from the map, or
536                     a negative error code in case of failure.
537
538       int bpf_redirect(u32 ifindex, u64 flags)
539
540              Description
541                     Redirect the  packet  to  another  net  device  of  index
542                     ifindex.     This   helper   is   somewhat   similar   to
543                     bpf_clone_redirect(),  except  that  the  packet  is  not
544                     cloned, which provides increased performance.
545
546                     Except for XDP, both ingress and egress interfaces can be
547                     used for redirection. The BPF_F_INGRESS value in flags is
548                     used to make the distinction (ingress path is selected if
549                     the flag is present, egress path  otherwise).  Currently,
550                     XDP  only  supports  redirection to the egress interface,
551                     and accepts no flag at all.
552
553                     The same effect can be attained  with  the  more  generic
554                     bpf_redirect_map(),  which  requires  specific maps to be
555                     used but offers better performance.
556
557              Return For XDP, the helper returns XDP_REDIRECT  on  success  or
558                     XDP_ABORTED on error. For other program types, the values
559                     are TC_ACT_REDIRECT on success or TC_ACT_SHOT on error.
560
561       u32 bpf_get_route_realm(struct sk_buff *skb)
562
563              Description
564                     Retrieve the realm or the  route,  that  is  to  say  the
565                     tclassid field of the destination for the skb. The inden‐
566                     tifier retrieved is a user-provided tag, similar  to  the
567                     one  used  with  the  net_cls cgroup (see description for
568                     bpf_get_cgroup_classid() helper), but here  this  tag  is
569                     held by a route (a destination entry), not by a task.
570
571                     Retrieving  this  identifier  works  with  the  clsact TC
572                     egress hook (see also  tc-bpf(8)),  or  alternatively  on
573                     conventional  classful  egress  qdiscs,  but  not  on  TC
574                     ingress path. In case of clsact TC egress hook, this  has
575                     the advantage that, internally, the destination entry has
576                     not been dropped yet in the transmit path. Therefore, the
577                     destination  entry  does not need to be artificially held
578                     via netif_keep_dst() for a classful qdisc until  the  skb
579                     is freed.
580
581                     This  helper is available only if the kernel was compiled
582                     with CONFIG_IP_ROUTE_CLASSID configuration option.
583
584              Return The realm of the route for the packet associated to  skb,
585                     or 0 if none was found.
586
587       int  bpf_perf_event_output(struct pt_reg *ctx, struct bpf_map *map, u64
588       flags, void *data, u64 size)
589
590              Description
591                     Write raw data blob into a special BPF perf event held by
592                     map  of  type  BPF_MAP_TYPE_PERF_EVENT_ARRAY.  This  perf
593                     event must have the following attributes: PERF_SAMPLE_RAW
594                     as   sample_type,   PERF_TYPE_SOFTWARE   as   type,   and
595                     PERF_COUNT_SW_BPF_OUTPUT as config.
596
597                     The flags are used to indicate the index in map for which
598                     the  value  must  be  put,  masked with BPF_F_INDEX_MASK.
599                     Alternatively, flags can be set to  BPF_F_CURRENT_CPU  to
600                     indicate that the index of the current CPU core should be
601                     used.
602
603                     The value to write, of size, is passed through eBPF stack
604                     and pointed by data.
605
606                     The  context  of  the program ctx needs also be passed to
607                     the helper.
608
609                     On user space, a program willing to read the values needs
610                     to  call  perf_event_open() on the perf event (either for
611                     one or for all CPUs) and to  store  the  file  descriptor
612                     into  the  map. This must be done before the eBPF program
613                     can send data into it. An example is  available  in  file
614                     samples/bpf/trace_output_user.c   in   the  Linux  kernel
615                     source tree (the eBPF  program  counterpart  is  in  sam‐
616                     ples/bpf/trace_output_kern.c).
617
618                     bpf_perf_event_output()  achieves better performance than
619                     bpf_trace_printk() for sharing data with user space,  and
620                     is much better suitable for streaming data from eBPF pro‐
621                     grams.
622
623                     Note that this helper is not restricted  to  tracing  use
624                     cases and can be used with programs attached to TC or XDP
625                     as well, where it allows for passing data to  user  space
626                     listeners. Data can be:
627
628                     · Only custom structs,
629
630                     · Only the packet payload, or
631
632                     · A combination of both.
633
634              Return 0 on success, or a negative error in case of failure.
635
636       int bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void *to,
637       u32 len)
638
639              Description
640                     This helper was provided as an easy way to load data from
641                     a  packet.  It  can be used to load len bytes from offset
642                     from the  packet  associated  to  skb,  into  the  buffer
643                     pointed by to.
644
645                     Since  Linux  4.7,  usage  of this helper has mostly been
646                     replaced by "direct packet access", enabling packet  data
647                     to be manipulated with skb->data and skb->data_end point‐
648                     ing respectively to the first byte of packet data and  to
649                     the  byte after the last byte of packet data. However, it
650                     remains useful if one wishes to read large quantities  of
651                     data at once from a packet into the eBPF stack.
652
653              Return 0 on success, or a negative error in case of failure.
654
655       int bpf_get_stackid(struct pt_reg *ctx, struct bpf_map *map, u64 flags)
656
657              Description
658                     Walk  a  user  or  a  kernel  stack and return its id. To
659                     achieve this, the helper needs ctx, which is a pointer to
660                     the context on which the tracing program is executed, and
661                     a pointer to a map of type BPF_MAP_TYPE_STACK_TRACE.
662
663                     The last argument,  flags,  holds  the  number  of  stack
664                     frames   to   skip   (from   0   to   255),  masked  with
665                     BPF_F_SKIP_FIELD_MASK. The next bits can be used to set a
666                     combination of the following flags:
667
668                     BPF_F_USER_STACK
669                            Collect  a  user  space  stack instead of a kernel
670                            stack.
671
672                     BPF_F_FAST_STACK_CMP
673                            Compare stacks by hash only.
674
675                     BPF_F_REUSE_STACKID
676                            If  two  different  stacks  hash  into  the   same
677                            stackid, discard the old one.
678
679                     The  stack  id  retrieved is a 32 bit long integer handle
680                     which can be further combined with other data  (including
681                     other stack ids) and used as a key into maps. This can be
682                     useful for generating a variety of graphs (such as  flame
683                     graphs or off-cpu graphs).
684
685                     For  walking  a stack, this helper is an improvement over
686                     bpf_probe_read(), which can be used with  unrolled  loops
687                     but  is not efficient and consumes a lot of eBPF instruc‐
688                     tions.  Instead,  bpf_get_stackid()  can  collect  up  to
689                     PERF_MAX_STACK_DEPTH  both  kernel  and user frames. Note
690                     that this limit can be controlled with  the  sysctl  pro‐
691                     gram,  and  that it should be manually increased in order
692                     to profile long user stacks (such as stacks for Java pro‐
693                     grams). To do so, use:
694
695                        # sysctl kernel.perf_event_max_stack=<new value>
696
697              Return The  positive  or null stack id on success, or a negative
698                     error in case of failure.
699
700       s64 bpf_csum_diff(__be32 *from, u32 from_size, __be32 *to, u32 to_size,
701       __wsum seed)
702
703              Description
704                     Compute  a  checksum  difference,  from  the  raw  buffer
705                     pointed by from, of length from_size (that must be a mul‐
706                     tiple  of  4),  towards  the raw buffer pointed by to, of
707                     size to_size (same remark). An optional seed can be added
708                     to  the  value  (this  can be cascaded, the seed may come
709                     from a previous call to the helper).
710
711                     This is flexible enough to be used in several ways:
712
713                     · With from_size == 0, to_size > 0 and seed set to check‐
714                       sum, it can be used when pushing new data.
715
716                     · With from_size > 0, to_size == 0 and seed set to check‐
717                       sum, it can be used when removing data from a packet.
718
719                     · With from_size > 0, to_size > 0 and seed set to  0,  it
720                       can  be used to compute a diff. Note that from_size and
721                       to_size do not need to be equal.
722
723                     This   helper   can   be   used   in   combination   with
724                     bpf_l3_csum_replace() and bpf_l4_csum_replace(), to which
725                     one  can   feed   in   the   difference   computed   with
726                     bpf_csum_diff().
727
728              Return The  checksum result, or a negative error code in case of
729                     failure.
730
731       int bpf_skb_get_tunnel_opt(struct sk_buff *skb, u8 *opt, u32 size)
732
733              Description
734                     Retrieve tunnel options metadata for the  packet  associ‐
735                     ated  to skb, and store the raw tunnel option data to the
736                     buffer opt of size.
737
738                     This helper can be used with encapsulation  devices  that
739                     can  operate  in "collect metadata" mode (please refer to
740                     the related note in the description  of  bpf_skb_get_tun‐
741                     nel_key()  for  more details). A particular example where
742                     this can be used is in combination with the Geneve encap‐
743                     sulation  protocol,  where  it  allows  for pushing (with
744                     bpf_skb_get_tunnel_opt() helper) and retrieving arbitrary
745                     TLVs  (Type-Length-Value  headers) from the eBPF program.
746                     This allows for full customization of these headers.
747
748              Return The size of the option data retrieved.
749
750       int bpf_skb_set_tunnel_opt(struct sk_buff *skb, u8 *opt, u32 size)
751
752              Description
753                     Set tunnel options metadata for the packet associated  to
754                     skb to the option data contained in the raw buffer opt of
755                     size.
756
757                     See also the description of the  bpf_skb_get_tunnel_opt()
758                     helper for additional information.
759
760              Return 0 on success, or a negative error in case of failure.
761
762       int bpf_skb_change_proto(struct sk_buff *skb, __be16 proto, u64 flags)
763
764              Description
765                     Change  the  protocol of the skb to proto. Currently sup‐
766                     ported are transition from IPv4 to IPv6, and from IPv6 to
767                     IPv4.  The  helper  takes  care of the groundwork for the
768                     transition, including resizing  the  socket  buffer.  The
769                     eBPF program is expected to fill the new headers, if any,
770                     via skb_store_bytes() and to recompute the checksums with
771                     bpf_l3_csum_replace() and bpf_l4_csum_replace(). The main
772                     case for this helper is to perform NAT64  operations  out
773                     of an eBPF program.
774
775                     Internally, the GSO type is marked as dodgy so that head‐
776                     ers are checked and  segments  are  recalculated  by  the
777                     GSO/GRO  engine.   The  size for GSO target is adapted as
778                     well.
779
780                     All values for flags are reserved for future  usage,  and
781                     must be left at zero.
782
783                     A call to this helper is susceptible to change the under‐
784                     laying packet buffer. Therefore, at load time, all checks
785                     on  pointers  previously done by the verifier are invali‐
786                     dated and must be performed again, if the helper is  used
787                     in combination with direct packet access.
788
789              Return 0 on success, or a negative error in case of failure.
790
791       int bpf_skb_change_type(struct sk_buff *skb, u32 type)
792
793              Description
794                     Change  the packet type for the packet associated to skb.
795                     This comes down to setting skb->pkt_type to type,  except
796                     the  eBPF  program  does  not  have  a  write  access  to
797                     skb->pkt_type beside this helper.  Using  a  helper  here
798                     allows for graceful handling of errors.
799
800                     The  major  use  case  is  to  change  incoming  skb*s to
801                     **PACKET_HOST* in a programmatic way instead of having to
802                     recirculate  via  redirect(..., BPF_F_INGRESS), for exam‐
803                     ple.
804
805                     Note that type only allows certain values. At this  time,
806                     they are:
807
808                     PACKET_HOST
809                            Packet is for us.
810
811                     PACKET_BROADCAST
812                            Send packet to all.
813
814                     PACKET_MULTICAST
815                            Send packet to group.
816
817                     PACKET_OTHERHOST
818                            Send packet to someone else.
819
820              Return 0 on success, or a negative error in case of failure.
821
822       int  bpf_skb_under_cgroup(struct sk_buff *skb, struct bpf_map *map, u32
823       index)
824
825              Description
826                     Check whether skb is a descendant of the cgroup2 held  by
827                     map of type BPF_MAP_TYPE_CGROUP_ARRAY, at index.
828
829              Return The  return  value depends on the result of the test, and
830                     can be:
831
832                     · 0, if the skb failed the cgroup2 descendant test.
833
834                     · 1, if the skb succeeded the cgroup2 descendant test.
835
836                     · A negative error code, if an error occurred.
837
838       u32 bpf_get_hash_recalc(struct sk_buff *skb)
839
840              Description
841                     Retrieve the hash of the packet, skb->hash. If it is  not
842                     set,  in  particular  if the hash was cleared due to man‐
843                     gling, recompute this hash. Later accesses  to  the  hash
844                     can be done directly with skb->hash.
845
846                     Calling  bpf_set_hash_invalid(), changing a packet proto‐
847                     type    with    bpf_skb_change_proto(),    or     calling
848                     bpf_skb_store_bytes()  with the BPF_F_INVALIDATE_HASH are
849                     actions susceptible to clear the hash and  to  trigger  a
850                     new     computation     for     the    next    call    to
851                     bpf_get_hash_recalc().
852
853              Return The 32-bit hash.
854
855       u64 bpf_get_current_task(void)
856
857              Return A pointer to the current task struct.
858
859       int bpf_probe_write_user(void *dst, const void *src, u32 len)
860
861              Description
862                     Attempt in a safe way to write len bytes from the  buffer
863                     src  to dst in memory. It only works for threads that are
864                     in user context, and dst  must  be  a  valid  user  space
865                     address.
866
867                     This  helper  should not be used to implement any kind of
868                     security mechanism because of TOC-TOU attacks, but rather
869                     to  debug, divert, and manipulate execution of semi-coop‐
870                     erative processes.
871
872                     Keep in mind that this feature is meant for  experiments,
873                     and it has a risk of crashing the system and running pro‐
874                     grams.  Therefore, when an eBPF program using this helper
875                     is  attached, a warning including PID and process name is
876                     printed to kernel logs.
877
878              Return 0 on success, or a negative error in case of failure.
879
880       int bpf_current_task_under_cgroup(struct bpf_map *map, u32 index)
881
882              Description
883                     Check whether the probe is being run is the context of  a
884                     given  subset  of  the  cgroup2 hierarchy. The cgroup2 to
885                     test is held by map of type BPF_MAP_TYPE_CGROUP_ARRAY, at
886                     index.
887
888              Return The  return  value depends on the result of the test, and
889                     can be:
890
891                     · 0, if the skb task belongs to the cgroup2.
892
893                     · 1, if the skb task does not belong to the cgroup2.
894
895                     · A negative error code, if an error occurred.
896
897       int bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
898
899              Description
900                     Resize (trim or grow) the packet associated to skb to the
901                     new  len.  The  flags  are reserved for future usage, and
902                     must be left at zero.
903
904                     The basic idea is that the  helper  performs  the  needed
905                     work to change the size of the packet, then the eBPF pro‐
906                     gram    rewrites    the    rest    via    helpers    like
907                     bpf_skb_store_bytes(),             bpf_l3_csum_replace(),
908                     bpf_l3_csum_replace() and others. This helper is  a  slow
909                     path  utility intended for replies with control messages.
910                     And because it is targeted  for  slow  path,  the  helper
911                     itself  can  afford to be slow: it implicitly linearizes,
912                     unclones and drops offloads from the skb.
913
914                     A call to this helper is susceptible to change the under‐
915                     laying packet buffer. Therefore, at load time, all checks
916                     on pointers previously done by the verifier  are  invali‐
917                     dated  and must be performed again, if the helper is used
918                     in combination with direct packet access.
919
920              Return 0 on success, or a negative error in case of failure.
921
922       int bpf_skb_pull_data(struct sk_buff *skb, u32 len)
923
924              Description
925                     Pull in non-linear data in case the skb is non-linear and
926                     not  all  of len are part of the linear section. Make len
927                     bytes from skb readable and writable. If a zero value  is
928                     passed  for  len,  then  the  whole  length of the skb is
929                     pulled.
930
931                     This helper is only needed for reading and  writing  with
932                     direct packet access.
933
934                     For  direct packet access, testing that offsets to access
935                     are within packet boundaries (test on  skb->data_end)  is
936                     susceptible  to  fail  if  offsets are invalid, or if the
937                     requested data is in non-linear  parts  of  the  skb.  On
938                     failure  the program can just bail out, or in the case of
939                     a non-linear buffer, use a helper to make the data avail‐
940                     able. The bpf_skb_load_bytes() helper is a first solution
941                     to  access  the  data.  Another  one  consists  in  using
942                     bpf_skb_pull_data  to  pull in once the non-linear parts,
943                     then retesting and eventually access the data.
944
945                     At the same  time,  this  also  makes  sure  the  skb  is
946                     uncloned,  which  is  a  necessary  condition  for direct
947                     write. As this needs to be an  invariant  for  the  write
948                     part  only,  the  verifier detects writes and adds a pro‐
949                     logue that is calling bpf_skb_pull_data() to  effectively
950                     unclone  the  skb  from  the very beginning in case it is
951                     indeed cloned.
952
953                     A call to this helper is susceptible to change the under‐
954                     laying packet buffer. Therefore, at load time, all checks
955                     on pointers previously done by the verifier  are  invali‐
956                     dated  and must be performed again, if the helper is used
957                     in combination with direct packet access.
958
959              Return 0 on success, or a negative error in case of failure.
960
961       s64 bpf_csum_update(struct sk_buff *skb, __wsum csum)
962
963              Description
964                     Add the checksum csum into skb->csum in case  the  driver
965                     has  supplied  a checksum for the entire packet into that
966                     field. Return an error otherwise. This helper is intended
967                     to  be  used in combination with bpf_csum_diff(), in par‐
968                     ticular when the checksum needs to be updated after  data
969                     has  been  written  into the packet through direct packet
970                     access.
971
972              Return The checksum on success, or a negative error code in case
973                     of failure.
974
975       void bpf_set_hash_invalid(struct sk_buff *skb)
976
977              Description
978                     Invalidate  the  current  skb->hash. It can be used after
979                     mangling on headers  through  direct  packet  access,  in
980                     order  to indicate that the hash is outdated and to trig‐
981                     ger a recalculation the next time  the  kernel  tries  to
982                     access this hash or when the bpf_get_hash_recalc() helper
983                     is called.
984
985       int bpf_get_numa_node_id(void)
986
987              Description
988                     Return the id of the current NUMA node. The  primary  use
989                     case  for this helper is the selection of sockets for the
990                     local NUMA node, when the program is attached to  sockets
991                     using   the  SO_ATTACH_REUSEPORT_EBPF  option  (see  also
992                     socket(7)), but the helper is  also  available  to  other
993                     eBPF  program  types,  similarly  to  bpf_get_smp_proces‐
994                     sor_id().
995
996              Return The id of current NUMA node.
997
998       int bpf_skb_change_head(struct sk_buff *skb, u32 len, u64 flags)
999
1000              Description
1001                     Grows headroom of packet associated to  skb  and  adjusts
1002                     the  offset  of  the  MAC  header accordingly, adding len
1003                     bytes of space. It automatically extends and  reallocates
1004                     memory as required.
1005
1006                     This  helper  can  be used on a layer 3 skb to push a MAC
1007                     header for redirection into a layer 2 device.
1008
1009                     All values for flags are reserved for future  usage,  and
1010                     must be left at zero.
1011
1012                     A call to this helper is susceptible to change the under‐
1013                     laying packet buffer. Therefore, at load time, all checks
1014                     on  pointers  previously done by the verifier are invali‐
1015                     dated and must be performed again, if the helper is  used
1016                     in combination with direct packet access.
1017
1018              Return 0 on success, or a negative error in case of failure.
1019
1020       int bpf_xdp_adjust_head(struct xdp_buff *xdp_md, int delta)
1021
1022              Description
1023                     Adjust  (move)  xdp_md->data by delta bytes. Note that it
1024                     is possible to use  a  negative  value  for  delta.  This
1025                     helper  can  be used to prepare the packet for pushing or
1026                     popping headers.
1027
1028                     A call to this helper is susceptible to change the under‐
1029                     laying packet buffer. Therefore, at load time, all checks
1030                     on pointers previously done by the verifier  are  invali‐
1031                     dated  and must be performed again, if the helper is used
1032                     in combination with direct packet access.
1033
1034              Return 0 on success, or a negative error in case of failure.
1035
1036       int bpf_probe_read_str(void *dst, int size, const void *unsafe_ptr)
1037
1038              Description
1039                     Copy a NUL  terminated  string  from  an  unsafe  address
1040                     unsafe_ptr  to dst. The size should include the terminat‐
1041                     ing NUL byte. In case the string length is  smaller  than
1042                     size, the target is not padded with further NUL bytes. If
1043                     the string length is larger than size, just size-1  bytes
1044                     are copied and the last byte is set to NUL.
1045
1046                     On  success, the length of the copied string is returned.
1047                     This makes this helper useful  in  tracing  programs  for
1048                     reading  strings,  and more importantly to get its length
1049                     at runtime. See the following snippet:
1050
1051                        SEC("kprobe/sys_open")
1052                        void bpf_sys_open(struct pt_regs *ctx)
1053                                char buf[PATHLEN]; // PATHLEN is defined to 256
1054                                int res = bpf_probe_read_str(buf, sizeof(buf),
1055                                                             ctx->di);
1056
1057                                // Consume buf, for example push it to
1058                                // userspace via bpf_perf_event_output(); we
1059                                // can use res (the string length) as event
1060                                // size, after checking its boundaries.
1061
1062                     In comparison, using bpf_probe_read() helper here instead
1063                     to  read  the string would require to estimate the length
1064                     at compile time, and would often result in  copying  more
1065                     memory than necessary.
1066
1067                     Another  useful  use  case  is  when  parsing  individual
1068                     process arguments  or  individual  environment  variables
1069                     navigating      current->mm->arg_start      and      cur‐
1070                     rent->mm->env_start: using this  helper  and  the  return
1071                     value, one can quickly iterate at the right offset of the
1072                     memory area.
1073
1074              Return On success, the strictly positive length of  the  string,
1075                     including  the  trailing NUL character. On error, a nega‐
1076                     tive value.
1077
1078       u64 bpf_get_socket_cookie(struct sk_buff *skb)
1079
1080              Description
1081                     If the struct sk_buff pointed by skb has a known  socket,
1082                     retrieve  the  cookie  (generated  by the kernel) of this
1083                     socket.  If no cookie has been set yet,  generate  a  new
1084                     cookie.  Once generated, the socket cookie remains stable
1085                     for the life of the socket. This helper can be useful for
1086                     monitoring per socket networking traffic statistics as it
1087                     provides a unique socket identifier per namespace.
1088
1089              Return A 8-byte long non-decreasing number on success, or  0  if
1090                     the socket field is missing inside skb.
1091
1092       u64 bpf_get_socket_cookie(struct bpf_sock_addr *ctx)
1093
1094              Description
1095                     Equivalent to bpf_get_socket_cookie() helper that accepts
1096                     skb, but gets socket from struct bpf_sock_addr contex.
1097
1098              Return A 8-byte long non-decreasing number.
1099
1100       u64 bpf_get_socket_cookie(struct bpf_sock_ops *ctx)
1101
1102              Description
1103                     Equivalent to bpf_get_socket_cookie() helper that accepts
1104                     skb, but gets socket from struct bpf_sock_ops contex.
1105
1106              Return A 8-byte long non-decreasing number.
1107
1108       u32 bpf_get_socket_uid(struct sk_buff *skb)
1109
1110              Return The  owner  UID  of  the socket associated to skb. If the
1111                     socket is NULL, or if it is not a full socket (i.e. if it
1112                     is  a time-wait or a request socket instead), overflowuid
1113                     value is returned (note that overflowuid  might  also  be
1114                     the actual UID value for the socket).
1115
1116       u32 bpf_set_hash(struct sk_buff *skb, u32 hash)
1117
1118              Description
1119                     Set  the  full  hash for skb (set the field skb->hash) to
1120                     value hash.
1121
1122              Return
1123
1124       int bpf_setsockopt(struct bpf_sock_ops *bpf_socket, int level, int opt‐
1125       name, char *optval, int optlen)
1126
1127              Description
1128                     Emulate  a  call to setsockopt() on the socket associated
1129                     to bpf_socket, which must be a full socket. The level  at
1130                     which  the  option  resides  and  the name optname of the
1131                     option must be  specified,  see  setsockopt(2)  for  more
1132                     information.   The  option  value  of  length  optlen  is
1133                     pointed by optval.
1134
1135                     This helper actually implements a subset of setsockopt().
1136                     It supports the following levels:
1137
1138                     · SOL_SOCKET,  which  supports  the  following  optnames:
1139                       SO_RCVBUF, SO_SNDBUF, SO_MAX_PACING_RATE,  SO_PRIORITY,
1140                       SO_RCVLOWAT, SO_MARK.
1141
1142                     · IPPROTO_TCP,  which  supports  the  following optnames:
1143                       TCP_CONGESTION, TCP_BPF_IW, TCP_BPF_SNDCWND_CLAMP.
1144
1145                     · IPPROTO_IP, which supports optname IP_TOS.
1146
1147                     · IPPROTO_IPV6, which supports optname IPV6_TCLASS.
1148
1149              Return 0 on success, or a negative error in case of failure.
1150
1151       int bpf_skb_adjust_room(struct sk_buff *skb, s32  len_diff,  u32  mode,
1152       u64 flags)
1153
1154              Description
1155                     Grow or shrink the room for data in the packet associated
1156                     to skb by len_diff, and according to the selected mode.
1157
1158                     There is a single supported mode at this time:
1159
1160                     · BPF_ADJ_ROOM_NET: Adjust  room  at  the  network  layer
1161                       (room  space  is  added  or  removed  below the layer 3
1162                       header).
1163
1164                     All values for flags are reserved for future  usage,  and
1165                     must be left at zero.
1166
1167                     A call to this helper is susceptible to change the under‐
1168                     laying packet buffer. Therefore, at load time, all checks
1169                     on  pointers  previously done by the verifier are invali‐
1170                     dated and must be performed again, if the helper is  used
1171                     in combination with direct packet access.
1172
1173              Return 0 on success, or a negative error in case of failure.
1174
1175       int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
1176
1177              Description
1178                     Redirect  the packet to the endpoint referenced by map at
1179                     index key. Depending on its type, this  map  can  contain
1180                     references to net devices (for forwarding packets through
1181                     other ports), or to CPUs (for redirecting XDP  frames  to
1182                     another  CPU; but this is only implemented for native XDP
1183                     (with driver support) as of this writing).
1184
1185                     All values for flags are reserved for future  usage,  and
1186                     must be left at zero.
1187
1188                     When used to redirect packets to net devices, this helper
1189                     provides a high performance increase over bpf_redirect().
1190                     This  is  due  to  various  implementation details of the
1191                     underlying mechanisms, one of  which  is  the  fact  that
1192                     bpf_redirect_map()  tries  to  send packet as a "bulk" to
1193                     the device.
1194
1195              Return XDP_REDIRECT on success, or XDP_ABORTED on error.
1196
1197       int bpf_sk_redirect_map(struct bpf_map *map, u32 key, u64 flags)
1198
1199              Description
1200                     Redirect the packet to the socket referenced by  map  (of
1201                     type BPF_MAP_TYPE_SOCKMAP) at index key. Both ingress and
1202                     egress  interfaces  can  be  used  for  redirection.  The
1203                     BPF_F_INGRESS value in flags is used to make the distinc‐
1204                     tion (ingress path is selected if the  flag  is  present,
1205                     egress  path  otherwise). This is the only flag supported
1206                     for now.
1207
1208              Return SK_PASS on success, or SK_DROP on error.
1209
1210       int  bpf_sock_map_update(struct  bpf_sock_ops  *skops,  struct  bpf_map
1211       *map, void *key, u64 flags)
1212
1213              Description
1214                     Add an entry to, or update a map referencing sockets. The
1215                     skops is used as a new value for the entry associated  to
1216                     key. flags is one of:
1217
1218                     BPF_NOEXIST
1219                            The entry for key must not exist in the map.
1220
1221                     BPF_EXIST
1222                            The entry for key must already exist in the map.
1223
1224                     BPF_ANY
1225                            No  condition  on  the  existence of the entry for
1226                            key.
1227
1228                     If the map has eBPF programs (parser and verdict),  those
1229                     will  be  inherited  by  the  socket  being added. If the
1230                     socket is already attached to eBPF programs, this results
1231                     in an error.
1232
1233              Return 0 on success, or a negative error in case of failure.
1234
1235       int bpf_xdp_adjust_meta(struct xdp_buff *xdp_md, int delta)
1236
1237              Description
1238                     Adjust  the address pointed by xdp_md->data_meta by delta
1239                     (which can be positive or negative). Note that this oper‐
1240                     ation modifies the address stored in xdp_md->data, so the
1241                     latter must be loaded only  after  the  helper  has  been
1242                     called.
1243
1244                     The use of xdp_md->data_meta is optional and programs are
1245                     not required to use it. The rationale is  that  when  the
1246                     packet  is processed with XDP (e.g. as DoS filter), it is
1247                     possible to push further meta data along with  it  before
1248                     passing  to  the stack, and to give the guarantee that an
1249                     ingress eBPF program attached as a TC classifier  on  the
1250                     same device can pick this up for further post-processing.
1251                     Since TC works with socket buffers, it  remains  possible
1252                     to  set  from XDP the mark or priority pointers, or other
1253                     pointers for the  socket  buffer.   Having  this  scratch
1254                     space  generic and programmable allows for more flexibil‐
1255                     ity as the user is free to store whatever meta data  they
1256                     need.
1257
1258                     A call to this helper is susceptible to change the under‐
1259                     laying packet buffer. Therefore, at load time, all checks
1260                     on  pointers  previously done by the verifier are invali‐
1261                     dated and must be performed again, if the helper is  used
1262                     in combination with direct packet access.
1263
1264              Return 0 on success, or a negative error in case of failure.
1265
1266       int  bpf_perf_event_read_value(struct  bpf_map  *map, u64 flags, struct
1267       bpf_perf_event_value *buf, u32 buf_size)
1268
1269              Description
1270                     Read the value of a perf event counter, and store it into
1271                     buf of size buf_size. This helper relies on a map of type
1272                     BPF_MAP_TYPE_PERF_EVENT_ARRAY. The  nature  of  the  perf
1273                     event  counter  is selected when map is updated with perf
1274                     event file descriptors. The map is an array whose size is
1275                     the  number  of  available CPUs, and each cell contains a
1276                     value relative to one CPU. The value to retrieve is indi‐
1277                     cated  by  flags,  that  contains the index of the CPU to
1278                     look up,  masked  with  BPF_F_INDEX_MASK.  Alternatively,
1279                     flags  can  be  set to BPF_F_CURRENT_CPU to indicate that
1280                     the value for the current CPU should be retrieved.
1281
1282                     This   helper    behaves    in    a    way    close    to
1283                     bpf_perf_event_read()  helper,  save that instead of just
1284                     returning the value observed, it fills the buf structure.
1285                     This  allows for additional data to be retrieved: in par‐
1286                     ticular, the enabled and running times  (in  buf->enabled
1287                     and  buf->running,  respectively) are copied. In general,
1288                     bpf_perf_event_read_value()    is    recommended     over
1289                     bpf_perf_event_read(), which has some ABI issues and pro‐
1290                     vides fewer functionalities.
1291
1292                     These values are interesting, because hardware PMU  (Per‐
1293                     formance Monitoring Unit) counters are limited resources.
1294                     When there are more PMU based  perf  events  opened  than
1295                     available counters, kernel will multiplex these events so
1296                     each event gets certain percentage (but not all)  of  the
1297                     PMU  time.  In case that multiplexing happens, the number
1298                     of samples or counter value will  not  reflect  the  case
1299                     compared  to when no multiplexing occurs. This makes com‐
1300                     parison between different runs difficult.  Typically, the
1301                     counter  value  should  be normalized before comparing to
1302                     other experiments. The usual  normalization  is  done  as
1303                     follows.
1304
1305                        normalized_counter = counter * t_enabled / t_running
1306
1307                     Where  t_enabled is the time enabled for event and t_run‐
1308                     ning is the time running for event since last  normaliza‐
1309                     tion. The enabled and running times are accumulated since
1310                     the perf event open. To achieve  scaling  factor  between
1311                     two invocations of an eBPF program, users can can use CPU
1312                     id as the key (which is  typical  for  perf  array  usage
1313                     model) to remember the previous value and do the calcula‐
1314                     tion inside the eBPF program.
1315
1316              Return 0 on success, or a negative error in case of failure.
1317
1318       int bpf_perf_prog_read_value(struct  bpf_perf_event_data  *ctx,  struct
1319       bpf_perf_event_value *buf, u32 buf_size)
1320
1321              Description
1322                     For  en  eBPF  program attached to a perf event, retrieve
1323                     the value of the event  counter  associated  to  ctx  and
1324                     store  it  in  the  structure  pointed by buf and of size
1325                     buf_size. Enabled and running times are  also  stored  in
1326                     the     structure     (see    description    of    helper
1327                     bpf_perf_event_read_value() for more details).
1328
1329              Return 0 on success, or a negative error in case of failure.
1330
1331       int bpf_getsockopt(struct bpf_sock_ops *bpf_socket, int level, int opt‐
1332       name, char *optval, int optlen)
1333
1334              Description
1335                     Emulate  a  call to getsockopt() on the socket associated
1336                     to bpf_socket, which must be a full socket. The level  at
1337                     which  the  option  resides  and  the name optname of the
1338                     option must be  specified,  see  getsockopt(2)  for  more
1339                     information.  The retrieved value is stored in the struc‐
1340                     ture pointed by opval and of length optlen.
1341
1342                     This helper actually implements a subset of getsockopt().
1343                     It supports the following levels:
1344
1345                     · IPPROTO_TCP, which supports optname TCP_CONGESTION.
1346
1347                     · IPPROTO_IP, which supports optname IP_TOS.
1348
1349                     · IPPROTO_IPV6, which supports optname IPV6_TCLASS.
1350
1351              Return 0 on success, or a negative error in case of failure.
1352
1353       int bpf_override_return(struct pt_reg *regs, u64 rc)
1354
1355              Description
1356                     Used  for  error  injection,  this helper uses kprobes to
1357                     override the return value of the probed function, and  to
1358                     set  it to rc.  The first argument is the context regs on
1359                     which the kprobe works.
1360
1361                     This helper works by  setting  setting  the  PC  (program
1362                     counter) to an override function which is run in place of
1363                     the original probed function. This means the probed func‐
1364                     tion  is  not  run  at all. The replacement function just
1365                     returns with the required value.
1366
1367                     This helper has security implications, and thus  is  sub‐
1368                     ject  to restrictions. It is only available if the kernel
1369                     was compiled with the CONFIG_BPF_KPROBE_OVERRIDE configu‐
1370                     ration  option,  and  in this case it only works on func‐
1371                     tions tagged with  ALLOW_ERROR_INJECTION  in  the  kernel
1372                     code.
1373
1374                     Also,  the helper is only available for the architectures
1375                     having the CONFIG_FUNCTION_ERROR_INJECTION option. As  of
1376                     this writing, x86 architecture is the only one to support
1377                     this feature.
1378
1379              Return
1380
1381       int  bpf_sock_ops_cb_flags_set(struct   bpf_sock_ops   *bpf_sock,   int
1382       argval)
1383
1384              Description
1385                     Attempt  to  set  the  value of the bpf_sock_ops_cb_flags
1386                     field for the full TCP socket associated to  bpf_sock_ops
1387                     to argval.
1388
1389                     The  primary  use  of this field is to determine if there
1390                     should   be   calls   to   eBPF    programs    of    type
1391                     BPF_PROG_TYPE_SOCK_OPS at various points in the TCP code.
1392                     A program of the same type can change its value, per con‐
1393                     nection  and  as necessary, when the connection is estab‐
1394                     lished. This field is directly  accessible  for  reading,
1395                     but  this  helper  must  be  used for updates in order to
1396                     return an error if an eBPF program tries to set  a  call‐
1397                     back that is not supported in the current kernel.
1398
1399                     The  supported  callback  values  that argval can combine
1400                     are:
1401
1402                     · BPF_SOCK_OPS_RTO_CB_FLAG (retransmission time out)
1403
1404                     · BPF_SOCK_OPS_RETRANS_CB_FLAG (retransmission)
1405
1406                     · BPF_SOCK_OPS_STATE_CB_FLAG (TCP state change)
1407
1408                     Here are some examples of where one could call such  eBPF
1409                     program:
1410
1411                     · When RTO fires.
1412
1413                     · When a packet is retransmitted.
1414
1415                     · When the connection terminates.
1416
1417                     · When a packet is sent.
1418
1419                     · When a packet is received.
1420
1421              Return Code -EINVAL if the socket is not a full TCP socket; oth‐
1422                     erwise, a positive number containing the bits that  could
1423                     not be set is returned (which comes down to 0 if all bits
1424                     were set as required).
1425
1426       int bpf_msg_redirect_map(struct sk_msg_buff *msg, struct bpf_map  *map,
1427       u32 key, u64 flags)
1428
1429              Description
1430                     This  helper is used in programs implementing policies at
1431                     the socket level. If the message msg is allowed  to  pass
1432                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1433                     rect  it  to  the  socket  referenced  by  map  (of  type
1434                     BPF_MAP_TYPE_SOCKMAP)  at  index  key.  Both  ingress and
1435                     egress  interfaces  can  be  used  for  redirection.  The
1436                     BPF_F_INGRESS value in flags is used to make the distinc‐
1437                     tion (ingress path is selected if the  flag  is  present,
1438                     egress  path  otherwise). This is the only flag supported
1439                     for now.
1440
1441              Return SK_PASS on success, or SK_DROP on error.
1442
1443       int bpf_msg_apply_bytes(struct sk_msg_buff *msg, u32 bytes)
1444
1445              Description
1446                     For socket policies, apply the verdict of the  eBPF  pro‐
1447                     gram to the next bytes (number of bytes) of message msg.
1448
1449                     For  example,  this  helper  can be used in the following
1450                     cases:
1451
1452                     · A single sendmsg() or sendfile() system  call  contains
1453                       multiple logical messages that the eBPF program is sup‐
1454                       posed to read and for which it should apply a verdict.
1455
1456                     · An eBPF program only cares to read the first bytes of a
1457                       msg.  If  the message has a large payload, then setting
1458                       up and calling the  eBPF  program  repeatedly  for  all
1459                       bytes,  even though the verdict is already known, would
1460                       create unnecessary overhead.
1461
1462                     When called from within an eBPF program, the helper  sets
1463                     a  counter  internal  to  the BPF infrastructure, that is
1464                     used to apply the last verdict  to  the  next  bytes.  If
1465                     bytes  is  smaller  than the current data being processed
1466                     from a sendmsg() or sendfile()  system  call,  the  first
1467                     bytes  will  be  sent and the eBPF program will be re-run
1468                     with the pointer for start of data pointing to byte  num‐
1469                     ber  bytes  + 1. If bytes is larger than the current data
1470                     being processed, then the eBPF verdict will be applied to
1471                     multiple  sendmsg()  or  sendfile() calls until bytes are
1472                     consumed.
1473
1474                     Note that if a socket closes with  the  internal  counter
1475                     holding  a  non-zero value, this is not a problem because
1476                     data is not being buffered for bytes and is sent as it is
1477                     received.
1478
1479              Return
1480
1481       int bpf_msg_cork_bytes(struct sk_msg_buff *msg, u32 bytes)
1482
1483              Description
1484                     For socket policies, prevent the execution of the verdict
1485                     eBPF program for message msg until  bytes  (byte  number)
1486                     have been accumulated.
1487
1488                     This  can  be  used  when  one needs a specific number of
1489                     bytes before a verdict can be assigned, even if the  data
1490                     spans multiple sendmsg() or sendfile() calls. The extreme
1491                     case would be a user calling  sendmsg()  repeatedly  with
1492                     1-byte  long message segments. Obviously, this is bad for
1493                     performance, but it is still valid. If the  eBPF  program
1494                     needs  bytes  bytes to validate a header, this helper can
1495                     be used to prevent the eBPF program to  be  called  again
1496                     until bytes have been accumulated.
1497
1498              Return
1499
1500       int  bpf_msg_pull_data(struct sk_msg_buff *msg, u32 start, u32 end, u64
1501       flags)
1502
1503              Description
1504                     For socket policies, pull in non-linear  data  from  user
1505                     space   for   msg   and   set   pointers   msg->data  and
1506                     msg->data_end to start and end bytes  offsets  into  msg,
1507                     respectively.
1508
1509                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
1510                     it can only parse data that the (data, data_end) pointers
1511                     have already consumed. For sendmsg() hooks this is likely
1512                     the first scatterlist element. But for calls  relying  on
1513                     the  sendpage  handler (e.g. sendfile()) this will be the
1514                     range (0, 0) because the data is shared with  user  space
1515                     and  by  default  the objective is to avoid allowing user
1516                     space to modify data while (or  after)  eBPF  verdict  is
1517                     being  decided.  This  helper can be used to pull in data
1518                     and to set the start and end  pointer  to  given  values.
1519                     Data  will  be  copied if necessary (i.e. if data was not
1520                     linear and if start and end pointers do not point to  the
1521                     same chunk).
1522
1523                     A call to this helper is susceptible to change the under‐
1524                     laying packet buffer. Therefore, at load time, all checks
1525                     on  pointers  previously done by the verifier are invali‐
1526                     dated and must be performed again, if the helper is  used
1527                     in combination with direct packet access.
1528
1529                     All  values  for flags are reserved for future usage, and
1530                     must be left at zero.
1531
1532              Return 0 on success, or a negative error in case of failure.
1533
1534       int bpf_bind(struct bpf_sock_addr  *ctx,  struct  sockaddr  *addr,  int
1535       addr_len)
1536
1537              Description
1538                     Bind  the socket associated to ctx to the address pointed
1539                     by addr, of length addr_len. This allows for making  out‐
1540                     going  connection  from the desired IP address, which can
1541                     be useful for example when all processes inside a  cgroup
1542                     should  use one single IP address on a host that has mul‐
1543                     tiple IP configured.
1544
1545                     This helper works for IPv4 and IPv6, TCP and UDP sockets.
1546                     The   domain   (addr->sa_family)   must  be  AF_INET  (or
1547                     AF_INET6). Looking for a free port  to  bind  to  can  be
1548                     expensive,  therefore binding to port is not permitted by
1549                     the helper: addr->sin_port (or  sin6_port,  respectively)
1550                     must be set to zero.
1551
1552              Return 0 on success, or a negative error in case of failure.
1553
1554       int bpf_xdp_adjust_tail(struct xdp_buff *xdp_md, int delta)
1555
1556              Description
1557                     Adjust (move) xdp_md->data_end by delta bytes. It is only
1558                     possible to shrink the packet as of this writing,  there‐
1559                     fore delta must be a negative integer.
1560
1561                     A call to this helper is susceptible to change the under‐
1562                     laying packet buffer. Therefore, at load time, all checks
1563                     on  pointers  previously done by the verifier are invali‐
1564                     dated and must be performed again, if the helper is  used
1565                     in combination with direct packet access.
1566
1567              Return 0 on success, or a negative error in case of failure.
1568
1569       int  bpf_skb_get_xfrm_state(struct  sk_buff  *skb,  u32  index,  struct
1570       bpf_xfrm_state *xfrm_state, u32 size, u64 flags)
1571
1572              Description
1573                     Retrieve the XFRM state (IP transform framework, see also
1574                     ip-xfrm(8)) at index in XFRM "security path" for skb.
1575
1576                     The   retrieved   value   is   stored   in   the   struct
1577                     bpf_xfrm_state pointed by xfrm_state and of length size.
1578
1579                     All values for flags are reserved for future  usage,  and
1580                     must be left at zero.
1581
1582                     This  helper is available only if the kernel was compiled
1583                     with CONFIG_XFRM configuration option.
1584
1585              Return 0 on success, or a negative error in case of failure.
1586
1587       int bpf_get_stack(struct pt_regs *regs, void *buf, u32 size, u64 flags)
1588
1589              Description
1590                     Return a user or a kernel stack in bpf  program  provided
1591                     buffer.   To achieve this, the helper needs ctx, which is
1592                     a pointer to the context on which the tracing program  is
1593                     executed.   To store the stacktrace, the bpf program pro‐
1594                     vides buf with a nonnegative size.
1595
1596                     The last argument,  flags,  holds  the  number  of  stack
1597                     frames   to   skip   (from   0   to   255),  masked  with
1598                     BPF_F_SKIP_FIELD_MASK. The next bits can be used  to  set
1599                     the following flags:
1600
1601                     BPF_F_USER_STACK
1602                            Collect  a  user  space  stack instead of a kernel
1603                            stack.
1604
1605                     BPF_F_USER_BUILD_ID
1606                            Collect buildid+offset instead  of  ips  for  user
1607                            stack,  only  valid  if  BPF_F_USER_STACK  is also
1608                            specified.
1609
1610                     bpf_get_stack() can collect  up  to  PERF_MAX_STACK_DEPTH
1611                     both  kernel and user frames, subject to sufficient large
1612                     buffer size. Note that this limit can be controlled  with
1613                     the  sysctl  program,  and  that  it  should  be manually
1614                     increased in order to profile long user stacks  (such  as
1615                     stacks for Java programs). To do so, use:
1616
1617                        # sysctl kernel.perf_event_max_stack=<new value>
1618
1619              Return A  non-negative  value equal to or less than size on suc‐
1620                     cess, or a negative error in case of failure.
1621
1622       int bpf_skb_load_bytes_relative(const struct sk_buff *skb, u32  offset,
1623       void *to, u32 len, u32 start_header)
1624
1625              Description
1626                     This helper is similar to bpf_skb_load_bytes() in that it
1627                     provides an easy way to load len bytes from  offset  from
1628                     the  packet associated to skb, into the buffer pointed by
1629                     to. The difference  to  bpf_skb_load_bytes()  is  that  a
1630                     fifth  argument  start_header exists in order to select a
1631                     base offset to start from. start_header can be one of:
1632
1633                     BPF_HDR_START_MAC
1634                            Base offset to load data from is skb's mac header.
1635
1636                     BPF_HDR_START_NET
1637                            Base offset to load data  from  is  skb's  network
1638                            header.
1639
1640                     In  general,  "direct  packet  access"  is  the preferred
1641                     method to access packet data, however, this helper is  in
1642                     particular  useful in socket filters where skb->data does
1643                     not always point to the start of the mac header and where
1644                     "direct packet access" is not available.
1645
1646              Return 0 on success, or a negative error in case of failure.
1647
1648       int  bpf_fib_lookup(void *ctx, struct bpf_fib_lookup *params, int plen,
1649       u32 flags)
1650
1651              Description
1652                     Do FIB  lookup  in  kernel  tables  using  parameters  in
1653                     params.   If lookup is successful and result shows packet
1654                     is to be forwarded, the neighbor tables are searched  for
1655                     the  nexthop.   If successful (ie., FIB lookup shows for‐
1656                     warding and nexthop is resolved), the nexthop address  is
1657                     returned in ipv4_dst or ipv6_dst based on family, smac is
1658                     set to mac address of egress device, dmac is set to  nex‐
1659                     thop  mac  address, rt_metric is set to metric from route
1660                     (IPv4/IPv6 only), and ifindex is set to the device  index
1661                     of the nexthop from the FIB lookup.
1662
1663                     plen argument is the size of the passed in struct.  flags
1664                     argument can be a combination of one or more of the  fol‐
1665                     lowing values:
1666
1667                     BPF_FIB_LOOKUP_DIRECT
1668                            Do  a direct table lookup vs full lookup using FIB
1669                            rules.
1670
1671                     BPF_FIB_LOOKUP_OUTPUT
1672                            Perform lookup from an egress perspective (default
1673                            is ingress).
1674
1675                     ctx  is  either  struct xdp_md for XDP programs or struct
1676                     sk_buff tc cls_act programs.
1677
1678              Return
1679
1680                     · < 0 if any input argument is invalid
1681
1682                     · 0 on success (packet  is  forwarded,  nexthop  neighbor
1683                       exists)
1684
1685                     · >  0  one of BPF_FIB_LKUP_RET_ codes explaining why the
1686                       packet is not forwarded or needs assist from full stack
1687
1688       int  bpf_sock_hash_update(struct   bpf_sock_ops_kern   *skops,   struct
1689       bpf_map *map, void *key, u64 flags)
1690
1691              Description
1692                     Add  an  entry  to,  or update a sockhash map referencing
1693                     sockets.  The skops is used as a new value for the  entry
1694                     associated to key. flags is one of:
1695
1696                     BPF_NOEXIST
1697                            The entry for key must not exist in the map.
1698
1699                     BPF_EXIST
1700                            The entry for key must already exist in the map.
1701
1702                     BPF_ANY
1703                            No  condition  on  the  existence of the entry for
1704                            key.
1705
1706                     If the map has eBPF programs (parser and verdict),  those
1707                     will  be  inherited  by  the  socket  being added. If the
1708                     socket is already attached to eBPF programs, this results
1709                     in an error.
1710
1711              Return 0 on success, or a negative error in case of failure.
1712
1713       int bpf_msg_redirect_hash(struct sk_msg_buff *msg, struct bpf_map *map,
1714       void *key, u64 flags)
1715
1716              Description
1717                     This helper is used in programs implementing policies  at
1718                     the  socket  level. If the message msg is allowed to pass
1719                     (i.e. if the verdict eBPF program returns SK_PASS), redi‐
1720                     rect  it  to  the  socket  referenced  by  map  (of  type
1721                     BPF_MAP_TYPE_SOCKHASH) using hash key. Both  ingress  and
1722                     egress  interfaces  can  be  used  for  redirection.  The
1723                     BPF_F_INGRESS value in flags is used to make the distinc‐
1724                     tion  (ingress  path  is selected if the flag is present,
1725                     egress path otherwise). This is the only  flag  supported
1726                     for now.
1727
1728              Return SK_PASS on success, or SK_DROP on error.
1729
1730       int bpf_sk_redirect_hash(struct sk_buff *skb, struct bpf_map *map, void
1731       *key, u64 flags)
1732
1733              Description
1734                     This helper is used in programs implementing policies  at
1735                     the  skb  socket  level. If the sk_buff skb is allowed to
1736                     pass  (i.e.   if  the  verdeict  eBPF   program   returns
1737                     SK_PASS), redirect it to the socket referenced by map (of
1738                     type BPF_MAP_TYPE_SOCKHASH) using hash key. Both  ingress
1739                     and  egress  interfaces  can be used for redirection. The
1740                     BPF_F_INGRESS value in flags is used to make the distinc‐
1741                     tion  (ingress  path  is selected if the flag is present,
1742                     egress otherwise). This is the only  flag  supported  for
1743                     now.
1744
1745              Return SK_PASS on success, or SK_DROP on error.
1746
1747       int  bpf_lwt_push_encap(struct  sk_buff  *skb, u32 type, void *hdr, u32
1748       len)
1749
1750              Description
1751                     Encapsulate the packet associated to skb within a Layer 3
1752                     protocol header. This header is provided in the buffer at
1753                     address hdr, with len its size in bytes.  type  indicates
1754                     the protocol of the header and can be one of:
1755
1756                     BPF_LWT_ENCAP_SEG6
1757                            IPv6  encapsulation  with  Segment  Routing Header
1758                            (struct ipv6_sr_hdr). hdr only contains  the  SRH,
1759                            the IPv6 header is computed by the kernel.
1760
1761                     BPF_LWT_ENCAP_SEG6_INLINE
1762                            Only  works if skb contains an IPv6 packet. Insert
1763                            a  Segment  Routing  Header  (struct  ipv6_sr_hdr)
1764                            inside the IPv6 header.
1765
1766                     A call to this helper is susceptible to change the under‐
1767                     laying packet buffer. Therefore, at load time, all checks
1768                     on  pointers  previously done by the verifier are invali‐
1769                     dated and must be performed again, if the helper is  used
1770                     in combination with direct packet access.
1771
1772              Return 0 on success, or a negative error in case of failure.
1773
1774       int  bpf_lwt_seg6_store_bytes(struct  sk_buff  *skb,  u32 offset, const
1775       void *from, u32 len)
1776
1777              Description
1778                     Store len bytes from address from into the packet associ‐
1779                     ated  to  skb,  at  offset.  Only the flags, tag and TLVs
1780                     inside the outermost IPv6 Segment Routing Header  can  be
1781                     modified through this helper.
1782
1783                     A call to this helper is susceptible to change the under‐
1784                     laying packet buffer. Therefore, at load time, all checks
1785                     on  pointers  previously done by the verifier are invali‐
1786                     dated and must be performed again, if the helper is  used
1787                     in combination with direct packet access.
1788
1789              Return 0 on success, or a negative error in case of failure.
1790
1791       int bpf_lwt_seg6_adjust_srh(struct sk_buff *skb, u32 offset, s32 delta)
1792
1793              Description
1794                     Adjust  the  size allocated to TLVs in the outermost IPv6
1795                     Segment Routing Header contained in the packet associated
1796                     to  skb,  at position offset by delta bytes. Only offsets
1797                     after the segments are accepted. delta  can  be  as  well
1798                     positive (growing) as negative (shrinking).
1799
1800                     A call to this helper is susceptible to change the under‐
1801                     laying packet buffer. Therefore, at load time, all checks
1802                     on  pointers  previously done by the verifier are invali‐
1803                     dated and must be performed again, if the helper is  used
1804                     in combination with direct packet access.
1805
1806              Return 0 on success, or a negative error in case of failure.
1807
1808       int  bpf_lwt_seg6_action(struct  sk_buff *skb, u32 action, void *param,
1809       u32 param_len)
1810
1811              Description
1812                     Apply an IPv6 Segment Routing action of  type  action  to
1813                     the packet associated to skb. Each action takes a parame‐
1814                     ter contained at address param, and of  length  param_len
1815                     bytes.  action can be one of:
1816
1817                     SEG6_LOCAL_ACTION_END_X
1818                            End.X action: Endpoint with Layer-3 cross-connect.
1819                            Type of param: struct in6_addr.
1820
1821                     SEG6_LOCAL_ACTION_END_T
1822                            End.T action: Endpoint with  specific  IPv6  table
1823                            lookup.  Type of param: int.
1824
1825                     SEG6_LOCAL_ACTION_END_B6
1826                            End.B6  action:  Endpoint bound to an SRv6 policy.
1827                            Type of param: struct ipv6_sr_hdr.
1828
1829                     SEG6_LOCAL_ACTION_END_B6_ENCAP
1830                            End.B6.Encap action: Endpoint  bound  to  an  SRv6
1831                            encapsulation   policy.   Type  of  param:  struct
1832                            ipv6_sr_hdr.
1833
1834                     A call to this helper is susceptible to change the under‐
1835                     laying packet buffer. Therefore, at load time, all checks
1836                     on pointers previously done by the verifier  are  invali‐
1837                     dated  and must be performed again, if the helper is used
1838                     in combination with direct packet access.
1839
1840              Return 0 on success, or a negative error in case of failure.
1841
1842       int bpf_rc_keydown(void *ctx, u32 protocol, u64 scancode, u32 toggle)
1843
1844              Description
1845                     This helper is used in programs implementing IR decoding,
1846                     to report a successfully decoded key press with scancode,
1847                     toggle value in the given protocol. The scancode will  be
1848                     translated to a keycode using the rc keymap, and reported
1849                     as an input key down event. After a period a key up event
1850                     is  generated.  This  period  can  be extended by calling
1851                     either bpf_rc_keydown() again with the  same  values,  or
1852                     calling bpf_rc_repeat().
1853
1854                     Some  protocols  include a toggle bit, in case the button
1855                     was released and pressed again between consecutive  scan‐
1856                     codes.
1857
1858                     The  ctx  should  point to the lirc sample as passed into
1859                     the program.
1860
1861                     The protocol is the decoded  protocol  number  (see  enum
1862                     rc_proto for some predefined values).
1863
1864                     This  helper is only available is the kernel was compiled
1865                     with the CONFIG_BPF_LIRC_MODE2 configuration  option  set
1866                     to "y".
1867
1868              Return
1869
1870       int bpf_rc_repeat(void *ctx)
1871
1872              Description
1873                     This helper is used in programs implementing IR decoding,
1874                     to report a successfully decoded repeat key message. This
1875                     delays  the  generation  of a key up event for previously
1876                     generated key down event.
1877
1878                     Some IR protocols like NEC have a special IR message  for
1879                     repeating last button, for when a button is held down.
1880
1881                     The  ctx  should  point to the lirc sample as passed into
1882                     the program.
1883
1884                     This helper is only available is the kernel was  compiled
1885                     with  the  CONFIG_BPF_LIRC_MODE2 configuration option set
1886                     to "y".
1887
1888              Return
1889
1890       uint64_t bpf_skb_cgroup_id(struct sk_buff *skb)
1891
1892              Description
1893                     Return the cgroup v2 id of the socket associated with the
1894                     skb.  This is roughly similar to the bpf_get_cgroup_clas‐
1895                     sid() helper for cgroup v1 by providing a tag resp. iden‐
1896                     tifier  that  can  be  matched on or used for map lookups
1897                     e.g. to implement policy. The cgroup v2  id  of  a  given
1898                     path  in  the  hierarchy is exposed in user space through
1899                     the f_handle API in order to get to the same 64-bit id.
1900
1901                     This helper can be used on TC egress  path,  but  not  on
1902                     ingress, and is available only if the kernel was compiled
1903                     with the CONFIG_SOCK_CGROUP_DATA configuration option.
1904
1905              Return The id is returned or 0 in  case  the  id  could  not  be
1906                     retrieved.
1907
1908       u64 bpf_skb_ancestor_cgroup_id(struct sk_buff *skb, int ancestor_level)
1909
1910              Description
1911                     Return id of cgroup v2 that is ancestor of cgroup associ‐
1912                     ated with the skb at the ancestor_level.  The root cgroup
1913                     is  at ancestor_level zero and each step down the hierar‐
1914                     chy increments the level. If ancestor_level ==  level  of
1915                     cgroup  associated  with  skb,  then return value will be
1916                     same as that of bpf_skb_cgroup_id().
1917
1918                     The helper is  useful  to  implement  policies  based  on
1919                     cgroups that are upper in hierarchy than immediate cgroup
1920                     associated with skb.
1921
1922                     The format of returned id and helper limitations are same
1923                     as in bpf_skb_cgroup_id().
1924
1925              Return The  id  is  returned  or  0  in case the id could not be
1926                     retrieved.
1927
1928       u64 bpf_get_current_cgroup_id(void)
1929
1930              Return A 64-bit integer containing the current cgroup  id  based
1931                     on the cgroup within which the current task is running.
1932
1933       void* get_local_storage(void *map, u64 flags)
1934
1935              Description
1936                     Get  the pointer to the local storage area.  The type and
1937                     the size of the local storage is defined by the map argu‐
1938                     ment.   The  flags meaning is specific for each map type,
1939                     and has to be 0 for cgroup local storage.
1940
1941                     Depending on the BPF program type, a local  storage  area
1942                     can  be shared between multiple instances of the BPF pro‐
1943                     gram, running simultaneously.
1944
1945                     A user should care about the synchronization by themself.
1946                     For  example,  by  using  the BPF_STX_XADD instruction to
1947                     alter the shared data.
1948
1949              Return A pointer to the local storage area.
1950
1951       int  bpf_sk_select_reuseport(struct  sk_reuseport_md   *reuse,   struct
1952       bpf_map *map, void *key, u64 flags)
1953
1954              Description
1955                     Select  a  SO_REUSEPORT socket from a BPF_MAP_TYPE_REUSE‐
1956                     PORT_ARRAY map.  It checks the selected socket is  match‐
1957                     ing the incoming request in the socket buffer.
1958
1959              Return 0 on success, or a negative error in case of failure.
1960
1961       struct  bpf_sock  *bpf_sk_lookup_tcp(void  *ctx,  struct bpf_sock_tuple
1962       *tuple, u32 tuple_size, u64 netns, u64 flags)
1963
1964              Description
1965                     Look for TCP socket matching tuple, optionally in a child
1966                     network   namespace  netns.  The  return  value  must  be
1967                     checked, and if non-NULL, released via bpf_sk_release().
1968
1969                     The ctx should point to the context of the program,  such
1970                     as the skb or socket (depending on the hook in use). This
1971                     is used to determine the base network namespace  for  the
1972                     lookup.
1973
1974                     tuple_size must be one of:
1975
1976                     sizeof(tuple->ipv4)
1977                            Look for an IPv4 socket.
1978
1979                     sizeof(tuple->ipv6)
1980                            Look for an IPv6 socket.
1981
1982                     If  the  netns  is a negative signed 32-bit integer, then
1983                     the socket lookup table in the netns associated with  the
1984                     ctx  will  will  be  used.  For the TC hooks, this is the
1985                     netns of the device in the skb. For socket hooks, this is
1986                     the  netns  of  the socket.  If netns is any other signed
1987                     32-bit value greater than or equal to zero then it speci‐
1988                     fies the ID of the netns relative to the netns associated
1989                     with the ctx. netns values beyond  the  range  of  32-bit
1990                     integers are reserved for future use.
1991
1992                     All  values  for flags are reserved for future usage, and
1993                     must be left at zero.
1994
1995                     This helper is available only if the kernel was  compiled
1996                     with CONFIG_NET configuration option.
1997
1998              Return Pointer  to  struct bpf_sock, or NULL in case of failure.
1999                     For sockets with reuseport option,  the  struct  bpf_sock
2000                     result  is  from  reuse->socks[]  using  the  hash of the
2001                     tuple.
2002
2003       struct bpf_sock  *bpf_sk_lookup_udp(void  *ctx,  struct  bpf_sock_tuple
2004       *tuple, u32 tuple_size, u64 netns, u64 flags)
2005
2006              Description
2007                     Look for UDP socket matching tuple, optionally in a child
2008                     network  namespace  netns.  The  return  value  must   be
2009                     checked, and if non-NULL, released via bpf_sk_release().
2010
2011                     The  ctx should point to the context of the program, such
2012                     as the skb or socket (depending on the hook in use). This
2013                     is  used  to determine the base network namespace for the
2014                     lookup.
2015
2016                     tuple_size must be one of:
2017
2018                     sizeof(tuple->ipv4)
2019                            Look for an IPv4 socket.
2020
2021                     sizeof(tuple->ipv6)
2022                            Look for an IPv6 socket.
2023
2024                     If the netns is a negative signed  32-bit  integer,  then
2025                     the  socket lookup table in the netns associated with the
2026                     ctx will will be used. For the  TC  hooks,  this  is  the
2027                     netns of the device in the skb. For socket hooks, this is
2028                     the netns of the socket.  If netns is  any  other  signed
2029                     32-bit value greater than or equal to zero then it speci‐
2030                     fies the ID of the netns relative to the netns associated
2031                     with  the  ctx.  netns  values beyond the range of 32-bit
2032                     integers are reserved for future use.
2033
2034                     All values for flags are reserved for future  usage,  and
2035                     must be left at zero.
2036
2037                     This  helper is available only if the kernel was compiled
2038                     with CONFIG_NET configuration option.
2039
2040              Return Pointer to struct bpf_sock, or NULL in case  of  failure.
2041                     For  sockets  with  reuseport option, the struct bpf_sock
2042                     result is from  reuse->socks[]  using  the  hash  of  the
2043                     tuple.
2044
2045       int bpf_sk_release(struct bpf_sock *sock)
2046
2047              Description
2048                     Release  the  reference  held  by  sock.  sock  must be a
2049                     non-NULL    pointer    that     was     returned     from
2050                     bpf_sk_lookup_xxx().
2051
2052              Return 0 on success, or a negative error in case of failure.
2053
2054       int bpf_map_pop_elem(struct bpf_map *map, void *value)
2055
2056              Description
2057                     Pop an element from map.
2058
2059              Return 0 on success, or a negative error in case of failure.
2060
2061       int bpf_map_peek_elem(struct bpf_map *map, void *value)
2062
2063              Description
2064                     Get an element from map without removing it.
2065
2066              Return 0 on success, or a negative error in case of failure.
2067
2068       int  bpf_msg_push_data(struct  sk_buff  *skb,  u32  start, u32 len, u64
2069       flags)
2070
2071              Description
2072                     For socket policies, insert len bytes into msg at  offset
2073                     start.
2074
2075                     If a program of type BPF_PROG_TYPE_SK_MSG is run on a msg
2076                     it may want to insert metadata or options into  the  msg.
2077                     This can later be read and used by any of the lower layer
2078                     BPF hooks.
2079
2080                     This helper may fail if under memory pressure  (a  malloc
2081                     fails)  in these cases BPF programs will get an appropri‐
2082                     ate error and BPF programs will need to handle them.
2083
2084              Return 0 on success, or a negative error in case of failure.
2085
2086       int bpf_msg_pop_data(struct sk_msg_buff *msg, u32 start, u32  pop,  u64
2087       flags)
2088
2089              Description
2090                     Will  remove pop bytes from a msg starting at byte start.
2091                     This may result in ENOMEM errors under certain situations
2092                     if an allocation and copy are required due to a full ring
2093                     buffer.  However, the helper will try to avoid doing  the
2094                     allocation  if  possible. Other errors can occur if input
2095                     parameters are invalid either due to start byte not being
2096                     valid  part  of  msg  payload  and/or  pop value being to
2097                     large.
2098
2099              Return 0 on success, or a negative error in case of failure.
2100
2101       int bpf_rc_pointer_rel(void *ctx, s32 rel_x, s32 rel_y)
2102
2103              Description
2104                     This helper is used in programs implementing IR decoding,
2105                     to report a successfully decoded pointer movement.
2106
2107                     The  ctx  should  point to the lirc sample as passed into
2108                     the program.
2109
2110                     This helper is only available is the kernel was  compiled
2111                     with  the  CONFIG_BPF_LIRC_MODE2 configuration option set
2112                     to "y".
2113
2114              Return
2115

EXAMPLES

2117       Example usage for most of the eBPF helpers listed in this  manual  page
2118       are  available  within the Linux kernel sources, at the following loca‐
2119       tions:
2120
2121       · samples/bpf/
2122
2123       · tools/testing/selftests/bpf/
2124

LICENSE

2126       eBPF programs can have an associated license,  passed  along  with  the
2127       bytecode  instructions  to the kernel when the programs are loaded. The
2128       format for that string is identical to the one in use for  kernel  mod‐
2129       ules  (Dual licenses, such as "Dual BSD/GPL", may be used). Some helper
2130       functions are only accessible to programs that are compatible with  the
2131       GNU Privacy License (GPL).
2132
2133       In  order to use such helpers, the eBPF program must be loaded with the
2134       correct license string passed (via attr) to the bpf() system call,  and
2135       this  generally  translates  into the C source code of the program con‐
2136       taining a line similar to the following:
2137
2138          char ____license[] __attribute__((section("license"), used)) = "GPL";
2139

IMPLEMENTATION

2141       This manual page is an effort to  document  the  existing  eBPF  helper
2142       functions.   But  as of this writing, the BPF sub-system is under heavy
2143       development. New eBPF program or map types are added,  along  with  new
2144       helper  functions.  Some  helpers  are  occasionally made available for
2145       additional program types. So in spite of the efforts of the  community,
2146       this  page  might  not  be up-to-date. If you want to check by yourself
2147       what helper functions exist in your kernel, or what types  of  programs
2148       they  can  support,  here are some files among the kernel tree that you
2149       may be interested in:
2150
2151       · include/uapi/linux/bpf.h is the main BPF header. It contains the full
2152         list  of  all helper functions, as well as many other BPF definitions
2153         including most of  the  flags,  structs  or  constants  used  by  the
2154         helpers.
2155
2156       · net/core/filter.c  contains  the  definition  of most network-related
2157         helper functions, and the list of program types from which  they  can
2158         be used.
2159
2160       · kernel/trace/bpf_trace.c  is  the  equivalent  for  most tracing pro‐
2161         gram-related helpers.
2162
2163       · kernel/bpf/verifier.c contains the functions used to check that valid
2164         types of eBPF maps are used with a given helper function.
2165
2166       · kernel/bpf/  directory  contains  other  files  in  which  additional
2167         helpers are defined (for cgroups, sockmaps, etc.).
2168
2169       Compatibility between helper functions and program types can  generally
2170       be  found in the files where helper functions are defined. Look for the
2171       struct bpf_func_proto objects and for functions returning  them:  these
2172       functions contain a list of helpers that a given program type can call.
2173       Note that the default: label of the switch  ...  case  used  to  filter
2174       helpers  can  call other functions, themselves allowing access to addi‐
2175       tional helpers. The requirement for GPL license is also in those struct
2176       bpf_func_proto.
2177
2178       Compatibility  between  helper  functions and map types can be found in
2179       the check_map_func_compatibility() function  in  file  kernel/bpf/veri‐
2180       fier.c.
2181
2182       Helper functions that invalidate the checks on data and data_end point‐
2183       ers    for    network    processing    are    listed    in     function
2184       bpf_helper_changes_pkt_data() in file net/core/filter.c.
2185

SEE ALSO

2187       bpf(2),  cgroups(7),  ip(8), perf_event_open(2), sendmsg(2), socket(7),
2188       tc-bpf(8)
2189

COLOPHON

2191       This page is part of release 5.02 of the Linux  man-pages  project.   A
2192       description  of  the project, information about reporting bugs, and the
2193       latest    version    of    this    page,    can     be     found     at
2194       https://www.kernel.org/doc/man-pages/.
2195
2196
2197
2198Linux                             2019-03-06                    BPF-HELPERS(7)
Impressum