1NFT(8) NFT(8)
2
3
4
6 nft - Administration tool of the nftables framework for packet
7 filtering and classification
8
10 nft [ -nNscaeSupyjtT ] [ -I directory ] [ -f filename | -i | cmd ...]
11 nft -h
12 nft -v
13
15 nft is the command line tool used to set up, maintain and inspect
16 packet filtering and classification rules in the Linux kernel, in the
17 nftables framework. The Linux kernel subsystem is known as nf_tables,
18 and ‘nf’ stands for Netfilter.
19
21 The command accepts several different options which are documented here
22 in groups for better understanding of their meaning. You can get
23 information about options by running nft --help.
24
25 General options:
26
27 -h, --help
28 Show help message and all options.
29
30 -v, --version
31 Show version.
32
33 -V
34 Show long version information, including compile-time
35 configuration.
36
37 Ruleset input handling options that specify to how to load rulesets:
38
39 -f, --file filename
40 Read input from filename. If filename is -, read from stdin.
41
42 -D, --define name=value
43 Define a variable. You can only combine this option with -f.
44
45 -i, --interactive
46 Read input from an interactive readline CLI. You can use quit to
47 exit, or use the EOF marker, normally this is CTRL-D.
48
49 -I, --includepath directory
50 Add the directory directory to the list of directories to be
51 searched for included files. This option may be specified multiple
52 times.
53
54 -c, --check
55 Check commands validity without actually applying the changes.
56
57 -o, --optimize
58 Optimize your ruleset. You can combine this option with -c to
59 inspect the proposed optimizations.
60
61 Ruleset list output formatting that modify the output of the list
62 ruleset command:
63
64 -a, --handle
65 Show object handles in output.
66
67 -s, --stateless
68 Omit stateful information of rules and stateful objects.
69
70 -t, --terse
71 Omit contents of sets from output.
72
73 -S, --service
74 Translate ports to service names as defined by /etc/services.
75
76 -N, --reversedns
77 Translate IP address to names via reverse DNS lookup. This may slow
78 down your listing since it generates network traffic.
79
80 -u, --guid
81 Translate numeric UID/GID to names as defined by /etc/passwd and
82 /etc/group.
83
84 -n, --numeric
85 Print fully numerical output.
86
87 -y, --numeric-priority
88 Display base chain priority numerically.
89
90 -p, --numeric-protocol
91 Display layer 4 protocol numerically.
92
93 -T, --numeric-time
94 Show time, day and hour values in numeric format.
95
96 Command output formatting:
97
98 -e, --echo
99 When inserting items into the ruleset using add, insert or replace
100 commands, print notifications just like nft monitor.
101
102 -j, --json
103 Format output in JSON. See libnftables-json(5) for a schema
104 description.
105
106 -d, --debug level
107 Enable debugging output. The debug level can be any of scanner,
108 parser, eval, netlink, mnl, proto-ctx, segtree, all. You can
109 combine more than one by separating by the , symbol, for example -d
110 eval,mnl.
111
113 LEXICAL CONVENTIONS
114 Input is parsed line-wise. When the last character of a line, just
115 before the newline character, is a non-quoted backslash (\), the next
116 line is treated as a continuation. Multiple commands on the same line
117 can be separated using a semicolon (;).
118
119 A hash sign (#) begins a comment. All following characters on the same
120 line are ignored.
121
122 Identifiers begin with an alphabetic character (a-z,A-Z), followed by
123 zero or more alphanumeric characters (a-z,A-Z,0-9) and the characters
124 slash (/), backslash (\), underscore (_) and dot (.). Identifiers using
125 different characters or clashing with a keyword need to be enclosed in
126 double quotes (").
127
128 INCLUDE FILES
129 include filename
130
131 Other files can be included by using the include statement. The
132 directories to be searched for include files can be specified using the
133 -I/--includepath option. You can override this behaviour either by
134 prepending ‘./’ to your path to force inclusion of files located in the
135 current working directory (i.e. relative path) or / for file location
136 expressed as an absolute path.
137
138 If -I/--includepath is not specified, then nft relies on the default
139 directory that is specified at compile time. You can retrieve this
140 default directory via the -h/--help option.
141
142 Include statements support the usual shell wildcard symbols (,?,[]).
143 Having no matches for an include statement is not an error, if wildcard
144 symbols are used in the include statement. This allows having
145 potentially empty include directories for statements like include
146 "/etc/firewall/rules/". The wildcard matches are loaded in alphabetical
147 order. Files beginning with dot (.) are not matched by include
148 statements.
149
150 SYMBOLIC VARIABLES
151 define variable = expr
152 undefine variable
153 redefine variable = expr
154 $variable
155
156 Symbolic variables can be defined using the define statement. Variable
157 references are expressions and can be used to initialize other
158 variables. The scope of a definition is the current block and all
159 blocks contained within. Symbolic variables can be undefined using the
160 undefine statement, and modified using the redefine statement.
161
162 Using symbolic variables.
163
164 define int_if1 = eth0
165 define int_if2 = eth1
166 define int_ifs = { $int_if1, $int_if2 }
167 redefine int_if2 = wlan0
168 undefine int_if2
169
170 filter input iif $int_ifs accept
171
172
174 Address families determine the type of packets which are processed. For
175 each address family, the kernel contains so called hooks at specific
176 stages of the packet processing paths, which invoke nftables if rules
177 for these hooks exist.
178
179
180 ip IPv4 address family.
181
182 ip6 IPv6 address family.
183
184 inet Internet (IPv4/IPv6)
185 address family.
186
187 arp ARP address family,
188 handling IPv4 ARP packets.
189
190 bridge Bridge address family,
191 handling packets which
192 traverse a bridge device.
193
194 netdev Netdev address family,
195 handling packets on
196 ingress and egress.
197
198
199 All nftables objects exist in address family specific namespaces,
200 therefore all identifiers include an address family. If an identifier
201 is specified without an address family, the ip family is used by
202 default.
203
204 IPV4/IPV6/INET ADDRESS FAMILIES
205 The IPv4/IPv6/Inet address families handle IPv4, IPv6 or both types of
206 packets. They contain five hooks at different packet processing stages
207 in the network stack.
208
209 Table 1. IPv4/IPv6/Inet address family hooks
210 ┌────────────┬────────────────────────────┐
211 │Hook │ Description │
212 ├────────────┼────────────────────────────┤
213 │ │ │
214 │prerouting │ All packets entering the │
215 │ │ system are processed by │
216 │ │ the prerouting hook. It is │
217 │ │ invoked before the routing │
218 │ │ process and is used for │
219 │ │ early filtering or │
220 │ │ changing packet attributes │
221 │ │ that affect routing. │
222 ├────────────┼────────────────────────────┤
223 │ │ │
224 │input │ Packets delivered to the │
225 │ │ local system are processed │
226 │ │ by the input hook. │
227 ├────────────┼────────────────────────────┤
228 │ │ │
229 │forward │ Packets forwarded to a │
230 │ │ different host are │
231 │ │ processed by the forward │
232 │ │ hook. │
233 ├────────────┼────────────────────────────┤
234 │ │ │
235 │output │ Packets sent by local │
236 │ │ processes are processed by │
237 │ │ the output hook. │
238 ├────────────┼────────────────────────────┤
239 │ │ │
240 │postrouting │ All packets leaving the │
241 │ │ system are processed by │
242 │ │ the postrouting hook. │
243 ├────────────┼────────────────────────────┤
244 │ │ │
245 │ingress │ All packets entering the │
246 │ │ system are processed by │
247 │ │ this hook. It is invoked │
248 │ │ before layer 3 protocol │
249 │ │ handlers, hence before the │
250 │ │ prerouting hook, and it │
251 │ │ can be used for filtering │
252 │ │ and policing. Ingress is │
253 │ │ only available for Inet │
254 │ │ family (since Linux kernel │
255 │ │ 5.10). │
256 └────────────┴────────────────────────────┘
257
258 ARP ADDRESS FAMILY
259 The ARP address family handles ARP packets received and sent by the
260 system. It is commonly used to mangle ARP packets for clustering.
261
262 Table 2. ARP address family hooks
263 ┌───────┬────────────────────────────┐
264 │Hook │ Description │
265 ├───────┼────────────────────────────┤
266 │ │ │
267 │input │ Packets delivered to the │
268 │ │ local system are processed │
269 │ │ by the input hook. │
270 ├───────┼────────────────────────────┤
271 │ │ │
272 │output │ Packets send by the local │
273 │ │ system are processed by │
274 │ │ the output hook. │
275 └───────┴────────────────────────────┘
276
277 BRIDGE ADDRESS FAMILY
278 The bridge address family handles Ethernet packets traversing bridge
279 devices.
280
281 The list of supported hooks is identical to IPv4/IPv6/Inet address
282 families above.
283
284 NETDEV ADDRESS FAMILY
285 The Netdev address family handles packets from the device ingress and
286 egress path. This family allows you to filter packets of any ethertype
287 such as ARP, VLAN 802.1q, VLAN 802.1ad (Q-in-Q) as well as IPv4 and
288 IPv6 packets.
289
290 Table 3. Netdev address family hooks
291 ┌────────┬────────────────────────────┐
292 │Hook │ Description │
293 ├────────┼────────────────────────────┤
294 │ │ │
295 │ingress │ All packets entering the │
296 │ │ system are processed by │
297 │ │ this hook. It is invoked │
298 │ │ after the network taps │
299 │ │ (ie. tcpdump), right after │
300 │ │ tc ingress and before │
301 │ │ layer 3 protocol handlers, │
302 │ │ it can be used for early │
303 │ │ filtering and policing. │
304 ├────────┼────────────────────────────┤
305 │ │ │
306 │egress │ All packets leaving the │
307 │ │ system are processed by │
308 │ │ this hook. It is invoked │
309 │ │ after layer 3 protocol │
310 │ │ handlers and before tc │
311 │ │ egress. It can be used for │
312 │ │ late filtering and │
313 │ │ policing. │
314 └────────┴────────────────────────────┘
315
316 Tunneled packets (such as vxlan) are processed by netdev family hooks
317 both in decapsulated and encapsulated (tunneled) form. So a packet can
318 be filtered on the overlay network as well as on the underlying
319 network.
320
321 Note that the order of netfilter and tc is mirrored on ingress versus
322 egress. This ensures symmetry for NAT and other packet mangling.
323
324 Ingress packets which are redirected out some other interface are only
325 processed by netfilter on egress if they have passed through netfilter
326 ingress processing before. Thus, ingress packets which are redirected
327 by tc are not subjected to netfilter. But they are if they are
328 redirected by netfilter on ingress. Conceptually, tc and netfilter can
329 be thought of as layers, with netfilter layered above tc: If the packet
330 hasn’t been passed up from the tc layer to the netfilter layer, it’s
331 not subjected to netfilter on egress.
332
334 {list | flush} ruleset [family]
335
336 The ruleset keyword is used to identify the whole set of tables,
337 chains, etc. currently in place in kernel. The following ruleset
338 commands exist:
339
340
341 list Print the ruleset in
342 human-readable format.
343
344
345
346
347
348
349
350
351
352
353
354
355
356 flush Clear the whole ruleset.
357 Note that, unlike
358 iptables, this will remove
359 all tables and whatever
360 they contain, effectively
361 leading to an empty
362 ruleset - no packet
363 filtering will happen
364 anymore, so the kernel
365 accepts any valid packet
366 it receives.
367
368
369 It is possible to limit list and flush to a specific address family
370 only. For a list of valid family names, see the section called “ADDRESS
371 FAMILIES” above.
372
373 By design, list ruleset command output may be used as input to nft -f.
374 Effectively, this is the nft-equivalent of iptables-save and
375 iptables-restore.
376
378 {add | create} table [family] table [ {comment comment ;} { flags 'flags ; }]
379 {delete | destroy | list | flush} table [family] table
380 list tables [family]
381 delete table [family] handle handle
382 destroy table [family] handle handle
383
384 Tables are containers for chains, sets and stateful objects. They are
385 identified by their address family and their name. The address family
386 must be one of ip, ip6, inet, arp, bridge, netdev. The inet address
387 family is a dummy family which is used to create hybrid IPv4/IPv6
388 tables. The meta expression nfproto keyword can be used to test which
389 family (ipv4 or ipv6) context the packet is being processed in. When no
390 address family is specified, ip is used by default. The only difference
391 between add and create is that the former will not return an error if
392 the specified table already exists while create will return an error.
393
394 Table 4. Table flags
395 ┌────────┬────────────────────────────┐
396 │Flag │ Description │
397 ├────────┼────────────────────────────┤
398 │ │ │
399 │dormant │ table is not evaluated any │
400 │ │ more (base chains are │
401 │ │ unregistered). │
402 └────────┴────────────────────────────┘
403
404 Add, change, delete a table.
405
406 # start nft in interactive mode
407 nft --interactive
408
409 # create a new table.
410 create table inet mytable
411
412 # add a new base chain: get input packets
413 add chain inet mytable myin { type filter hook input priority filter; }
414
415 # add a single counter to the chain
416 add rule inet mytable myin counter
417
418 # disable the table temporarily -- rules are not evaluated anymore
419 add table inet mytable { flags dormant; }
420
421 # make table active again:
422 add table inet mytable
423
424
425
426 add Add a new table for the
427 given family with the
428 given name.
429
430 delete Delete the specified
431 table.
432
433
434 destroy Delete the specified
435 table, it does not fail if
436 it does not exist.
437
438 list List all chains and rules
439 of the specified table.
440
441 flush Flush all chains and rules
442 of the specified table.
443
444
446 {add | create} chain [family] table chain [{ type type hook hook [device device] priority priority ; [policy policy ;] [comment comment ;] }]
447 {delete | destroy | list | flush} chain ['family] table chain
448 list chains [family]
449 delete chain [family] table handle handle
450 destroy chain [family] table handle handle
451 rename chain [family] table chain newname
452
453 Chains are containers for rules. They exist in two kinds, base chains
454 and regular chains. A base chain is an entry point for packets from the
455 networking stack, a regular chain may be used as jump target and is
456 used for better rule organization.
457
458
459 add Add a new chain in the
460 specified table. When a
461 hook and priority value
462 are specified, the chain
463 is created as a base chain
464 and hooked up to the
465 networking stack.
466
467 create Similar to the add
468 command, but returns an
469 error if the chain already
470 exists.
471
472 delete Delete the specified
473 chain. The chain must not
474 contain any rules or be
475 used as jump target.
476
477 destroy Delete the specified
478 chain, it does not fail if
479 it does not exist. The
480 chain must not contain any
481 rules or be used as jump
482 target.
483
484 rename Rename the specified
485 chain.
486
487 list List all rules of the
488 specified chain.
489
490 flush Flush all rules of the
491 specified chain.
492
493
494 For base chains, type, hook and priority parameters are mandatory.
495
496 Table 5. Supported chain types
497 ┌───────┬───────────────┬────────────────┬──────────────────┐
498 │Type │ Families │ Hooks │ Description │
499 ├───────┼───────────────┼────────────────┼──────────────────┤
500 │ │ │ │ │
501 │filter │ all │ all │ Standard chain │
502 │ │ │ │ type to use in │
503 │ │ │ │ doubt. │
504 ├───────┼───────────────┼────────────────┼──────────────────┤
505 │ │ │ │ │
506 │nat │ ip, ip6, inet │ prerouting, │ Chains of this │
507 │ │ │ input, output, │ type perform │
508 │ │ │ postrouting │ Native Address │
509 │ │ │ │ Translation │
510 │ │ │ │ based on │
511 │ │ │ │ conntrack │
512 │ │ │ │ entries. Only │
513 │ │ │ │ the first packet │
514 │ │ │ │ of a connection │
515 │ │ │ │ actually │
516 │ │ │ │ traverses this │
517 │ │ │ │ chain - its │
518 │ │ │ │ rules usually │
519 │ │ │ │ define details │
520 │ │ │ │ of the created │
521 │ │ │ │ conntrack entry │
522 │ │ │ │ (NAT statements │
523 │ │ │ │ for instance). │
524 ├───────┼───────────────┼────────────────┼──────────────────┤
525 │ │ │ │ │
526 │route │ ip, ip6 │ output │ If a packet has │
527 │ │ │ │ traversed a │
528 │ │ │ │ chain of this │
529 │ │ │ │ type and is │
530 │ │ │ │ about to be │
531 │ │ │ │ accepted, a new │
532 │ │ │ │ route lookup is │
533 │ │ │ │ performed if │
534 │ │ │ │ relevant parts │
535 │ │ │ │ of the IP header │
536 │ │ │ │ have changed. │
537 │ │ │ │ This allows one │
538 │ │ │ │ to e.g. │
539 │ │ │ │ implement policy │
540 │ │ │ │ routing │
541 │ │ │ │ selectors in │
542 │ │ │ │ nftables. │
543 └───────┴───────────────┴────────────────┴──────────────────┘
544
545 Apart from the special cases illustrated above (e.g. nat type not
546 supporting forward hook or route type only supporting output hook),
547 there are three further quirks worth noticing:
548
549 • The netdev family supports merely two combinations, namely filter
550 type with ingress hook and filter type with egress hook. Base
551 chains in this family also require the device parameter to be
552 present since they exist per interface only.
553
554 • The arp family supports only the input and output hooks, both in
555 chains of type filter.
556
557 • The inet family also supports the ingress hook (since Linux kernel
558 5.10), to filter IPv4 and IPv6 packet at the same location as the
559 netdev ingress hook. This inet hook allows you to share sets and
560 maps between the usual prerouting, input, forward, output,
561 postrouting and this ingress hook.
562
563 The priority parameter accepts a signed integer value or a standard
564 priority name which specifies the order in which chains with the same
565 hook value are traversed. The ordering is ascending, i.e. lower
566 priority values have precedence over higher ones.
567
568 With nat type chains, there’s a lower excluding limit of -200 for
569 priority values, because conntrack hooks at this priority and NAT
570 requires it.
571
572 Standard priority values can be replaced with easily memorizable names.
573 Not all names make sense in every family with every hook (see the
574 compatibility matrices below) but their numerical value can still be
575 used for prioritizing chains.
576
577 These names and values are defined and made available based on what
578 priorities are used by xtables when registering their default chains.
579
580 Most of the families use the same values, but bridge uses different
581 ones from the others. See the following tables that describe the values
582 and compatibility.
583
584 Table 6. Standard priority names, family and hook compatibility matrix
585 ┌─────────┬───────┬────────────────┬─────────────┐
586 │Name │ Value │ Families │ Hooks │
587 ├─────────┼───────┼────────────────┼─────────────┤
588 │ │ │ │ │
589 │raw │ -300 │ ip, ip6, inet │ all │
590 ├─────────┼───────┼────────────────┼─────────────┤
591 │ │ │ │ │
592 │mangle │ -150 │ ip, ip6, inet │ all │
593 ├─────────┼───────┼────────────────┼─────────────┤
594 │ │ │ │ │
595 │dstnat │ -100 │ ip, ip6, inet │ prerouting │
596 ├─────────┼───────┼────────────────┼─────────────┤
597 │ │ │ │ │
598 │filter │ 0 │ ip, ip6, inet, │ all │
599 │ │ │ arp, netdev │ │
600 ├─────────┼───────┼────────────────┼─────────────┤
601 │ │ │ │ │
602 │security │ 50 │ ip, ip6, inet │ all │
603 ├─────────┼───────┼────────────────┼─────────────┤
604 │ │ │ │ │
605 │srcnat │ 100 │ ip, ip6, inet │ postrouting │
606 └─────────┴───────┴────────────────┴─────────────┘
607
608 Table 7. Standard priority names and hook compatibility for the bridge
609 family
610 ┌───────┬───────┬─────────────┐
611 │ │ │ │
612 │Name │ Value │ Hooks │
613 ├───────┼───────┼─────────────┤
614 │ │ │ │
615 │dstnat │ -300 │ prerouting │
616 ├───────┼───────┼─────────────┤
617 │ │ │ │
618 │filter │ -200 │ all │
619 ├───────┼───────┼─────────────┤
620 │ │ │ │
621 │out │ 100 │ output │
622 ├───────┼───────┼─────────────┤
623 │ │ │ │
624 │srcnat │ 300 │ postrouting │
625 └───────┴───────┴─────────────┘
626
627 Basic arithmetic expressions (addition and subtraction) can also be
628 achieved with these standard names to ease relative prioritizing, e.g.
629 mangle - 5 stands for -155. Values will also be printed like this until
630 the value is not further than 10 from the standard value.
631
632 Base chains also allow one to set the chain’s policy, i.e. what happens
633 to packets not explicitly accepted or refused in contained rules.
634 Supported policy values are accept (which is the default) or drop.
635
637 {add | insert} rule [family] table chain [handle handle | index index] statement ... [comment comment]
638 replace rule [family] table chain handle handle statement ... [comment comment]
639 {delete | reset} rule [family] table chain handle handle
640 destroy rule [family] table chain handle handle
641 reset rules [family]
642 reset rules table [family] table
643 reset rules chain [family] table [chain]
644
645 Rules are added to chains in the given table. If the family is not
646 specified, the ip family is used. Rules are constructed from two kinds
647 of components according to a set of grammatical rules: expressions and
648 statements.
649
650 The add and insert commands support an optional location specifier,
651 which is either a handle or the index (starting at zero) of an existing
652 rule. Internally, rule locations are always identified by handle and
653 the translation from index happens in userspace. This has two potential
654 implications in case a concurrent ruleset change happens after the
655 translation was done: The effective rule index might change if a rule
656 was inserted or deleted before the referred one. If the referred rule
657 was deleted, the command is rejected by the kernel just as if an
658 invalid handle was given.
659
660 A comment is a single word or a double-quoted (") multi-word string
661 which can be used to make notes regarding the actual rule. Note: If you
662 use bash for adding rules, you have to escape the quotation marks, e.g.
663 \"enable ssh for servers\".
664
665
666 add Add a new rule described
667 by the list of statements.
668 The rule is appended to
669 the given chain unless a
670 location is specified, in
671 which case the rule is
672 inserted after the
673 specified rule.
674
675 insert Same as add except the
676 rule is inserted at the
677 beginning of the chain or
678 before the specified rule.
679
680 replace Similar to add, but the
681 rule replaces the
682 specified rule.
683
684 delete Delete the specified rule.
685
686 destroy Delete the specified rule,
687 it does not fail if it
688 does not exist.
689
690 reset Reset rule-contained
691 state, i.e. counter and
692 quota statement values.
693
694
695 add a rule to ip table output chain.
696
697 nft add rule filter output ip daddr 192.168.0.0/24 accept # 'ip filter' is assumed
698 # same command, slightly more verbose
699 nft add rule ip filter output ip daddr 192.168.0.0/24 accept
700
701 delete rule from inet table.
702
703 # nft -a list ruleset
704 table inet filter {
705 chain input {
706 type filter hook input priority filter; policy accept;
707 ct state established,related accept # handle 4
708 ip saddr 10.1.1.1 tcp dport ssh accept # handle 5
709 ...
710 # delete the rule with handle 5
711 nft delete rule inet filter input handle 5
712
713
715 nftables offers two kinds of set concepts. Anonymous sets are sets that
716 have no specific name. The set members are enclosed in curly braces,
717 with commas to separate elements when creating the rule the set is used
718 in. Once that rule is removed, the set is removed as well. They cannot
719 be updated, i.e. once an anonymous set is declared it cannot be changed
720 anymore except by removing/altering the rule that uses the anonymous
721 set.
722
723 Using anonymous sets to accept particular subnets and ports.
724
725 nft add rule filter input ip saddr { 10.0.0.0/8, 192.168.0.0/16 } tcp dport { 22, 443 } accept
726
727 Named sets are sets that need to be defined first before they can be
728 referenced in rules. Unlike anonymous sets, elements can be added to or
729 removed from a named set at any time. Sets are referenced from rules
730 using an @ prefixed to the sets name.
731
732 Using named sets to accept addresses and ports.
733
734 nft add rule filter input ip saddr @allowed_hosts tcp dport @allowed_ports accept
735
736 The sets allowed_hosts and allowed_ports need to be created first. The
737 next section describes nft set syntax in more detail.
738
739 add set [family] table set { type type | typeof expression ; [flags flags ;] [timeout timeout ;] [gc-interval gc-interval ;] [elements = { element[, ...] } ;] [size size ;] [comment comment ;] [policy 'policy ;] [auto-merge ;] }
740 {delete | destroy | list | flush} set [family] table set
741 list sets [family]
742 delete set [family] table handle handle
743 {add | delete | destroy } element [family] table set { element[, ...] }
744
745 Sets are element containers of a user-defined data type, they are
746 uniquely identified by a user-defined name and attached to tables.
747 Their behaviour can be tuned with the flags that can be specified at
748 set creation time.
749
750
751 add Add a new set in the
752 specified table. See the
753 Set specification table
754 below for more information
755 about how to specify
756 properties of a set.
757
758 delete Delete the specified set.
759
760 destroy Delete the specified set,
761 it does not fail if it
762 does not exist.
763
764 list Display the elements in
765 the specified set.
766
767
768
769
770 flush Remove all elements from
771 the specified set.
772
773
774 Table 8. Set specifications
775 ┌────────────┬──────────────────────┬─────────────────────┐
776 │Keyword │ Description │ Type │
777 ├────────────┼──────────────────────┼─────────────────────┤
778 │ │ │ │
779 │type │ data type of set │ string: ipv4_addr, │
780 │ │ elements │ ipv6_addr, │
781 │ │ │ ether_addr, │
782 │ │ │ inet_proto, │
783 │ │ │ inet_service, mark │
784 ├────────────┼──────────────────────┼─────────────────────┤
785 │ │ │ │
786 │typeof │ data type of set │ expression to │
787 │ │ element │ derive the data │
788 │ │ │ type from │
789 ├────────────┼──────────────────────┼─────────────────────┤
790 │ │ │ │
791 │flags │ set flags │ string: constant, │
792 │ │ │ dynamic, interval, │
793 │ │ │ timeout │
794 ├────────────┼──────────────────────┼─────────────────────┤
795 │ │ │ │
796 │timeout │ time an element │ string, decimal │
797 │ │ stays in the set, │ followed by unit. │
798 │ │ mandatory if set is │ Units are: d, h, m, │
799 │ │ added to from the │ s │
800 │ │ packet path │ │
801 │ │ (ruleset) │ │
802 ├────────────┼──────────────────────┼─────────────────────┤
803 │ │ │ │
804 │gc-interval │ garbage collection │ string, decimal │
805 │ │ interval, only │ followed by unit. │
806 │ │ available when │ Units are: d, h, m, │
807 │ │ timeout or flag │ s │
808 │ │ timeout are active │ │
809 ├────────────┼──────────────────────┼─────────────────────┤
810 │ │ │ │
811 │elements │ elements contained │ set data type │
812 │ │ by the set │ │
813 ├────────────┼──────────────────────┼─────────────────────┤
814 │ │ │ │
815 │size │ maximum number of │ unsigned integer │
816 │ │ elements in the │ (64 bit) │
817 │ │ set, mandatory if │ │
818 │ │ set is added to │ │
819 │ │ from the packet │ │
820 │ │ path (ruleset) │ │
821 ├────────────┼──────────────────────┼─────────────────────┤
822 │ │ │ │
823 │policy │ set policy │ string: performance │
824 │ │ │ [default], memory │
825 ├────────────┼──────────────────────┼─────────────────────┤
826 │ │ │ │
827 │auto-merge │ automatic merge of │ │
828 │ │ adjacent/overlapping │ │
829 │ │ set elements (only │ │
830 │ │ for interval sets) │ │
831 └────────────┴──────────────────────┴─────────────────────┘
832
834 add map [family] table map { type type | typeof expression [flags flags ;] [elements = { element[, ...] } ;] [size size ;] [comment comment ;] [policy 'policy ;] }
835 {delete | destroy | list | flush} map [family] table map
836 list maps [family]
837
838 Maps store data based on some specific key used as input. They are
839 uniquely identified by a user-defined name and attached to tables.
840
841
842 add Add a new map in the
843 specified table.
844
845 delete Delete the specified map.
846
847 destroy Delete the specified map,
848 it does not fail if it
849 does not exist.
850
851 list Display the elements in
852 the specified map.
853
854 flush Remove all elements from
855 the specified map.
856
857 add element Comma-separated list of
858 elements to add into the
859 specified map.
860
861 delete element Comma-separated list of
862 element keys to delete
863 from the specified map.
864
865
866 Table 9. Map specifications
867 ┌─────────┬─────────────────────┬─────────────────────┐
868 │Keyword │ Description │ Type │
869 ├─────────┼─────────────────────┼─────────────────────┤
870 │ │ │ │
871 │type │ data type of map │ string: ipv4_addr, │
872 │ │ elements │ ipv6_addr, │
873 │ │ │ ether_addr, │
874 │ │ │ inet_proto, │
875 │ │ │ inet_service, mark, │
876 │ │ │ counter, quota. │
877 │ │ │ Counter and quota │
878 │ │ │ can’t be used as │
879 │ │ │ keys │
880 ├─────────┼─────────────────────┼─────────────────────┤
881 │ │ │ │
882 │typeof │ data type of set │ expression to │
883 │ │ element │ derive the data │
884 │ │ │ type from │
885 ├─────────┼─────────────────────┼─────────────────────┤
886 │ │ │ │
887 │flags │ map flags │ string: constant, │
888 │ │ │ interval │
889 ├─────────┼─────────────────────┼─────────────────────┤
890 │ │ │ │
891 │elements │ elements contained │ map data type │
892 │ │ by the map │ │
893 ├─────────┼─────────────────────┼─────────────────────┤
894 │ │ │ │
895 │size │ maximum number of │ unsigned integer │
896 │ │ elements in the map │ (64 bit) │
897 ├─────────┼─────────────────────┼─────────────────────┤
898 │ │ │ │
899 │policy │ map policy │ string: performance │
900 │ │ │ [default], memory │
901 └─────────┴─────────────────────┴─────────────────────┘
902
904 {add | create | delete | destroy | get } element [family] table set { ELEMENT[, ...] }
905
906 ELEMENT := key_expression OPTIONS [: value_expression]
907 OPTIONS := [timeout TIMESPEC] [expires TIMESPEC] [comment string]
908 TIMESPEC := [numd][numh][numm][num[s]]
909
910 Element-related commands allow one to change contents of named sets and
911 maps. key_expression is typically a value matching the set type.
912 value_expression is not allowed in sets but mandatory when adding to
913 maps, where it matches the data part in its type definition. When
914 deleting from maps, it may be specified but is optional as
915 key_expression uniquely identifies the element.
916
917 create command is similar to add with the exception that none of the
918 listed elements may already exist.
919
920 get command is useful to check if an element is contained in a set
921 which may be non-trivial in very large and/or interval sets. In the
922 latter case, the containing interval is returned instead of just the
923 element itself.
924
925 Table 10. Element options
926 ┌────────┬───────────────────────────┐
927 │Option │ Description │
928 ├────────┼───────────────────────────┤
929 │ │ │
930 │timeout │ timeout value for │
931 │ │ sets/maps with flag │
932 │ │ timeout │
933 ├────────┼───────────────────────────┤
934 │ │ │
935 │expires │ the time until given │
936 │ │ element expires, useful │
937 │ │ for ruleset replication │
938 │ │ only │
939 ├────────┼───────────────────────────┤
940 │ │ │
941 │comment │ per element comment field │
942 └────────┴───────────────────────────┘
943
945 {add | create} flowtable [family] table flowtable { hook hook priority priority ; devices = { device[, ...] } ; }
946 list flowtables [family]
947 {delete | destroy | list} flowtable [family] table flowtable
948 delete flowtable [family] table handle handle
949
950 Flowtables allow you to accelerate packet forwarding in software.
951 Flowtables entries are represented through a tuple that is composed of
952 the input interface, source and destination address, source and
953 destination port; and layer 3/4 protocols. Each entry also caches the
954 destination interface and the gateway address - to update the
955 destination link-layer address - to forward packets. The ttl and
956 hoplimit fields are also decremented. Hence, flowtables provides an
957 alternative path that allow packets to bypass the classic forwarding
958 path. Flowtables reside in the ingress hook that is located before the
959 prerouting hook. You can select which flows you want to offload through
960 the flow expression from the forward chain. Flowtables are identified
961 by their address family and their name. The address family must be one
962 of ip, ip6, or inet. The inet address family is a dummy family which is
963 used to create hybrid IPv4/IPv6 tables. When no address family is
964 specified, ip is used by default.
965
966 The priority can be a signed integer or filter which stands for 0.
967 Addition and subtraction can be used to set relative priority, e.g.
968 filter + 5 equals to 5.
969
970
971 add Add a new flowtable for
972 the given family with the
973 given name.
974
975 delete Delete the specified
976 flowtable.
977
978 destroy Delete the specified
979 flowtable, it does not
980 fail if it does not exist.
981
982 list List all flowtables.
983
984
986 list { secmarks | synproxys | flow tables | meters | hooks } [family]
987 list { secmarks | synproxys | flow tables | meters | hooks } table [family] table
988 list ct { timeout | expectation | helper | helpers } table [family] table
989
990 Inspect configured objects. list hooks shows the full hook pipeline,
991 including those registered by kernel modules, such as nf_conntrack.
992
994 {add | delete | destroy | list | reset} counter [family] table object
995 {add | delete | destroy | list | reset} quota [family] table object
996 {add | delete | destroy | list} limit [family] table object
997 delete counter [family] table handle handle
998 delete quota [family] table handle handle
999 delete limit [family] table handle handle
1000 destroy counter [family] table handle handle
1001 destroy quota [family] table handle handle
1002 destroy limit [family] table handle handle
1003 list counters [family]
1004 list quotas [family]
1005 list limits [family]
1006 reset counters [family]
1007 reset quotas [family]
1008 reset counters [family] table table
1009 reset quotas [family] table table
1010
1011 Stateful objects are attached to tables and are identified by a unique
1012 name. They group stateful information from rules, to reference them in
1013 rules the keywords "type name" are used e.g. "counter name".
1014
1015
1016 add Add a new stateful object
1017 in the specified table.
1018
1019 delete Delete the specified
1020 object.
1021
1022 destroy Delete the specified
1023 object, it does not fail
1024 if it does not exist.
1025
1026 list Display stateful
1027 information the object
1028 holds.
1029
1030 reset List-and-reset stateful
1031 object.
1032
1033
1034 CT HELPER
1035 add ct helper [family] table name { type type protocol protocol ; [l3proto family ;] }
1036 delete ct helper [family] table name
1037 list ct helpers
1038
1039 Ct helper is used to define connection tracking helpers that can then
1040 be used in combination with the ct helper set statement. type and
1041 protocol are mandatory, l3proto is derived from the table family by
1042 default, i.e. in the inet table the kernel will try to load both the
1043 ipv4 and ipv6 helper backends, if they are supported by the kernel.
1044
1045 Table 11. conntrack helper specifications
1046 ┌─────────┬─────────────────────┬─────────────────────┐
1047 │Keyword │ Description │ Type │
1048 ├─────────┼─────────────────────┼─────────────────────┤
1049 │ │ │ │
1050 │type │ name of helper type │ quoted string (e.g. │
1051 │ │ │ "ftp") │
1052 ├─────────┼─────────────────────┼─────────────────────┤
1053 │ │ │ │
1054 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1055 │ │ the helper │ │
1056 ├─────────┼─────────────────────┼─────────────────────┤
1057 │ │ │ │
1058 │l3proto │ layer 3 protocol of │ address family │
1059 │ │ the helper │ (e.g. ip) │
1060 ├─────────┼─────────────────────┼─────────────────────┤
1061 │ │ │ │
1062 │comment │ per ct helper │ string │
1063 │ │ comment field │ │
1064 └─────────┴─────────────────────┴─────────────────────┘
1065
1066 defining and assigning ftp helper.
1067
1068 Unlike iptables, helper assignment needs to be performed after the conntrack
1069 lookup has completed, for example with the default 0 hook priority.
1070
1071 table inet myhelpers {
1072 ct helper ftp-standard {
1073 type "ftp" protocol tcp
1074 }
1075 chain prerouting {
1076 type filter hook prerouting priority filter;
1077 tcp dport 21 ct helper set "ftp-standard"
1078 }
1079 }
1080
1081
1082 CT TIMEOUT
1083 add ct timeout [family] table name { protocol protocol ; policy = { state: value [, ...] } ; [l3proto family ;] }
1084 delete ct timeout [family] table name
1085 list ct timeouts
1086
1087 Ct timeout is used to update connection tracking timeout values.Timeout
1088 policies are assigned with the ct timeout set statement. protocol and
1089 policy are mandatory, l3proto is derived from the table family by
1090 default.
1091
1092 Table 12. conntrack timeout specifications
1093 ┌─────────┬─────────────────────┬──────────────────┐
1094 │Keyword │ Description │ Type │
1095 ├─────────┼─────────────────────┼──────────────────┤
1096 │ │ │ │
1097 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1098 │ │ the timeout object │ │
1099 ├─────────┼─────────────────────┼──────────────────┤
1100 │ │ │ │
1101 │state │ connection state │ string (e.g. │
1102 │ │ name │ "established") │
1103 ├─────────┼─────────────────────┼──────────────────┤
1104 │ │ │ │
1105 │value │ timeout value for │ unsigned integer │
1106 │ │ connection state │ │
1107 ├─────────┼─────────────────────┼──────────────────┤
1108 │ │ │ │
1109 │l3proto │ layer 3 protocol of │ address family │
1110 │ │ the timeout object │ (e.g. ip) │
1111 ├─────────┼─────────────────────┼──────────────────┤
1112 │ │ │ │
1113 │comment │ per ct timeout │ string │
1114 │ │ comment field │ │
1115 └─────────┴─────────────────────┴──────────────────┘
1116
1117 tcp connection state names that can have a specific timeout value are:
1118
1119 close, close_wait, established, fin_wait, last_ack, retrans, syn_recv,
1120 syn_sent, time_wait and unack.
1121
1122 You can use sysctl -a |grep net.netfilter.nf_conntrack_tcp_timeout_ to
1123 view and change the system-wide defaults. ct timeout allows for
1124 flow-specific settings, without changing the global timeouts.
1125
1126 For example, tcp port 53 could have much lower settings than other
1127 traffic.
1128
1129 udp state names that can have a specific timeout value are replied and
1130 unreplied.
1131
1132 defining and assigning ct timeout policy.
1133
1134 table ip filter {
1135 ct timeout customtimeout {
1136 protocol tcp;
1137 l3proto ip
1138 policy = { established: 120, close: 20 }
1139 }
1140
1141 chain output {
1142 type filter hook output priority filter; policy accept;
1143 ct timeout set "customtimeout"
1144 }
1145 }
1146
1147 testing the updated timeout policy.
1148
1149 % conntrack -E
1150
1151 It should display:
1152
1153 [UPDATE] tcp 6 120 ESTABLISHED src=172.16.19.128 dst=172.16.19.1
1154 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128
1155 sport=41360 dport=22
1156
1157
1158 CT EXPECTATION
1159 add ct expectation [family] table name { protocol protocol ; dport dport ; timeout timeout ; size size ; [*l3proto family ;] }
1160 delete ct expectation [family] table name
1161 list ct expectations
1162
1163 Ct expectation is used to create connection expectations. Expectations
1164 are assigned with the ct expectation set statement. protocol, dport,
1165 timeout and size are mandatory, l3proto is derived from the table
1166 family by default.
1167
1168 Table 13. conntrack expectation specifications
1169 ┌─────────┬─────────────────────┬──────────────────┐
1170 │Keyword │ Description │ Type │
1171 ├─────────┼─────────────────────┼──────────────────┤
1172 │ │ │ │
1173 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1174 │ │ the expectation │ │
1175 │ │ object │ │
1176 ├─────────┼─────────────────────┼──────────────────┤
1177 │ │ │ │
1178 │dport │ destination port of │ unsigned integer │
1179 │ │ expected connection │ │
1180 ├─────────┼─────────────────────┼──────────────────┤
1181 │ │ │ │
1182 │timeout │ timeout value for │ unsigned integer │
1183 │ │ expectation │ │
1184 ├─────────┼─────────────────────┼──────────────────┤
1185 │ │ │ │
1186 │size │ size value for │ unsigned integer │
1187 │ │ expectation │ │
1188 ├─────────┼─────────────────────┼──────────────────┤
1189 │ │ │ │
1190 │l3proto │ layer 3 protocol of │ address family │
1191 │ │ the expectation │ (e.g. ip) │
1192 │ │ object │ │
1193 ├─────────┼─────────────────────┼──────────────────┤
1194 │ │ │ │
1195 │comment │ per ct expectation │ string │
1196 │ │ comment field │ │
1197 └─────────┴─────────────────────┴──────────────────┘
1198
1199 defining and assigning ct expectation policy.
1200
1201 table ip filter {
1202 ct expectation expect {
1203 protocol udp
1204 dport 9876
1205 timeout 2m
1206 size 8
1207 l3proto ip
1208 }
1209
1210 chain input {
1211 type filter hook input priority filter; policy accept;
1212 ct expectation set "expect"
1213 }
1214 }
1215
1216
1217 COUNTER
1218 add counter [family] table name [{ [ packets packets bytes bytes ; ] [ comment comment ; }]
1219 delete counter [family] table name
1220 list counters
1221
1222 Table 14. Counter specifications
1223 ┌────────┬─────────────────────┬──────────────────┐
1224 │Keyword │ Description │ Type │
1225 ├────────┼─────────────────────┼──────────────────┤
1226 │ │ │ │
1227 │packets │ initial count of │ unsigned integer │
1228 │ │ packets │ (64 bit) │
1229 ├────────┼─────────────────────┼──────────────────┤
1230 │ │ │ │
1231 │bytes │ initial count of │ unsigned integer │
1232 │ │ bytes │ (64 bit) │
1233 ├────────┼─────────────────────┼──────────────────┤
1234 │ │ │ │
1235 │comment │ per counter comment │ string │
1236 │ │ field │ │
1237 └────────┴─────────────────────┴──────────────────┘
1238
1239 Using named counters.
1240
1241 nft add counter filter http
1242 nft add rule filter input tcp dport 80 counter name \"http\"
1243
1244 Using named counters with maps.
1245
1246 nft add counter filter http
1247 nft add counter filter https
1248 nft add rule filter input counter name tcp dport map { 80 : \"http\", 443 : \"https\" }
1249
1250
1251 QUOTA
1252 add quota [family] table name { [over|until] bytes BYTE_UNIT [ used bytes BYTE_UNIT ] ; [ comment comment ; ] }
1253 BYTE_UNIT := bytes | kbytes | mbytes
1254 delete quota [family] table name
1255 list quotas
1256
1257 Table 15. Quota specifications
1258 ┌────────┬───────────────────┬────────────────────┐
1259 │Keyword │ Description │ Type │
1260 ├────────┼───────────────────┼────────────────────┤
1261 │ │ │ │
1262 │quota │ quota limit, used │ Two arguments, │
1263 │ │ as the quota name │ unsigned integer │
1264 │ │ │ (64 bit) and │
1265 │ │ │ string: bytes, │
1266 │ │ │ kbytes, mbytes. │
1267 │ │ │ "over" and "until" │
1268 │ │ │ go before these │
1269 │ │ │ arguments │
1270 ├────────┼───────────────────┼────────────────────┤
1271 │ │ │ │
1272 │used │ initial value of │ Two arguments, │
1273 │ │ used quota │ unsigned integer │
1274 │ │ │ (64 bit) and │
1275 │ │ │ string: bytes, │
1276 │ │ │ kbytes, mbytes │
1277 ├────────┼───────────────────┼────────────────────┤
1278 │ │ │ │
1279 │comment │ per quota comment │ string │
1280 │ │ field │ │
1281 └────────┴───────────────────┴────────────────────┘
1282
1283 Using named quotas.
1284
1285 nft add quota filter user123 { over 20 mbytes }
1286 nft add rule filter input ip saddr 192.168.10.123 quota name \"user123\"
1287
1288 Using named quotas with maps.
1289
1290 nft add quota filter user123 { over 20 mbytes }
1291 nft add quota filter user124 { over 20 mbytes }
1292 nft add rule filter input quota name ip saddr map { 192.168.10.123 : \"user123\", 192.168.10.124 : \"user124\" }
1293
1294
1296 Expressions represent values, either constants like network addresses,
1297 port numbers, etc., or data gathered from the packet during ruleset
1298 evaluation. Expressions can be combined using binary, logical,
1299 relational and other types of expressions to form complex or relational
1300 (match) expressions. They are also used as arguments to certain types
1301 of operations, like NAT, packet marking etc.
1302
1303 Each expression has a data type, which determines the size, parsing and
1304 representation of symbolic values and type compatibility with other
1305 expressions.
1306
1307 DESCRIBE COMMAND
1308 describe expression | data type
1309
1310 The describe command shows information about the type of an expression
1311 and its data type. A data type may also be given, in which nft will
1312 display more information about the type.
1313
1314 The describe command.
1315
1316 $ nft describe tcp flags
1317 payload expression, datatype tcp_flag (TCP flag) (basetype bitmask, integer), 8 bits
1318
1319 predefined symbolic constants:
1320 fin 0x01
1321 syn 0x02
1322 rst 0x04
1323 psh 0x08
1324 ack 0x10
1325 urg 0x20
1326 ecn 0x40
1327 cwr 0x80
1328
1329
1331 Data types determine the size, parsing and representation of symbolic
1332 values and type compatibility of expressions. A number of global data
1333 types exist, in addition some expression types define further data
1334 types specific to the expression type. Most data types have a fixed
1335 size, some however may have a dynamic size, f.i. the string type. Some
1336 types also have predefined symbolic constants. Those can be listed
1337 using the nft describe command:
1338
1339 $ nft describe ct_state
1340 datatype ct_state (conntrack state) (basetype bitmask, integer), 32 bits
1341
1342 pre-defined symbolic constants (in hexadecimal):
1343 invalid 0x00000001
1344 new ...
1345
1346 Types may be derived from lower order types, f.i. the IPv4 address type
1347 is derived from the integer type, meaning an IPv4 address can also be
1348 specified as an integer value.
1349
1350 In certain contexts (set and map definitions), it is necessary to
1351 explicitly specify a data type. Each type has a name which is used for
1352 this.
1353
1354 INTEGER TYPE
1355 ┌────────┬─────────┬──────────┬───────────┐
1356 │Name │ Keyword │ Size │ Base type │
1357 ├────────┼─────────┼──────────┼───────────┤
1358 │ │ │ │ │
1359 │Integer │ integer │ variable │ - │
1360 └────────┴─────────┴──────────┴───────────┘
1361
1362 The integer type is used for numeric values. It may be specified as a
1363 decimal, hexadecimal or octal number. The integer type does not have a
1364 fixed size, its size is determined by the expression for which it is
1365 used.
1366
1367 BITMASK TYPE
1368 ┌────────┬─────────┬──────────┬───────────┐
1369 │Name │ Keyword │ Size │ Base type │
1370 ├────────┼─────────┼──────────┼───────────┤
1371 │ │ │ │ │
1372 │Bitmask │ bitmask │ variable │ integer │
1373 └────────┴─────────┴──────────┴───────────┘
1374
1375 The bitmask type (bitmask) is used for bitmasks.
1376
1377 STRING TYPE
1378 ┌───────┬─────────┬──────────┬───────────┐
1379 │Name │ Keyword │ Size │ Base type │
1380 ├───────┼─────────┼──────────┼───────────┤
1381 │ │ │ │ │
1382 │String │ string │ variable │ - │
1383 └───────┴─────────┴──────────┴───────────┘
1384
1385 The string type is used for character strings. A string begins with an
1386 alphabetic character (a-zA-Z) followed by zero or more alphanumeric
1387 characters or the characters /, -, _ and .. In addition, anything
1388 enclosed in double quotes (") is recognized as a string.
1389
1390 String specification.
1391
1392 # Interface name
1393 filter input iifname eth0
1394
1395 # Weird interface name
1396 filter input iifname "(eth0)"
1397
1398
1399 LINK LAYER ADDRESS TYPE
1400 ┌───────────┬─────────┬──────────┬───────────┐
1401 │Name │ Keyword │ Size │ Base type │
1402 ├───────────┼─────────┼──────────┼───────────┤
1403 │ │ │ │ │
1404 │Link layer │ lladdr │ variable │ integer │
1405 │address │ │ │ │
1406 └───────────┴─────────┴──────────┴───────────┘
1407
1408 The link layer address type is used for link layer addresses. Link
1409 layer addresses are specified as a variable amount of groups of two
1410 hexadecimal digits separated using colons (:).
1411
1412 Link layer address specification.
1413
1414 # Ethernet destination MAC address
1415 filter input ether daddr 20:c9:d0:43:12:d9
1416
1417
1418 IPV4 ADDRESS TYPE
1419 ┌─────────────┬───────────┬────────┬───────────┐
1420 │Name │ Keyword │ Size │ Base type │
1421 ├─────────────┼───────────┼────────┼───────────┤
1422 │ │ │ │ │
1423 │IPV4 address │ ipv4_addr │ 32 bit │ integer │
1424 └─────────────┴───────────┴────────┴───────────┘
1425
1426 The IPv4 address type is used for IPv4 addresses. Addresses are
1427 specified in either dotted decimal, dotted hexadecimal, dotted octal,
1428 decimal, hexadecimal, octal notation or as a host name. A host name
1429 will be resolved using the standard system resolver.
1430
1431 IPv4 address specification.
1432
1433 # dotted decimal notation
1434 filter output ip daddr 127.0.0.1
1435
1436 # host name
1437 filter output ip daddr localhost
1438
1439
1440 IPV6 ADDRESS TYPE
1441 ┌─────────────┬───────────┬─────────┬───────────┐
1442 │Name │ Keyword │ Size │ Base type │
1443 ├─────────────┼───────────┼─────────┼───────────┤
1444 │ │ │ │ │
1445 │IPv6 address │ ipv6_addr │ 128 bit │ integer │
1446 └─────────────┴───────────┴─────────┴───────────┘
1447
1448 The IPv6 address type is used for IPv6 addresses. Addresses are
1449 specified as a host name or as hexadecimal halfwords separated by
1450 colons. Addresses might be enclosed in square brackets ("[]") to
1451 differentiate them from port numbers.
1452
1453 IPv6 address specification.
1454
1455 # abbreviated loopback address
1456 filter output ip6 daddr ::1
1457
1458 IPv6 address specification with bracket notation.
1459
1460 # without [] the port number (22) would be parsed as part of the
1461 # ipv6 address
1462 ip6 nat prerouting tcp dport 2222 dnat to [1ce::d0]:22
1463
1464
1465 BOOLEAN TYPE
1466 ┌────────┬─────────┬───────┬───────────┐
1467 │Name │ Keyword │ Size │ Base type │
1468 ├────────┼─────────┼───────┼───────────┤
1469 │ │ │ │ │
1470 │Boolean │ boolean │ 1 bit │ integer │
1471 └────────┴─────────┴───────┴───────────┘
1472
1473 The boolean type is a syntactical helper type in userspace. Its use is
1474 in the right-hand side of a (typically implicit) relational expression
1475 to change the expression on the left-hand side into a boolean check
1476 (usually for existence).
1477
1478 Table 16. The following keywords will automatically resolve into a
1479 boolean type with given value
1480 ┌────────┬───────┐
1481 │Keyword │ Value │
1482 ├────────┼───────┤
1483 │ │ │
1484 │exists │ 1 │
1485 ├────────┼───────┤
1486 │ │ │
1487 │missing │ 0 │
1488 └────────┴───────┘
1489
1490 Table 17. expressions support a boolean comparison
1491 ┌───────────┬─────────────────────────┐
1492 │Expression │ Behaviour │
1493 ├───────────┼─────────────────────────┤
1494 │ │ │
1495 │fib │ Check route existence. │
1496 ├───────────┼─────────────────────────┤
1497 │ │ │
1498 │exthdr │ Check IPv6 extension │
1499 │ │ header existence. │
1500 ├───────────┼─────────────────────────┤
1501 │ │ │
1502 │tcp option │ Check TCP option header │
1503 │ │ existence. │
1504 └───────────┴─────────────────────────┘
1505
1506 Boolean specification.
1507
1508 # match if route exists
1509 filter input fib daddr . iif oif exists
1510
1511 # match only non-fragmented packets in IPv6 traffic
1512 filter input exthdr frag missing
1513
1514 # match if TCP timestamp option is present
1515 filter input tcp option timestamp exists
1516
1517
1518 ICMP TYPE TYPE
1519 ┌──────────┬───────────┬───────┬───────────┐
1520 │Name │ Keyword │ Size │ Base type │
1521 ├──────────┼───────────┼───────┼───────────┤
1522 │ │ │ │ │
1523 │ICMP Type │ icmp_type │ 8 bit │ integer │
1524 └──────────┴───────────┴───────┴───────────┘
1525
1526 The ICMP Type type is used to conveniently specify the ICMP header’s
1527 type field.
1528
1529 Table 18. Keywords may be used when specifying the ICMP type
1530 ┌────────────────────────┬───────┐
1531 │Keyword │ Value │
1532 ├────────────────────────┼───────┤
1533 │ │ │
1534 │echo-reply │ 0 │
1535 ├────────────────────────┼───────┤
1536 │ │ │
1537 │destination-unreachable │ 3 │
1538 ├────────────────────────┼───────┤
1539 │ │ │
1540 │source-quench │ 4 │
1541 ├────────────────────────┼───────┤
1542 │ │ │
1543 │redirect │ 5 │
1544 ├────────────────────────┼───────┤
1545 │ │ │
1546 │echo-request │ 8 │
1547 ├────────────────────────┼───────┤
1548 │ │ │
1549 │router-advertisement │ 9 │
1550 ├────────────────────────┼───────┤
1551 │ │ │
1552 │router-solicitation │ 10 │
1553 ├────────────────────────┼───────┤
1554 │ │ │
1555 │time-exceeded │ 11 │
1556 ├────────────────────────┼───────┤
1557 │ │ │
1558 │parameter-problem │ 12 │
1559 ├────────────────────────┼───────┤
1560 │ │ │
1561 │timestamp-request │ 13 │
1562 ├────────────────────────┼───────┤
1563 │ │ │
1564 │timestamp-reply │ 14 │
1565 ├────────────────────────┼───────┤
1566 │ │ │
1567 │info-request │ 15 │
1568 ├────────────────────────┼───────┤
1569 │ │ │
1570 │info-reply │ 16 │
1571 ├────────────────────────┼───────┤
1572 │ │ │
1573 │address-mask-request │ 17 │
1574 ├────────────────────────┼───────┤
1575 │ │ │
1576 │address-mask-reply │ 18 │
1577 └────────────────────────┴───────┘
1578
1579 ICMP Type specification.
1580
1581 # match ping packets
1582 filter output icmp type { echo-request, echo-reply }
1583
1584
1585 ICMP CODE TYPE
1586 ┌──────────┬───────────┬───────┬───────────┐
1587 │Name │ Keyword │ Size │ Base type │
1588 ├──────────┼───────────┼───────┼───────────┤
1589 │ │ │ │ │
1590 │ICMP Code │ icmp_code │ 8 bit │ integer │
1591 └──────────┴───────────┴───────┴───────────┘
1592
1593 The ICMP Code type is used to conveniently specify the ICMP header’s
1594 code field.
1595
1596 Table 19. Keywords may be used when specifying the ICMP code
1597 ┌─────────────────┬───────┐
1598 │Keyword │ Value │
1599 ├─────────────────┼───────┤
1600 │ │ │
1601 │net-unreachable │ 0 │
1602 ├─────────────────┼───────┤
1603 │ │ │
1604 │host-unreachable │ 1 │
1605 ├─────────────────┼───────┤
1606 │ │ │
1607 │prot-unreachable │ 2 │
1608 ├─────────────────┼───────┤
1609 │ │ │
1610 │port-unreachable │ 3 │
1611 ├─────────────────┼───────┤
1612 │ │ │
1613 │frag-needed │ 4 │
1614 ├─────────────────┼───────┤
1615 │ │ │
1616 │net-prohibited │ 9 │
1617 ├─────────────────┼───────┤
1618 │ │ │
1619 │host-prohibited │ 10 │
1620 ├─────────────────┼───────┤
1621 │ │ │
1622 │admin-prohibited │ 13 │
1623 └─────────────────┴───────┘
1624
1625 ICMPV6 TYPE TYPE
1626 ┌────────────┬────────────┬───────┬───────────┐
1627 │Name │ Keyword │ Size │ Base type │
1628 ├────────────┼────────────┼───────┼───────────┤
1629 │ │ │ │ │
1630 │ICMPv6 Type │ icmpx_code │ 8 bit │ integer │
1631 └────────────┴────────────┴───────┴───────────┘
1632
1633 The ICMPv6 Type type is used to conveniently specify the ICMPv6
1634 header’s type field.
1635
1636 Table 20. keywords may be used when specifying the ICMPv6 type:
1637 ┌────────────────────────┬───────┐
1638 │Keyword │ Value │
1639 ├────────────────────────┼───────┤
1640 │ │ │
1641 │destination-unreachable │ 1 │
1642 ├────────────────────────┼───────┤
1643 │ │ │
1644 │packet-too-big │ 2 │
1645 ├────────────────────────┼───────┤
1646 │ │ │
1647 │time-exceeded │ 3 │
1648 ├────────────────────────┼───────┤
1649 │ │ │
1650 │parameter-problem │ 4 │
1651 ├────────────────────────┼───────┤
1652 │ │ │
1653 │echo-request │ 128 │
1654 ├────────────────────────┼───────┤
1655 │ │ │
1656 │echo-reply │ 129 │
1657 ├────────────────────────┼───────┤
1658 │ │ │
1659 │mld-listener-query │ 130 │
1660 ├────────────────────────┼───────┤
1661 │ │ │
1662 │mld-listener-report │ 131 │
1663 ├────────────────────────┼───────┤
1664 │ │ │
1665 │mld-listener-done │ 132 │
1666 ├────────────────────────┼───────┤
1667 │ │ │
1668 │mld-listener-reduction │ 132 │
1669 ├────────────────────────┼───────┤
1670 │ │ │
1671 │nd-router-solicit │ 133 │
1672 ├────────────────────────┼───────┤
1673 │ │ │
1674 │nd-router-advert │ 134 │
1675 ├────────────────────────┼───────┤
1676 │ │ │
1677 │nd-neighbor-solicit │ 135 │
1678 ├────────────────────────┼───────┤
1679 │ │ │
1680 │nd-neighbor-advert │ 136 │
1681 ├────────────────────────┼───────┤
1682 │ │ │
1683 │nd-redirect │ 137 │
1684 ├────────────────────────┼───────┤
1685 │ │ │
1686 │router-renumbering │ 138 │
1687 ├────────────────────────┼───────┤
1688 │ │ │
1689 │ind-neighbor-solicit │ 141 │
1690 ├────────────────────────┼───────┤
1691 │ │ │
1692 │ind-neighbor-advert │ 142 │
1693 ├────────────────────────┼───────┤
1694 │ │ │
1695 │mld2-listener-report │ 143 │
1696 └────────────────────────┴───────┘
1697
1698 ICMPv6 Type specification.
1699
1700 # match ICMPv6 ping packets
1701 filter output icmpv6 type { echo-request, echo-reply }
1702
1703
1704 ICMPV6 CODE TYPE
1705 ┌────────────┬─────────────┬───────┬───────────┐
1706 │Name │ Keyword │ Size │ Base type │
1707 ├────────────┼─────────────┼───────┼───────────┤
1708 │ │ │ │ │
1709 │ICMPv6 Code │ icmpv6_code │ 8 bit │ integer │
1710 └────────────┴─────────────┴───────┴───────────┘
1711
1712 The ICMPv6 Code type is used to conveniently specify the ICMPv6
1713 header’s code field.
1714
1715 Table 21. keywords may be used when specifying the ICMPv6 code
1716 ┌─────────────────┬───────┐
1717 │Keyword │ Value │
1718 ├─────────────────┼───────┤
1719 │ │ │
1720 │no-route │ 0 │
1721 ├─────────────────┼───────┤
1722 │ │ │
1723 │admin-prohibited │ 1 │
1724 ├─────────────────┼───────┤
1725 │ │ │
1726 │addr-unreachable │ 3 │
1727 ├─────────────────┼───────┤
1728 │ │ │
1729 │port-unreachable │ 4 │
1730 ├─────────────────┼───────┤
1731 │ │ │
1732 │policy-fail │ 5 │
1733 ├─────────────────┼───────┤
1734 │ │ │
1735 │reject-route │ 6 │
1736 └─────────────────┴───────┘
1737
1738 ICMPVX CODE TYPE
1739 ┌────────────┬─────────────┬───────┬───────────┐
1740 │Name │ Keyword │ Size │ Base type │
1741 ├────────────┼─────────────┼───────┼───────────┤
1742 │ │ │ │ │
1743 │ICMPvX Code │ icmpv6_type │ 8 bit │ integer │
1744 └────────────┴─────────────┴───────┴───────────┘
1745
1746 The ICMPvX Code type abstraction is a set of values which overlap
1747 between ICMP and ICMPv6 Code types to be used from the inet family.
1748
1749 Table 22. keywords may be used when specifying the ICMPvX code
1750 ┌─────────────────┬───────┐
1751 │Keyword │ Value │
1752 ├─────────────────┼───────┤
1753 │ │ │
1754 │no-route │ 0 │
1755 ├─────────────────┼───────┤
1756 │ │ │
1757 │port-unreachable │ 1 │
1758 ├─────────────────┼───────┤
1759 │ │ │
1760 │host-unreachable │ 2 │
1761 ├─────────────────┼───────┤
1762 │ │ │
1763 │admin-prohibited │ 3 │
1764 └─────────────────┴───────┘
1765
1766 CONNTRACK TYPES
1767 Table 23. overview of types used in ct expression and statement
1768 ┌─────────────────┬───────────┬─────────┬───────────┐
1769 │Name │ Keyword │ Size │ Base type │
1770 ├─────────────────┼───────────┼─────────┼───────────┤
1771 │ │ │ │ │
1772 │conntrack state │ ct_state │ 4 byte │ bitmask │
1773 ├─────────────────┼───────────┼─────────┼───────────┤
1774 │ │ │ │ │
1775 │conntrack │ ct_dir │ 8 bit │ integer │
1776 │direction │ │ │ │
1777 ├─────────────────┼───────────┼─────────┼───────────┤
1778 │ │ │ │ │
1779 │conntrack status │ ct_status │ 4 byte │ bitmask │
1780 ├─────────────────┼───────────┼─────────┼───────────┤
1781 │ │ │ │ │
1782 │conntrack event │ ct_event │ 4 byte │ bitmask │
1783 │bits │ │ │ │
1784 ├─────────────────┼───────────┼─────────┼───────────┤
1785 │ │ │ │ │
1786 │conntrack label │ ct_label │ 128 bit │ bitmask │
1787 └─────────────────┴───────────┴─────────┴───────────┘
1788
1789 For each of the types above, keywords are available for convenience:
1790
1791 Table 24. conntrack state (ct_state)
1792 ┌────────────┬───────┐
1793 │Keyword │ Value │
1794 ├────────────┼───────┤
1795 │ │ │
1796 │invalid │ 1 │
1797 ├────────────┼───────┤
1798 │ │ │
1799 │established │ 2 │
1800 ├────────────┼───────┤
1801 │ │ │
1802 │related │ 4 │
1803 ├────────────┼───────┤
1804 │ │ │
1805 │new │ 8 │
1806 ├────────────┼───────┤
1807 │ │ │
1808 │untracked │ 64 │
1809 └────────────┴───────┘
1810
1811 Table 25. conntrack direction (ct_dir)
1812 ┌─────────┬───────┐
1813 │Keyword │ Value │
1814 ├─────────┼───────┤
1815 │ │ │
1816 │original │ 0 │
1817 ├─────────┼───────┤
1818 │ │ │
1819 │reply │ 1 │
1820 └─────────┴───────┘
1821
1822 Table 26. conntrack status (ct_status)
1823 ┌───────────┬───────┐
1824 │Keyword │ Value │
1825 ├───────────┼───────┤
1826 │ │ │
1827 │expected │ 1 │
1828 ├───────────┼───────┤
1829 │ │ │
1830 │seen-reply │ 2 │
1831 ├───────────┼───────┤
1832 │ │ │
1833 │assured │ 4 │
1834 ├───────────┼───────┤
1835 │ │ │
1836 │confirmed │ 8 │
1837 ├───────────┼───────┤
1838 │ │ │
1839 │snat │ 16 │
1840 ├───────────┼───────┤
1841 │ │ │
1842 │dnat │ 32 │
1843 ├───────────┼───────┤
1844 │ │ │
1845 │dying │ 512 │
1846 └───────────┴───────┘
1847
1848 Table 27. conntrack event bits (ct_event)
1849 ┌──────────┬───────┐
1850 │Keyword │ Value │
1851 ├──────────┼───────┤
1852 │ │ │
1853 │new │ 1 │
1854 ├──────────┼───────┤
1855 │ │ │
1856 │related │ 2 │
1857 ├──────────┼───────┤
1858 │ │ │
1859 │destroy │ 4 │
1860 ├──────────┼───────┤
1861 │ │ │
1862 │reply │ 8 │
1863 ├──────────┼───────┤
1864 │ │ │
1865 │assured │ 16 │
1866 ├──────────┼───────┤
1867 │ │ │
1868 │protoinfo │ 32 │
1869 ├──────────┼───────┤
1870 │ │ │
1871 │helper │ 64 │
1872 ├──────────┼───────┤
1873 │ │ │
1874 │mark │ 128 │
1875 ├──────────┼───────┤
1876 │ │ │
1877 │seqadj │ 256 │
1878 ├──────────┼───────┤
1879 │ │ │
1880 │secmark │ 512 │
1881 ├──────────┼───────┤
1882 │ │ │
1883 │label │ 1024 │
1884 └──────────┴───────┘
1885
1886 Possible keywords for conntrack label type (ct_label) are read at
1887 runtime from /etc/connlabel.conf.
1888
1889 DCCP PKTTYPE TYPE
1890 ┌─────────────────┬──────────────┬───────┬───────────┐
1891 │Name │ Keyword │ Size │ Base type │
1892 ├─────────────────┼──────────────┼───────┼───────────┤
1893 │ │ │ │ │
1894 │DCCP packet type │ dccp_pkttype │ 4 bit │ integer │
1895 └─────────────────┴──────────────┴───────┴───────────┘
1896
1897 The DCCP packet type abstracts the different legal values of the
1898 respective four bit field in the DCCP header, as stated by RFC4340.
1899 Note that possible values 10-15 are considered reserved and therefore
1900 not allowed to be used. In iptables' dccp match, these values are
1901 aliased INVALID. With nftables, one may simply match on the numeric
1902 value range, i.e. 10-15.
1903
1904 Table 28. keywords may be used when specifying the DCCP packet type
1905 ┌─────────┬───────┐
1906 │Keyword │ Value │
1907 ├─────────┼───────┤
1908 │ │ │
1909 │request │ 0 │
1910 ├─────────┼───────┤
1911 │ │ │
1912 │response │ 1 │
1913 ├─────────┼───────┤
1914 │ │ │
1915 │data │ 2 │
1916 ├─────────┼───────┤
1917 │ │ │
1918 │ack │ 3 │
1919 ├─────────┼───────┤
1920 │ │ │
1921 │dataack │ 4 │
1922 ├─────────┼───────┤
1923 │ │ │
1924 │closereq │ 5 │
1925 ├─────────┼───────┤
1926 │ │ │
1927 │close │ 6 │
1928 ├─────────┼───────┤
1929 │ │ │
1930 │reset │ 7 │
1931 ├─────────┼───────┤
1932 │ │ │
1933 │sync │ 8 │
1934 ├─────────┼───────┤
1935 │ │ │
1936 │syncack │ 9 │
1937 └─────────┴───────┘
1938
1940 The lowest order expression is a primary expression, representing
1941 either a constant or a single datum from a packet’s payload, meta data
1942 or a stateful module.
1943
1944 META EXPRESSIONS
1945 meta {length | nfproto | l4proto | protocol | priority}
1946 [meta] {mark | iif | iifname | iiftype | oif | oifname | oiftype | skuid | skgid | nftrace | rtclassid | ibrname | obrname | pkttype | cpu | iifgroup | oifgroup | cgroup | random | ipsec | iifkind | oifkind | time | hour | day }
1947
1948 A meta expression refers to meta data associated with a packet.
1949
1950 There are two types of meta expressions: unqualified and qualified meta
1951 expressions. Qualified meta expressions require the meta keyword before
1952 the meta key, unqualified meta expressions can be specified by using
1953 the meta key directly or as qualified meta expressions. Meta l4proto is
1954 useful to match a particular transport protocol that is part of either
1955 an IPv4 or IPv6 packet. It will also skip any IPv6 extension headers
1956 present in an IPv6 packet.
1957
1958 meta iif, oif, iifname and oifname are used to match the interface a
1959 packet arrived on or is about to be sent out on.
1960
1961 iif and oif are used to match on the interface index, whereas iifname
1962 and oifname are used to match on the interface name. This is not the
1963 same — assuming the rule
1964
1965 filter input meta iif "foo"
1966
1967 Then this rule can only be added if the interface "foo" exists. Also,
1968 the rule will continue to match even if the interface "foo" is renamed
1969 to "bar".
1970
1971 This is because internally the interface index is used. In case of
1972 dynamically created interfaces, such as tun/tap or dialup interfaces
1973 (ppp for example), it might be better to use iifname or oifname
1974 instead.
1975
1976 In these cases, the name is used so the interface doesn’t have to exist
1977 to add such a rule, it will stop matching if the interface gets renamed
1978 and it will match again in case interface gets deleted and later a new
1979 interface with the same name is created.
1980
1981 Like with iptables, wildcard matching on interface name prefixes is
1982 available for iifname and oifname matches by appending an asterisk (*)
1983 character. Note however that unlike iptables, nftables does not accept
1984 interface names consisting of the wildcard character only - users are
1985 supposed to just skip those always matching expressions. In order to
1986 match on literal asterisk character, one may escape it using backslash
1987 (\).
1988
1989 Table 29. Meta expression types
1990 ┌──────────┬─────────────────────┬─────────────────────┐
1991 │Keyword │ Description │ Type │
1992 ├──────────┼─────────────────────┼─────────────────────┤
1993 │ │ │ │
1994 │length │ Length of the │ integer (32-bit) │
1995 │ │ packet in bytes │ │
1996 ├──────────┼─────────────────────┼─────────────────────┤
1997 │ │ │ │
1998 │nfproto │ real hook protocol │ integer (32 bit) │
1999 │ │ family, useful only │ │
2000 │ │ in inet table │ │
2001 ├──────────┼─────────────────────┼─────────────────────┤
2002 │ │ │ │
2003 │l4proto │ layer 4 protocol, │ integer (8 bit) │
2004 │ │ skips ipv6 │ │
2005 │ │ extension headers │ │
2006 ├──────────┼─────────────────────┼─────────────────────┤
2007 │ │ │ │
2008 │protocol │ EtherType protocol │ ether_type │
2009 │ │ value │ │
2010 ├──────────┼─────────────────────┼─────────────────────┤
2011 │ │ │ │
2012 │priority │ TC packet priority │ tc_handle │
2013 ├──────────┼─────────────────────┼─────────────────────┤
2014 │ │ │ │
2015 │mark │ Packet mark │ mark │
2016 ├──────────┼─────────────────────┼─────────────────────┤
2017 │ │ │ │
2018 │iif │ Input interface │ iface_index │
2019 │ │ index │ │
2020 ├──────────┼─────────────────────┼─────────────────────┤
2021 │ │ │ │
2022 │iifname │ Input interface │ ifname │
2023 │ │ name │ │
2024 ├──────────┼─────────────────────┼─────────────────────┤
2025 │ │ │ │
2026 │iiftype │ Input interface │ iface_type │
2027 │ │ type │ │
2028 ├──────────┼─────────────────────┼─────────────────────┤
2029 │ │ │ │
2030 │oif │ Output interface │ iface_index │
2031 │ │ index │ │
2032 ├──────────┼─────────────────────┼─────────────────────┤
2033 │ │ │ │
2034 │oifname │ Output interface │ ifname │
2035 │ │ name │ │
2036 ├──────────┼─────────────────────┼─────────────────────┤
2037 │ │ │ │
2038 │oiftype │ Output interface │ iface_type │
2039 │ │ hardware type │ │
2040 ├──────────┼─────────────────────┼─────────────────────┤
2041 │ │ │ │
2042 │sdif │ Slave device input │ iface_index │
2043 │ │ interface index │ │
2044 ├──────────┼─────────────────────┼─────────────────────┤
2045 │ │ │ │
2046 │sdifname │ Slave device │ ifname │
2047 │ │ interface name │ │
2048 ├──────────┼─────────────────────┼─────────────────────┤
2049 │ │ │ │
2050 │skuid │ UID associated with │ uid │
2051 │ │ originating socket │ │
2052 ├──────────┼─────────────────────┼─────────────────────┤
2053 │ │ │ │
2054 │skgid │ GID associated with │ gid │
2055 │ │ originating socket │ │
2056 ├──────────┼─────────────────────┼─────────────────────┤
2057 │ │ │ │
2058 │rtclassid │ Routing realm │ realm │
2059 ├──────────┼─────────────────────┼─────────────────────┤
2060 │ │ │ │
2061 │ibrname │ Input bridge │ ifname │
2062 │ │ interface name │ │
2063 ├──────────┼─────────────────────┼─────────────────────┤
2064 │ │ │ │
2065 │obrname │ Output bridge │ ifname │
2066 │ │ interface name │ │
2067 ├──────────┼─────────────────────┼─────────────────────┤
2068 │ │ │ │
2069 │pkttype │ packet type │ pkt_type │
2070 ├──────────┼─────────────────────┼─────────────────────┤
2071 │ │ │ │
2072 │cpu │ cpu number │ integer (32 bit) │
2073 │ │ processing the │ │
2074 │ │ packet │ │
2075 ├──────────┼─────────────────────┼─────────────────────┤
2076 │ │ │ │
2077 │iifgroup │ incoming device │ devgroup │
2078 │ │ group │ │
2079 ├──────────┼─────────────────────┼─────────────────────┤
2080 │ │ │ │
2081 │oifgroup │ outgoing device │ devgroup │
2082 │ │ group │ │
2083 ├──────────┼─────────────────────┼─────────────────────┤
2084 │ │ │ │
2085 │cgroup │ control group id │ integer (32 bit) │
2086 ├──────────┼─────────────────────┼─────────────────────┤
2087 │ │ │ │
2088 │random │ pseudo-random │ integer (32 bit) │
2089 │ │ number │ │
2090 ├──────────┼─────────────────────┼─────────────────────┤
2091 │ │ │ │
2092 │ipsec │ true if packet was │ boolean (1 bit) │
2093 │ │ ipsec encrypted │ │
2094 ├──────────┼─────────────────────┼─────────────────────┤
2095 │ │ │ │
2096 │iifkind │ Input interface │ │
2097 │ │ kind │ │
2098 ├──────────┼─────────────────────┼─────────────────────┤
2099 │ │ │ │
2100 │oifkind │ Output interface │ │
2101 │ │ kind │ │
2102 ├──────────┼─────────────────────┼─────────────────────┤
2103 │ │ │ │
2104 │time │ Absolute time of │ Integer (32 bit) or │
2105 │ │ packet reception │ string │
2106 ├──────────┼─────────────────────┼─────────────────────┤
2107 │ │ │ │
2108 │day │ Day of week │ Integer (8 bit) or │
2109 │ │ │ string │
2110 ├──────────┼─────────────────────┼─────────────────────┤
2111 │ │ │ │
2112 │hour │ Hour of day │ String │
2113 └──────────┴─────────────────────┴─────────────────────┘
2114
2115 Table 30. Meta expression specific types
2116 ┌──────────────┬────────────────────────────┐
2117 │Type │ Description │
2118 ├──────────────┼────────────────────────────┤
2119 │ │ │
2120 │iface_index │ Interface index (32 bit │
2121 │ │ number). Can be specified │
2122 │ │ numerically or as name of │
2123 │ │ an existing interface. │
2124 ├──────────────┼────────────────────────────┤
2125 │ │ │
2126 │ifname │ Interface name (16 byte │
2127 │ │ string). Does not have to │
2128 │ │ exist. │
2129 ├──────────────┼────────────────────────────┤
2130 │ │ │
2131 │iface_type │ Interface type (16 bit │
2132 │ │ number). │
2133 ├──────────────┼────────────────────────────┤
2134 │ │ │
2135 │uid │ User ID (32 bit number). │
2136 │ │ Can be specified │
2137 │ │ numerically or as user │
2138 │ │ name. │
2139 ├──────────────┼────────────────────────────┤
2140 │ │ │
2141 │gid │ Group ID (32 bit number). │
2142 │ │ Can be specified │
2143 │ │ numerically or as group │
2144 │ │ name. │
2145 ├──────────────┼────────────────────────────┤
2146 │ │ │
2147 │realm │ Routing Realm (32 bit │
2148 │ │ number). Can be specified │
2149 │ │ numerically or as symbolic │
2150 │ │ name defined in │
2151 │ │ /etc/iproute2/rt_realms. │
2152 ├──────────────┼────────────────────────────┤
2153 │ │ │
2154 │devgroup_type │ Device group (32 bit │
2155 │ │ number). Can be specified │
2156 │ │ numerically or as symbolic │
2157 │ │ name defined in │
2158 │ │ /etc/iproute2/group. │
2159 ├──────────────┼────────────────────────────┤
2160 │ │ │
2161 │pkt_type │ Packet type: host │
2162 │ │ (addressed to local host), │
2163 │ │ broadcast (to all), │
2164 │ │ multicast (to group), │
2165 │ │ other (addressed to │
2166 │ │ another host). │
2167 ├──────────────┼────────────────────────────┤
2168 │ │ │
2169 │ifkind │ Interface kind (16 byte │
2170 │ │ string). See TYPES in │
2171 │ │ ip-link(8) for a list. │
2172 ├──────────────┼────────────────────────────┤
2173 │ │ │
2174 │time │ Either an integer or a │
2175 │ │ date in ISO format. For │
2176 │ │ example: "2019-06-06 │
2177 │ │ 17:00". Hour and seconds │
2178 │ │ are optional and can be │
2179 │ │ omitted if desired. If │
2180 │ │ omitted, midnight will be │
2181 │ │ assumed. The following │
2182 │ │ three would be equivalent: │
2183 │ │ "2019-06-06", "2019-06-06 │
2184 │ │ 00:00" and "2019-06-06 │
2185 │ │ 00:00:00". When an integer │
2186 │ │ is given, it is assumed to │
2187 │ │ be a UNIX timestamp. │
2188 ├──────────────┼────────────────────────────┤
2189 │ │ │
2190 │day │ Either a day of week │
2191 │ │ ("Monday", "Tuesday", │
2192 │ │ etc.), or an integer │
2193 │ │ between 0 and 6. Strings │
2194 │ │ are matched │
2195 │ │ case-insensitively, and a │
2196 │ │ full match is not expected │
2197 │ │ (e.g. "Mon" would match │
2198 │ │ "Monday"). When an integer │
2199 │ │ is given, 0 is Sunday and │
2200 │ │ 6 is Saturday. │
2201 ├──────────────┼────────────────────────────┤
2202 │ │ │
2203 │hour │ A string representing an │
2204 │ │ hour in 24-hour format. │
2205 │ │ Seconds can optionally be │
2206 │ │ specified. For example, │
2207 │ │ 17:00 and 17:00:00 would │
2208 │ │ be equivalent. │
2209 └──────────────┴────────────────────────────┘
2210
2211 Using meta expressions.
2212
2213 # qualified meta expression
2214 filter output meta oif eth0
2215 filter forward meta iifkind { "tun", "veth" }
2216
2217 # unqualified meta expression
2218 filter output oif eth0
2219
2220 # incoming packet was subject to ipsec processing
2221 raw prerouting meta ipsec exists accept
2222
2223
2224 SOCKET EXPRESSION
2225 socket {transparent | mark | wildcard}
2226 socket cgroupv2 level NUM
2227
2228 Socket expression can be used to search for an existing open TCP/UDP
2229 socket and its attributes that can be associated with a packet. It
2230 looks for an established or non-zero bound listening socket (possibly
2231 with a non-local address). You can also use it to match on the socket
2232 cgroupv2 at a given ancestor level, e.g. if the socket belongs to
2233 cgroupv2 a/b, ancestor level 1 checks for a matching on cgroup a and
2234 ancestor level 2 checks for a matching on cgroup b.
2235
2236 Table 31. Available socket attributes
2237 ┌────────────┬─────────────────────┬─────────────────┐
2238 │Name │ Description │ Type │
2239 ├────────────┼─────────────────────┼─────────────────┤
2240 │ │ │ │
2241 │transparent │ Value of the │ boolean (1 bit) │
2242 │ │ IP_TRANSPARENT │ │
2243 │ │ socket option in │ │
2244 │ │ the found socket. │ │
2245 │ │ It can be 0 or 1. │ │
2246 ├────────────┼─────────────────────┼─────────────────┤
2247 │ │ │ │
2248 │mark │ Value of the socket │ mark │
2249 │ │ mark (SOL_SOCKET, │ │
2250 │ │ SO_MARK). │ │
2251 ├────────────┼─────────────────────┼─────────────────┤
2252 │ │ │ │
2253 │wildcard │ Indicates whether │ boolean (1 bit) │
2254 │ │ the socket is │ │
2255 │ │ wildcard-bound │ │
2256 │ │ (e.g. 0.0.0.0 or │ │
2257 │ │ ::0). │ │
2258 ├────────────┼─────────────────────┼─────────────────┤
2259 │ │ │ │
2260 │cgroupv2 │ cgroup version 2 │ cgroupv2 │
2261 │ │ for this socket │ │
2262 │ │ (path from │ │
2263 │ │ /sys/fs/cgroup) │ │
2264 └────────────┴─────────────────────┴─────────────────┘
2265
2266 Using socket expression.
2267
2268 # Mark packets that correspond to a transparent socket. "socket wildcard 0"
2269 # means that zero-bound listener sockets are NOT matched (which is usually
2270 # exactly what you want).
2271 table inet x {
2272 chain y {
2273 type filter hook prerouting priority mangle; policy accept;
2274 socket transparent 1 socket wildcard 0 mark set 0x00000001 accept
2275 }
2276 }
2277
2278 # Trace packets that corresponds to a socket with a mark value of 15
2279 table inet x {
2280 chain y {
2281 type filter hook prerouting priority mangle; policy accept;
2282 socket mark 0x0000000f nftrace set 1
2283 }
2284 }
2285
2286 # Set packet mark to socket mark
2287 table inet x {
2288 chain y {
2289 type filter hook prerouting priority mangle; policy accept;
2290 tcp dport 8080 mark set socket mark
2291 }
2292 }
2293
2294 # Count packets for cgroupv2 "user.slice" at level 1
2295 table inet x {
2296 chain y {
2297 type filter hook input priority filter; policy accept;
2298 socket cgroupv2 level 1 "user.slice" counter
2299 }
2300 }
2301
2302
2303 OSF EXPRESSION
2304 osf [ttl {loose | skip}] {name | version}
2305
2306 The osf expression does passive operating system fingerprinting. This
2307 expression compares some data (Window Size, MSS, options and their
2308 order, DF, and others) from packets with the SYN bit set.
2309
2310 Table 32. Available osf attributes
2311 ┌────────┬─────────────────────┬────────┐
2312 │Name │ Description │ Type │
2313 ├────────┼─────────────────────┼────────┤
2314 │ │ │ │
2315 │ttl │ Do TTL checks on │ string │
2316 │ │ the packet to │ │
2317 │ │ determine the │ │
2318 │ │ operating system. │ │
2319 ├────────┼─────────────────────┼────────┤
2320 │ │ │ │
2321 │version │ Do OS version │ │
2322 │ │ checks on the │ │
2323 │ │ packet. │ │
2324 ├────────┼─────────────────────┼────────┤
2325 │ │ │ │
2326 │name │ Name of the OS │ string │
2327 │ │ signature to match. │ │
2328 │ │ All signatures can │ │
2329 │ │ be found at pf.os │ │
2330 │ │ file. Use "unknown" │ │
2331 │ │ for OS signatures │ │
2332 │ │ that the expression │ │
2333 │ │ could not detect. │ │
2334 └────────┴─────────────────────┴────────┘
2335
2336 Available ttl values.
2337
2338 If no TTL attribute is passed, make a true IP header and fingerprint TTL true comparison. This generally works for LANs.
2339
2340 * loose: Check if the IP header's TTL is less than the fingerprint one. Works for globally-routable addresses.
2341 * skip: Do not compare the TTL at all.
2342
2343 Using osf expression.
2344
2345 # Accept packets that match the "Linux" OS genre signature without comparing TTL.
2346 table inet x {
2347 chain y {
2348 type filter hook input priority filter; policy accept;
2349 osf ttl skip name "Linux"
2350 }
2351 }
2352
2353
2354 FIB EXPRESSIONS
2355 fib {saddr | daddr | mark | iif | oif} [. ...] {oif | oifname | type}
2356
2357 A fib expression queries the fib (forwarding information base) to
2358 obtain information such as the output interface index a particular
2359 address would use. The input is a tuple of elements that is used as
2360 input to the fib lookup functions.
2361
2362 Table 33. fib expression specific types
2363 ┌────────┬──────────────────┬──────────────────┐
2364 │Keyword │ Description │ Type │
2365 ├────────┼──────────────────┼──────────────────┤
2366 │ │ │ │
2367 │oif │ Output interface │ integer (32 bit) │
2368 │ │ index │ │
2369 ├────────┼──────────────────┼──────────────────┤
2370 │ │ │ │
2371 │oifname │ Output interface │ string │
2372 │ │ name │ │
2373 ├────────┼──────────────────┼──────────────────┤
2374 │ │ │ │
2375 │type │ Address type │ fib_addrtype │
2376 └────────┴──────────────────┴──────────────────┘
2377
2378 Use nft describe fib_addrtype to get a list of all address types.
2379
2380 Using fib expressions.
2381
2382 # drop packets without a reverse path
2383 filter prerouting fib saddr . iif oif missing drop
2384
2385 In this example, 'saddr . iif' looks up routing information based on the source address and the input interface.
2386 oif picks the output interface index from the routing information.
2387 If no route was found for the source address/input interface combination, the output interface index is zero.
2388 In case the input interface is specified as part of the input key, the output interface index is always the same as the input interface index or zero.
2389 If only 'saddr oif' is given, then oif can be any interface index or zero.
2390
2391 # drop packets to address not configured on incoming interface
2392 filter prerouting fib daddr . iif type != { local, broadcast, multicast } drop
2393
2394 # perform lookup in a specific 'blackhole' table (0xdead, needs ip appropriate ip rule)
2395 filter prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : jump prohibited, unreachable : drop }
2396
2397
2398 ROUTING EXPRESSIONS
2399 rt [ip | ip6] {classid | nexthop | mtu | ipsec}
2400
2401 A routing expression refers to routing data associated with a packet.
2402
2403 Table 34. Routing expression types
2404 ┌────────┬─────────────────────┬─────────────────────┐
2405 │Keyword │ Description │ Type │
2406 ├────────┼─────────────────────┼─────────────────────┤
2407 │ │ │ │
2408 │classid │ Routing realm │ realm │
2409 ├────────┼─────────────────────┼─────────────────────┤
2410 │ │ │ │
2411 │nexthop │ Routing nexthop │ ipv4_addr/ipv6_addr │
2412 ├────────┼─────────────────────┼─────────────────────┤
2413 │ │ │ │
2414 │mtu │ TCP maximum segment │ integer (16 bit) │
2415 │ │ size of route │ │
2416 ├────────┼─────────────────────┼─────────────────────┤
2417 │ │ │ │
2418 │ipsec │ route via ipsec │ boolean │
2419 │ │ tunnel or transport │ │
2420 └────────┴─────────────────────┴─────────────────────┘
2421
2422 Table 35. Routing expression specific types
2423 ┌──────┬────────────────────────────┐
2424 │Type │ Description │
2425 ├──────┼────────────────────────────┤
2426 │ │ │
2427 │realm │ Routing Realm (32 bit │
2428 │ │ number). Can be specified │
2429 │ │ numerically or as symbolic │
2430 │ │ name defined in │
2431 │ │ /etc/iproute2/rt_realms. │
2432 └──────┴────────────────────────────┘
2433
2434 Using routing expressions.
2435
2436 # IP family independent rt expression
2437 filter output rt classid 10
2438
2439 # IP family dependent rt expressions
2440 ip filter output rt nexthop 192.168.0.1
2441 ip6 filter output rt nexthop fd00::1
2442 inet filter output rt ip nexthop 192.168.0.1
2443 inet filter output rt ip6 nexthop fd00::1
2444
2445 # outgoing packet will be encapsulated/encrypted by ipsec
2446 filter output rt ipsec exists
2447
2448
2449 IPSEC EXPRESSIONS
2450 ipsec {in | out} [ spnum NUM ] {reqid | spi}
2451 ipsec {in | out} [ spnum NUM ] {ip | ip6} {saddr | daddr}
2452
2453 An ipsec expression refers to ipsec data associated with a packet.
2454
2455 The in or out keyword needs to be used to specify if the expression
2456 should examine inbound or outbound policies. The in keyword can be used
2457 in the prerouting, input and forward hooks. The out keyword applies to
2458 forward, output and postrouting hooks. The optional keyword spnum can
2459 be used to match a specific state in a chain, it defaults to 0.
2460
2461 Table 36. Ipsec expression types
2462 ┌────────┬─────────────────────┬─────────────────────┐
2463 │Keyword │ Description │ Type │
2464 ├────────┼─────────────────────┼─────────────────────┤
2465 │ │ │ │
2466 │reqid │ Request ID │ integer (32 bit) │
2467 ├────────┼─────────────────────┼─────────────────────┤
2468 │ │ │ │
2469 │spi │ Security Parameter │ integer (32 bit) │
2470 │ │ Index │ │
2471 ├────────┼─────────────────────┼─────────────────────┤
2472 │ │ │ │
2473 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
2474 │ │ the tunnel │ │
2475 ├────────┼─────────────────────┼─────────────────────┤
2476 │ │ │ │
2477 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
2478 │ │ of the tunnel │ │
2479 └────────┴─────────────────────┴─────────────────────┘
2480
2481 Note: When using xfrm_interface, this expression is not useable in
2482 output hook as the plain packet does not traverse it with IPsec info
2483 attached - use a chain in postrouting hook instead.
2484
2485 NUMGEN EXPRESSION
2486 numgen {inc | random} mod NUM [ offset NUM ]
2487
2488 Create a number generator. The inc or random keywords control its
2489 operation mode: In inc mode, the last returned value is simply
2490 incremented. In random mode, a new random number is returned. The value
2491 after mod keyword specifies an upper boundary (read: modulus) which is
2492 not reached by returned numbers. The optional offset allows one to
2493 increment the returned value by a fixed offset.
2494
2495 A typical use-case for numgen is load-balancing:
2496
2497 Using numgen expression.
2498
2499 # round-robin between 192.168.10.100 and 192.168.20.200:
2500 add rule nat prerouting dnat to numgen inc mod 2 map \
2501 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2502
2503 # probability-based with odd bias using intervals:
2504 add rule nat prerouting dnat to numgen random mod 10 map \
2505 { 0-2 : 192.168.10.100, 3-9 : 192.168.20.200 }
2506
2507
2508 HASH EXPRESSIONS
2509 jhash {ip saddr | ip6 daddr | tcp dport | udp sport | ether saddr} [. ...] mod NUM [ seed NUM ] [ offset NUM ]
2510 symhash mod NUM [ offset NUM ]
2511
2512 Use a hashing function to generate a number. The functions available
2513 are jhash, known as Jenkins Hash, and symhash, for Symmetric Hash. The
2514 jhash requires an expression to determine the parameters of the packet
2515 header to apply the hashing, concatenations are possible as well. The
2516 value after mod keyword specifies an upper boundary (read: modulus)
2517 which is not reached by returned numbers. The optional seed is used to
2518 specify an init value used as seed in the hashing function. The
2519 optional offset allows one to increment the returned value by a fixed
2520 offset.
2521
2522 A typical use-case for jhash and symhash is load-balancing:
2523
2524 Using hash expressions.
2525
2526 # load balance based on source ip between 2 ip addresses:
2527 add rule nat prerouting dnat to jhash ip saddr mod 2 map \
2528 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2529
2530 # symmetric load balancing between 2 ip addresses:
2531 add rule nat prerouting dnat to symhash mod 2 map \
2532 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2533
2534
2536 Payload expressions refer to data from the packet’s payload.
2537
2538 ETHERNET HEADER EXPRESSION
2539 ether {daddr | saddr | type}
2540
2541 Table 37. Ethernet header expression types
2542 ┌────────┬────────────────────┬────────────┐
2543 │Keyword │ Description │ Type │
2544 ├────────┼────────────────────┼────────────┤
2545 │ │ │ │
2546 │daddr │ Destination MAC │ ether_addr │
2547 │ │ address │ │
2548 ├────────┼────────────────────┼────────────┤
2549 │ │ │ │
2550 │saddr │ Source MAC address │ ether_addr │
2551 ├────────┼────────────────────┼────────────┤
2552 │ │ │ │
2553 │type │ EtherType │ ether_type │
2554 └────────┴────────────────────┴────────────┘
2555
2556 VLAN HEADER EXPRESSION
2557 vlan {id | dei | pcp | type}
2558
2559 The vlan expression is used to match on the vlan header fields. This
2560 expression will not work in the ip, ip6 and inet families, unless the
2561 vlan interface is configured with the reorder_hdr off setting. The
2562 default is reorder_hdr on which will automatically remove the vlan tag
2563 from the packet. See ip-link(8) for more information. For these
2564 families its easier to match the vlan interface name instead, using the
2565 meta iif or meta iifname expression.
2566
2567 Table 38. VLAN header expression
2568 ┌────────┬─────────────────────┬──────────────────┐
2569 │Keyword │ Description │ Type │
2570 ├────────┼─────────────────────┼──────────────────┤
2571 │ │ │ │
2572 │id │ VLAN ID (VID) │ integer (12 bit) │
2573 ├────────┼─────────────────────┼──────────────────┤
2574 │ │ │ │
2575 │dei │ Drop Eligible │ integer (1 bit) │
2576 │ │ Indicator │ │
2577 ├────────┼─────────────────────┼──────────────────┤
2578 │ │ │ │
2579 │pcp │ Priority code point │ integer (3 bit) │
2580 ├────────┼─────────────────────┼──────────────────┤
2581 │ │ │ │
2582 │type │ EtherType │ ether_type │
2583 └────────┴─────────────────────┴──────────────────┘
2584
2585 ARP HEADER EXPRESSION
2586 arp {htype | ptype | hlen | plen | operation | saddr { ip | ether } | daddr { ip | ether }
2587
2588 Table 39. ARP header expression
2589 ┌────────────┬─────────────────────┬──────────────────┐
2590 │Keyword │ Description │ Type │
2591 ├────────────┼─────────────────────┼──────────────────┤
2592 │ │ │ │
2593 │htype │ ARP hardware type │ integer (16 bit) │
2594 ├────────────┼─────────────────────┼──────────────────┤
2595 │ │ │ │
2596 │ptype │ EtherType │ ether_type │
2597 ├────────────┼─────────────────────┼──────────────────┤
2598 │ │ │ │
2599 │hlen │ Hardware address │ integer (8 bit) │
2600 │ │ len │ │
2601 ├────────────┼─────────────────────┼──────────────────┤
2602 │ │ │ │
2603 │plen │ Protocol address │ integer (8 bit) │
2604 │ │ len │ │
2605 ├────────────┼─────────────────────┼──────────────────┤
2606 │ │ │ │
2607 │operation │ Operation │ arp_op │
2608 ├────────────┼─────────────────────┼──────────────────┤
2609 │ │ │ │
2610 │saddr ether │ Ethernet sender │ ether_addr │
2611 │ │ address │ │
2612 ├────────────┼─────────────────────┼──────────────────┤
2613 │ │ │ │
2614 │daddr ether │ Ethernet target │ ether_addr │
2615 │ │ address │ │
2616 ├────────────┼─────────────────────┼──────────────────┤
2617 │ │ │ │
2618 │saddr ip │ IPv4 sender address │ ipv4_addr │
2619 ├────────────┼─────────────────────┼──────────────────┤
2620 │ │ │ │
2621 │daddr ip │ IPv4 target address │ ipv4_addr │
2622 └────────────┴─────────────────────┴──────────────────┘
2623
2624 IPV4 HEADER EXPRESSION
2625 ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
2626
2627 Table 40. IPv4 header expression
2628 ┌──────────┬─────────────────────┬──────────────────┐
2629 │Keyword │ Description │ Type │
2630 ├──────────┼─────────────────────┼──────────────────┤
2631 │ │ │ │
2632 │version │ IP header version │ integer (4 bit) │
2633 │ │ (4) │ │
2634 ├──────────┼─────────────────────┼──────────────────┤
2635 │ │ │ │
2636 │hdrlength │ IP header length │ integer (4 bit) │
2637 │ │ including options │ FIXME scaling │
2638 ├──────────┼─────────────────────┼──────────────────┤
2639 │ │ │ │
2640 │dscp │ Differentiated │ dscp │
2641 │ │ Services Code Point │ │
2642 ├──────────┼─────────────────────┼──────────────────┤
2643 │ │ │ │
2644 │ecn │ Explicit Congestion │ ecn │
2645 │ │ Notification │ │
2646 ├──────────┼─────────────────────┼──────────────────┤
2647 │ │ │ │
2648 │length │ Total packet length │ integer (16 bit) │
2649 ├──────────┼─────────────────────┼──────────────────┤
2650 │ │ │ │
2651 │id │ IP ID │ integer (16 bit) │
2652 ├──────────┼─────────────────────┼──────────────────┤
2653 │ │ │ │
2654 │frag-off │ Fragment offset │ integer (16 bit) │
2655 ├──────────┼─────────────────────┼──────────────────┤
2656 │ │ │ │
2657 │ttl │ Time to live │ integer (8 bit) │
2658 ├──────────┼─────────────────────┼──────────────────┤
2659 │ │ │ │
2660 │protocol │ Upper layer │ inet_proto │
2661 │ │ protocol │ │
2662 ├──────────┼─────────────────────┼──────────────────┤
2663 │ │ │ │
2664 │checksum │ IP header checksum │ integer (16 bit) │
2665 ├──────────┼─────────────────────┼──────────────────┤
2666 │ │ │ │
2667 │saddr │ Source address │ ipv4_addr │
2668 ├──────────┼─────────────────────┼──────────────────┤
2669 │ │ │ │
2670 │daddr │ Destination address │ ipv4_addr │
2671 └──────────┴─────────────────────┴──────────────────┘
2672
2673 ICMP HEADER EXPRESSION
2674 icmp {type | code | checksum | id | sequence | gateway | mtu}
2675
2676 This expression refers to ICMP header fields. When using it in inet,
2677 bridge or netdev families, it will cause an implicit dependency on IPv4
2678 to be created. To match on unusual cases like ICMP over IPv6, one has
2679 to add an explicit meta protocol ip6 match to the rule.
2680
2681 Table 41. ICMP header expression
2682 ┌─────────┬─────────────────────┬──────────────────┐
2683 │Keyword │ Description │ Type │
2684 ├─────────┼─────────────────────┼──────────────────┤
2685 │ │ │ │
2686 │type │ ICMP type field │ icmp_type │
2687 ├─────────┼─────────────────────┼──────────────────┤
2688 │ │ │ │
2689 │code │ ICMP code field │ integer (8 bit) │
2690 ├─────────┼─────────────────────┼──────────────────┤
2691 │ │ │ │
2692 │checksum │ ICMP checksum field │ integer (16 bit) │
2693 ├─────────┼─────────────────────┼──────────────────┤
2694 │ │ │ │
2695 │id │ ID of echo │ integer (16 bit) │
2696 │ │ request/response │ │
2697 ├─────────┼─────────────────────┼──────────────────┤
2698 │ │ │ │
2699 │sequence │ sequence number of │ integer (16 bit) │
2700 │ │ echo │ │
2701 │ │ request/response │ │
2702 ├─────────┼─────────────────────┼──────────────────┤
2703 │ │ │ │
2704 │gateway │ gateway of │ integer (32 bit) │
2705 │ │ redirects │ │
2706 ├─────────┼─────────────────────┼──────────────────┤
2707 │ │ │ │
2708 │mtu │ MTU of path MTU │ integer (16 bit) │
2709 │ │ discovery │ │
2710 └─────────┴─────────────────────┴──────────────────┘
2711
2712 IGMP HEADER EXPRESSION
2713 igmp {type | mrt | checksum | group}
2714
2715 This expression refers to IGMP header fields. When using it in inet,
2716 bridge or netdev families, it will cause an implicit dependency on IPv4
2717 to be created. To match on unusual cases like IGMP over IPv6, one has
2718 to add an explicit meta protocol ip6 match to the rule.
2719
2720 Table 42. IGMP header expression
2721 ┌─────────┬─────────────────────┬──────────────────┐
2722 │Keyword │ Description │ Type │
2723 ├─────────┼─────────────────────┼──────────────────┤
2724 │ │ │ │
2725 │type │ IGMP type field │ igmp_type │
2726 ├─────────┼─────────────────────┼──────────────────┤
2727 │ │ │ │
2728 │mrt │ IGMP maximum │ integer (8 bit) │
2729 │ │ response time field │ │
2730 ├─────────┼─────────────────────┼──────────────────┤
2731 │ │ │ │
2732 │checksum │ IGMP checksum field │ integer (16 bit) │
2733 ├─────────┼─────────────────────┼──────────────────┤
2734 │ │ │ │
2735 │group │ Group address │ integer (32 bit) │
2736 └─────────┴─────────────────────┴──────────────────┘
2737
2738 IPV6 HEADER EXPRESSION
2739 ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
2740
2741 This expression refers to the ipv6 header fields. Caution when using
2742 ip6 nexthdr, the value only refers to the next header, i.e. ip6 nexthdr
2743 tcp will only match if the ipv6 packet does not contain any extension
2744 headers. Packets that are fragmented or e.g. contain a routing
2745 extension headers will not be matched. Please use meta l4proto if you
2746 wish to match the real transport header and ignore any additional
2747 extension headers instead.
2748
2749 Table 43. IPv6 header expression
2750 ┌──────────┬─────────────────────┬──────────────────┐
2751 │Keyword │ Description │ Type │
2752 ├──────────┼─────────────────────┼──────────────────┤
2753 │ │ │ │
2754 │version │ IP header version │ integer (4 bit) │
2755 │ │ (6) │ │
2756 ├──────────┼─────────────────────┼──────────────────┤
2757 │ │ │ │
2758 │dscp │ Differentiated │ dscp │
2759 │ │ Services Code Point │ │
2760 ├──────────┼─────────────────────┼──────────────────┤
2761 │ │ │ │
2762 │ecn │ Explicit Congestion │ ecn │
2763 │ │ Notification │ │
2764 ├──────────┼─────────────────────┼──────────────────┤
2765 │ │ │ │
2766 │flowlabel │ Flow label │ integer (20 bit) │
2767 ├──────────┼─────────────────────┼──────────────────┤
2768 │ │ │ │
2769 │length │ Payload length │ integer (16 bit) │
2770 ├──────────┼─────────────────────┼──────────────────┤
2771 │ │ │ │
2772 │nexthdr │ Nexthdr protocol │ inet_proto │
2773 ├──────────┼─────────────────────┼──────────────────┤
2774 │ │ │ │
2775 │hoplimit │ Hop limit │ integer (8 bit) │
2776 ├──────────┼─────────────────────┼──────────────────┤
2777 │ │ │ │
2778 │saddr │ Source address │ ipv6_addr │
2779 ├──────────┼─────────────────────┼──────────────────┤
2780 │ │ │ │
2781 │daddr │ Destination address │ ipv6_addr │
2782 └──────────┴─────────────────────┴──────────────────┘
2783
2784 Using ip6 header expressions.
2785
2786 # matching if first extension header indicates a fragment
2787 ip6 nexthdr ipv6-frag
2788
2789
2790 ICMPV6 HEADER EXPRESSION
2791 icmpv6 {type | code | checksum | parameter-problem | packet-too-big | id | sequence | max-delay}
2792
2793 This expression refers to ICMPv6 header fields. When using it in inet,
2794 bridge or netdev families, it will cause an implicit dependency on IPv6
2795 to be created. To match on unusual cases like ICMPv6 over IPv4, one has
2796 to add an explicit meta protocol ip match to the rule.
2797
2798 Table 44. ICMPv6 header expression
2799 ┌──────────────────┬────────────────────┬──────────────────┐
2800 │Keyword │ Description │ Type │
2801 ├──────────────────┼────────────────────┼──────────────────┤
2802 │ │ │ │
2803 │type │ ICMPv6 type field │ icmpv6_type │
2804 ├──────────────────┼────────────────────┼──────────────────┤
2805 │ │ │ │
2806 │code │ ICMPv6 code field │ integer (8 bit) │
2807 ├──────────────────┼────────────────────┼──────────────────┤
2808 │ │ │ │
2809 │checksum │ ICMPv6 checksum │ integer (16 bit) │
2810 │ │ field │ │
2811 ├──────────────────┼────────────────────┼──────────────────┤
2812 │ │ │ │
2813 │parameter-problem │ pointer to problem │ integer (32 bit) │
2814 ├──────────────────┼────────────────────┼──────────────────┤
2815 │ │ │ │
2816 │packet-too-big │ oversized MTU │ integer (32 bit) │
2817 ├──────────────────┼────────────────────┼──────────────────┤
2818 │ │ │ │
2819 │id │ ID of echo │ integer (16 bit) │
2820 │ │ request/response │ │
2821 ├──────────────────┼────────────────────┼──────────────────┤
2822 │ │ │ │
2823 │sequence │ sequence number of │ integer (16 bit) │
2824 │ │ echo │ │
2825 │ │ request/response │ │
2826 ├──────────────────┼────────────────────┼──────────────────┤
2827 │ │ │ │
2828 │max-delay │ maximum response │ integer (16 bit) │
2829 │ │ delay of MLD │ │
2830 │ │ queries │ │
2831 └──────────────────┴────────────────────┴──────────────────┘
2832
2833 TCP HEADER EXPRESSION
2834 tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
2835
2836 Table 45. TCP header expression
2837 ┌─────────┬──────────────────┬──────────────────┐
2838 │Keyword │ Description │ Type │
2839 ├─────────┼──────────────────┼──────────────────┤
2840 │ │ │ │
2841 │sport │ Source port │ inet_service │
2842 ├─────────┼──────────────────┼──────────────────┤
2843 │ │ │ │
2844 │dport │ Destination port │ inet_service │
2845 ├─────────┼──────────────────┼──────────────────┤
2846 │ │ │ │
2847 │sequence │ Sequence number │ integer (32 bit) │
2848 ├─────────┼──────────────────┼──────────────────┤
2849 │ │ │ │
2850 │ackseq │ Acknowledgement │ integer (32 bit) │
2851 │ │ number │ │
2852 ├─────────┼──────────────────┼──────────────────┤
2853 │ │ │ │
2854 │doff │ Data offset │ integer (4 bit) │
2855 │ │ │ FIXME scaling │
2856 ├─────────┼──────────────────┼──────────────────┤
2857 │ │ │ │
2858 │reserved │ Reserved area │ integer (4 bit) │
2859 ├─────────┼──────────────────┼──────────────────┤
2860 │ │ │ │
2861 │flags │ TCP flags │ tcp_flag │
2862 ├─────────┼──────────────────┼──────────────────┤
2863 │ │ │ │
2864 │window │ Window │ integer (16 bit) │
2865 ├─────────┼──────────────────┼──────────────────┤
2866 │ │ │ │
2867 │checksum │ Checksum │ integer (16 bit) │
2868 ├─────────┼──────────────────┼──────────────────┤
2869 │ │ │ │
2870 │urgptr │ Urgent pointer │ integer (16 bit) │
2871 └─────────┴──────────────────┴──────────────────┘
2872
2873 UDP HEADER EXPRESSION
2874 udp {sport | dport | length | checksum}
2875
2876 Table 46. UDP header expression
2877 ┌─────────┬─────────────────────┬──────────────────┐
2878 │Keyword │ Description │ Type │
2879 ├─────────┼─────────────────────┼──────────────────┤
2880 │ │ │ │
2881 │sport │ Source port │ inet_service │
2882 ├─────────┼─────────────────────┼──────────────────┤
2883 │ │ │ │
2884 │dport │ Destination port │ inet_service │
2885 ├─────────┼─────────────────────┼──────────────────┤
2886 │ │ │ │
2887 │length │ Total packet length │ integer (16 bit) │
2888 ├─────────┼─────────────────────┼──────────────────┤
2889 │ │ │ │
2890 │checksum │ Checksum │ integer (16 bit) │
2891 └─────────┴─────────────────────┴──────────────────┘
2892
2893 UDP-LITE HEADER EXPRESSION
2894 udplite {sport | dport | checksum}
2895
2896 Table 47. UDP-Lite header expression
2897 ┌─────────┬──────────────────┬──────────────────┐
2898 │Keyword │ Description │ Type │
2899 ├─────────┼──────────────────┼──────────────────┤
2900 │ │ │ │
2901 │sport │ Source port │ inet_service │
2902 ├─────────┼──────────────────┼──────────────────┤
2903 │ │ │ │
2904 │dport │ Destination port │ inet_service │
2905 ├─────────┼──────────────────┼──────────────────┤
2906 │ │ │ │
2907 │checksum │ Checksum │ integer (16 bit) │
2908 └─────────┴──────────────────┴──────────────────┘
2909
2910 SCTP HEADER EXPRESSION
2911 sctp {sport | dport | vtag | checksum}
2912 sctp chunk CHUNK [ FIELD ]
2913
2914 CHUNK := data | init | init-ack | sack | heartbeat |
2915 heartbeat-ack | abort | shutdown | shutdown-ack | error |
2916 cookie-echo | cookie-ack | ecne | cwr | shutdown-complete
2917 | asconf-ack | forward-tsn | asconf
2918
2919 FIELD := COMMON_FIELD | DATA_FIELD | INIT_FIELD | INIT_ACK_FIELD |
2920 SACK_FIELD | SHUTDOWN_FIELD | ECNE_FIELD | CWR_FIELD |
2921 ASCONF_ACK_FIELD | FORWARD_TSN_FIELD | ASCONF_FIELD
2922
2923 COMMON_FIELD := type | flags | length
2924 DATA_FIELD := tsn | stream | ssn | ppid
2925 INIT_FIELD := init-tag | a-rwnd | num-outbound-streams |
2926 num-inbound-streams | initial-tsn
2927 INIT_ACK_FIELD := INIT_FIELD
2928 SACK_FIELD := cum-tsn-ack | a-rwnd | num-gap-ack-blocks |
2929 num-dup-tsns
2930 SHUTDOWN_FIELD := cum-tsn-ack
2931 ECNE_FIELD := lowest-tsn
2932 CWR_FIELD := lowest-tsn
2933 ASCONF_ACK_FIELD := seqno
2934 FORWARD_TSN_FIELD := new-cum-tsn
2935 ASCONF_FIELD := seqno
2936
2937 Table 48. SCTP header expression
2938 ┌─────────┬──────────────────┬────────────────────┐
2939 │Keyword │ Description │ Type │
2940 ├─────────┼──────────────────┼────────────────────┤
2941 │ │ │ │
2942 │sport │ Source port │ inet_service │
2943 ├─────────┼──────────────────┼────────────────────┤
2944 │ │ │ │
2945 │dport │ Destination port │ inet_service │
2946 ├─────────┼──────────────────┼────────────────────┤
2947 │ │ │ │
2948 │vtag │ Verification Tag │ integer (32 bit) │
2949 ├─────────┼──────────────────┼────────────────────┤
2950 │ │ │ │
2951 │checksum │ Checksum │ integer (32 bit) │
2952 ├─────────┼──────────────────┼────────────────────┤
2953 │ │ │ │
2954 │chunk │ Search chunk in │ without FIELD, │
2955 │ │ packet │ boolean indicating │
2956 │ │ │ existence │
2957 └─────────┴──────────────────┴────────────────────┘
2958
2959 Table 49. SCTP chunk fields
2960 ┌─────────────────────┬───────────────┬─────────────────┬──────────────────┐
2961 │Name │ Width in bits │ Chunk │ Notes │
2962 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2963 │ │ │ │ │
2964 │type │ 8 │ all │ not useful, │
2965 │ │ │ │ defined by chunk │
2966 │ │ │ │ type │
2967 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2968 │ │ │ │ │
2969 │flags │ 8 │ all │ semantics │
2970 │ │ │ │ defined on │
2971 │ │ │ │ per-chunk basis │
2972 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2973 │ │ │ │ │
2974 │length │ 16 │ all │ length of this │
2975 │ │ │ │ chunk in bytes │
2976 │ │ │ │ excluding │
2977 │ │ │ │ padding │
2978 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2979 │ │ │ │ │
2980 │tsn │ 32 │ data │ transmission │
2981 │ │ │ │ sequence number │
2982 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2983 │ │ │ │ │
2984 │stream │ 16 │ data │ stream │
2985 │ │ │ │ identifier │
2986 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2987 │ │ │ │ │
2988 │ssn │ 16 │ data │ stream sequence │
2989 │ │ │ │ number │
2990 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2991 │ │ │ │ │
2992 │ppid │ 32 │ data │ payload protocol │
2993 │ │ │ │ identifier │
2994 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2995 │ │ │ │ │
2996 │init-tag │ 32 │ init, init-ack │ initiate tag │
2997 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2998 │ │ │ │ │
2999 │a-rwnd │ 32 │ init, init-ack, │ advertised │
3000 │ │ │ sack │ receiver window │
3001 │ │ │ │ credit │
3002 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3003 │ │ │ │ │
3004 │num-outbound-streams │ 16 │ init, init-ack │ number of │
3005 │ │ │ │ outbound streams │
3006 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3007 │ │ │ │ │
3008 │num-inbound-streams │ 16 │ init, init-ack │ number of │
3009 │ │ │ │ inbound streams │
3010 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3011 │ │ │ │ │
3012 │initial-tsn │ 32 │ init, init-ack │ initial transmit │
3013 │ │ │ │ sequence number │
3014 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3015 │ │ │ │ │
3016 │cum-tsn-ack │ 32 │ sack, shutdown │ cumulative │
3017 │ │ │ │ transmission │
3018 │ │ │ │ sequence number │
3019 │ │ │ │ acknowledged │
3020 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3021 │ │ │ │ │
3022 │num-gap-ack-blocks │ 16 │ sack │ number of Gap │
3023 │ │ │ │ Ack Blocks │
3024 │ │ │ │ included │
3025 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3026 │ │ │ │ │
3027 │num-dup-tsns │ 16 │ sack │ number of │
3028 │ │ │ │ duplicate │
3029 │ │ │ │ transmission │
3030 │ │ │ │ sequence numbers │
3031 │ │ │ │ received │
3032 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3033 │ │ │ │ │
3034 │lowest-tsn │ 32 │ ecne, cwr │ lowest │
3035 │ │ │ │ transmission │
3036 │ │ │ │ sequence number │
3037 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3038 │ │ │ │ │
3039 │seqno │ 32 │ asconf-ack, │ sequence number │
3040 │ │ │ asconf │ │
3041 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
3042 │ │ │ │ │
3043 │new-cum-tsn │ 32 │ forward-tsn │ new cumulative │
3044 │ │ │ │ transmission │
3045 │ │ │ │ sequence number │
3046 └─────────────────────┴───────────────┴─────────────────┴──────────────────┘
3047
3048 DCCP HEADER EXPRESSION
3049 dccp {sport | dport | type}
3050
3051 Table 50. DCCP header expression
3052 ┌────────┬──────────────────┬──────────────┐
3053 │Keyword │ Description │ Type │
3054 ├────────┼──────────────────┼──────────────┤
3055 │ │ │ │
3056 │sport │ Source port │ inet_service │
3057 ├────────┼──────────────────┼──────────────┤
3058 │ │ │ │
3059 │dport │ Destination port │ inet_service │
3060 ├────────┼──────────────────┼──────────────┤
3061 │ │ │ │
3062 │type │ Packet type │ dccp_pkttype │
3063 └────────┴──────────────────┴──────────────┘
3064
3065 AUTHENTICATION HEADER EXPRESSION
3066 ah {nexthdr | hdrlength | reserved | spi | sequence}
3067
3068 Table 51. AH header expression
3069 ┌──────────┬────────────────────┬──────────────────┐
3070 │Keyword │ Description │ Type │
3071 ├──────────┼────────────────────┼──────────────────┤
3072 │ │ │ │
3073 │nexthdr │ Next header │ inet_proto │
3074 │ │ protocol │ │
3075 ├──────────┼────────────────────┼──────────────────┤
3076 │ │ │ │
3077 │hdrlength │ AH Header length │ integer (8 bit) │
3078 ├──────────┼────────────────────┼──────────────────┤
3079 │ │ │ │
3080 │reserved │ Reserved area │ integer (16 bit) │
3081 ├──────────┼────────────────────┼──────────────────┤
3082 │ │ │ │
3083 │spi │ Security Parameter │ integer (32 bit) │
3084 │ │ Index │ │
3085 ├──────────┼────────────────────┼──────────────────┤
3086 │ │ │ │
3087 │sequence │ Sequence number │ integer (32 bit) │
3088 └──────────┴────────────────────┴──────────────────┘
3089
3090 ENCRYPTED SECURITY PAYLOAD HEADER EXPRESSION
3091 esp {spi | sequence}
3092
3093 Table 52. ESP header expression
3094 ┌─────────┬────────────────────┬──────────────────┐
3095 │Keyword │ Description │ Type │
3096 ├─────────┼────────────────────┼──────────────────┤
3097 │ │ │ │
3098 │spi │ Security Parameter │ integer (32 bit) │
3099 │ │ Index │ │
3100 ├─────────┼────────────────────┼──────────────────┤
3101 │ │ │ │
3102 │sequence │ Sequence number │ integer (32 bit) │
3103 └─────────┴────────────────────┴──────────────────┘
3104
3105 IPCOMP HEADER EXPRESSION
3106 comp {nexthdr | flags | cpi}
3107
3108 Table 53. IPComp header expression
3109 ┌────────┬─────────────────┬──────────────────┐
3110 │Keyword │ Description │ Type │
3111 ├────────┼─────────────────┼──────────────────┤
3112 │ │ │ │
3113 │nexthdr │ Next header │ inet_proto │
3114 │ │ protocol │ │
3115 ├────────┼─────────────────┼──────────────────┤
3116 │ │ │ │
3117 │flags │ Flags │ bitmask │
3118 ├────────┼─────────────────┼──────────────────┤
3119 │ │ │ │
3120 │cpi │ compression │ integer (16 bit) │
3121 │ │ Parameter Index │ │
3122 └────────┴─────────────────┴──────────────────┘
3123
3124 GRE HEADER EXPRESSION
3125 gre {flags | version | protocol}
3126 gre ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
3127 gre ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
3128
3129 The gre expression is used to match on the gre header fields. This
3130 expression also allows to match on the IPv4 or IPv6 packet within the
3131 gre header.
3132
3133 Table 54. GRE header expression
3134 ┌─────────┬─────────────────────┬──────────────────┐
3135 │Keyword │ Description │ Type │
3136 ├─────────┼─────────────────────┼──────────────────┤
3137 │ │ │ │
3138 │flags │ checksum, routing, │ integer (5 bit) │
3139 │ │ key, sequence and │ │
3140 │ │ strict source route │ │
3141 │ │ flags │ │
3142 ├─────────┼─────────────────────┼──────────────────┤
3143 │ │ │ │
3144 │version │ gre version field, │ integer (3 bit) │
3145 │ │ 0 for GRE and 1 for │ │
3146 │ │ PPTP │ │
3147 ├─────────┼─────────────────────┼──────────────────┤
3148 │ │ │ │
3149 │protocol │ EtherType of │ integer (16 bit) │
3150 │ │ encapsulated packet │ │
3151 └─────────┴─────────────────────┴──────────────────┘
3152
3153 Matching inner IPv4 destination address encapsulated in gre.
3154
3155 netdev filter ingress gre ip daddr 9.9.9.9 counter
3156
3157
3158 GENEVE HEADER EXPRESSION
3159 geneve {vni | flags}
3160 geneve ether {daddr | saddr | type}
3161 geneve vlan {id | dei | pcp | type}
3162 geneve ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
3163 geneve ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
3164 geneve tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
3165 geneve udp {sport | dport | length | checksum}
3166
3167 The geneve expression is used to match on the geneve header fields. The
3168 geneve header encapsulates a ethernet frame within a udp packet. This
3169 expression requires that you restrict the matching to udp packets
3170 (usually at port 6081 according to IANA-assigned ports).
3171
3172 Table 55. GENEVE header expression
3173 ┌─────────┬─────────────────────┬──────────────────┐
3174 │Keyword │ Description │ Type │
3175 ├─────────┼─────────────────────┼──────────────────┤
3176 │ │ │ │
3177 │protocol │ EtherType of │ integer (16 bit) │
3178 │ │ encapsulated packet │ │
3179 ├─────────┼─────────────────────┼──────────────────┤
3180 │ │ │ │
3181 │vni │ Virtual Network ID │ integer (24 bit) │
3182 │ │ (VNI) │ │
3183 └─────────┴─────────────────────┴──────────────────┘
3184
3185 Matching inner TCP destination port encapsulated in geneve.
3186
3187 netdev filter ingress udp dport 4789 geneve tcp dport 80 counter
3188
3189
3190 GRETAP HEADER EXPRESSION
3191 gretap {vni | flags}
3192 gretap ether {daddr | saddr | type}
3193 gretap vlan {id | dei | pcp | type}
3194 gretap ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
3195 gretap ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
3196 gretap tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
3197 gretap udp {sport | dport | length | checksum}
3198
3199 The gretap expression is used to match on the encapsulated ethernet
3200 frame within the gre header. Use the gre expression to match on the gre
3201 header fields.
3202
3203 Matching inner TCP destination port encapsulated in gretap.
3204
3205 netdev filter ingress gretap tcp dport 80 counter
3206
3207
3208 VXLAN HEADER EXPRESSION
3209 vxlan {vni | flags}
3210 vxlan ether {daddr | saddr | type}
3211 vxlan vlan {id | dei | pcp | type}
3212 vxlan ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
3213 vxlan ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
3214 vxlan tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
3215 vxlan udp {sport | dport | length | checksum}
3216
3217 The vxlan expression is used to match on the vxlan header fields. The
3218 vxlan header encapsulates a ethernet frame within a udp packet. This
3219 expression requires that you restrict the matching to udp packets
3220 (usually at port 4789 according to IANA-assigned ports).
3221
3222 Table 56. VXLAN header expression
3223 ┌────────┬────────────────────┬──────────────────┐
3224 │Keyword │ Description │ Type │
3225 ├────────┼────────────────────┼──────────────────┤
3226 │ │ │ │
3227 │flags │ vxlan flags │ integer (8 bit) │
3228 ├────────┼────────────────────┼──────────────────┤
3229 │ │ │ │
3230 │vni │ Virtual Network ID │ integer (24 bit) │
3231 │ │ (VNI) │ │
3232 └────────┴────────────────────┴──────────────────┘
3233
3234 Matching inner TCP destination port encapsulated in vxlan.
3235
3236 netdev filter ingress udp dport 4789 vxlan tcp dport 80 counter
3237
3238
3239 ARP HEADER EXPRESSION
3240 arp {htype | ptype | hlen | plen | operation | saddr { ip | ether } | daddr { ip | ether }
3241
3242 Table 57. ARP header expression
3243 ┌────────────┬─────────────────────┬──────────────────┐
3244 │Keyword │ Description │ Type │
3245 ├────────────┼─────────────────────┼──────────────────┤
3246 │ │ │ │
3247 │htype │ ARP hardware type │ integer (16 bit) │
3248 ├────────────┼─────────────────────┼──────────────────┤
3249 │ │ │ │
3250 │ptype │ EtherType │ ether_type │
3251 ├────────────┼─────────────────────┼──────────────────┤
3252 │ │ │ │
3253 │hlen │ Hardware address │ integer (8 bit) │
3254 │ │ len │ │
3255 ├────────────┼─────────────────────┼──────────────────┤
3256 │ │ │ │
3257 │plen │ Protocol address │ integer (8 bit) │
3258 │ │ len │ │
3259 ├────────────┼─────────────────────┼──────────────────┤
3260 │ │ │ │
3261 │operation │ Operation │ arp_op │
3262 ├────────────┼─────────────────────┼──────────────────┤
3263 │ │ │ │
3264 │saddr ether │ Ethernet sender │ ether_addr │
3265 │ │ address │ │
3266 ├────────────┼─────────────────────┼──────────────────┤
3267 │ │ │ │
3268 │daddr ether │ Ethernet target │ ether_addr │
3269 │ │ address │ │
3270 ├────────────┼─────────────────────┼──────────────────┤
3271 │ │ │ │
3272 │saddr ip │ IPv4 sender address │ ipv4_addr │
3273 ├────────────┼─────────────────────┼──────────────────┤
3274 │ │ │ │
3275 │daddr ip │ IPv4 target address │ ipv4_addr │
3276 └────────────┴─────────────────────┴──────────────────┘
3277
3278 RAW PAYLOAD EXPRESSION
3279 @base,offset,length
3280
3281 The raw payload expression instructs to load length bits starting at
3282 offset bits. Bit 0 refers to the very first bit — in the C programming
3283 language, this corresponds to the topmost bit, i.e. 0x80 in case of an
3284 octet. They are useful to match headers that do not have a
3285 human-readable template expression yet. Note that nft will not add
3286 dependencies for Raw payload expressions. If you e.g. want to match
3287 protocol fields of a transport header with protocol number 5, you need
3288 to manually exclude packets that have a different transport header, for
3289 instance by using meta l4proto 5 before the raw expression.
3290
3291 Table 58. Supported payload protocol bases
3292 ┌─────┬─────────────────────────┐
3293 │Base │ Description │
3294 ├─────┼─────────────────────────┤
3295 │ │ │
3296 │ll │ Link layer, for example │
3297 │ │ the Ethernet header │
3298 ├─────┼─────────────────────────┤
3299 │ │ │
3300 │nh │ Network header, for │
3301 │ │ example IPv4 or IPv6 │
3302 ├─────┼─────────────────────────┤
3303 │ │ │
3304 │th │ Transport Header, for │
3305 │ │ example TCP │
3306 ├─────┼─────────────────────────┤
3307 │ │ │
3308 │ih │ Inner Header / Payload, │
3309 │ │ i.e. after the L4 │
3310 │ │ transport level header │
3311 └─────┴─────────────────────────┘
3312
3313 Matching destination port of both UDP and TCP.
3314
3315 inet filter input meta l4proto {tcp, udp} @th,16,16 { 53, 80 }
3316
3317 The above can also be written as
3318
3319 inet filter input meta l4proto {tcp, udp} th dport { 53, 80 }
3320
3321 it is more convenient, but like the raw expression notation no
3322 dependencies are created or checked. It is the users responsibility to
3323 restrict matching to those header types that have a notion of ports.
3324 Otherwise, rules using raw expressions will errnously match unrelated
3325 packets, e.g. mis-interpreting ESP packets SPI field as a port.
3326
3327 Rewrite arp packet target hardware address if target protocol address
3328 matches a given address.
3329
3330 input meta iifname enp2s0 arp ptype 0x0800 arp htype 1 arp hlen 6 arp plen 4 @nh,192,32 0xc0a88f10 @nh,144,48 set 0x112233445566 accept
3331
3332
3333 EXTENSION HEADER EXPRESSIONS
3334 Extension header expressions refer to data from variable-sized protocol
3335 headers, such as IPv6 extension headers, TCP options and IPv4 options.
3336
3337 nftables currently supports matching (finding) a given ipv6 extension
3338 header, TCP option or IPv4 option.
3339
3340 hbh {nexthdr | hdrlength}
3341 frag {nexthdr | frag-off | more-fragments | id}
3342 rt {nexthdr | hdrlength | type | seg-left}
3343 dst {nexthdr | hdrlength}
3344 mh {nexthdr | hdrlength | checksum | type}
3345 srh {flags | tag | sid | seg-left}
3346 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp} tcp_option_field
3347 ip option { lsrr | ra | rr | ssrr } ip_option_field
3348
3349 The following syntaxes are valid only in a relational expression with
3350 boolean type on right-hand side for checking header existence only:
3351
3352 exthdr {hbh | frag | rt | dst | mh}
3353 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp}
3354 ip option { lsrr | ra | rr | ssrr }
3355
3356 Table 59. IPv6 extension headers
3357 ┌────────┬────────────────────────┐
3358 │Keyword │ Description │
3359 ├────────┼────────────────────────┤
3360 │ │ │
3361 │hbh │ Hop by Hop │
3362 ├────────┼────────────────────────┤
3363 │ │ │
3364 │rt │ Routing Header │
3365 ├────────┼────────────────────────┤
3366 │ │ │
3367 │frag │ Fragmentation header │
3368 ├────────┼────────────────────────┤
3369 │ │ │
3370 │dst │ dst options │
3371 ├────────┼────────────────────────┤
3372 │ │ │
3373 │mh │ Mobility Header │
3374 ├────────┼────────────────────────┤
3375 │ │ │
3376 │srh │ Segment Routing Header │
3377 └────────┴────────────────────────┘
3378
3379 Table 60. TCP Options
3380 ┌──────────┬─────────────────────┬─────────────────────┐
3381 │Keyword │ Description │ TCP option fields │
3382 ├──────────┼─────────────────────┼─────────────────────┤
3383 │ │ │ │
3384 │eol │ End if option list │ - │
3385 ├──────────┼─────────────────────┼─────────────────────┤
3386 │ │ │ │
3387 │nop │ 1 Byte TCP Nop │ - │
3388 │ │ padding option │ │
3389 ├──────────┼─────────────────────┼─────────────────────┤
3390 │ │ │ │
3391 │maxseg │ TCP Maximum Segment │ length, size │
3392 │ │ Size │ │
3393 ├──────────┼─────────────────────┼─────────────────────┤
3394 │ │ │ │
3395 │window │ TCP Window Scaling │ length, count │
3396 ├──────────┼─────────────────────┼─────────────────────┤
3397 │ │ │ │
3398 │sack-perm │ TCP SACK permitted │ length │
3399 ├──────────┼─────────────────────┼─────────────────────┤
3400 │ │ │ │
3401 │sack │ TCP Selective │ length, left, right │
3402 │ │ Acknowledgement │ │
3403 │ │ (alias of block 0) │ │
3404 ├──────────┼─────────────────────┼─────────────────────┤
3405 │ │ │ │
3406 │sack0 │ TCP Selective │ length, left, right │
3407 │ │ Acknowledgement │ │
3408 │ │ (block 0) │ │
3409 ├──────────┼─────────────────────┼─────────────────────┤
3410 │ │ │ │
3411 │sack1 │ TCP Selective │ length, left, right │
3412 │ │ Acknowledgement │ │
3413 │ │ (block 1) │ │
3414 ├──────────┼─────────────────────┼─────────────────────┤
3415 │ │ │ │
3416 │sack2 │ TCP Selective │ length, left, right │
3417 │ │ Acknowledgement │ │
3418 │ │ (block 2) │ │
3419 ├──────────┼─────────────────────┼─────────────────────┤
3420 │ │ │ │
3421 │sack3 │ TCP Selective │ length, left, right │
3422 │ │ Acknowledgement │ │
3423 │ │ (block 3) │ │
3424 ├──────────┼─────────────────────┼─────────────────────┤
3425 │ │ │ │
3426 │timestamp │ TCP Timestamps │ length, tsval, │
3427 │ │ │ tsecr │
3428 └──────────┴─────────────────────┴─────────────────────┘
3429
3430 TCP option matching also supports raw expression syntax to access
3431 arbitrary options:
3432
3433 tcp option
3434
3435 tcp option @number,offset,length
3436
3437 Table 61. IP Options
3438 ┌────────┬─────────────────────┬─────────────────────┐
3439 │Keyword │ Description │ IP option fields │
3440 ├────────┼─────────────────────┼─────────────────────┤
3441 │ │ │ │
3442 │lsrr │ Loose Source Route │ type, length, ptr, │
3443 │ │ │ addr │
3444 ├────────┼─────────────────────┼─────────────────────┤
3445 │ │ │ │
3446 │ra │ Router Alert │ type, length, value │
3447 ├────────┼─────────────────────┼─────────────────────┤
3448 │ │ │ │
3449 │rr │ Record Route │ type, length, ptr, │
3450 │ │ │ addr │
3451 ├────────┼─────────────────────┼─────────────────────┤
3452 │ │ │ │
3453 │ssrr │ Strict Source Route │ type, length, ptr, │
3454 │ │ │ addr │
3455 └────────┴─────────────────────┴─────────────────────┘
3456
3457 finding TCP options.
3458
3459 filter input tcp option sack-perm exists counter
3460
3461 matching TCP options.
3462
3463 filter input tcp option maxseg size lt 536
3464
3465 matching IPv6 exthdr.
3466
3467 ip6 filter input frag more-fragments 1 counter
3468
3469 finding IP option.
3470
3471 filter input ip option lsrr exists counter
3472
3473
3474 CONNTRACK EXPRESSIONS
3475 Conntrack expressions refer to meta data of the connection tracking
3476 entry associated with a packet.
3477
3478 There are three types of conntrack expressions. Some conntrack
3479 expressions require the flow direction before the conntrack key, others
3480 must be used directly because they are direction agnostic. The packets,
3481 bytes and avgpkt keywords can be used with or without a direction. If
3482 the direction is omitted, the sum of the original and the reply
3483 direction is returned. The same is true for the zone, if a direction is
3484 given, the zone is only matched if the zone id is tied to the given
3485 direction.
3486
3487 ct {state | direction | status | mark | expiration | helper | label | count | id}
3488 ct [original | reply] {l3proto | protocol | bytes | packets | avgpkt | zone}
3489 ct {original | reply} {proto-src | proto-dst}
3490 ct {original | reply} {ip | ip6} {saddr | daddr}
3491
3492 The conntrack-specific types in this table are described in the
3493 sub-section CONNTRACK TYPES above.
3494
3495 Table 62. Conntrack expressions
3496 ┌───────────┬─────────────────────┬─────────────────────┐
3497 │Keyword │ Description │ Type │
3498 ├───────────┼─────────────────────┼─────────────────────┤
3499 │ │ │ │
3500 │state │ State of the │ ct_state │
3501 │ │ connection │ │
3502 ├───────────┼─────────────────────┼─────────────────────┤
3503 │ │ │ │
3504 │direction │ Direction of the │ ct_dir │
3505 │ │ packet relative to │ │
3506 │ │ the connection │ │
3507 ├───────────┼─────────────────────┼─────────────────────┤
3508 │ │ │ │
3509 │status │ Status of the │ ct_status │
3510 │ │ connection │ │
3511 ├───────────┼─────────────────────┼─────────────────────┤
3512 │ │ │ │
3513 │mark │ Connection mark │ mark │
3514 ├───────────┼─────────────────────┼─────────────────────┤
3515 │ │ │ │
3516 │expiration │ Connection │ time │
3517 │ │ expiration time │ │
3518 ├───────────┼─────────────────────┼─────────────────────┤
3519 │ │ │ │
3520 │helper │ Helper associated │ string │
3521 │ │ with the connection │ │
3522 ├───────────┼─────────────────────┼─────────────────────┤
3523 │ │ │ │
3524 │label │ Connection tracking │ ct_label │
3525 │ │ label bit or │ │
3526 │ │ symbolic name │ │
3527 │ │ defined in │ │
3528 │ │ connlabel.conf in │ │
3529 │ │ the nftables │ │
3530 │ │ include path │ │
3531 ├───────────┼─────────────────────┼─────────────────────┤
3532 │ │ │ │
3533 │l3proto │ Layer 3 protocol of │ nf_proto │
3534 │ │ the connection │ │
3535 ├───────────┼─────────────────────┼─────────────────────┤
3536 │ │ │ │
3537 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
3538 │ │ the connection for │ │
3539 │ │ the given direction │ │
3540 ├───────────┼─────────────────────┼─────────────────────┤
3541 │ │ │ │
3542 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
3543 │ │ of the connection │ │
3544 │ │ for the given │ │
3545 │ │ direction │ │
3546 ├───────────┼─────────────────────┼─────────────────────┤
3547 │ │ │ │
3548 │protocol │ Layer 4 protocol of │ inet_proto │
3549 │ │ the connection for │ │
3550 │ │ the given direction │ │
3551 ├───────────┼─────────────────────┼─────────────────────┤
3552 │ │ │ │
3553 │proto-src │ Layer 4 protocol │ integer (16 bit) │
3554 │ │ source for the │ │
3555 │ │ given direction │ │
3556 ├───────────┼─────────────────────┼─────────────────────┤
3557 │ │ │ │
3558 │proto-dst │ Layer 4 protocol │ integer (16 bit) │
3559 │ │ destination for the │ │
3560 │ │ given direction │ │
3561 ├───────────┼─────────────────────┼─────────────────────┤
3562 │ │ │ │
3563 │packets │ packet count seen │ integer (64 bit) │
3564 │ │ in the given │ │
3565 │ │ direction or sum of │ │
3566 │ │ original and reply │ │
3567 ├───────────┼─────────────────────┼─────────────────────┤
3568 │ │ │ │
3569 │bytes │ byte count seen, │ integer (64 bit) │
3570 │ │ see description for │ │
3571 │ │ packets keyword │ │
3572 ├───────────┼─────────────────────┼─────────────────────┤
3573 │ │ │ │
3574 │avgpkt │ average bytes per │ integer (64 bit) │
3575 │ │ packet, see │ │
3576 │ │ description for │ │
3577 │ │ packets keyword │ │
3578 ├───────────┼─────────────────────┼─────────────────────┤
3579 │ │ │ │
3580 │zone │ conntrack zone │ integer (16 bit) │
3581 ├───────────┼─────────────────────┼─────────────────────┤
3582 │ │ │ │
3583 │count │ number of current │ integer (32 bit) │
3584 │ │ connections │ │
3585 ├───────────┼─────────────────────┼─────────────────────┤
3586 │ │ │ │
3587 │id │ Connection id │ ct_id │
3588 └───────────┴─────────────────────┴─────────────────────┘
3589
3590 restrict the number of parallel connections to a server.
3591
3592 nft add set filter ssh_flood '{ type ipv4_addr; flags dynamic; }'
3593 nft add rule filter input tcp dport 22 add @ssh_flood '{ ip saddr ct count over 2 }' reject
3594
3595
3597 Statements represent actions to be performed. They can alter control
3598 flow (return, jump to a different chain, accept or drop the packet) or
3599 can perform actions, such as logging, rejecting a packet, etc.
3600
3601 Statements exist in two kinds. Terminal statements unconditionally
3602 terminate evaluation of the current rule, non-terminal statements
3603 either only conditionally or never terminate evaluation of the current
3604 rule, in other words, they are passive from the ruleset evaluation
3605 perspective. There can be an arbitrary amount of non-terminal
3606 statements in a rule, but only a single terminal statement as the final
3607 statement.
3608
3609 VERDICT STATEMENT
3610 The verdict statement alters control flow in the ruleset and issues
3611 policy decisions for packets.
3612
3613 {accept | drop | queue | continue | return}
3614 {jump | goto} chain
3615
3616 accept and drop are absolute verdicts — they terminate ruleset
3617 evaluation immediately.
3618
3619
3620 accept Terminate ruleset
3621 evaluation and accept the
3622 packet. The packet can
3623 still be dropped later by
3624 another hook, for instance
3625 accept in the forward hook
3626 still allows one to drop
3627 the packet later in the
3628 postrouting hook, or
3629 another forward base chain
3630 that has a higher priority
3631 number and is evaluated
3632 afterwards in the
3633 processing pipeline.
3634
3635 drop Terminate ruleset
3636 evaluation and drop the
3637 packet. The drop occurs
3638 instantly, no further
3639 chains or hooks are
3640 evaluated. It is not
3641 possible to accept the
3642 packet in a later chain
3643 again, as those are not
3644 evaluated anymore for the
3645 packet.
3646
3647 queue Terminate ruleset
3648 evaluation and queue the
3649 packet to userspace.
3650 Userspace must provide a
3651 drop or accept verdict. In
3652 case of accept, processing
3653 resumes with the next base
3654 chain hook, not the rule
3655 following the queue
3656 verdict.
3657
3658 continue Continue ruleset
3659 evaluation with the next
3660 rule. This is the default
3661 behaviour in case a rule
3662 issues no verdict.
3663
3664 return Return from the current
3665 chain and continue
3666 evaluation at the next
3667 rule in the last chain. If
3668 issued in a base chain, it
3669 is equivalent to the base
3670 chain policy.
3671
3672 jump chain Continue evaluation at the
3673 first rule in chain. The
3674 current position in the
3675 ruleset is pushed to a
3676 call stack and evaluation
3677 will continue there when
3678 the new chain is entirely
3679 evaluated or a return
3680 verdict is issued. In case
3681 an absolute verdict is
3682 issued by a rule in the
3683 chain, ruleset evaluation
3684 terminates immediately and
3685 the specific action is
3686 taken.
3687
3688 goto chain Similar to jump, but the
3689 current position is not
3690 pushed to the call stack,
3691 meaning that after the new
3692 chain evaluation will
3693 continue at the last chain
3694 instead of the one
3695 containing the goto
3696 statement.
3697
3698
3699 Using verdict statements.
3700
3701 # process packets from eth0 and the internal network in from_lan
3702 # chain, drop all packets from eth0 with different source addresses.
3703
3704 filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan
3705 filter input iif eth0 drop
3706
3707
3708 PAYLOAD STATEMENT
3709 payload_expression set value
3710
3711 The payload statement alters packet content. It can be used for example
3712 to set ip DSCP (diffserv) header field or ipv6 flow labels.
3713
3714 route some packets instead of bridging.
3715
3716 # redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging
3717 # assumes 00:11:22:33:44:55 is local MAC address.
3718 bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55
3719
3720 Set IPv4 DSCP header field.
3721
3722 ip forward ip dscp set 42
3723
3724
3725 EXTENSION HEADER STATEMENT
3726 extension_header_expression set value
3727
3728 The extension header statement alters packet content in variable-sized
3729 headers. This can currently be used to alter the TCP Maximum segment
3730 size of packets, similar to the TCPMSS target in iptables.
3731
3732 change tcp mss.
3733
3734 tcp flags syn tcp option maxseg size set 1360
3735 # set a size based on route information:
3736 tcp flags syn tcp option maxseg size set rt mtu
3737
3738 You can also remove tcp options via reset keyword.
3739
3740 remove tcp option.
3741
3742 tcp flags syn reset tcp option sack-perm
3743
3744
3745 LOG STATEMENT
3746 log [prefix quoted_string] [level syslog-level] [flags log-flags]
3747 log group nflog_group [prefix quoted_string] [queue-threshold value] [snaplen size]
3748 log level audit
3749
3750 The log statement enables logging of matching packets. When this
3751 statement is used from a rule, the Linux kernel will print some
3752 information on all matching packets, such as header fields, via the
3753 kernel log (where it can be read with dmesg(1) or read in the syslog).
3754
3755 In the second form of invocation (if nflog_group is specified), the
3756 Linux kernel will pass the packet to nfnetlink_log which will send the
3757 log through a netlink socket to the specified group. One userspace
3758 process may subscribe to the group to receive the logs, see man(8)
3759 ulogd for the Netfilter userspace log daemon and libnetfilter_log
3760 documentation for details in case you would like to develop a custom
3761 program to digest your logs.
3762
3763 In the third form of invocation (if level audit is specified), the
3764 Linux kernel writes a message into the audit buffer suitably formatted
3765 for reading with auditd. Therefore no further formatting options (such
3766 as prefix or flags) are allowed in this mode.
3767
3768 This is a non-terminating statement, so the rule evaluation continues
3769 after the packet is logged.
3770
3771 Table 63. log statement options
3772 ┌────────────────┬─────────────────────┬───────────────────┐
3773 │Keyword │ Description │ Type │
3774 ├────────────────┼─────────────────────┼───────────────────┤
3775 │ │ │ │
3776 │prefix │ Log message prefix │ quoted string │
3777 ├────────────────┼─────────────────────┼───────────────────┤
3778 │ │ │ │
3779 │level │ Syslog level of │ string: emerg, │
3780 │ │ logging │ alert, crit, err, │
3781 │ │ │ warn [default], │
3782 │ │ │ notice, info, │
3783 │ │ │ debug, audit │
3784 ├────────────────┼─────────────────────┼───────────────────┤
3785 │ │ │ │
3786 │group │ NFLOG group to send │ unsigned integer │
3787 │ │ messages to │ (16 bit) │
3788 ├────────────────┼─────────────────────┼───────────────────┤
3789 │ │ │ │
3790 │snaplen │ Length of packet │ unsigned integer │
3791 │ │ payload to include │ (32 bit) │
3792 │ │ in netlink message │ │
3793 ├────────────────┼─────────────────────┼───────────────────┤
3794 │ │ │ │
3795 │queue-threshold │ Number of packets │ unsigned integer │
3796 │ │ to queue inside the │ (32 bit) │
3797 │ │ kernel before │ │
3798 │ │ sending them to │ │
3799 │ │ userspace │ │
3800 └────────────────┴─────────────────────┴───────────────────┘
3801
3802 Table 64. log-flags
3803 ┌─────────────┬───────────────────────────┐
3804 │Flag │ Description │
3805 ├─────────────┼───────────────────────────┤
3806 │ │ │
3807 │tcp sequence │ Log TCP sequence numbers. │
3808 ├─────────────┼───────────────────────────┤
3809 │ │ │
3810 │tcp options │ Log options from the TCP │
3811 │ │ packet header. │
3812 ├─────────────┼───────────────────────────┤
3813 │ │ │
3814 │ip options │ Log options from the │
3815 │ │ IP/IPv6 packet header. │
3816 ├─────────────┼───────────────────────────┤
3817 │ │ │
3818 │skuid │ Log the userid of the │
3819 │ │ process which generated │
3820 │ │ the packet. │
3821 ├─────────────┼───────────────────────────┤
3822 │ │ │
3823 │ether │ Decode MAC addresses and │
3824 │ │ protocol. │
3825 ├─────────────┼───────────────────────────┤
3826 │ │ │
3827 │all │ Enable all log flags │
3828 │ │ listed above. │
3829 └─────────────┴───────────────────────────┘
3830
3831 Using log statement.
3832
3833 # log the UID which generated the packet and ip options
3834 ip filter output log flags skuid flags ip options
3835
3836 # log the tcp sequence numbers and tcp options from the TCP packet
3837 ip filter output log flags tcp sequence,options
3838
3839 # enable all supported log flags
3840 ip6 filter output log flags all
3841
3842
3843 REJECT STATEMENT
3844 reject [ with REJECT_WITH ]
3845
3846 REJECT_WITH := icmp icmp_code |
3847 icmpv6 icmpv6_code |
3848 icmpx icmpx_code |
3849 tcp reset
3850
3851 A reject statement is used to send back an error packet in response to
3852 the matched packet otherwise it is equivalent to drop so it is a
3853 terminating statement, ending rule traversal. This statement is only
3854 valid in base chains using the input, forward or output hooks, and
3855 user-defined chains which are only called from those chains.
3856
3857 Table 65. different ICMP reject variants are meant for use in different
3858 table families
3859 ┌────────┬────────┬─────────────┐
3860 │Variant │ Family │ Type │
3861 ├────────┼────────┼─────────────┤
3862 │ │ │ │
3863 │icmp │ ip │ icmp_code │
3864 ├────────┼────────┼─────────────┤
3865 │ │ │ │
3866 │icmpv6 │ ip6 │ icmpv6_code │
3867 ├────────┼────────┼─────────────┤
3868 │ │ │ │
3869 │icmpx │ inet │ icmpx_code │
3870 └────────┴────────┴─────────────┘
3871
3872 For a description of the different types and a list of supported
3873 keywords refer to DATA TYPES section above. The common default reject
3874 value is port-unreachable.
3875
3876 Note that in bridge family, reject statement is only allowed in base
3877 chains which hook into input or prerouting.
3878
3879 COUNTER STATEMENT
3880 A counter statement sets the hit count of packets along with the number
3881 of bytes.
3882
3883 counter packets number bytes number
3884 counter { packets number | bytes number }
3885
3886 CONNTRACK STATEMENT
3887 The conntrack statement can be used to set the conntrack mark and
3888 conntrack labels.
3889
3890 ct {mark | event | label | zone} set value
3891
3892 The ct statement sets meta data associated with a connection. The zone
3893 id has to be assigned before a conntrack lookup takes place, i.e. this
3894 has to be done in prerouting and possibly output (if locally generated
3895 packets need to be placed in a distinct zone), with a hook priority of
3896 raw (-300).
3897
3898 Unlike iptables, where the helper assignment happens in the raw table,
3899 the helper needs to be assigned after a conntrack entry has been found,
3900 i.e. it will not work when used with hook priorities equal or before
3901 -200.
3902
3903 Table 66. Conntrack statement types
3904 ┌────────┬─────────────────────┬──────────────────┐
3905 │Keyword │ Description │ Value │
3906 ├────────┼─────────────────────┼──────────────────┤
3907 │ │ │ │
3908 │event │ conntrack event │ bitmask, integer │
3909 │ │ bits │ (32 bit) │
3910 ├────────┼─────────────────────┼──────────────────┤
3911 │ │ │ │
3912 │helper │ name of ct helper │ quoted string │
3913 │ │ object to assign to │ │
3914 │ │ the connection │ │
3915 ├────────┼─────────────────────┼──────────────────┤
3916 │ │ │ │
3917 │mark │ Connection tracking │ mark │
3918 │ │ mark │ │
3919 ├────────┼─────────────────────┼──────────────────┤
3920 │ │ │ │
3921 │label │ Connection tracking │ label │
3922 │ │ label │ │
3923 ├────────┼─────────────────────┼──────────────────┤
3924 │ │ │ │
3925 │zone │ conntrack zone │ integer (16 bit) │
3926 └────────┴─────────────────────┴──────────────────┘
3927
3928 save packet nfmark in conntrack.
3929
3930 ct mark set meta mark
3931
3932 set zone mapped via interface.
3933
3934 table inet raw {
3935 chain prerouting {
3936 type filter hook prerouting priority raw;
3937 ct zone set iif map { "eth1" : 1, "veth1" : 2 }
3938 }
3939 chain output {
3940 type filter hook output priority raw;
3941 ct zone set oif map { "eth1" : 1, "veth1" : 2 }
3942 }
3943 }
3944
3945 restrict events reported by ctnetlink.
3946
3947 ct event set new,related,destroy
3948
3949
3950 NOTRACK STATEMENT
3951 The notrack statement allows one to disable connection tracking for
3952 certain packets.
3953
3954 notrack
3955
3956 Note that for this statement to be effective, it has to be applied to
3957 packets before a conntrack lookup happens. Therefore, it needs to sit
3958 in a chain with either prerouting or output hook and a hook priority of
3959 -300 (raw) or less.
3960
3961 See SYNPROXY STATEMENT for an example usage.
3962
3963 META STATEMENT
3964 A meta statement sets the value of a meta expression. The existing meta
3965 fields are: priority, mark, pkttype, nftrace.
3966
3967 meta {mark | priority | pkttype | nftrace} set value
3968
3969 A meta statement sets meta data associated with a packet.
3970
3971 Table 67. Meta statement types
3972 ┌─────────┬─────────────────────┬───────────┐
3973 │Keyword │ Description │ Value │
3974 ├─────────┼─────────────────────┼───────────┤
3975 │ │ │ │
3976 │priority │ TC packet priority │ tc_handle │
3977 ├─────────┼─────────────────────┼───────────┤
3978 │ │ │ │
3979 │mark │ Packet mark │ mark │
3980 ├─────────┼─────────────────────┼───────────┤
3981 │ │ │ │
3982 │pkttype │ packet type │ pkt_type │
3983 ├─────────┼─────────────────────┼───────────┤
3984 │ │ │ │
3985 │nftrace │ ruleset packet │ 0, 1 │
3986 │ │ tracing on/off. Use │ │
3987 │ │ monitor trace │ │
3988 │ │ command to watch │ │
3989 │ │ traces │ │
3990 └─────────┴─────────────────────┴───────────┘
3991
3992 LIMIT STATEMENT
3993 limit rate [over] packet_number / TIME_UNIT [burst packet_number packets]
3994 limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT]
3995
3996 TIME_UNIT := second | minute | hour | day
3997 BYTE_UNIT := bytes | kbytes | mbytes
3998
3999 A limit statement matches at a limited rate using a token bucket
4000 filter. A rule using this statement will match until this limit is
4001 reached. It can be used in combination with the log statement to give
4002 limited logging. The optional over keyword makes it match over the
4003 specified rate.
4004
4005 The burst value influences the bucket size, i.e. jitter tolerance. With
4006 packet-based limit, the bucket holds exactly burst packets, by default
4007 five. If you specify packet burst, it must be a non-zero value. With
4008 byte-based limit, the bucket’s minimum size is the given rate’s byte
4009 value and the burst value adds to that, by default zero bytes.
4010
4011 Table 68. limit statement values
4012 ┌──────────────┬───────────────────┬──────────────────┐
4013 │Value │ Description │ Type │
4014 ├──────────────┼───────────────────┼──────────────────┤
4015 │ │ │ │
4016 │packet_number │ Number of packets │ unsigned integer │
4017 │ │ │ (32 bit) │
4018 ├──────────────┼───────────────────┼──────────────────┤
4019 │ │ │ │
4020 │byte_number │ Number of bytes │ unsigned integer │
4021 │ │ │ (32 bit) │
4022 └──────────────┴───────────────────┴──────────────────┘
4023
4024 NAT STATEMENTS
4025 snat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
4026 dnat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
4027 masquerade [to :PORT_SPEC] [FLAGS]
4028 redirect [to :PORT_SPEC] [FLAGS]
4029
4030 ADDR_SPEC := address | address - address
4031 PORT_SPEC := port | port - port
4032
4033 FLAGS := FLAG [, FLAGS]
4034 FLAG := persistent | random | fully-random
4035
4036 The nat statements are only valid from nat chain types.
4037
4038 The snat and masquerade statements specify that the source address of
4039 the packet should be modified. While snat is only valid in the
4040 postrouting and input chains, masquerade makes sense only in
4041 postrouting. The dnat and redirect statements are only valid in the
4042 prerouting and output chains, they specify that the destination address
4043 of the packet should be modified. You can use non-base chains which are
4044 called from base chains of nat chain type too. All future packets in
4045 this connection will also be mangled, and rules should cease being
4046 examined.
4047
4048 The masquerade statement is a special form of snat which always uses
4049 the outgoing interface’s IP address to translate to. It is particularly
4050 useful on gateways with dynamic (public) IP addresses.
4051
4052 The redirect statement is a special form of dnat which always
4053 translates the destination address to the local host’s one. It comes in
4054 handy if one only wants to alter the destination port of incoming
4055 traffic on different interfaces.
4056
4057 When used in the inet family (available with kernel 5.2), the dnat and
4058 snat statements require the use of the ip and ip6 keyword in case an
4059 address is provided, see the examples below.
4060
4061 Before kernel 4.18 nat statements require both prerouting and
4062 postrouting base chains to be present since otherwise packets on the
4063 return path won’t be seen by netfilter and therefore no reverse
4064 translation will take place.
4065
4066 Table 69. NAT statement values
4067 ┌───────────┬─────────────────────┬─────────────────────┐
4068 │Expression │ Description │ Type │
4069 ├───────────┼─────────────────────┼─────────────────────┤
4070 │ │ │ │
4071 │address │ Specifies that the │ ipv4_addr, │
4072 │ │ source/destination │ ipv6_addr, e.g. │
4073 │ │ address of the │ abcd::1234, or you │
4074 │ │ packet should be │ can use a mapping, │
4075 │ │ modified. You may │ e.g. meta mark map │
4076 │ │ specify a mapping │ { 10 : 192.168.1.2, │
4077 │ │ to relate a list of │ 20 : 192.168.1.3 } │
4078 │ │ tuples composed of │ │
4079 │ │ arbitrary │ │
4080 │ │ expression key with │ │
4081 │ │ address value. │ │
4082 ├───────────┼─────────────────────┼─────────────────────┤
4083 │ │ │ │
4084 │port │ Specifies that the │ port number (16 │
4085 │ │ source/destination │ bit) │
4086 │ │ address of the │ │
4087 │ │ packet should be │ │
4088 │ │ modified. │ │
4089 └───────────┴─────────────────────┴─────────────────────┘
4090
4091 Table 70. NAT statement flags
4092 ┌─────────────┬─────────────────────────────┐
4093 │Flag │ Description │
4094 ├─────────────┼─────────────────────────────┤
4095 │ │ │
4096 │persistent │ Gives a client the same │
4097 │ │ source-/destination-address │
4098 │ │ for each connection. │
4099 ├─────────────┼─────────────────────────────┤
4100 │ │ │
4101 │random │ In kernel 5.0 and newer │
4102 │ │ this is the same as │
4103 │ │ fully-random. In earlier │
4104 │ │ kernels the port mapping │
4105 │ │ will be randomized using a │
4106 │ │ seeded MD5 hash mix using │
4107 │ │ source and destination │
4108 │ │ address and destination │
4109 │ │ port. │
4110 ├─────────────┼─────────────────────────────┤
4111 │ │ │
4112 │fully-random │ If used then port mapping │
4113 │ │ is generated based on a │
4114 │ │ 32-bit pseudo-random │
4115 │ │ algorithm. │
4116 └─────────────┴─────────────────────────────┘
4117
4118 Using NAT statements.
4119
4120 # create a suitable table/chain setup for all further examples
4121 add table nat
4122 add chain nat prerouting { type nat hook prerouting priority dstnat; }
4123 add chain nat postrouting { type nat hook postrouting priority srcnat; }
4124
4125 # translate source addresses of all packets leaving via eth0 to address 1.2.3.4
4126 add rule nat postrouting oif eth0 snat to 1.2.3.4
4127
4128 # redirect all traffic entering via eth0 to destination address 192.168.1.120
4129 add rule nat prerouting iif eth0 dnat to 192.168.1.120
4130
4131 # translate source addresses of all packets leaving via eth0 to whatever
4132 # locally generated packets would use as source to reach the same destination
4133 add rule nat postrouting oif eth0 masquerade
4134
4135 # redirect incoming TCP traffic for port 22 to port 2222
4136 add rule nat prerouting tcp dport 22 redirect to :2222
4137
4138 # inet family:
4139 # handle ip dnat:
4140 add rule inet nat prerouting dnat ip to 10.0.2.99
4141 # handle ip6 dnat:
4142 add rule inet nat prerouting dnat ip6 to fe80::dead
4143 # this masquerades both ipv4 and ipv6:
4144 add rule inet nat postrouting meta oif ppp0 masquerade
4145
4146
4147 TPROXY STATEMENT
4148 Tproxy redirects the packet to a local socket without changing the
4149 packet header in any way. If any of the arguments is missing the data
4150 of the incoming packet is used as parameter. Tproxy matching requires
4151 another rule that ensures the presence of transport protocol header is
4152 specified.
4153
4154 tproxy to address:port
4155 tproxy to {address | :port}
4156
4157 This syntax can be used in ip/ip6 tables where network layer protocol
4158 is obvious. Either IP address or port can be specified, but at least
4159 one of them is necessary.
4160
4161 tproxy {ip | ip6} to address[:port]
4162 tproxy to :port
4163
4164 This syntax can be used in inet tables. The ip/ip6 parameter defines
4165 the family the rule will match. The address parameter must be of this
4166 family. When only port is defined, the address family should not be
4167 specified. In this case the rule will match for both families.
4168
4169 Table 71. tproxy attributes
4170 ┌────────┬────────────────────────────┐
4171 │Name │ Description │
4172 ├────────┼────────────────────────────┤
4173 │ │ │
4174 │address │ IP address the listening │
4175 │ │ socket with IP_TRANSPARENT │
4176 │ │ option is bound to. │
4177 ├────────┼────────────────────────────┤
4178 │ │ │
4179 │port │ Port the listening socket │
4180 │ │ with IP_TRANSPARENT option │
4181 │ │ is bound to. │
4182 └────────┴────────────────────────────┘
4183
4184 Example ruleset for tproxy statement.
4185
4186 table ip x {
4187 chain y {
4188 type filter hook prerouting priority mangle; policy accept;
4189 tcp dport ntp tproxy to 1.1.1.1
4190 udp dport ssh tproxy to :2222
4191 }
4192 }
4193 table ip6 x {
4194 chain y {
4195 type filter hook prerouting priority mangle; policy accept;
4196 tcp dport ntp tproxy to [dead::beef]
4197 udp dport ssh tproxy to :2222
4198 }
4199 }
4200 table inet x {
4201 chain y {
4202 type filter hook prerouting priority mangle; policy accept;
4203 tcp dport 321 tproxy to :ssh
4204 tcp dport 99 tproxy ip to 1.1.1.1:999
4205 udp dport 155 tproxy ip6 to [dead::beef]:smux
4206 }
4207 }
4208
4209
4210 SYNPROXY STATEMENT
4211 This statement will process TCP three-way-handshake parallel in
4212 netfilter context to protect either local or backend system. This
4213 statement requires connection tracking because sequence numbers need to
4214 be translated.
4215
4216 synproxy [mss mss_value] [wscale wscale_value] [SYNPROXY_FLAGS]
4217
4218 Table 72. synproxy statement attributes
4219 ┌───────┬────────────────────────────┐
4220 │Name │ Description │
4221 ├───────┼────────────────────────────┤
4222 │ │ │
4223 │mss │ Maximum segment size │
4224 │ │ announced to clients. This │
4225 │ │ must match the backend. │
4226 ├───────┼────────────────────────────┤
4227 │ │ │
4228 │wscale │ Window scale announced to │
4229 │ │ clients. This must match │
4230 │ │ the backend. │
4231 └───────┴────────────────────────────┘
4232
4233 Table 73. synproxy statement flags
4234 ┌──────────┬────────────────────────────┐
4235 │Flag │ Description │
4236 ├──────────┼────────────────────────────┤
4237 │ │ │
4238 │sack-perm │ Pass client selective │
4239 │ │ acknowledgement option to │
4240 │ │ backend (will be disabled │
4241 │ │ if not present). │
4242 ├──────────┼────────────────────────────┤
4243 │ │ │
4244 │timestamp │ Pass client timestamp │
4245 │ │ option to backend (will be │
4246 │ │ disabled if not present, │
4247 │ │ also needed for selective │
4248 │ │ acknowledgement and window │
4249 │ │ scaling). │
4250 └──────────┴────────────────────────────┘
4251
4252 Example ruleset for synproxy statement.
4253
4254 Determine tcp options used by backend, from an external system
4255
4256 tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)'
4257 port 80 &
4258 telnet 192.0.2.42 80
4259 18:57:24.693307 IP 192.0.2.42.80 > 192.0.2.43.48757:
4260 Flags [S.], seq 360414582, ack 788841994, win 14480,
4261 options [mss 1460,sackOK,
4262 TS val 1409056151 ecr 9690221,
4263 nop,wscale 9],
4264 length 0
4265
4266 Switch tcp_loose mode off, so conntrack will mark out-of-flow packets as state INVALID.
4267
4268 echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
4269
4270 Make SYN packets untracked.
4271
4272 table ip x {
4273 chain y {
4274 type filter hook prerouting priority raw; policy accept;
4275 tcp flags syn notrack
4276 }
4277 }
4278
4279 Catch UNTRACKED (SYN packets) and INVALID (3WHS ACK packets) states and send
4280 them to SYNPROXY. This rule will respond to SYN packets with SYN+ACK
4281 syncookies, create ESTABLISHED for valid client response (3WHS ACK packets) and
4282 drop incorrect cookies. Flags combinations not expected during 3WHS will not
4283 match and continue (e.g. SYN+FIN, SYN+ACK). Finally, drop invalid packets, this
4284 will be out-of-flow packets that were not matched by SYNPROXY.
4285
4286 table ip x {
4287 chain z {
4288 type filter hook input priority filter; policy accept;
4289 ct state invalid, untracked synproxy mss 1460 wscale 9 timestamp sack-perm
4290 ct state invalid drop
4291 }
4292 }
4293
4294
4295 FLOW STATEMENT
4296 A flow statement allows us to select what flows you want to accelerate
4297 forwarding through layer 3 network stack bypass. You have to specify
4298 the flowtable name where you want to offload this flow.
4299
4300 flow add @flowtable
4301
4302 QUEUE STATEMENT
4303 This statement passes the packet to userspace using the nfnetlink_queue
4304 handler. The packet is put into the queue identified by its 16-bit
4305 queue number. Userspace can inspect and modify the packet if desired.
4306 Userspace must then drop or re-inject the packet into the kernel. See
4307 libnetfilter_queue documentation for details.
4308
4309 queue [flags QUEUE_FLAGS] [to queue_number]
4310 queue [flags QUEUE_FLAGS] [to queue_number_from - queue_number_to]
4311 queue [flags QUEUE_FLAGS] [to QUEUE_EXPRESSION ]
4312
4313 QUEUE_FLAGS := QUEUE_FLAG [, QUEUE_FLAGS]
4314 QUEUE_FLAG := bypass | fanout
4315 QUEUE_EXPRESSION := numgen | hash | symhash | MAP STATEMENT
4316
4317 QUEUE_EXPRESSION can be used to compute a queue number at run-time with
4318 the hash or numgen expressions. It also allows one to use the map
4319 statement to assign fixed queue numbers based on external inputs such
4320 as the source ip address or interface names.
4321
4322 Table 74. queue statement values
4323 ┌──────────────────┬────────────────────┬──────────────────┐
4324 │Value │ Description │ Type │
4325 ├──────────────────┼────────────────────┼──────────────────┤
4326 │ │ │ │
4327 │queue_number │ Sets queue number, │ unsigned integer │
4328 │ │ default is 0. │ (16 bit) │
4329 ├──────────────────┼────────────────────┼──────────────────┤
4330 │ │ │ │
4331 │queue_number_from │ Sets initial queue │ unsigned integer │
4332 │ │ in the range, if │ (16 bit) │
4333 │ │ fanout is used. │ │
4334 ├──────────────────┼────────────────────┼──────────────────┤
4335 │ │ │ │
4336 │queue_number_to │ Sets closing queue │ unsigned integer │
4337 │ │ in the range, if │ (16 bit) │
4338 │ │ fanout is used. │ │
4339 └──────────────────┴────────────────────┴──────────────────┘
4340
4341 Table 75. queue statement flags
4342 ┌───────┬────────────────────────────┐
4343 │Flag │ Description │
4344 ├───────┼────────────────────────────┤
4345 │ │ │
4346 │bypass │ Let packets go through if │
4347 │ │ userspace application │
4348 │ │ cannot back off. Before │
4349 │ │ using this flag, read │
4350 │ │ libnetfilter_queue │
4351 │ │ documentation for │
4352 │ │ performance tuning │
4353 │ │ recommendations. │
4354 ├───────┼────────────────────────────┤
4355 │ │ │
4356 │fanout │ Distribute packets between │
4357 │ │ several queues. │
4358 └───────┴────────────────────────────┘
4359
4360 DUP STATEMENT
4361 The dup statement is used to duplicate a packet and send the copy to a
4362 different destination.
4363
4364 dup to device
4365 dup to address device device
4366
4367 Table 76. Dup statement values
4368 ┌───────────┬─────────────────────┬─────────────────────┐
4369 │Expression │ Description │ Type │
4370 ├───────────┼─────────────────────┼─────────────────────┤
4371 │ │ │ │
4372 │address │ Specifies that the │ ipv4_addr, │
4373 │ │ copy of the packet │ ipv6_addr, e.g. │
4374 │ │ should be sent to a │ abcd::1234, or you │
4375 │ │ new gateway. │ can use a mapping, │
4376 │ │ │ e.g. ip saddr map { │
4377 │ │ │ 192.168.1.2 : │
4378 │ │ │ 10.1.1.1 } │
4379 ├───────────┼─────────────────────┼─────────────────────┤
4380 │ │ │ │
4381 │device │ Specifies that the │ string │
4382 │ │ copy should be │ │
4383 │ │ transmitted via │ │
4384 │ │ device. │ │
4385 └───────────┴─────────────────────┴─────────────────────┘
4386
4387 Using the dup statement.
4388
4389 # send to machine with ip address 10.2.3.4 on eth0
4390 ip filter forward dup to 10.2.3.4 device "eth0"
4391
4392 # copy raw frame to another interface
4393 netdev ingress dup to "eth0"
4394 dup to "eth0"
4395
4396 # combine with map dst addr to gateways
4397 dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" }
4398
4399
4400 FWD STATEMENT
4401 The fwd statement is used to redirect a raw packet to another
4402 interface. It is only available in the netdev family ingress and egress
4403 hooks. It is similar to the dup statement except that no copy is made.
4404
4405 You can also specify the address of the next hop and the device to
4406 forward the packet to. This updates the source and destination MAC
4407 address of the packet by transmitting it through the neighboring layer.
4408 This also decrements the ttl field of the IP packet. This provides a
4409 way to effectively bypass the classical forwarding path, thus skipping
4410 the fib (forwarding information base) lookup.
4411
4412 fwd to device
4413 fwd [ip | ip6] to address device device
4414
4415 Using the fwd statement.
4416
4417 # redirect raw packet to device
4418 netdev ingress fwd to "eth0"
4419
4420 # forward packet to next hop 192.168.200.1 via eth0 device
4421 netdev ingress ether saddr set fwd ip to 192.168.200.1 device "eth0"
4422
4423
4424 SET STATEMENT
4425 The set statement is used to dynamically add or update elements in a
4426 set from the packet path. The set setname must already exist in the
4427 given table and must have been created with one or both of the dynamic
4428 and the timeout flags. The dynamic flag is required if the set
4429 statement expression includes a stateful object. The timeout flag is
4430 implied if the set is created with a timeout, and is required if the
4431 set statement updates elements, rather than adding them. Furthermore,
4432 these sets should specify both a maximum set size (to prevent memory
4433 exhaustion), and their elements should have a timeout (so their number
4434 will not grow indefinitely) either from the set definition or from the
4435 statement that adds or updates them. The set statement can be used to
4436 e.g. create dynamic blacklists.
4437
4438 Dynamic updates are also supported with maps. In this case, the add or
4439 update rule needs to provide both the key and the data element (value),
4440 separated via :.
4441
4442 {add | update} @setname { expression [timeout timeout] [comment string] }
4443
4444 Example for simple blacklist.
4445
4446 # declare a set, bound to table "filter", in family "ip".
4447 # Timeout and size are mandatory because we will add elements from packet path.
4448 # Entries will timeout after one minute, after which they might be
4449 # re-added if limit condition persists.
4450 nft add set ip filter blackhole \
4451 "{ type ipv4_addr; flags dynamic; timeout 1m; size 65536; }"
4452
4453 # declare a set to store the limit per saddr.
4454 # This must be separate from blackhole since the timeout is different
4455 nft add set ip filter flood \
4456 "{ type ipv4_addr; flags dynamic; timeout 10s; size 128000; }"
4457
4458 # whitelist internal interface.
4459 nft add rule ip filter input meta iifname "internal" accept
4460
4461 # drop packets coming from blacklisted ip addresses.
4462 nft add rule ip filter input ip saddr @blackhole counter drop
4463
4464 # add source ip addresses to the blacklist if more than 10 tcp connection
4465 # requests occurred per second and ip address.
4466 nft add rule ip filter input tcp flags syn tcp dport ssh \
4467 add @flood { ip saddr limit rate over 10/second } \
4468 add @blackhole { ip saddr } \
4469 drop
4470
4471 # inspect state of the sets.
4472 nft list set ip filter flood
4473 nft list set ip filter blackhole
4474
4475 # manually add two addresses to the blackhole.
4476 nft add element filter blackhole { 10.2.3.4, 10.23.1.42 }
4477
4478
4479 MAP STATEMENT
4480 The map statement is used to lookup data based on some specific input
4481 key.
4482
4483 expression map { MAP_ELEMENTS }
4484
4485 MAP_ELEMENTS := MAP_ELEMENT [, MAP_ELEMENTS]
4486 MAP_ELEMENT := key : value
4487
4488 The key is a value returned by expression.
4489
4490 Using the map statement.
4491
4492 # select DNAT target based on TCP dport:
4493 # connections to port 80 are redirected to 192.168.1.100,
4494 # connections to port 8888 are redirected to 192.168.1.101
4495 nft add rule ip nat prerouting dnat tcp dport map { 80 : 192.168.1.100, 8888 : 192.168.1.101 }
4496
4497 # source address based SNAT:
4498 # packets from net 192.168.1.0/24 will appear as originating from 10.0.0.1,
4499 # packets from net 192.168.2.0/24 will appear as originating from 10.0.0.2
4500 nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 }
4501
4502
4503 VMAP STATEMENT
4504 The verdict map (vmap) statement works analogous to the map statement,
4505 but contains verdicts as values.
4506
4507 expression vmap { VMAP_ELEMENTS }
4508
4509 VMAP_ELEMENTS := VMAP_ELEMENT [, VMAP_ELEMENTS]
4510 VMAP_ELEMENT := key : verdict
4511
4512 Using the vmap statement.
4513
4514 # jump to different chains depending on layer 4 protocol type:
4515 nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain }
4516
4517
4518 XT STATEMENT
4519 This represents an xt statement from xtables compat interface. It is a
4520 fallback if translation is not available or not complete.
4521
4522 xt TYPE NAME
4523
4524 TYPE := match | target | watcher
4525
4526 Seeing this means the ruleset (or parts of it) were created by
4527 iptables-nft and one should use that to manage it.
4528
4529 BEWARE: nftables won’t restore these statements.
4530
4532 These are some additional commands included in nft.
4533
4534 MONITOR
4535 The monitor command allows you to listen to Netlink events produced by
4536 the nf_tables subsystem. These are either related to creation and
4537 deletion of objects or to packets for which meta nftrace was enabled.
4538 When they occur, nft will print to stdout the monitored events in
4539 either JSON or native nft format.
4540
4541 monitor [new | destroy] MONITOR_OBJECT
4542 monitor trace
4543
4544 MONITOR_OBJECT := tables | chains | sets | rules | elements | ruleset
4545
4546 To filter events related to a concrete object, use one of the keywords
4547 in MONITOR_OBJECT.
4548
4549 To filter events related to a concrete action, use keyword new or
4550 destroy.
4551
4552 The second form of invocation takes no further options and exclusively
4553 prints events generated for packets with nftrace enabled.
4554
4555 Hit ^C to finish the monitor operation.
4556
4557 Listen to all events, report in native nft format.
4558
4559 % nft monitor
4560
4561 Listen to deleted rules, report in JSON format.
4562
4563 % nft -j monitor destroy rules
4564
4565 Listen to both new and destroyed chains, in native nft format.
4566
4567 % nft monitor chains
4568
4569 Listen to ruleset events such as table, chain, rule, set, counters and
4570 quotas, in native nft format.
4571
4572 % nft monitor ruleset
4573
4574 Trace incoming packets from host 10.0.0.1.
4575
4576 % nft add rule filter input ip saddr 10.0.0.1 meta nftrace set 1
4577 % nft monitor trace
4578
4579
4581 When an error is detected, nft shows the line(s) containing the error,
4582 the position of the erroneous parts in the input stream and marks up
4583 the erroneous parts using carets (^). If the error results from the
4584 combination of two expressions or statements, the part imposing the
4585 constraints which are violated is marked using tildes (~).
4586
4587 For errors returned by the kernel, nft cannot detect which parts of the
4588 input caused the error and the entire command is marked.
4589
4590 Error caused by single incorrect expression.
4591
4592 <cmdline>:1:19-22: Error: Interface does not exist
4593 filter output oif eth0
4594 ^^^^
4595
4596 Error caused by invalid combination of two expressions.
4597
4598 <cmdline>:1:28-36: Error: Right hand side of relational expression (==) must be constant
4599 filter output tcp dport == tcp dport
4600 ~~ ^^^^^^^^^
4601
4602 Error returned by the kernel.
4603
4604 <cmdline>:0:0-23: Error: Could not process rule: Operation not permitted
4605 filter output oif wlan0
4606 ^^^^^^^^^^^^^^^^^^^^^^^
4607
4608
4610 On success, nft exits with a status of 0. Unspecified errors cause it
4611 to exit with a status of 1, memory allocation errors with a status of
4612 2, unable to open Netlink socket with 3.
4613
4615 libnftables(3), libnftables-json(5), iptables(8), ip6tables(8), arptables(8), ebtables(8), ip(8), tc(8)
4616
4617 There is an official wiki at: https://wiki.nftables.org
4618
4620 nftables was written by Patrick McHardy and Pablo Neira Ayuso, among
4621 many other contributors from the Netfilter community.
4622
4624 Copyright © 2008-2014 Patrick McHardy <kaber@trash.net> Copyright ©
4625 2013-2018 Pablo Neira Ayuso <pablo@netfilter.org>
4626
4627 nftables is free software; you can redistribute it and/or modify it
4628 under the terms of the GNU General Public License version 2 as
4629 published by the Free Software Foundation.
4630
4631 This documentation is licensed under the terms of the Creative Commons
4632 Attribution-ShareAlike 4.0 license, CC BY-SA 4.0
4633 http://creativecommons.org/licenses/by-sa/4.0/.
4634
4635
4636
4637 03/13/2023 NFT(8)