1NFT(8) NFT(8)
2
3
4
6 nft - Administration tool of the nftables framework for packet
7 filtering and classification
8
10 nft [ -nNscaeSupyjt ] [ -I directory ] [ -f filename | -i | cmd ...]
11 nft -h
12 nft -v
13
15 nft is the command line tool used to set up, maintain and inspect
16 packet filtering and classification rules in the Linux kernel, in the
17 nftables framework. The Linux kernel subsystem is known as nf_tables,
18 and ‘nf’ stands for Netfilter.
19
21 The command accepts several different options which are documented here
22 in groups for better understanding of their meaning. You can get
23 information about options by running nft --help.
24
25 General options:
26
27 -h, --help
28 Show help message and all options.
29
30 -v, --version
31 Show version.
32
33 -V
34 Show long version information, including compile-time
35 configuration.
36
37 Ruleset input handling options that specify to how to load rulesets:
38
39 -f, --file filename
40 Read input from filename. If filename is -, read from stdin.
41
42 -D, --define name=value
43 Define a variable. You can only combine this option with -f.
44
45 -i, --interactive
46 Read input from an interactive readline CLI. You can use quit to
47 exit, or use the EOF marker, normally this is CTRL-D.
48
49 -I, --includepath directory
50 Add the directory directory to the list of directories to be
51 searched for included files. This option may be specified multiple
52 times.
53
54 -c, --check
55 Check commands validity without actually applying the changes.
56
57 Ruleset list output formatting that modify the output of the list
58 ruleset command:
59
60 -a, --handle
61 Show object handles in output.
62
63 -s, --stateless
64 Omit stateful information of rules and stateful objects.
65
66 -t, --terse
67 Omit contents of sets from output.
68
69 -S, --service
70 Translate ports to service names as defined by /etc/services.
71
72 -N, --reversedns
73 Translate IP address to names via reverse DNS lookup. This may slow
74 down your listing since it generates network traffic.
75
76 -u, --guid
77 Translate numeric UID/GID to names as defined by /etc/passwd and
78 /etc/group.
79
80 -n, --numeric
81 Print fully numerical output.
82
83 -y, --numeric-priority
84 Display base chain priority numerically.
85
86 -p, --numeric-protocol
87 Display layer 4 protocol numerically.
88
89 -T, --numeric-time
90 Show time, day and hour values in numeric format.
91
92 Command output formatting:
93
94 -e, --echo
95 When inserting items into the ruleset using add, insert or replace
96 commands, print notifications just like nft monitor.
97
98 -j, --json
99 Format output in JSON. See libnftables-json(5) for a schema
100 description.
101
102 -d, --debug level
103 Enable debugging output. The debug level can be any of scanner,
104 parser, eval, netlink, mnl, proto-ctx, segtree, all. You can
105 combine more than one by separating by the , symbol, for example -d
106 eval,mnl.
107
109 LEXICAL CONVENTIONS
110 Input is parsed line-wise. When the last character of a line, just
111 before the newline character, is a non-quoted backslash (\), the next
112 line is treated as a continuation. Multiple commands on the same line
113 can be separated using a semicolon (;).
114
115 A hash sign (#) begins a comment. All following characters on the same
116 line are ignored.
117
118 Identifiers begin with an alphabetic character (a-z,A-Z), followed by
119 zero or more alphanumeric characters (a-z,A-Z,0-9) and the characters
120 slash (/), backslash (\), underscore (_) and dot (.). Identifiers using
121 different characters or clashing with a keyword need to be enclosed in
122 double quotes (").
123
124 INCLUDE FILES
125 include filename
126
127 Other files can be included by using the include statement. The
128 directories to be searched for include files can be specified using the
129 -I/--includepath option. You can override this behaviour either by
130 prepending ‘./’ to your path to force inclusion of files located in the
131 current working directory (i.e. relative path) or / for file location
132 expressed as an absolute path.
133
134 If -I/--includepath is not specified, then nft relies on the default
135 directory that is specified at compile time. You can retrieve this
136 default directory via the -h/--help option.
137
138 Include statements support the usual shell wildcard symbols (,?,[]).
139 Having no matches for an include statement is not an error, if wildcard
140 symbols are used in the include statement. This allows having
141 potentially empty include directories for statements like include
142 "/etc/firewall/rules/". The wildcard matches are loaded in alphabetical
143 order. Files beginning with dot (.) are not matched by include
144 statements.
145
146 SYMBOLIC VARIABLES
147 define variable = expr
148 $variable
149
150 Symbolic variables can be defined using the define statement. Variable
151 references are expressions and can be used to initialize other
152 variables. The scope of a definition is the current block and all
153 blocks contained within.
154
155 Using symbolic variables.
156
157 define int_if1 = eth0
158 define int_if2 = eth1
159 define int_ifs = { $int_if1, $int_if2 }
160
161 filter input iif $int_ifs accept
162
163
165 Address families determine the type of packets which are processed. For
166 each address family, the kernel contains so called hooks at specific
167 stages of the packet processing paths, which invoke nftables if rules
168 for these hooks exist.
169
170
171 ip IPv4 address family.
172
173 ip6 IPv6 address family.
174
175 inet Internet (IPv4/IPv6)
176 address family.
177
178 arp ARP address family,
179 handling IPv4 ARP packets.
180
181 bridge Bridge address family,
182 handling packets which
183 traverse a bridge device.
184
185 netdev Netdev address family,
186 handling packets from
187 ingress.
188
189
190 All nftables objects exist in address family specific namespaces,
191 therefore all identifiers include an address family. If an identifier
192 is specified without an address family, the ip family is used by
193 default.
194
195 IPV4/IPV6/INET ADDRESS FAMILIES
196 The IPv4/IPv6/Inet address families handle IPv4, IPv6 or both types of
197 packets. They contain five hooks at different packet processing stages
198 in the network stack.
199
200 Table 1. IPv4/IPv6/Inet address family hooks
201 ┌────────────┬────────────────────────────┐
202 │Hook │ Description │
203 ├────────────┼────────────────────────────┤
204 │ │ │
205 │prerouting │ All packets entering the │
206 │ │ system are processed by │
207 │ │ the prerouting hook. It is │
208 │ │ invoked before the routing │
209 │ │ process and is used for │
210 │ │ early filtering or │
211 │ │ changing packet attributes │
212 │ │ that affect routing. │
213 ├────────────┼────────────────────────────┤
214 │ │ │
215 │input │ Packets delivered to the │
216 │ │ local system are processed │
217 │ │ by the input hook. │
218 ├────────────┼────────────────────────────┤
219 │ │ │
220 │forward │ Packets forwarded to a │
221 │ │ different host are │
222 │ │ processed by the forward │
223 │ │ hook. │
224 ├────────────┼────────────────────────────┤
225 │ │ │
226 │output │ Packets sent by local │
227 │ │ processes are processed by │
228 │ │ the output hook. │
229 ├────────────┼────────────────────────────┤
230 │ │ │
231 │postrouting │ All packets leaving the │
232 │ │ system are processed by │
233 │ │ the postrouting hook. │
234 ├────────────┼────────────────────────────┤
235 │ │ │
236 │ingress │ All packets entering the │
237 │ │ system are processed by │
238 │ │ this hook. It is invoked │
239 │ │ before layer 3 protocol │
240 │ │ handlers, hence before the │
241 │ │ prerouting hook, and it │
242 │ │ can be used for filtering │
243 │ │ and policing. Ingress is │
244 │ │ only available for Inet │
245 │ │ family (since Linux kernel │
246 │ │ 5.10). │
247 └────────────┴────────────────────────────┘
248
249 ARP ADDRESS FAMILY
250 The ARP address family handles ARP packets received and sent by the
251 system. It is commonly used to mangle ARP packets for clustering.
252
253 Table 2. ARP address family hooks
254 ┌───────┬────────────────────────────┐
255 │Hook │ Description │
256 ├───────┼────────────────────────────┤
257 │ │ │
258 │input │ Packets delivered to the │
259 │ │ local system are processed │
260 │ │ by the input hook. │
261 ├───────┼────────────────────────────┤
262 │ │ │
263 │output │ Packets send by the local │
264 │ │ system are processed by │
265 │ │ the output hook. │
266 └───────┴────────────────────────────┘
267
268 BRIDGE ADDRESS FAMILY
269 The bridge address family handles Ethernet packets traversing bridge
270 devices.
271
272 The list of supported hooks is identical to IPv4/IPv6/Inet address
273 families above.
274
275 NETDEV ADDRESS FAMILY
276 The Netdev address family handles packets from the device ingress path.
277 This family allows you to filter packets of any ethertype such as ARP,
278 VLAN 802.1q, VLAN 802.1ad (Q-in-Q) as well as IPv4 and IPv6 packets.
279
280 Table 3. Netdev address family hooks
281 ┌────────┬────────────────────────────┐
282 │Hook │ Description │
283 ├────────┼────────────────────────────┤
284 │ │ │
285 │ingress │ All packets entering the │
286 │ │ system are processed by │
287 │ │ this hook. It is invoked │
288 │ │ after the network taps │
289 │ │ (ie. tcpdump), right after │
290 │ │ tc ingress and before │
291 │ │ layer 3 protocol handlers, │
292 │ │ it can be used for early │
293 │ │ filtering and policing. │
294 └────────┴────────────────────────────┘
295
297 {list | flush} ruleset [family]
298
299 The ruleset keyword is used to identify the whole set of tables,
300 chains, etc. currently in place in kernel. The following ruleset
301 commands exist:
302
303
304 list Print the ruleset in
305 human-readable format.
306
307 flush Clear the whole ruleset.
308 Note that, unlike
309 iptables, this will remove
310 all tables and whatever
311 they contain, effectively
312 leading to an empty
313 ruleset - no packet
314 filtering will happen
315 anymore, so the kernel
316 accepts any valid packet
317 it receives.
318
319
320 It is possible to limit list and flush to a specific address family
321 only. For a list of valid family names, see the section called “ADDRESS
322 FAMILIES” above.
323
324 By design, list ruleset command output may be used as input to nft -f.
325 Effectively, this is the nft-equivalent of iptables-save and
326 iptables-restore.
327
329 {add | create} table [family] table [{ flags flags ; }]
330 {delete | list | flush} table [family] table
331 list tables [family]
332 delete table [family] handle handle
333
334 Tables are containers for chains, sets and stateful objects. They are
335 identified by their address family and their name. The address family
336 must be one of ip, ip6, inet, arp, bridge, netdev. The inet address
337 family is a dummy family which is used to create hybrid IPv4/IPv6
338 tables. The meta expression nfproto keyword can be used to test which
339 family (ipv4 or ipv6) context the packet is being processed in. When no
340 address family is specified, ip is used by default. The only difference
341 between add and create is that the former will not return an error if
342 the specified table already exists while create will return an error.
343
344 Table 4. Table flags
345 ┌────────┬────────────────────────────┐
346 │Flag │ Description │
347 ├────────┼────────────────────────────┤
348 │ │ │
349 │dormant │ table is not evaluated any │
350 │ │ more (base chains are │
351 │ │ unregistered). │
352 └────────┴────────────────────────────┘
353
354 Add, change, delete a table.
355
356 # start nft in interactive mode
357 nft --interactive
358
359 # create a new table.
360 create table inet mytable
361
362 # add a new base chain: get input packets
363 add chain inet mytable myin { type filter hook input priority filter; }
364
365 # add a single counter to the chain
366 add rule inet mytable myin counter
367
368 # disable the table temporarily -- rules are not evaluated anymore
369 add table inet mytable { flags dormant; }
370
371 # make table active again:
372 add table inet mytable
373
374
375
376 add Add a new table for the
377 given family with the
378 given name.
379
380 delete Delete the specified
381 table.
382
383 list List all chains and rules
384 of the specified table.
385
386 flush Flush all chains and rules
387 of the specified table.
388
389
391 {add | create} chain [family] table chain [{ type type hook hook [device device] priority priority ; [policy policy ;] }]
392 {delete | list | flush} chain [family] table chain
393 list chains [family]
394 delete chain [family] table handle handle
395 rename chain [family] table chain newname
396
397 Chains are containers for rules. They exist in two kinds, base chains
398 and regular chains. A base chain is an entry point for packets from the
399 networking stack, a regular chain may be used as jump target and is
400 used for better rule organization.
401
402
403
404
405
406
407 add Add a new chain in the
408 specified table. When a
409 hook and priority value
410 are specified, the chain
411 is created as a base chain
412 and hooked up to the
413 networking stack.
414
415 create Similar to the add
416 command, but returns an
417 error if the chain already
418 exists.
419
420 delete Delete the specified
421 chain. The chain must not
422 contain any rules or be
423 used as jump target.
424
425 rename Rename the specified
426 chain.
427
428 list List all rules of the
429 specified chain.
430
431 flush Flush all rules of the
432 specified chain.
433
434
435 For base chains, type, hook and priority parameters are mandatory.
436
437 Table 5. Supported chain types
438 ┌───────┬───────────────┬────────────────┬──────────────────┐
439 │Type │ Families │ Hooks │ Description │
440 ├───────┼───────────────┼────────────────┼──────────────────┤
441 │ │ │ │ │
442 │filter │ all │ all │ Standard chain │
443 │ │ │ │ type to use in │
444 │ │ │ │ doubt. │
445 ├───────┼───────────────┼────────────────┼──────────────────┤
446 │ │ │ │ │
447 │nat │ ip, ip6, inet │ prerouting, │ Chains of this │
448 │ │ │ input, output, │ type perform │
449 │ │ │ postrouting │ Native Address │
450 │ │ │ │ Translation │
451 │ │ │ │ based on │
452 │ │ │ │ conntrack │
453 │ │ │ │ entries. Only │
454 │ │ │ │ the first packet │
455 │ │ │ │ of a connection │
456 │ │ │ │ actually │
457 │ │ │ │ traverses this │
458 │ │ │ │ chain - its │
459 │ │ │ │ rules usually │
460 │ │ │ │ define details │
461 │ │ │ │ of the created │
462 │ │ │ │ conntrack entry │
463 │ │ │ │ (NAT statements │
464 │ │ │ │ for instance). │
465 ├───────┼───────────────┼────────────────┼──────────────────┤
466 │ │ │ │ │
467 │route │ ip, ip6 │ output │ If a packet has │
468 │ │ │ │ traversed a │
469 │ │ │ │ chain of this │
470 │ │ │ │ type and is │
471 │ │ │ │ about to be │
472 │ │ │ │ accepted, a new │
473 │ │ │ │ route lookup is │
474 │ │ │ │ performed if │
475 │ │ │ │ relevant parts │
476 │ │ │ │ of the IP header │
477 │ │ │ │ have changed. │
478 │ │ │ │ This allows to │
479 │ │ │ │ e.g. implement │
480 │ │ │ │ policy routing │
481 │ │ │ │ selectors in │
482 │ │ │ │ nftables. │
483 └───────┴───────────────┴────────────────┴──────────────────┘
484
485 Apart from the special cases illustrated above (e.g. nat type not
486 supporting forward hook or route type only supporting output hook),
487 there are three further quirks worth noticing:
488
489 • The netdev family supports merely a single combination, namely
490 filter type and ingress hook. Base chains in this family also
491 require the device parameter to be present since they exist per
492 incoming interface only.
493
494 • The arp family supports only the input and output hooks, both in
495 chains of type filter.
496
497 • The inet family also supports the ingress hook (since Linux kernel
498 5.10), to filter IPv4 and IPv6 packet at the same location as the
499 netdev ingress hook. This inet hook allows you to share sets and
500 maps between the usual prerouting, input, forward, output,
501 postrouting and this ingress hook.
502
503 The priority parameter accepts a signed integer value or a standard
504 priority name which specifies the order in which chains with the same
505 hook value are traversed. The ordering is ascending, i.e. lower
506 priority values have precedence over higher ones.
507
508 Standard priority values can be replaced with easily memorizable names.
509 Not all names make sense in every family with every hook (see the
510 compatibility matrices below) but their numerical value can still be
511 used for prioritizing chains.
512
513 These names and values are defined and made available based on what
514 priorities are used by xtables when registering their default chains.
515
516 Most of the families use the same values, but bridge uses different
517 ones from the others. See the following tables that describe the values
518 and compatibility.
519
520 Table 6. Standard priority names, family and hook compatibility matrix
521 ┌─────────┬───────┬────────────────┬─────────────┐
522 │Name │ Value │ Families │ Hooks │
523 ├─────────┼───────┼────────────────┼─────────────┤
524 │ │ │ │ │
525 │raw │ -300 │ ip, ip6, inet │ all │
526 ├─────────┼───────┼────────────────┼─────────────┤
527 │ │ │ │ │
528 │mangle │ -150 │ ip, ip6, inet │ all │
529 ├─────────┼───────┼────────────────┼─────────────┤
530 │ │ │ │ │
531 │dstnat │ -100 │ ip, ip6, inet │ prerouting │
532 ├─────────┼───────┼────────────────┼─────────────┤
533 │ │ │ │ │
534 │filter │ 0 │ ip, ip6, inet, │ all │
535 │ │ │ arp, netdev │ │
536 ├─────────┼───────┼────────────────┼─────────────┤
537 │ │ │ │ │
538 │security │ 50 │ ip, ip6, inet │ all │
539 ├─────────┼───────┼────────────────┼─────────────┤
540 │ │ │ │ │
541 │srcnat │ 100 │ ip, ip6, inet │ postrouting │
542 └─────────┴───────┴────────────────┴─────────────┘
543
544 Table 7. Standard priority names and hook compatibility for the bridge
545 family
546 ┌───────┬───────┬─────────────┐
547 │ │ │ │
548 │Name │ Value │ Hooks │
549 ├───────┼───────┼─────────────┤
550 │ │ │ │
551 │dstnat │ -300 │ prerouting │
552 ├───────┼───────┼─────────────┤
553 │ │ │ │
554 │filter │ -200 │ all │
555 ├───────┼───────┼─────────────┤
556 │ │ │ │
557 │out │ 100 │ output │
558 ├───────┼───────┼─────────────┤
559 │ │ │ │
560 │srcnat │ 300 │ postrouting │
561 └───────┴───────┴─────────────┘
562
563 Basic arithmetic expressions (addition and subtraction) can also be
564 achieved with these standard names to ease relative prioritizing, e.g.
565 mangle - 5 stands for -155. Values will also be printed like this until
566 the value is not further than 10 from the standard value.
567
568 Base chains also allow to set the chain’s policy, i.e. what happens to
569 packets not explicitly accepted or refused in contained rules.
570 Supported policy values are accept (which is the default) or drop.
571
573 {add | insert} rule [family] table chain [handle handle | index index] statement ... [comment comment]
574 replace rule [family] table chain handle handle statement ... [comment comment]
575 delete rule [family] table chain handle handle
576
577 Rules are added to chains in the given table. If the family is not
578 specified, the ip family is used. Rules are constructed from two kinds
579 of components according to a set of grammatical rules: expressions and
580 statements.
581
582 The add and insert commands support an optional location specifier,
583 which is either a handle or the index (starting at zero) of an existing
584 rule. Internally, rule locations are always identified by handle and
585 the translation from index happens in userspace. This has two potential
586 implications in case a concurrent ruleset change happens after the
587 translation was done: The effective rule index might change if a rule
588 was inserted or deleted before the referred one. If the referred rule
589 was deleted, the command is rejected by the kernel just as if an
590 invalid handle was given.
591
592 A comment is a single word or a double-quoted (") multi-word string
593 which can be used to make notes regarding the actual rule. Note: If you
594 use bash for adding rules, you have to escape the quotation marks, e.g.
595 \"enable ssh for servers\".
596
597
598 add Add a new rule described
599 by the list of statements.
600 The rule is appended to
601 the given chain unless a
602 location is specified, in
603 which case the rule is
604 inserted after the
605 specified rule.
606
607 insert Same as add except the
608 rule is inserted at the
609 beginning of the chain or
610 before the specified rule.
611
612 replace Similar to add, but the
613 rule replaces the
614 specified rule.
615
616 delete Delete the specified rule.
617
618
619 add a rule to ip table output chain.
620
621 nft add rule filter output ip daddr 192.168.0.0/24 accept # 'ip filter' is assumed
622 # same command, slightly more verbose
623 nft add rule ip filter output ip daddr 192.168.0.0/24 accept
624
625 delete rule from inet table.
626
627 # nft -a list ruleset
628 table inet filter {
629 chain input {
630 type filter hook input priority filter; policy accept;
631 ct state established,related accept # handle 4
632 ip saddr 10.1.1.1 tcp dport ssh accept # handle 5
633 ...
634 # delete the rule with handle 5
635 nft delete rule inet filter input handle 5
636
637
639 nftables offers two kinds of set concepts. Anonymous sets are sets that
640 have no specific name. The set members are enclosed in curly braces,
641 with commas to separate elements when creating the rule the set is used
642 in. Once that rule is removed, the set is removed as well. They cannot
643 be updated, i.e. once an anonymous set is declared it cannot be changed
644 anymore except by removing/altering the rule that uses the anonymous
645 set.
646
647 Using anonymous sets to accept particular subnets and ports.
648
649 nft add rule filter input ip saddr { 10.0.0.0/8, 192.168.0.0/16 } tcp dport { 22, 443 } accept
650
651 Named sets are sets that need to be defined first before they can be
652 referenced in rules. Unlike anonymous sets, elements can be added to or
653 removed from a named set at any time. Sets are referenced from rules
654 using an @ prefixed to the sets name.
655
656 Using named sets to accept addresses and ports.
657
658 nft add rule filter input ip saddr @allowed_hosts tcp dport @allowed_ports accept
659
660 The sets allowed_hosts and allowed_ports need to be created first. The
661 next section describes nft set syntax in more detail.
662
663 add set [family] table set { type type | typeof expression ; [flags flags ;] [timeout timeout ;] [gc-interval gc-interval ;] [elements = { element[, ...] } ;] [size size ;] [policy policy ;] [auto-merge ;] }
664 {delete | list | flush} set [family] table set
665 list sets [family]
666 delete set [family] table handle handle
667 {add | delete} element [family] table set { element[, ...] }
668
669 Sets are element containers of a user-defined data type, they are
670 uniquely identified by a user-defined name and attached to tables.
671 Their behaviour can be tuned with the flags that can be specified at
672 set creation time.
673
674
675 add Add a new set in the
676 specified table. See the
677 Set specification table
678 below for more information
679 about how to specify
680 properties of a set.
681
682 delete Delete the specified set.
683
684 list Display the elements in
685 the specified set.
686
687 flush Remove all elements from
688 the specified set.
689
690
691 Table 8. Set specifications
692 ┌────────────┬──────────────────────┬─────────────────────┐
693 │Keyword │ Description │ Type │
694 ├────────────┼──────────────────────┼─────────────────────┤
695 │ │ │ │
696 │type │ data type of set │ string: ipv4_addr, │
697 │ │ elements │ ipv6_addr, │
698 │ │ │ ether_addr, │
699 │ │ │ inet_proto, │
700 │ │ │ inet_service, mark │
701 ├────────────┼──────────────────────┼─────────────────────┤
702 │ │ │ │
703 │typeof │ data type of set │ expression to │
704 │ │ element │ derive the data │
705 │ │ │ type from │
706 ├────────────┼──────────────────────┼─────────────────────┤
707 │ │ │ │
708 │flags │ set flags │ string: constant, │
709 │ │ │ dynamic, interval, │
710 │ │ │ timeout │
711 ├────────────┼──────────────────────┼─────────────────────┤
712 │ │ │ │
713 │timeout │ time an element │ string, decimal │
714 │ │ stays in the set, │ followed by unit. │
715 │ │ mandatory if set is │ Units are: d, h, m, │
716 │ │ added to from the │ s │
717 │ │ packet path │ │
718 │ │ (ruleset) │ │
719 ├────────────┼──────────────────────┼─────────────────────┤
720 │ │ │ │
721 │gc-interval │ garbage collection │ string, decimal │
722 │ │ interval, only │ followed by unit. │
723 │ │ available when │ Units are: d, h, m, │
724 │ │ timeout or flag │ s │
725 │ │ timeout are active │ │
726 ├────────────┼──────────────────────┼─────────────────────┤
727 │ │ │ │
728 │elements │ elements contained │ set data type │
729 │ │ by the set │ │
730 ├────────────┼──────────────────────┼─────────────────────┤
731 │ │ │ │
732 │size │ maximum number of │ unsigned integer │
733 │ │ elements in the │ (64 bit) │
734 │ │ set, mandatory if │ │
735 │ │ set is added to │ │
736 │ │ from the packet │ │
737 │ │ path (ruleset) │ │
738 ├────────────┼──────────────────────┼─────────────────────┤
739 │ │ │ │
740 │policy │ set policy │ string: performance │
741 │ │ │ [default], memory │
742 ├────────────┼──────────────────────┼─────────────────────┤
743 │ │ │ │
744 │auto-merge │ automatic merge of │ │
745 │ │ adjacent/overlapping │ │
746 │ │ set elements (only │ │
747 │ │ for interval sets) │ │
748 └────────────┴──────────────────────┴─────────────────────┘
749
751 add map [family] table map { type type | typeof expression [flags flags ;] [elements = { element[, ...] } ;] [size size ;] [policy policy ;] }
752 {delete | list | flush} map [family] table map
753 list maps [family]
754
755 Maps store data based on some specific key used as input. They are
756 uniquely identified by a user-defined name and attached to tables.
757
758
759 add Add a new map in the
760 specified table.
761
762 delete Delete the specified map.
763
764 list Display the elements in
765 the specified map.
766
767 flush Remove all elements from
768 the specified map.
769
770 add element Comma-separated list of
771 elements to add into the
772 specified map.
773
774 delete element Comma-separated list of
775 element keys to delete
776 from the specified map.
777
778
779 Table 9. Map specifications
780 ┌─────────┬─────────────────────┬─────────────────────┐
781 │Keyword │ Description │ Type │
782 ├─────────┼─────────────────────┼─────────────────────┤
783 │ │ │ │
784 │type │ data type of map │ string: ipv4_addr, │
785 │ │ elements │ ipv6_addr, │
786 │ │ │ ether_addr, │
787 │ │ │ inet_proto, │
788 │ │ │ inet_service, mark, │
789 │ │ │ counter, quota. │
790 │ │ │ Counter and quota │
791 │ │ │ can’t be used as │
792 │ │ │ keys │
793 ├─────────┼─────────────────────┼─────────────────────┤
794 │ │ │ │
795 │typeof │ data type of set │ expression to │
796 │ │ element │ derive the data │
797 │ │ │ type from │
798 ├─────────┼─────────────────────┼─────────────────────┤
799 │ │ │ │
800 │flags │ map flags │ string: constant, │
801 │ │ │ interval │
802 ├─────────┼─────────────────────┼─────────────────────┤
803 │ │ │ │
804 │elements │ elements contained │ map data type │
805 │ │ by the map │ │
806 ├─────────┼─────────────────────┼─────────────────────┤
807 │ │ │ │
808 │size │ maximum number of │ unsigned integer │
809 │ │ elements in the map │ (64 bit) │
810 ├─────────┼─────────────────────┼─────────────────────┤
811 │ │ │ │
812 │policy │ map policy │ string: performance │
813 │ │ │ [default], memory │
814 └─────────┴─────────────────────┴─────────────────────┘
815
817 {add | create | delete | get } element [family] table set { ELEMENT[, ...] }
818
819 ELEMENT := key_expression OPTIONS [: value_expression]
820 OPTIONS := [timeout TIMESPEC] [expires TIMESPEC] [comment string]
821 TIMESPEC := [numd][numh][numm][num[s]]
822
823 Element-related commands allow to change contents of named sets and
824 maps. key_expression is typically a value matching the set type.
825 value_expression is not allowed in sets but mandatory when adding to
826 maps, where it matches the data part in its type definition. When
827 deleting from maps, it may be specified but is optional as
828 key_expression uniquely identifies the element.
829
830 create command is similar to add with the exception that none of the
831 listed elements may already exist.
832
833 get command is useful to check if an element is contained in a set
834 which may be non-trivial in very large and/or interval sets. In the
835 latter case, the containing interval is returned instead of just the
836 element itself.
837
838 Table 10. Element options
839 ┌────────┬───────────────────────────┐
840 │Option │ Description │
841 ├────────┼───────────────────────────┤
842 │ │ │
843 │timeout │ timeout value for │
844 │ │ sets/maps with flag │
845 │ │ timeout │
846 ├────────┼───────────────────────────┤
847 │ │ │
848 │expires │ the time until given │
849 │ │ element expires, useful │
850 │ │ for ruleset replication │
851 │ │ only │
852 ├────────┼───────────────────────────┤
853 │ │ │
854 │comment │ per element comment field │
855 └────────┴───────────────────────────┘
856
858 {add | create} flowtable [family] table flowtable { hook hook priority priority ; devices = { device[, ...] } ; }
859 list flowtables [family]
860 {delete | list} flowtable [family] table flowtable
861 delete flowtable [family] table handle handle
862
863 Flowtables allow you to accelerate packet forwarding in software.
864 Flowtables entries are represented through a tuple that is composed of
865 the input interface, source and destination address, source and
866 destination port; and layer 3/4 protocols. Each entry also caches the
867 destination interface and the gateway address - to update the
868 destination link-layer address - to forward packets. The ttl and
869 hoplimit fields are also decremented. Hence, flowtables provides an
870 alternative path that allow packets to bypass the classic forwarding
871 path. Flowtables reside in the ingress hook that is located before the
872 prerouting hook. You can select which flows you want to offload through
873 the flow expression from the forward chain. Flowtables are identified
874 by their address family and their name. The address family must be one
875 of ip, ip6, or inet. The inet address family is a dummy family which is
876 used to create hybrid IPv4/IPv6 tables. When no address family is
877 specified, ip is used by default.
878
879 The priority can be a signed integer or filter which stands for 0.
880 Addition and subtraction can be used to set relative priority, e.g.
881 filter + 5 equals to 5.
882
883
884 add Add a new flowtable for
885 the given family with the
886 given name.
887
888 delete Delete the specified
889 flowtable.
890
891 list List all flowtables.
892
893
895 list { secmarks | synproxys | flow tables | meters | hooks } [family]
896 list { secmarks | synproxys | flow tables | meters | hooks } table [family] table
897 list ct { timeout | expectation | helper | helpers } table [family] table
898
899 Inspect configured objects. list hooks shows the full hook pipeline,
900 including those registered by kernel modules, such as nf_conntrack.
901
903 {add | delete | list | reset} type [family] table object
904 delete type [family] table handle handle
905 list counters [family]
906 list quotas [family]
907 list limits [family]
908
909 Stateful objects are attached to tables and are identified by a unique
910 name. They group stateful information from rules, to reference them in
911 rules the keywords "type name" are used e.g. "counter name".
912
913
914 add Add a new stateful object
915 in the specified table.
916
917 delete Delete the specified
918 object.
919
920 list Display stateful
921 information the object
922 holds.
923
924 reset List-and-reset stateful
925 object.
926
927
928 CT HELPER
929 ct helper helper { type type protocol protocol ; [l3proto family ;] }
930
931 Ct helper is used to define connection tracking helpers that can then
932 be used in combination with the ct helper set statement. type and
933 protocol are mandatory, l3proto is derived from the table family by
934 default, i.e. in the inet table the kernel will try to load both the
935 ipv4 and ipv6 helper backends, if they are supported by the kernel.
936
937 Table 11. conntrack helper specifications
938 ┌─────────┬─────────────────────┬─────────────────────┐
939 │Keyword │ Description │ Type │
940 ├─────────┼─────────────────────┼─────────────────────┤
941 │ │ │ │
942 │type │ name of helper type │ quoted string (e.g. │
943 │ │ │ "ftp") │
944 ├─────────┼─────────────────────┼─────────────────────┤
945 │ │ │ │
946 │protocol │ layer 4 protocol of │ string (e.g. ip) │
947 │ │ the helper │ │
948 ├─────────┼─────────────────────┼─────────────────────┤
949 │ │ │ │
950 │l3proto │ layer 3 protocol of │ address family │
951 │ │ the helper │ (e.g. ip) │
952 └─────────┴─────────────────────┴─────────────────────┘
953
954 defining and assigning ftp helper.
955
956 Unlike iptables, helper assignment needs to be performed after the conntrack
957 lookup has completed, for example with the default 0 hook priority.
958
959 table inet myhelpers {
960 ct helper ftp-standard {
961 type "ftp" protocol tcp
962 }
963 chain prerouting {
964 type filter hook prerouting priority filter;
965 tcp dport 21 ct helper set "ftp-standard"
966 }
967 }
968
969
970 CT TIMEOUT
971 ct timeout name { protocol protocol ; policy = { state: value [, ...] } ; [l3proto family ;] }
972
973 Ct timeout is used to update connection tracking timeout values.Timeout
974 policies are assigned with the ct timeout set statement. protocol and
975 policy are mandatory, l3proto is derived from the table family by
976 default.
977
978 Table 12. conntrack timeout specifications
979 ┌─────────┬─────────────────────┬──────────────────┐
980 │Keyword │ Description │ Type │
981 ├─────────┼─────────────────────┼──────────────────┤
982 │ │ │ │
983 │protocol │ layer 4 protocol of │ string (e.g. ip) │
984 │ │ the timeout object │ │
985 ├─────────┼─────────────────────┼──────────────────┤
986 │ │ │ │
987 │state │ connection state │ string (e.g. │
988 │ │ name │ "established") │
989 ├─────────┼─────────────────────┼──────────────────┤
990 │ │ │ │
991 │value │ timeout value for │ unsigned integer │
992 │ │ connection state │ │
993 ├─────────┼─────────────────────┼──────────────────┤
994 │ │ │ │
995 │l3proto │ layer 3 protocol of │ address family │
996 │ │ the timeout object │ (e.g. ip) │
997 └─────────┴─────────────────────┴──────────────────┘
998
999 defining and assigning ct timeout policy.
1000
1001 table ip filter {
1002 ct timeout customtimeout {
1003 protocol tcp;
1004 l3proto ip
1005 policy = { established: 120, close: 20 }
1006 }
1007
1008 chain output {
1009 type filter hook output priority filter; policy accept;
1010 ct timeout set "customtimeout"
1011 }
1012 }
1013
1014 testing the updated timeout policy.
1015
1016 % conntrack -E
1017
1018 It should display:
1019
1020 [UPDATE] tcp 6 120 ESTABLISHED src=172.16.19.128 dst=172.16.19.1
1021 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128
1022 sport=41360 dport=22
1023
1024
1025 CT EXPECTATION
1026 ct expectation name { protocol protocol ; dport dport ; timeout timeout ; size size ; [*l3proto family ;] }
1027
1028 Ct expectation is used to create connection expectations. Expectations
1029 are assigned with the ct expectation set statement. protocol, dport,
1030 timeout and size are mandatory, l3proto is derived from the table
1031 family by default.
1032
1033 Table 13. conntrack expectation specifications
1034 ┌─────────┬─────────────────────┬──────────────────┐
1035 │Keyword │ Description │ Type │
1036 ├─────────┼─────────────────────┼──────────────────┤
1037 │ │ │ │
1038 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1039 │ │ the expectation │ │
1040 │ │ object │ │
1041 ├─────────┼─────────────────────┼──────────────────┤
1042 │ │ │ │
1043 │dport │ destination port of │ unsigned integer │
1044 │ │ expected connection │ │
1045 ├─────────┼─────────────────────┼──────────────────┤
1046 │ │ │ │
1047 │timeout │ timeout value for │ unsigned integer │
1048 │ │ expectation │ │
1049 ├─────────┼─────────────────────┼──────────────────┤
1050 │ │ │ │
1051 │size │ size value for │ unsigned integer │
1052 │ │ expectation │ │
1053 ├─────────┼─────────────────────┼──────────────────┤
1054 │ │ │ │
1055 │l3proto │ layer 3 protocol of │ address family │
1056 │ │ the expectation │ (e.g. ip) │
1057 │ │ object │ │
1058 └─────────┴─────────────────────┴──────────────────┘
1059
1060 defining and assigning ct expectation policy.
1061
1062 table ip filter {
1063 ct expectation expect {
1064 protocol udp
1065 dport 9876
1066 timeout 2m
1067 size 8
1068 l3proto ip
1069 }
1070
1071 chain input {
1072 type filter hook input priority filter; policy accept;
1073 ct expectation set "expect"
1074 }
1075 }
1076
1077
1078 COUNTER
1079 counter [packets bytes]
1080
1081 Table 14. Counter specifications
1082 ┌────────┬──────────────────┬──────────────────┐
1083 │Keyword │ Description │ Type │
1084 ├────────┼──────────────────┼──────────────────┤
1085 │ │ │ │
1086 │packets │ initial count of │ unsigned integer │
1087 │ │ packets │ (64 bit) │
1088 ├────────┼──────────────────┼──────────────────┤
1089 │ │ │ │
1090 │bytes │ initial count of │ unsigned integer │
1091 │ │ bytes │ (64 bit) │
1092 └────────┴──────────────────┴──────────────────┘
1093
1094 QUOTA
1095 quota [over | until] [used]
1096
1097 Table 15. Quota specifications
1098 ┌────────┬───────────────────┬────────────────────┐
1099 │Keyword │ Description │ Type │
1100 ├────────┼───────────────────┼────────────────────┤
1101 │ │ │ │
1102 │quota │ quota limit, used │ Two arguments, │
1103 │ │ as the quota name │ unsigned integer │
1104 │ │ │ (64 bit) and │
1105 │ │ │ string: bytes, │
1106 │ │ │ kbytes, mbytes. │
1107 │ │ │ "over" and "until" │
1108 │ │ │ go before these │
1109 │ │ │ arguments │
1110 ├────────┼───────────────────┼────────────────────┤
1111 │ │ │ │
1112 │used │ initial value of │ Two arguments, │
1113 │ │ used quota │ unsigned integer │
1114 │ │ │ (64 bit) and │
1115 │ │ │ string: bytes, │
1116 │ │ │ kbytes, mbytes │
1117 └────────┴───────────────────┴────────────────────┘
1118
1120 Expressions represent values, either constants like network addresses,
1121 port numbers, etc., or data gathered from the packet during ruleset
1122 evaluation. Expressions can be combined using binary, logical,
1123 relational and other types of expressions to form complex or relational
1124 (match) expressions. They are also used as arguments to certain types
1125 of operations, like NAT, packet marking etc.
1126
1127 Each expression has a data type, which determines the size, parsing and
1128 representation of symbolic values and type compatibility with other
1129 expressions.
1130
1131 DESCRIBE COMMAND
1132 describe expression | data type
1133
1134 The describe command shows information about the type of an expression
1135 and its data type. A data type may also be given, in which nft will
1136 display more information about the type.
1137
1138 The describe command.
1139
1140 $ nft describe tcp flags
1141 payload expression, datatype tcp_flag (TCP flag) (basetype bitmask, integer), 8 bits
1142
1143 predefined symbolic constants:
1144 fin 0x01
1145 syn 0x02
1146 rst 0x04
1147 psh 0x08
1148 ack 0x10
1149 urg 0x20
1150 ecn 0x40
1151 cwr 0x80
1152
1153
1155 Data types determine the size, parsing and representation of symbolic
1156 values and type compatibility of expressions. A number of global data
1157 types exist, in addition some expression types define further data
1158 types specific to the expression type. Most data types have a fixed
1159 size, some however may have a dynamic size, f.i. the string type. Some
1160 types also have predefined symbolic constants. Those can be listed
1161 using the nft describe command:
1162
1163 $ nft describe ct_state
1164 datatype ct_state (conntrack state) (basetype bitmask, integer), 32 bits
1165
1166 pre-defined symbolic constants (in hexadecimal):
1167 invalid 0x00000001
1168 new ...
1169
1170 Types may be derived from lower order types, f.i. the IPv4 address type
1171 is derived from the integer type, meaning an IPv4 address can also be
1172 specified as an integer value.
1173
1174 In certain contexts (set and map definitions), it is necessary to
1175 explicitly specify a data type. Each type has a name which is used for
1176 this.
1177
1178 INTEGER TYPE
1179 ┌────────┬─────────┬──────────┬───────────┐
1180 │Name │ Keyword │ Size │ Base type │
1181 ├────────┼─────────┼──────────┼───────────┤
1182 │ │ │ │ │
1183 │Integer │ integer │ variable │ - │
1184 └────────┴─────────┴──────────┴───────────┘
1185
1186 The integer type is used for numeric values. It may be specified as a
1187 decimal, hexadecimal or octal number. The integer type does not have a
1188 fixed size, its size is determined by the expression for which it is
1189 used.
1190
1191 BITMASK TYPE
1192 ┌────────┬─────────┬──────────┬───────────┐
1193 │Name │ Keyword │ Size │ Base type │
1194 ├────────┼─────────┼──────────┼───────────┤
1195 │ │ │ │ │
1196 │Bitmask │ bitmask │ variable │ integer │
1197 └────────┴─────────┴──────────┴───────────┘
1198
1199 The bitmask type (bitmask) is used for bitmasks.
1200
1201 STRING TYPE
1202 ┌───────┬─────────┬──────────┬───────────┐
1203 │Name │ Keyword │ Size │ Base type │
1204 ├───────┼─────────┼──────────┼───────────┤
1205 │ │ │ │ │
1206 │String │ string │ variable │ - │
1207 └───────┴─────────┴──────────┴───────────┘
1208
1209 The string type is used for character strings. A string begins with an
1210 alphabetic character (a-zA-Z) followed by zero or more alphanumeric
1211 characters or the characters /, -, _ and .. In addition, anything
1212 enclosed in double quotes (") is recognized as a string.
1213
1214 String specification.
1215
1216 # Interface name
1217 filter input iifname eth0
1218
1219 # Weird interface name
1220 filter input iifname "(eth0)"
1221
1222
1223 LINK LAYER ADDRESS TYPE
1224 ┌───────────┬─────────┬──────────┬───────────┐
1225 │Name │ Keyword │ Size │ Base type │
1226 ├───────────┼─────────┼──────────┼───────────┤
1227 │ │ │ │ │
1228 │Link layer │ lladdr │ variable │ integer │
1229 │address │ │ │ │
1230 └───────────┴─────────┴──────────┴───────────┘
1231
1232 The link layer address type is used for link layer addresses. Link
1233 layer addresses are specified as a variable amount of groups of two
1234 hexadecimal digits separated using colons (:).
1235
1236 Link layer address specification.
1237
1238 # Ethernet destination MAC address
1239 filter input ether daddr 20:c9:d0:43:12:d9
1240
1241
1242 IPV4 ADDRESS TYPE
1243 ┌─────────────┬───────────┬────────┬───────────┐
1244 │Name │ Keyword │ Size │ Base type │
1245 ├─────────────┼───────────┼────────┼───────────┤
1246 │ │ │ │ │
1247 │IPV4 address │ ipv4_addr │ 32 bit │ integer │
1248 └─────────────┴───────────┴────────┴───────────┘
1249
1250 The IPv4 address type is used for IPv4 addresses. Addresses are
1251 specified in either dotted decimal, dotted hexadecimal, dotted octal,
1252 decimal, hexadecimal, octal notation or as a host name. A host name
1253 will be resolved using the standard system resolver.
1254
1255 IPv4 address specification.
1256
1257 # dotted decimal notation
1258 filter output ip daddr 127.0.0.1
1259
1260 # host name
1261 filter output ip daddr localhost
1262
1263
1264 IPV6 ADDRESS TYPE
1265 ┌─────────────┬───────────┬─────────┬───────────┐
1266 │Name │ Keyword │ Size │ Base type │
1267 ├─────────────┼───────────┼─────────┼───────────┤
1268 │ │ │ │ │
1269 │IPv6 address │ ipv6_addr │ 128 bit │ integer │
1270 └─────────────┴───────────┴─────────┴───────────┘
1271
1272 The IPv6 address type is used for IPv6 addresses. Addresses are
1273 specified as a host name or as hexadecimal halfwords separated by
1274 colons. Addresses might be enclosed in square brackets ("[]") to
1275 differentiate them from port numbers.
1276
1277 IPv6 address specification.
1278
1279 # abbreviated loopback address
1280 filter output ip6 daddr ::1
1281
1282 IPv6 address specification with bracket notation.
1283
1284 # without [] the port number (22) would be parsed as part of the
1285 # ipv6 address
1286 ip6 nat prerouting tcp dport 2222 dnat to [1ce::d0]:22
1287
1288
1289 BOOLEAN TYPE
1290 ┌────────┬─────────┬───────┬───────────┐
1291 │Name │ Keyword │ Size │ Base type │
1292 ├────────┼─────────┼───────┼───────────┤
1293 │ │ │ │ │
1294 │Boolean │ boolean │ 1 bit │ integer │
1295 └────────┴─────────┴───────┴───────────┘
1296
1297 The boolean type is a syntactical helper type in userspace. Its use is
1298 in the right-hand side of a (typically implicit) relational expression
1299 to change the expression on the left-hand side into a boolean check
1300 (usually for existence).
1301
1302 Table 16. The following keywords will automatically resolve into a
1303 boolean type with given value
1304 ┌────────┬───────┐
1305 │Keyword │ Value │
1306 ├────────┼───────┤
1307 │ │ │
1308 │exists │ 1 │
1309 ├────────┼───────┤
1310 │ │ │
1311 │missing │ 0 │
1312 └────────┴───────┘
1313
1314 Table 17. expressions support a boolean comparison
1315 ┌───────────┬─────────────────────────┐
1316 │Expression │ Behaviour │
1317 ├───────────┼─────────────────────────┤
1318 │ │ │
1319 │fib │ Check route existence. │
1320 ├───────────┼─────────────────────────┤
1321 │ │ │
1322 │exthdr │ Check IPv6 extension │
1323 │ │ header existence. │
1324 ├───────────┼─────────────────────────┤
1325 │ │ │
1326 │tcp option │ Check TCP option header │
1327 │ │ existence. │
1328 └───────────┴─────────────────────────┘
1329
1330 Boolean specification.
1331
1332 # match if route exists
1333 filter input fib daddr . iif oif exists
1334
1335 # match only non-fragmented packets in IPv6 traffic
1336 filter input exthdr frag missing
1337
1338 # match if TCP timestamp option is present
1339 filter input tcp option timestamp exists
1340
1341
1342 ICMP TYPE TYPE
1343 ┌──────────┬───────────┬───────┬───────────┐
1344 │Name │ Keyword │ Size │ Base type │
1345 ├──────────┼───────────┼───────┼───────────┤
1346 │ │ │ │ │
1347 │ICMP Type │ icmp_type │ 8 bit │ integer │
1348 └──────────┴───────────┴───────┴───────────┘
1349
1350 The ICMP Type type is used to conveniently specify the ICMP header’s
1351 type field.
1352
1353 Table 18. Keywords may be used when specifying the ICMP type
1354 ┌────────────────────────┬───────┐
1355 │Keyword │ Value │
1356 ├────────────────────────┼───────┤
1357 │ │ │
1358 │echo-reply │ 0 │
1359 ├────────────────────────┼───────┤
1360 │ │ │
1361 │destination-unreachable │ 3 │
1362 ├────────────────────────┼───────┤
1363 │ │ │
1364 │source-quench │ 4 │
1365 ├────────────────────────┼───────┤
1366 │ │ │
1367 │redirect │ 5 │
1368 ├────────────────────────┼───────┤
1369 │ │ │
1370 │echo-request │ 8 │
1371 ├────────────────────────┼───────┤
1372 │ │ │
1373 │router-advertisement │ 9 │
1374 ├────────────────────────┼───────┤
1375 │ │ │
1376 │router-solicitation │ 10 │
1377 ├────────────────────────┼───────┤
1378 │ │ │
1379 │time-exceeded │ 11 │
1380 ├────────────────────────┼───────┤
1381 │ │ │
1382 │parameter-problem │ 12 │
1383 ├────────────────────────┼───────┤
1384 │ │ │
1385 │timestamp-request │ 13 │
1386 ├────────────────────────┼───────┤
1387 │ │ │
1388 │timestamp-reply │ 14 │
1389 ├────────────────────────┼───────┤
1390 │ │ │
1391 │info-request │ 15 │
1392 ├────────────────────────┼───────┤
1393 │ │ │
1394 │info-reply │ 16 │
1395 ├────────────────────────┼───────┤
1396 │ │ │
1397 │address-mask-request │ 17 │
1398 ├────────────────────────┼───────┤
1399 │ │ │
1400 │address-mask-reply │ 18 │
1401 └────────────────────────┴───────┘
1402
1403 ICMP Type specification.
1404
1405 # match ping packets
1406 filter output icmp type { echo-request, echo-reply }
1407
1408
1409 ICMP CODE TYPE
1410 ┌──────────┬───────────┬───────┬───────────┐
1411 │Name │ Keyword │ Size │ Base type │
1412 ├──────────┼───────────┼───────┼───────────┤
1413 │ │ │ │ │
1414 │ICMP Code │ icmp_code │ 8 bit │ integer │
1415 └──────────┴───────────┴───────┴───────────┘
1416
1417 The ICMP Code type is used to conveniently specify the ICMP header’s
1418 code field.
1419
1420 Table 19. Keywords may be used when specifying the ICMP code
1421 ┌─────────────────┬───────┐
1422 │Keyword │ Value │
1423 ├─────────────────┼───────┤
1424 │ │ │
1425 │net-unreachable │ 0 │
1426 ├─────────────────┼───────┤
1427 │ │ │
1428 │host-unreachable │ 1 │
1429 ├─────────────────┼───────┤
1430 │ │ │
1431 │prot-unreachable │ 2 │
1432 ├─────────────────┼───────┤
1433 │ │ │
1434 │port-unreachable │ 3 │
1435 ├─────────────────┼───────┤
1436 │ │ │
1437 │frag-needed │ 4 │
1438 ├─────────────────┼───────┤
1439 │ │ │
1440 │net-prohibited │ 9 │
1441 ├─────────────────┼───────┤
1442 │ │ │
1443 │host-prohibited │ 10 │
1444 ├─────────────────┼───────┤
1445 │ │ │
1446 │admin-prohibited │ 13 │
1447 └─────────────────┴───────┘
1448
1449 ICMPV6 TYPE TYPE
1450 ┌────────────┬────────────┬───────┬───────────┐
1451 │Name │ Keyword │ Size │ Base type │
1452 ├────────────┼────────────┼───────┼───────────┤
1453 │ │ │ │ │
1454 │ICMPv6 Type │ icmpx_code │ 8 bit │ integer │
1455 └────────────┴────────────┴───────┴───────────┘
1456
1457 The ICMPv6 Type type is used to conveniently specify the ICMPv6
1458 header’s type field.
1459
1460 Table 20. keywords may be used when specifying the ICMPv6 type:
1461 ┌────────────────────────┬───────┐
1462 │Keyword │ Value │
1463 ├────────────────────────┼───────┤
1464 │ │ │
1465 │destination-unreachable │ 1 │
1466 ├────────────────────────┼───────┤
1467 │ │ │
1468 │packet-too-big │ 2 │
1469 ├────────────────────────┼───────┤
1470 │ │ │
1471 │time-exceeded │ 3 │
1472 ├────────────────────────┼───────┤
1473 │ │ │
1474 │parameter-problem │ 4 │
1475 ├────────────────────────┼───────┤
1476 │ │ │
1477 │echo-request │ 128 │
1478 ├────────────────────────┼───────┤
1479 │ │ │
1480 │echo-reply │ 129 │
1481 ├────────────────────────┼───────┤
1482 │ │ │
1483 │mld-listener-query │ 130 │
1484 ├────────────────────────┼───────┤
1485 │ │ │
1486 │mld-listener-report │ 131 │
1487 ├────────────────────────┼───────┤
1488 │ │ │
1489 │mld-listener-done │ 132 │
1490 ├────────────────────────┼───────┤
1491 │ │ │
1492 │mld-listener-reduction │ 132 │
1493 ├────────────────────────┼───────┤
1494 │ │ │
1495 │nd-router-solicit │ 133 │
1496 ├────────────────────────┼───────┤
1497 │ │ │
1498 │nd-router-advert │ 134 │
1499 ├────────────────────────┼───────┤
1500 │ │ │
1501 │nd-neighbor-solicit │ 135 │
1502 ├────────────────────────┼───────┤
1503 │ │ │
1504 │nd-neighbor-advert │ 136 │
1505 ├────────────────────────┼───────┤
1506 │ │ │
1507 │nd-redirect │ 137 │
1508 ├────────────────────────┼───────┤
1509 │ │ │
1510 │router-renumbering │ 138 │
1511 ├────────────────────────┼───────┤
1512 │ │ │
1513 │ind-neighbor-solicit │ 141 │
1514 ├────────────────────────┼───────┤
1515 │ │ │
1516 │ind-neighbor-advert │ 142 │
1517 ├────────────────────────┼───────┤
1518 │ │ │
1519 │mld2-listener-report │ 143 │
1520 └────────────────────────┴───────┘
1521
1522 ICMPv6 Type specification.
1523
1524 # match ICMPv6 ping packets
1525 filter output icmpv6 type { echo-request, echo-reply }
1526
1527
1528 ICMPV6 CODE TYPE
1529 ┌────────────┬─────────────┬───────┬───────────┐
1530 │Name │ Keyword │ Size │ Base type │
1531 ├────────────┼─────────────┼───────┼───────────┤
1532 │ │ │ │ │
1533 │ICMPv6 Code │ icmpv6_code │ 8 bit │ integer │
1534 └────────────┴─────────────┴───────┴───────────┘
1535
1536 The ICMPv6 Code type is used to conveniently specify the ICMPv6
1537 header’s code field.
1538
1539 Table 21. keywords may be used when specifying the ICMPv6 code
1540 ┌─────────────────┬───────┐
1541 │Keyword │ Value │
1542 ├─────────────────┼───────┤
1543 │ │ │
1544 │no-route │ 0 │
1545 ├─────────────────┼───────┤
1546 │ │ │
1547 │admin-prohibited │ 1 │
1548 ├─────────────────┼───────┤
1549 │ │ │
1550 │addr-unreachable │ 3 │
1551 ├─────────────────┼───────┤
1552 │ │ │
1553 │port-unreachable │ 4 │
1554 ├─────────────────┼───────┤
1555 │ │ │
1556 │policy-fail │ 5 │
1557 ├─────────────────┼───────┤
1558 │ │ │
1559 │reject-route │ 6 │
1560 └─────────────────┴───────┘
1561
1562 ICMPVX CODE TYPE
1563 ┌────────────┬─────────────┬───────┬───────────┐
1564 │Name │ Keyword │ Size │ Base type │
1565 ├────────────┼─────────────┼───────┼───────────┤
1566 │ │ │ │ │
1567 │ICMPvX Code │ icmpv6_type │ 8 bit │ integer │
1568 └────────────┴─────────────┴───────┴───────────┘
1569
1570 The ICMPvX Code type abstraction is a set of values which overlap
1571 between ICMP and ICMPv6 Code types to be used from the inet family.
1572
1573 Table 22. keywords may be used when specifying the ICMPvX code
1574 ┌─────────────────┬───────┐
1575 │Keyword │ Value │
1576 ├─────────────────┼───────┤
1577 │ │ │
1578 │no-route │ 0 │
1579 ├─────────────────┼───────┤
1580 │ │ │
1581 │port-unreachable │ 1 │
1582 ├─────────────────┼───────┤
1583 │ │ │
1584 │host-unreachable │ 2 │
1585 ├─────────────────┼───────┤
1586 │ │ │
1587 │admin-prohibited │ 3 │
1588 └─────────────────┴───────┘
1589
1590 CONNTRACK TYPES
1591 Table 23. overview of types used in ct expression and statement
1592 ┌─────────────────┬───────────┬─────────┬───────────┐
1593 │Name │ Keyword │ Size │ Base type │
1594 ├─────────────────┼───────────┼─────────┼───────────┤
1595 │ │ │ │ │
1596 │conntrack state │ ct_state │ 4 byte │ bitmask │
1597 ├─────────────────┼───────────┼─────────┼───────────┤
1598 │ │ │ │ │
1599 │conntrack │ ct_dir │ 8 bit │ integer │
1600 │direction │ │ │ │
1601 ├─────────────────┼───────────┼─────────┼───────────┤
1602 │ │ │ │ │
1603 │conntrack status │ ct_status │ 4 byte │ bitmask │
1604 ├─────────────────┼───────────┼─────────┼───────────┤
1605 │ │ │ │ │
1606 │conntrack event │ ct_event │ 4 byte │ bitmask │
1607 │bits │ │ │ │
1608 ├─────────────────┼───────────┼─────────┼───────────┤
1609 │ │ │ │ │
1610 │conntrack label │ ct_label │ 128 bit │ bitmask │
1611 └─────────────────┴───────────┴─────────┴───────────┘
1612
1613 For each of the types above, keywords are available for convenience:
1614
1615 Table 24. conntrack state (ct_state)
1616 ┌────────────┬───────┐
1617 │Keyword │ Value │
1618 ├────────────┼───────┤
1619 │ │ │
1620 │invalid │ 1 │
1621 ├────────────┼───────┤
1622 │ │ │
1623 │established │ 2 │
1624 ├────────────┼───────┤
1625 │ │ │
1626 │related │ 4 │
1627 ├────────────┼───────┤
1628 │ │ │
1629 │new │ 8 │
1630 ├────────────┼───────┤
1631 │ │ │
1632 │untracked │ 64 │
1633 └────────────┴───────┘
1634
1635 Table 25. conntrack direction (ct_dir)
1636 ┌─────────┬───────┐
1637 │Keyword │ Value │
1638 ├─────────┼───────┤
1639 │ │ │
1640 │original │ 0 │
1641 ├─────────┼───────┤
1642 │ │ │
1643 │reply │ 1 │
1644 └─────────┴───────┘
1645
1646 Table 26. conntrack status (ct_status)
1647 ┌───────────┬───────┐
1648 │Keyword │ Value │
1649 ├───────────┼───────┤
1650 │ │ │
1651 │expected │ 1 │
1652 ├───────────┼───────┤
1653 │ │ │
1654 │seen-reply │ 2 │
1655 ├───────────┼───────┤
1656 │ │ │
1657 │assured │ 4 │
1658 ├───────────┼───────┤
1659 │ │ │
1660 │confirmed │ 8 │
1661 ├───────────┼───────┤
1662 │ │ │
1663 │snat │ 16 │
1664 ├───────────┼───────┤
1665 │ │ │
1666 │dnat │ 32 │
1667 ├───────────┼───────┤
1668 │ │ │
1669 │dying │ 512 │
1670 └───────────┴───────┘
1671
1672 Table 27. conntrack event bits (ct_event)
1673 ┌──────────┬───────┐
1674 │Keyword │ Value │
1675 ├──────────┼───────┤
1676 │ │ │
1677 │new │ 1 │
1678 ├──────────┼───────┤
1679 │ │ │
1680 │related │ 2 │
1681 ├──────────┼───────┤
1682 │ │ │
1683 │destroy │ 4 │
1684 ├──────────┼───────┤
1685 │ │ │
1686 │reply │ 8 │
1687 ├──────────┼───────┤
1688 │ │ │
1689 │assured │ 16 │
1690 ├──────────┼───────┤
1691 │ │ │
1692 │protoinfo │ 32 │
1693 ├──────────┼───────┤
1694 │ │ │
1695 │helper │ 64 │
1696 ├──────────┼───────┤
1697 │ │ │
1698 │mark │ 128 │
1699 ├──────────┼───────┤
1700 │ │ │
1701 │seqadj │ 256 │
1702 ├──────────┼───────┤
1703 │ │ │
1704 │secmark │ 512 │
1705 ├──────────┼───────┤
1706 │ │ │
1707 │label │ 1024 │
1708 └──────────┴───────┘
1709
1710 Possible keywords for conntrack label type (ct_label) are read at
1711 runtime from /etc/connlabel.conf.
1712
1713 DCCP PKTTYPE TYPE
1714 ┌─────────────────┬──────────────┬───────┬───────────┐
1715 │Name │ Keyword │ Size │ Base type │
1716 ├─────────────────┼──────────────┼───────┼───────────┤
1717 │ │ │ │ │
1718 │DCCP packet type │ dccp_pkttype │ 4 bit │ integer │
1719 └─────────────────┴──────────────┴───────┴───────────┘
1720
1721 The DCCP packet type abstracts the different legal values of the
1722 respective four bit field in the DCCP header, as stated by RFC4340.
1723 Note that possible values 10-15 are considered reserved and therefore
1724 not allowed to be used. In iptables' dccp match, these values are
1725 aliased INVALID. With nftables, one may simply match on the numeric
1726 value range, i.e. 10-15.
1727
1728 Table 28. keywords may be used when specifying the DCCP packet type
1729 ┌─────────┬───────┐
1730 │Keyword │ Value │
1731 ├─────────┼───────┤
1732 │ │ │
1733 │request │ 0 │
1734 ├─────────┼───────┤
1735 │ │ │
1736 │response │ 1 │
1737 ├─────────┼───────┤
1738 │ │ │
1739 │data │ 2 │
1740 ├─────────┼───────┤
1741 │ │ │
1742 │ack │ 3 │
1743 ├─────────┼───────┤
1744 │ │ │
1745 │dataack │ 4 │
1746 ├─────────┼───────┤
1747 │ │ │
1748 │closereq │ 5 │
1749 ├─────────┼───────┤
1750 │ │ │
1751 │close │ 6 │
1752 ├─────────┼───────┤
1753 │ │ │
1754 │reset │ 7 │
1755 ├─────────┼───────┤
1756 │ │ │
1757 │sync │ 8 │
1758 ├─────────┼───────┤
1759 │ │ │
1760 │syncack │ 9 │
1761 └─────────┴───────┘
1762
1764 The lowest order expression is a primary expression, representing
1765 either a constant or a single datum from a packet’s payload, meta data
1766 or a stateful module.
1767
1768 META EXPRESSIONS
1769 meta {length | nfproto | l4proto | protocol | priority}
1770 [meta] {mark | iif | iifname | iiftype | oif | oifname | oiftype | skuid | skgid | nftrace | rtclassid | ibrname | obrname | pkttype | cpu | iifgroup | oifgroup | cgroup | random | ipsec | iifkind | oifkind | time | hour | day }
1771
1772 A meta expression refers to meta data associated with a packet.
1773
1774 There are two types of meta expressions: unqualified and qualified meta
1775 expressions. Qualified meta expressions require the meta keyword before
1776 the meta key, unqualified meta expressions can be specified by using
1777 the meta key directly or as qualified meta expressions. Meta l4proto is
1778 useful to match a particular transport protocol that is part of either
1779 an IPv4 or IPv6 packet. It will also skip any IPv6 extension headers
1780 present in an IPv6 packet.
1781
1782 meta iif, oif, iifname and oifname are used to match the interface a
1783 packet arrived on or is about to be sent out on.
1784
1785 iif and oif are used to match on the interface index, whereas iifname
1786 and oifname are used to match on the interface name. This is not the
1787 same — assuming the rule
1788
1789 filter input meta iif "foo"
1790
1791 Then this rule can only be added if the interface "foo" exists. Also,
1792 the rule will continue to match even if the interface "foo" is renamed
1793 to "bar".
1794
1795 This is because internally the interface index is used. In case of
1796 dynamically created interfaces, such as tun/tap or dialup interfaces
1797 (ppp for example), it might be better to use iifname or oifname
1798 instead.
1799
1800 In these cases, the name is used so the interface doesn’t have to exist
1801 to add such a rule, it will stop matching if the interface gets renamed
1802 and it will match again in case interface gets deleted and later a new
1803 interface with the same name is created.
1804
1805 Like with iptables, wildcard matching on interface name prefixes is
1806 available for iifname and oifname matches by appending an asterisk (*)
1807 character. Note however that unlike iptables, nftables does not accept
1808 interface names consisting of the wildcard character only - users are
1809 supposed to just skip those always matching expressions. In order to
1810 match on literal asterisk character, one may escape it using backslash
1811 (\).
1812
1813 Table 29. Meta expression types
1814 ┌──────────┬─────────────────────┬─────────────────────┐
1815 │Keyword │ Description │ Type │
1816 ├──────────┼─────────────────────┼─────────────────────┤
1817 │ │ │ │
1818 │length │ Length of the │ integer (32-bit) │
1819 │ │ packet in bytes │ │
1820 ├──────────┼─────────────────────┼─────────────────────┤
1821 │ │ │ │
1822 │nfproto │ real hook protocol │ integer (32 bit) │
1823 │ │ family, useful only │ │
1824 │ │ in inet table │ │
1825 ├──────────┼─────────────────────┼─────────────────────┤
1826 │ │ │ │
1827 │l4proto │ layer 4 protocol, │ integer (8 bit) │
1828 │ │ skips ipv6 │ │
1829 │ │ extension headers │ │
1830 ├──────────┼─────────────────────┼─────────────────────┤
1831 │ │ │ │
1832 │protocol │ EtherType protocol │ ether_type │
1833 │ │ value │ │
1834 ├──────────┼─────────────────────┼─────────────────────┤
1835 │ │ │ │
1836 │priority │ TC packet priority │ tc_handle │
1837 ├──────────┼─────────────────────┼─────────────────────┤
1838 │ │ │ │
1839 │mark │ Packet mark │ mark │
1840 ├──────────┼─────────────────────┼─────────────────────┤
1841 │ │ │ │
1842 │iif │ Input interface │ iface_index │
1843 │ │ index │ │
1844 ├──────────┼─────────────────────┼─────────────────────┤
1845 │ │ │ │
1846 │iifname │ Input interface │ ifname │
1847 │ │ name │ │
1848 ├──────────┼─────────────────────┼─────────────────────┤
1849 │ │ │ │
1850 │iiftype │ Input interface │ iface_type │
1851 │ │ type │ │
1852 ├──────────┼─────────────────────┼─────────────────────┤
1853 │ │ │ │
1854 │oif │ Output interface │ iface_index │
1855 │ │ index │ │
1856 ├──────────┼─────────────────────┼─────────────────────┤
1857 │ │ │ │
1858 │oifname │ Output interface │ ifname │
1859 │ │ name │ │
1860 ├──────────┼─────────────────────┼─────────────────────┤
1861 │ │ │ │
1862 │oiftype │ Output interface │ iface_type │
1863 │ │ hardware type │ │
1864 ├──────────┼─────────────────────┼─────────────────────┤
1865 │ │ │ │
1866 │sdif │ Slave device input │ iface_index │
1867 │ │ interface index │ │
1868 ├──────────┼─────────────────────┼─────────────────────┤
1869 │ │ │ │
1870 │sdifname │ Slave device │ ifname │
1871 │ │ interface name │ │
1872 ├──────────┼─────────────────────┼─────────────────────┤
1873 │ │ │ │
1874 │skuid │ UID associated with │ uid │
1875 │ │ originating socket │ │
1876 ├──────────┼─────────────────────┼─────────────────────┤
1877 │ │ │ │
1878 │skgid │ GID associated with │ gid │
1879 │ │ originating socket │ │
1880 ├──────────┼─────────────────────┼─────────────────────┤
1881 │ │ │ │
1882 │rtclassid │ Routing realm │ realm │
1883 ├──────────┼─────────────────────┼─────────────────────┤
1884 │ │ │ │
1885 │ibrname │ Input bridge │ ifname │
1886 │ │ interface name │ │
1887 ├──────────┼─────────────────────┼─────────────────────┤
1888 │ │ │ │
1889 │obrname │ Output bridge │ ifname │
1890 │ │ interface name │ │
1891 ├──────────┼─────────────────────┼─────────────────────┤
1892 │ │ │ │
1893 │pkttype │ packet type │ pkt_type │
1894 ├──────────┼─────────────────────┼─────────────────────┤
1895 │ │ │ │
1896 │cpu │ cpu number │ integer (32 bit) │
1897 │ │ processing the │ │
1898 │ │ packet │ │
1899 ├──────────┼─────────────────────┼─────────────────────┤
1900 │ │ │ │
1901 │iifgroup │ incoming device │ devgroup │
1902 │ │ group │ │
1903 ├──────────┼─────────────────────┼─────────────────────┤
1904 │ │ │ │
1905 │oifgroup │ outgoing device │ devgroup │
1906 │ │ group │ │
1907 ├──────────┼─────────────────────┼─────────────────────┤
1908 │ │ │ │
1909 │cgroup │ control group id │ integer (32 bit) │
1910 ├──────────┼─────────────────────┼─────────────────────┤
1911 │ │ │ │
1912 │random │ pseudo-random │ integer (32 bit) │
1913 │ │ number │ │
1914 ├──────────┼─────────────────────┼─────────────────────┤
1915 │ │ │ │
1916 │ipsec │ true if packet was │ boolean (1 bit) │
1917 │ │ ipsec encrypted │ │
1918 ├──────────┼─────────────────────┼─────────────────────┤
1919 │ │ │ │
1920 │iifkind │ Input interface │ │
1921 │ │ kind │ │
1922 ├──────────┼─────────────────────┼─────────────────────┤
1923 │ │ │ │
1924 │oifkind │ Output interface │ │
1925 │ │ kind │ │
1926 ├──────────┼─────────────────────┼─────────────────────┤
1927 │ │ │ │
1928 │time │ Absolute time of │ Integer (32 bit) or │
1929 │ │ packet reception │ string │
1930 ├──────────┼─────────────────────┼─────────────────────┤
1931 │ │ │ │
1932 │day │ Day of week │ Integer (8 bit) or │
1933 │ │ │ string │
1934 ├──────────┼─────────────────────┼─────────────────────┤
1935 │ │ │ │
1936 │hour │ Hour of day │ String │
1937 └──────────┴─────────────────────┴─────────────────────┘
1938
1939 Table 30. Meta expression specific types
1940 ┌──────────────┬────────────────────────────┐
1941 │Type │ Description │
1942 ├──────────────┼────────────────────────────┤
1943 │ │ │
1944 │iface_index │ Interface index (32 bit │
1945 │ │ number). Can be specified │
1946 │ │ numerically or as name of │
1947 │ │ an existing interface. │
1948 ├──────────────┼────────────────────────────┤
1949 │ │ │
1950 │ifname │ Interface name (16 byte │
1951 │ │ string). Does not have to │
1952 │ │ exist. │
1953 ├──────────────┼────────────────────────────┤
1954 │ │ │
1955 │iface_type │ Interface type (16 bit │
1956 │ │ number). │
1957 ├──────────────┼────────────────────────────┤
1958 │ │ │
1959 │uid │ User ID (32 bit number). │
1960 │ │ Can be specified │
1961 │ │ numerically or as user │
1962 │ │ name. │
1963 ├──────────────┼────────────────────────────┤
1964 │ │ │
1965 │gid │ Group ID (32 bit number). │
1966 │ │ Can be specified │
1967 │ │ numerically or as group │
1968 │ │ name. │
1969 ├──────────────┼────────────────────────────┤
1970 │ │ │
1971 │realm │ Routing Realm (32 bit │
1972 │ │ number). Can be specified │
1973 │ │ numerically or as symbolic │
1974 │ │ name defined in │
1975 │ │ /etc/iproute2/rt_realms. │
1976 ├──────────────┼────────────────────────────┤
1977 │ │ │
1978 │devgroup_type │ Device group (32 bit │
1979 │ │ number). Can be specified │
1980 │ │ numerically or as symbolic │
1981 │ │ name defined in │
1982 │ │ /etc/iproute2/group. │
1983 ├──────────────┼────────────────────────────┤
1984 │ │ │
1985 │pkt_type │ Packet type: host │
1986 │ │ (addressed to local host), │
1987 │ │ broadcast (to all), │
1988 │ │ multicast (to group), │
1989 │ │ other (addressed to │
1990 │ │ another host). │
1991 ├──────────────┼────────────────────────────┤
1992 │ │ │
1993 │ifkind │ Interface kind (16 byte │
1994 │ │ string). See TYPES in │
1995 │ │ ip-link(8) for a list. │
1996 ├──────────────┼────────────────────────────┤
1997 │ │ │
1998 │time │ Either an integer or a │
1999 │ │ date in ISO format. For │
2000 │ │ example: "2019-06-06 │
2001 │ │ 17:00". Hour and seconds │
2002 │ │ are optional and can be │
2003 │ │ omitted if desired. If │
2004 │ │ omitted, midnight will be │
2005 │ │ assumed. The following │
2006 │ │ three would be equivalent: │
2007 │ │ "2019-06-06", "2019-06-06 │
2008 │ │ 00:00" and "2019-06-06 │
2009 │ │ 00:00:00". When an integer │
2010 │ │ is given, it is assumed to │
2011 │ │ be a UNIX timestamp. │
2012 ├──────────────┼────────────────────────────┤
2013 │ │ │
2014 │day │ Either a day of week │
2015 │ │ ("Monday", "Tuesday", │
2016 │ │ etc.), or an integer │
2017 │ │ between 0 and 6. Strings │
2018 │ │ are matched │
2019 │ │ case-insensitively, and a │
2020 │ │ full match is not expected │
2021 │ │ (e.g. "Mon" would match │
2022 │ │ "Monday"). When an integer │
2023 │ │ is given, 0 is Sunday and │
2024 │ │ 6 is Saturday. │
2025 ├──────────────┼────────────────────────────┤
2026 │ │ │
2027 │hour │ A string representing an │
2028 │ │ hour in 24-hour format. │
2029 │ │ Seconds can optionally be │
2030 │ │ specified. For example, │
2031 │ │ 17:00 and 17:00:00 would │
2032 │ │ be equivalent. │
2033 └──────────────┴────────────────────────────┘
2034
2035 Using meta expressions.
2036
2037 # qualified meta expression
2038 filter output meta oif eth0
2039 filter forward meta iifkind { "tun", "veth" }
2040
2041 # unqualified meta expression
2042 filter output oif eth0
2043
2044 # incoming packet was subject to ipsec processing
2045 raw prerouting meta ipsec exists accept
2046
2047
2048 SOCKET EXPRESSION
2049 socket {transparent | mark | wildcard}
2050 socket cgroupv2 level NUM
2051
2052 Socket expression can be used to search for an existing open TCP/UDP
2053 socket and its attributes that can be associated with a packet. It
2054 looks for an established or non-zero bound listening socket (possibly
2055 with a non-local address). You can also use it to match on the socket
2056 cgroupv2 at a given ancestor level, e.g. if the socket belongs to
2057 cgroupv2 a/b, ancestor level 1 checks for a matching on cgroup a and
2058 ancestor level 2 checks for a matching on cgroup b.
2059
2060 Table 31. Available socket attributes
2061 ┌────────────┬─────────────────────┬─────────────────┐
2062 │Name │ Description │ Type │
2063 ├────────────┼─────────────────────┼─────────────────┤
2064 │ │ │ │
2065 │transparent │ Value of the │ boolean (1 bit) │
2066 │ │ IP_TRANSPARENT │ │
2067 │ │ socket option in │ │
2068 │ │ the found socket. │ │
2069 │ │ It can be 0 or 1. │ │
2070 ├────────────┼─────────────────────┼─────────────────┤
2071 │ │ │ │
2072 │mark │ Value of the socket │ mark │
2073 │ │ mark (SOL_SOCKET, │ │
2074 │ │ SO_MARK). │ │
2075 ├────────────┼─────────────────────┼─────────────────┤
2076 │ │ │ │
2077 │wildcard │ Indicates whether │ boolean (1 bit) │
2078 │ │ the socket is │ │
2079 │ │ wildcard-bound │ │
2080 │ │ (e.g. 0.0.0.0 or │ │
2081 │ │ ::0). │ │
2082 ├────────────┼─────────────────────┼─────────────────┤
2083 │ │ │ │
2084 │cgroupv2 │ cgroup version 2 │ cgroupv2 │
2085 │ │ for this socket │ │
2086 │ │ (path from │ │
2087 │ │ /sys/fs/cgroup) │ │
2088 └────────────┴─────────────────────┴─────────────────┘
2089
2090 Using socket expression.
2091
2092 # Mark packets that correspond to a transparent socket. "socket wildcard 0"
2093 # means that zero-bound listener sockets are NOT matched (which is usually
2094 # exactly what you want).
2095 table inet x {
2096 chain y {
2097 type filter hook prerouting priority mangle; policy accept;
2098 socket transparent 1 socket wildcard 0 mark set 0x00000001 accept
2099 }
2100 }
2101
2102 # Trace packets that corresponds to a socket with a mark value of 15
2103 table inet x {
2104 chain y {
2105 type filter hook prerouting priority mangle; policy accept;
2106 socket mark 0x0000000f nftrace set 1
2107 }
2108 }
2109
2110 # Set packet mark to socket mark
2111 table inet x {
2112 chain y {
2113 type filter hook prerouting priority mangle; policy accept;
2114 tcp dport 8080 mark set socket mark
2115 }
2116 }
2117
2118 # Count packets for cgroupv2 "user.slice" at level 1
2119 table inet x {
2120 chain y {
2121 type filter hook input priority filter; policy accept;
2122 socket cgroupv2 level 1 "user.slice" counter
2123 }
2124 }
2125
2126
2127 OSF EXPRESSION
2128 osf [ttl {loose | skip}] {name | version}
2129
2130 The osf expression does passive operating system fingerprinting. This
2131 expression compares some data (Window Size, MSS, options and their
2132 order, DF, and others) from packets with the SYN bit set.
2133
2134 Table 32. Available osf attributes
2135 ┌────────┬─────────────────────┬────────┐
2136 │Name │ Description │ Type │
2137 ├────────┼─────────────────────┼────────┤
2138 │ │ │ │
2139 │ttl │ Do TTL checks on │ string │
2140 │ │ the packet to │ │
2141 │ │ determine the │ │
2142 │ │ operating system. │ │
2143 ├────────┼─────────────────────┼────────┤
2144 │ │ │ │
2145 │version │ Do OS version │ │
2146 │ │ checks on the │ │
2147 │ │ packet. │ │
2148 ├────────┼─────────────────────┼────────┤
2149 │ │ │ │
2150 │name │ Name of the OS │ string │
2151 │ │ signature to match. │ │
2152 │ │ All signatures can │ │
2153 │ │ be found at pf.os │ │
2154 │ │ file. Use "unknown" │ │
2155 │ │ for OS signatures │ │
2156 │ │ that the expression │ │
2157 │ │ could not detect. │ │
2158 └────────┴─────────────────────┴────────┘
2159
2160 Available ttl values.
2161
2162 If no TTL attribute is passed, make a true IP header and fingerprint TTL true comparison. This generally works for LANs.
2163
2164 * loose: Check if the IP header's TTL is less than the fingerprint one. Works for globally-routable addresses.
2165 * skip: Do not compare the TTL at all.
2166
2167 Using osf expression.
2168
2169 # Accept packets that match the "Linux" OS genre signature without comparing TTL.
2170 table inet x {
2171 chain y {
2172 type filter hook input priority filter; policy accept;
2173 osf ttl skip name "Linux"
2174 }
2175 }
2176
2177
2178 FIB EXPRESSIONS
2179 fib {saddr | daddr | mark | iif | oif} [. ...] {oif | oifname | type}
2180
2181 A fib expression queries the fib (forwarding information base) to
2182 obtain information such as the output interface index a particular
2183 address would use. The input is a tuple of elements that is used as
2184 input to the fib lookup functions.
2185
2186 Table 33. fib expression specific types
2187 ┌────────┬──────────────────┬──────────────────┐
2188 │Keyword │ Description │ Type │
2189 ├────────┼──────────────────┼──────────────────┤
2190 │ │ │ │
2191 │oif │ Output interface │ integer (32 bit) │
2192 │ │ index │ │
2193 ├────────┼──────────────────┼──────────────────┤
2194 │ │ │ │
2195 │oifname │ Output interface │ string │
2196 │ │ name │ │
2197 ├────────┼──────────────────┼──────────────────┤
2198 │ │ │ │
2199 │type │ Address type │ fib_addrtype │
2200 └────────┴──────────────────┴──────────────────┘
2201
2202 Use nft describe fib_addrtype to get a list of all address types.
2203
2204 Using fib expressions.
2205
2206 # drop packets without a reverse path
2207 filter prerouting fib saddr . iif oif missing drop
2208
2209 In this example, 'saddr . iif' looks up routing information based on the source address and the input interface.
2210 oif picks the output interface index from the routing information.
2211 If no route was found for the source address/input interface combination, the output interface index is zero.
2212 In case the input interface is specified as part of the input key, the output interface index is always the same as the input interface index or zero.
2213 If only 'saddr oif' is given, then oif can be any interface index or zero.
2214
2215 # drop packets to address not configured on incoming interface
2216 filter prerouting fib daddr . iif type != { local, broadcast, multicast } drop
2217
2218 # perform lookup in a specific 'blackhole' table (0xdead, needs ip appropriate ip rule)
2219 filter prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : jump prohibited, unreachable : drop }
2220
2221
2222 ROUTING EXPRESSIONS
2223 rt [ip | ip6] {classid | nexthop | mtu | ipsec}
2224
2225 A routing expression refers to routing data associated with a packet.
2226
2227 Table 34. Routing expression types
2228 ┌────────┬─────────────────────┬─────────────────────┐
2229 │Keyword │ Description │ Type │
2230 ├────────┼─────────────────────┼─────────────────────┤
2231 │ │ │ │
2232 │classid │ Routing realm │ realm │
2233 ├────────┼─────────────────────┼─────────────────────┤
2234 │ │ │ │
2235 │nexthop │ Routing nexthop │ ipv4_addr/ipv6_addr │
2236 ├────────┼─────────────────────┼─────────────────────┤
2237 │ │ │ │
2238 │mtu │ TCP maximum segment │ integer (16 bit) │
2239 │ │ size of route │ │
2240 ├────────┼─────────────────────┼─────────────────────┤
2241 │ │ │ │
2242 │ipsec │ route via ipsec │ boolean │
2243 │ │ tunnel or transport │ │
2244 └────────┴─────────────────────┴─────────────────────┘
2245
2246 Table 35. Routing expression specific types
2247 ┌──────┬────────────────────────────┐
2248 │Type │ Description │
2249 ├──────┼────────────────────────────┤
2250 │ │ │
2251 │realm │ Routing Realm (32 bit │
2252 │ │ number). Can be specified │
2253 │ │ numerically or as symbolic │
2254 │ │ name defined in │
2255 │ │ /etc/iproute2/rt_realms. │
2256 └──────┴────────────────────────────┘
2257
2258 Using routing expressions.
2259
2260 # IP family independent rt expression
2261 filter output rt classid 10
2262
2263 # IP family dependent rt expressions
2264 ip filter output rt nexthop 192.168.0.1
2265 ip6 filter output rt nexthop fd00::1
2266 inet filter output rt ip nexthop 192.168.0.1
2267 inet filter output rt ip6 nexthop fd00::1
2268
2269 # outgoing packet will be encapsulated/encrypted by ipsec
2270 filter output rt ipsec exists
2271
2272
2273 IPSEC EXPRESSIONS
2274 ipsec {in | out} [ spnum NUM ] {reqid | spi}
2275 ipsec {in | out} [ spnum NUM ] {ip | ip6} {saddr | daddr}
2276
2277 An ipsec expression refers to ipsec data associated with a packet.
2278
2279 The in or out keyword needs to be used to specify if the expression
2280 should examine inbound or outbound policies. The in keyword can be used
2281 in the prerouting, input and forward hooks. The out keyword applies to
2282 forward, output and postrouting hooks. The optional keyword spnum can
2283 be used to match a specific state in a chain, it defaults to 0.
2284
2285 Table 36. Ipsec expression types
2286 ┌────────┬─────────────────────┬─────────────────────┐
2287 │Keyword │ Description │ Type │
2288 ├────────┼─────────────────────┼─────────────────────┤
2289 │ │ │ │
2290 │reqid │ Request ID │ integer (32 bit) │
2291 ├────────┼─────────────────────┼─────────────────────┤
2292 │ │ │ │
2293 │spi │ Security Parameter │ integer (32 bit) │
2294 │ │ Index │ │
2295 ├────────┼─────────────────────┼─────────────────────┤
2296 │ │ │ │
2297 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
2298 │ │ the tunnel │ │
2299 ├────────┼─────────────────────┼─────────────────────┤
2300 │ │ │ │
2301 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
2302 │ │ of the tunnel │ │
2303 └────────┴─────────────────────┴─────────────────────┘
2304
2305 NUMGEN EXPRESSION
2306 numgen {inc | random} mod NUM [ offset NUM ]
2307
2308 Create a number generator. The inc or random keywords control its
2309 operation mode: In inc mode, the last returned value is simply
2310 incremented. In random mode, a new random number is returned. The value
2311 after mod keyword specifies an upper boundary (read: modulus) which is
2312 not reached by returned numbers. The optional offset allows to
2313 increment the returned value by a fixed offset.
2314
2315 A typical use-case for numgen is load-balancing:
2316
2317 Using numgen expression.
2318
2319 # round-robin between 192.168.10.100 and 192.168.20.200:
2320 add rule nat prerouting dnat to numgen inc mod 2 map \
2321 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2322
2323 # probability-based with odd bias using intervals:
2324 add rule nat prerouting dnat to numgen random mod 10 map \
2325 { 0-2 : 192.168.10.100, 3-9 : 192.168.20.200 }
2326
2327
2328 HASH EXPRESSIONS
2329 jhash {ip saddr | ip6 daddr | tcp dport | udp sport | ether saddr} [. ...] mod NUM [ seed NUM ] [ offset NUM ]
2330 symhash mod NUM [ offset NUM ]
2331
2332 Use a hashing function to generate a number. The functions available
2333 are jhash, known as Jenkins Hash, and symhash, for Symmetric Hash. The
2334 jhash requires an expression to determine the parameters of the packet
2335 header to apply the hashing, concatenations are possible as well. The
2336 value after mod keyword specifies an upper boundary (read: modulus)
2337 which is not reached by returned numbers. The optional seed is used to
2338 specify an init value used as seed in the hashing function. The
2339 optional offset allows to increment the returned value by a fixed
2340 offset.
2341
2342 A typical use-case for jhash and symhash is load-balancing:
2343
2344 Using hash expressions.
2345
2346 # load balance based on source ip between 2 ip addresses:
2347 add rule nat prerouting dnat to jhash ip saddr mod 2 map \
2348 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2349
2350 # symmetric load balancing between 2 ip addresses:
2351 add rule nat prerouting dnat to symhash mod 2 map \
2352 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2353
2354
2356 Payload expressions refer to data from the packet’s payload.
2357
2358 ETHERNET HEADER EXPRESSION
2359 ether {daddr | saddr | type}
2360
2361 Table 37. Ethernet header expression types
2362 ┌────────┬────────────────────┬────────────┐
2363 │Keyword │ Description │ Type │
2364 ├────────┼────────────────────┼────────────┤
2365 │ │ │ │
2366 │daddr │ Destination MAC │ ether_addr │
2367 │ │ address │ │
2368 ├────────┼────────────────────┼────────────┤
2369 │ │ │ │
2370 │saddr │ Source MAC address │ ether_addr │
2371 ├────────┼────────────────────┼────────────┤
2372 │ │ │ │
2373 │type │ EtherType │ ether_type │
2374 └────────┴────────────────────┴────────────┘
2375
2376 VLAN HEADER EXPRESSION
2377 vlan {id | dei | pcp | type}
2378
2379 Table 38. VLAN header expression
2380 ┌────────┬─────────────────────┬──────────────────┐
2381 │Keyword │ Description │ Type │
2382 ├────────┼─────────────────────┼──────────────────┤
2383 │ │ │ │
2384 │id │ VLAN ID (VID) │ integer (12 bit) │
2385 ├────────┼─────────────────────┼──────────────────┤
2386 │ │ │ │
2387 │dei │ Drop Eligible │ integer (1 bit) │
2388 │ │ Indicator │ │
2389 ├────────┼─────────────────────┼──────────────────┤
2390 │ │ │ │
2391 │pcp │ Priority code point │ integer (3 bit) │
2392 ├────────┼─────────────────────┼──────────────────┤
2393 │ │ │ │
2394 │type │ EtherType │ ether_type │
2395 └────────┴─────────────────────┴──────────────────┘
2396
2397 ARP HEADER EXPRESSION
2398 arp {htype | ptype | hlen | plen | operation | saddr { ip | ether } | daddr { ip | ether }
2399
2400 Table 39. ARP header expression
2401 ┌────────────┬─────────────────────┬──────────────────┐
2402 │Keyword │ Description │ Type │
2403 ├────────────┼─────────────────────┼──────────────────┤
2404 │ │ │ │
2405 │htype │ ARP hardware type │ integer (16 bit) │
2406 ├────────────┼─────────────────────┼──────────────────┤
2407 │ │ │ │
2408 │ptype │ EtherType │ ether_type │
2409 ├────────────┼─────────────────────┼──────────────────┤
2410 │ │ │ │
2411 │hlen │ Hardware address │ integer (8 bit) │
2412 │ │ len │ │
2413 ├────────────┼─────────────────────┼──────────────────┤
2414 │ │ │ │
2415 │plen │ Protocol address │ integer (8 bit) │
2416 │ │ len │ │
2417 ├────────────┼─────────────────────┼──────────────────┤
2418 │ │ │ │
2419 │operation │ Operation │ arp_op │
2420 ├────────────┼─────────────────────┼──────────────────┤
2421 │ │ │ │
2422 │saddr ether │ Ethernet sender │ ether_addr │
2423 │ │ address │ │
2424 ├────────────┼─────────────────────┼──────────────────┤
2425 │ │ │ │
2426 │daddr ether │ Ethernet target │ ether_addr │
2427 │ │ address │ │
2428 ├────────────┼─────────────────────┼──────────────────┤
2429 │ │ │ │
2430 │saddr ip │ IPv4 sender address │ ipv4_addr │
2431 ├────────────┼─────────────────────┼──────────────────┤
2432 │ │ │ │
2433 │daddr ip │ IPv4 target address │ ipv4_addr │
2434 └────────────┴─────────────────────┴──────────────────┘
2435
2436 IPV4 HEADER EXPRESSION
2437 ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
2438
2439 Table 40. IPv4 header expression
2440 ┌──────────┬─────────────────────┬──────────────────┐
2441 │Keyword │ Description │ Type │
2442 ├──────────┼─────────────────────┼──────────────────┤
2443 │ │ │ │
2444 │version │ IP header version │ integer (4 bit) │
2445 │ │ (4) │ │
2446 ├──────────┼─────────────────────┼──────────────────┤
2447 │ │ │ │
2448 │hdrlength │ IP header length │ integer (4 bit) │
2449 │ │ including options │ FIXME scaling │
2450 ├──────────┼─────────────────────┼──────────────────┤
2451 │ │ │ │
2452 │dscp │ Differentiated │ dscp │
2453 │ │ Services Code Point │ │
2454 ├──────────┼─────────────────────┼──────────────────┤
2455 │ │ │ │
2456 │ecn │ Explicit Congestion │ ecn │
2457 │ │ Notification │ │
2458 ├──────────┼─────────────────────┼──────────────────┤
2459 │ │ │ │
2460 │length │ Total packet length │ integer (16 bit) │
2461 ├──────────┼─────────────────────┼──────────────────┤
2462 │ │ │ │
2463 │id │ IP ID │ integer (16 bit) │
2464 ├──────────┼─────────────────────┼──────────────────┤
2465 │ │ │ │
2466 │frag-off │ Fragment offset │ integer (16 bit) │
2467 ├──────────┼─────────────────────┼──────────────────┤
2468 │ │ │ │
2469 │ttl │ Time to live │ integer (8 bit) │
2470 ├──────────┼─────────────────────┼──────────────────┤
2471 │ │ │ │
2472 │protocol │ Upper layer │ inet_proto │
2473 │ │ protocol │ │
2474 ├──────────┼─────────────────────┼──────────────────┤
2475 │ │ │ │
2476 │checksum │ IP header checksum │ integer (16 bit) │
2477 ├──────────┼─────────────────────┼──────────────────┤
2478 │ │ │ │
2479 │saddr │ Source address │ ipv4_addr │
2480 ├──────────┼─────────────────────┼──────────────────┤
2481 │ │ │ │
2482 │daddr │ Destination address │ ipv4_addr │
2483 └──────────┴─────────────────────┴──────────────────┘
2484
2485 ICMP HEADER EXPRESSION
2486 icmp {type | code | checksum | id | sequence | gateway | mtu}
2487
2488 This expression refers to ICMP header fields. When using it in inet,
2489 bridge or netdev families, it will cause an implicit dependency on IPv4
2490 to be created. To match on unusual cases like ICMP over IPv6, one has
2491 to add an explicit meta protocol ip6 match to the rule.
2492
2493 Table 41. ICMP header expression
2494 ┌─────────┬─────────────────────┬──────────────────┐
2495 │Keyword │ Description │ Type │
2496 ├─────────┼─────────────────────┼──────────────────┤
2497 │ │ │ │
2498 │type │ ICMP type field │ icmp_type │
2499 ├─────────┼─────────────────────┼──────────────────┤
2500 │ │ │ │
2501 │code │ ICMP code field │ integer (8 bit) │
2502 ├─────────┼─────────────────────┼──────────────────┤
2503 │ │ │ │
2504 │checksum │ ICMP checksum field │ integer (16 bit) │
2505 ├─────────┼─────────────────────┼──────────────────┤
2506 │ │ │ │
2507 │id │ ID of echo │ integer (16 bit) │
2508 │ │ request/response │ │
2509 ├─────────┼─────────────────────┼──────────────────┤
2510 │ │ │ │
2511 │sequence │ sequence number of │ integer (16 bit) │
2512 │ │ echo │ │
2513 │ │ request/response │ │
2514 ├─────────┼─────────────────────┼──────────────────┤
2515 │ │ │ │
2516 │gateway │ gateway of │ integer (32 bit) │
2517 │ │ redirects │ │
2518 ├─────────┼─────────────────────┼──────────────────┤
2519 │ │ │ │
2520 │mtu │ MTU of path MTU │ integer (16 bit) │
2521 │ │ discovery │ │
2522 └─────────┴─────────────────────┴──────────────────┘
2523
2524 IGMP HEADER EXPRESSION
2525 igmp {type | mrt | checksum | group}
2526
2527 This expression refers to IGMP header fields. When using it in inet,
2528 bridge or netdev families, it will cause an implicit dependency on IPv4
2529 to be created. To match on unusual cases like IGMP over IPv6, one has
2530 to add an explicit meta protocol ip6 match to the rule.
2531
2532 Table 42. IGMP header expression
2533 ┌─────────┬─────────────────────┬──────────────────┐
2534 │Keyword │ Description │ Type │
2535 ├─────────┼─────────────────────┼──────────────────┤
2536 │ │ │ │
2537 │type │ IGMP type field │ igmp_type │
2538 ├─────────┼─────────────────────┼──────────────────┤
2539 │ │ │ │
2540 │mrt │ IGMP maximum │ integer (8 bit) │
2541 │ │ response time field │ │
2542 ├─────────┼─────────────────────┼──────────────────┤
2543 │ │ │ │
2544 │checksum │ IGMP checksum field │ integer (16 bit) │
2545 ├─────────┼─────────────────────┼──────────────────┤
2546 │ │ │ │
2547 │group │ Group address │ integer (32 bit) │
2548 └─────────┴─────────────────────┴──────────────────┘
2549
2550 IPV6 HEADER EXPRESSION
2551 ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
2552
2553 This expression refers to the ipv6 header fields. Caution when using
2554 ip6 nexthdr, the value only refers to the next header, i.e. ip6 nexthdr
2555 tcp will only match if the ipv6 packet does not contain any extension
2556 headers. Packets that are fragmented or e.g. contain a routing
2557 extension headers will not be matched. Please use meta l4proto if you
2558 wish to match the real transport header and ignore any additional
2559 extension headers instead.
2560
2561 Table 43. IPv6 header expression
2562 ┌──────────┬─────────────────────┬──────────────────┐
2563 │Keyword │ Description │ Type │
2564 ├──────────┼─────────────────────┼──────────────────┤
2565 │ │ │ │
2566 │version │ IP header version │ integer (4 bit) │
2567 │ │ (6) │ │
2568 ├──────────┼─────────────────────┼──────────────────┤
2569 │ │ │ │
2570 │dscp │ Differentiated │ dscp │
2571 │ │ Services Code Point │ │
2572 ├──────────┼─────────────────────┼──────────────────┤
2573 │ │ │ │
2574 │ecn │ Explicit Congestion │ ecn │
2575 │ │ Notification │ │
2576 ├──────────┼─────────────────────┼──────────────────┤
2577 │ │ │ │
2578 │flowlabel │ Flow label │ integer (20 bit) │
2579 ├──────────┼─────────────────────┼──────────────────┤
2580 │ │ │ │
2581 │length │ Payload length │ integer (16 bit) │
2582 ├──────────┼─────────────────────┼──────────────────┤
2583 │ │ │ │
2584 │nexthdr │ Nexthdr protocol │ inet_proto │
2585 ├──────────┼─────────────────────┼──────────────────┤
2586 │ │ │ │
2587 │hoplimit │ Hop limit │ integer (8 bit) │
2588 ├──────────┼─────────────────────┼──────────────────┤
2589 │ │ │ │
2590 │saddr │ Source address │ ipv6_addr │
2591 ├──────────┼─────────────────────┼──────────────────┤
2592 │ │ │ │
2593 │daddr │ Destination address │ ipv6_addr │
2594 └──────────┴─────────────────────┴──────────────────┘
2595
2596 Using ip6 header expressions.
2597
2598 # matching if first extension header indicates a fragment
2599 ip6 nexthdr ipv6-frag
2600
2601
2602 ICMPV6 HEADER EXPRESSION
2603 icmpv6 {type | code | checksum | parameter-problem | packet-too-big | id | sequence | max-delay}
2604
2605 This expression refers to ICMPv6 header fields. When using it in inet,
2606 bridge or netdev families, it will cause an implicit dependency on IPv6
2607 to be created. To match on unusual cases like ICMPv6 over IPv4, one has
2608 to add an explicit meta protocol ip match to the rule.
2609
2610 Table 44. ICMPv6 header expression
2611 ┌──────────────────┬────────────────────┬──────────────────┐
2612 │Keyword │ Description │ Type │
2613 ├──────────────────┼────────────────────┼──────────────────┤
2614 │ │ │ │
2615 │type │ ICMPv6 type field │ icmpv6_type │
2616 ├──────────────────┼────────────────────┼──────────────────┤
2617 │ │ │ │
2618 │code │ ICMPv6 code field │ integer (8 bit) │
2619 ├──────────────────┼────────────────────┼──────────────────┤
2620 │ │ │ │
2621 │checksum │ ICMPv6 checksum │ integer (16 bit) │
2622 │ │ field │ │
2623 ├──────────────────┼────────────────────┼──────────────────┤
2624 │ │ │ │
2625 │parameter-problem │ pointer to problem │ integer (32 bit) │
2626 ├──────────────────┼────────────────────┼──────────────────┤
2627 │ │ │ │
2628 │packet-too-big │ oversized MTU │ integer (32 bit) │
2629 ├──────────────────┼────────────────────┼──────────────────┤
2630 │ │ │ │
2631 │id │ ID of echo │ integer (16 bit) │
2632 │ │ request/response │ │
2633 ├──────────────────┼────────────────────┼──────────────────┤
2634 │ │ │ │
2635 │sequence │ sequence number of │ integer (16 bit) │
2636 │ │ echo │ │
2637 │ │ request/response │ │
2638 ├──────────────────┼────────────────────┼──────────────────┤
2639 │ │ │ │
2640 │max-delay │ maximum response │ integer (16 bit) │
2641 │ │ delay of MLD │ │
2642 │ │ queries │ │
2643 └──────────────────┴────────────────────┴──────────────────┘
2644
2645 TCP HEADER EXPRESSION
2646 tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
2647
2648 Table 45. TCP header expression
2649 ┌─────────┬──────────────────┬──────────────────┐
2650 │Keyword │ Description │ Type │
2651 ├─────────┼──────────────────┼──────────────────┤
2652 │ │ │ │
2653 │sport │ Source port │ inet_service │
2654 ├─────────┼──────────────────┼──────────────────┤
2655 │ │ │ │
2656 │dport │ Destination port │ inet_service │
2657 ├─────────┼──────────────────┼──────────────────┤
2658 │ │ │ │
2659 │sequence │ Sequence number │ integer (32 bit) │
2660 ├─────────┼──────────────────┼──────────────────┤
2661 │ │ │ │
2662 │ackseq │ Acknowledgement │ integer (32 bit) │
2663 │ │ number │ │
2664 ├─────────┼──────────────────┼──────────────────┤
2665 │ │ │ │
2666 │doff │ Data offset │ integer (4 bit) │
2667 │ │ │ FIXME scaling │
2668 ├─────────┼──────────────────┼──────────────────┤
2669 │ │ │ │
2670 │reserved │ Reserved area │ integer (4 bit) │
2671 ├─────────┼──────────────────┼──────────────────┤
2672 │ │ │ │
2673 │flags │ TCP flags │ tcp_flag │
2674 ├─────────┼──────────────────┼──────────────────┤
2675 │ │ │ │
2676 │window │ Window │ integer (16 bit) │
2677 ├─────────┼──────────────────┼──────────────────┤
2678 │ │ │ │
2679 │checksum │ Checksum │ integer (16 bit) │
2680 ├─────────┼──────────────────┼──────────────────┤
2681 │ │ │ │
2682 │urgptr │ Urgent pointer │ integer (16 bit) │
2683 └─────────┴──────────────────┴──────────────────┘
2684
2685 UDP HEADER EXPRESSION
2686 udp {sport | dport | length | checksum}
2687
2688 Table 46. UDP header expression
2689 ┌─────────┬─────────────────────┬──────────────────┐
2690 │Keyword │ Description │ Type │
2691 ├─────────┼─────────────────────┼──────────────────┤
2692 │ │ │ │
2693 │sport │ Source port │ inet_service │
2694 ├─────────┼─────────────────────┼──────────────────┤
2695 │ │ │ │
2696 │dport │ Destination port │ inet_service │
2697 ├─────────┼─────────────────────┼──────────────────┤
2698 │ │ │ │
2699 │length │ Total packet length │ integer (16 bit) │
2700 ├─────────┼─────────────────────┼──────────────────┤
2701 │ │ │ │
2702 │checksum │ Checksum │ integer (16 bit) │
2703 └─────────┴─────────────────────┴──────────────────┘
2704
2705 UDP-LITE HEADER EXPRESSION
2706 udplite {sport | dport | checksum}
2707
2708 Table 47. UDP-Lite header expression
2709 ┌─────────┬──────────────────┬──────────────────┐
2710 │Keyword │ Description │ Type │
2711 ├─────────┼──────────────────┼──────────────────┤
2712 │ │ │ │
2713 │sport │ Source port │ inet_service │
2714 ├─────────┼──────────────────┼──────────────────┤
2715 │ │ │ │
2716 │dport │ Destination port │ inet_service │
2717 ├─────────┼──────────────────┼──────────────────┤
2718 │ │ │ │
2719 │checksum │ Checksum │ integer (16 bit) │
2720 └─────────┴──────────────────┴──────────────────┘
2721
2722 SCTP HEADER EXPRESSION
2723 sctp {sport | dport | vtag | checksum}
2724 sctp chunk CHUNK [ FIELD ]
2725
2726 CHUNK := data | init | init-ack | sack | heartbeat |
2727 heartbeat-ack | abort | shutdown | shutdown-ack | error |
2728 cookie-echo | cookie-ack | ecne | cwr | shutdown-complete
2729 | asconf-ack | forward-tsn | asconf
2730
2731 FIELD := COMMON_FIELD | DATA_FIELD | INIT_FIELD | INIT_ACK_FIELD |
2732 SACK_FIELD | SHUTDOWN_FIELD | ECNE_FIELD | CWR_FIELD |
2733 ASCONF_ACK_FIELD | FORWARD_TSN_FIELD | ASCONF_FIELD
2734
2735 COMMON_FIELD := type | flags | length
2736 DATA_FIELD := tsn | stream | ssn | ppid
2737 INIT_FIELD := init-tag | a-rwnd | num-outbound-streams |
2738 num-inbound-streams | initial-tsn
2739 INIT_ACK_FIELD := INIT_FIELD
2740 SACK_FIELD := cum-tsn-ack | a-rwnd | num-gap-ack-blocks |
2741 num-dup-tsns
2742 SHUTDOWN_FIELD := cum-tsn-ack
2743 ECNE_FIELD := lowest-tsn
2744 CWR_FIELD := lowest-tsn
2745 ASCONF_ACK_FIELD := seqno
2746 FORWARD_TSN_FIELD := new-cum-tsn
2747 ASCONF_FIELD := seqno
2748
2749 Table 48. SCTP header expression
2750 ┌─────────┬──────────────────┬────────────────────┐
2751 │Keyword │ Description │ Type │
2752 ├─────────┼──────────────────┼────────────────────┤
2753 │ │ │ │
2754 │sport │ Source port │ inet_service │
2755 ├─────────┼──────────────────┼────────────────────┤
2756 │ │ │ │
2757 │dport │ Destination port │ inet_service │
2758 ├─────────┼──────────────────┼────────────────────┤
2759 │ │ │ │
2760 │vtag │ Verification Tag │ integer (32 bit) │
2761 ├─────────┼──────────────────┼────────────────────┤
2762 │ │ │ │
2763 │checksum │ Checksum │ integer (32 bit) │
2764 ├─────────┼──────────────────┼────────────────────┤
2765 │ │ │ │
2766 │chunk │ Search chunk in │ without FIELD, │
2767 │ │ packet │ boolean indicating │
2768 │ │ │ existence │
2769 └─────────┴──────────────────┴────────────────────┘
2770
2771 Table 49. SCTP chunk fields
2772 ┌─────────────────────┬───────────────┬─────────────────┬──────────────────┐
2773 │Name │ Width in bits │ Chunk │ Notes │
2774 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2775 │ │ │ │ │
2776 │type │ 8 │ all │ not useful, │
2777 │ │ │ │ defined by chunk │
2778 │ │ │ │ type │
2779 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2780 │ │ │ │ │
2781 │flags │ 8 │ all │ semantics │
2782 │ │ │ │ defined on │
2783 │ │ │ │ per-chunk basis │
2784 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2785 │ │ │ │ │
2786 │length │ 16 │ all │ length of this │
2787 │ │ │ │ chunk in bytes │
2788 │ │ │ │ excluding │
2789 │ │ │ │ padding │
2790 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2791 │ │ │ │ │
2792 │tsn │ 32 │ data │ transmission │
2793 │ │ │ │ sequence number │
2794 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2795 │ │ │ │ │
2796 │stream │ 16 │ data │ stream │
2797 │ │ │ │ identifier │
2798 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2799 │ │ │ │ │
2800 │ssn │ 16 │ data │ stream sequence │
2801 │ │ │ │ number │
2802 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2803 │ │ │ │ │
2804 │ppid │ 32 │ data │ payload protocol │
2805 │ │ │ │ identifier │
2806 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2807 │ │ │ │ │
2808 │init-tag │ 32 │ init, init-ack │ initiate tag │
2809 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2810 │ │ │ │ │
2811 │a-rwnd │ 32 │ init, init-ack, │ advertised │
2812 │ │ │ sack │ receiver window │
2813 │ │ │ │ credit │
2814 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2815 │ │ │ │ │
2816 │num-outbound-streams │ 16 │ init, init-ack │ number of │
2817 │ │ │ │ outbound streams │
2818 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2819 │ │ │ │ │
2820 │num-inbound-streams │ 16 │ init, init-ack │ number of │
2821 │ │ │ │ inbound streams │
2822 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2823 │ │ │ │ │
2824 │initial-tsn │ 32 │ init, init-ack │ initial transmit │
2825 │ │ │ │ sequence number │
2826 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2827 │ │ │ │ │
2828 │cum-tsn-ack │ 32 │ sack, shutdown │ cumulative │
2829 │ │ │ │ transmission │
2830 │ │ │ │ sequence number │
2831 │ │ │ │ acknowledged │
2832 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2833 │ │ │ │ │
2834 │num-gap-ack-blocks │ 16 │ sack │ number of Gap │
2835 │ │ │ │ Ack Blocks │
2836 │ │ │ │ included │
2837 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2838 │ │ │ │ │
2839 │num-dup-tsns │ 16 │ sack │ number of │
2840 │ │ │ │ duplicate │
2841 │ │ │ │ transmission │
2842 │ │ │ │ sequence numbers │
2843 │ │ │ │ received │
2844 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2845 │ │ │ │ │
2846 │lowest-tsn │ 32 │ ecne, cwr │ lowest │
2847 │ │ │ │ transmission │
2848 │ │ │ │ sequence number │
2849 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2850 │ │ │ │ │
2851 │seqno │ 32 │ asconf-ack, │ sequence number │
2852 │ │ │ asconf │ │
2853 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2854 │ │ │ │ │
2855 │new-cum-tsn │ 32 │ forward-tsn │ new cumulative │
2856 │ │ │ │ transmission │
2857 │ │ │ │ sequence number │
2858 └─────────────────────┴───────────────┴─────────────────┴──────────────────┘
2859
2860 DCCP HEADER EXPRESSION
2861 dccp {sport | dport | type}
2862
2863 Table 50. DCCP header expression
2864 ┌────────┬──────────────────┬──────────────┐
2865 │Keyword │ Description │ Type │
2866 ├────────┼──────────────────┼──────────────┤
2867 │ │ │ │
2868 │sport │ Source port │ inet_service │
2869 ├────────┼──────────────────┼──────────────┤
2870 │ │ │ │
2871 │dport │ Destination port │ inet_service │
2872 ├────────┼──────────────────┼──────────────┤
2873 │ │ │ │
2874 │type │ Packet type │ dccp_pkttype │
2875 └────────┴──────────────────┴──────────────┘
2876
2877 AUTHENTICATION HEADER EXPRESSION
2878 ah {nexthdr | hdrlength | reserved | spi | sequence}
2879
2880 Table 51. AH header expression
2881 ┌──────────┬────────────────────┬──────────────────┐
2882 │Keyword │ Description │ Type │
2883 ├──────────┼────────────────────┼──────────────────┤
2884 │ │ │ │
2885 │nexthdr │ Next header │ inet_proto │
2886 │ │ protocol │ │
2887 ├──────────┼────────────────────┼──────────────────┤
2888 │ │ │ │
2889 │hdrlength │ AH Header length │ integer (8 bit) │
2890 ├──────────┼────────────────────┼──────────────────┤
2891 │ │ │ │
2892 │reserved │ Reserved area │ integer (16 bit) │
2893 ├──────────┼────────────────────┼──────────────────┤
2894 │ │ │ │
2895 │spi │ Security Parameter │ integer (32 bit) │
2896 │ │ Index │ │
2897 ├──────────┼────────────────────┼──────────────────┤
2898 │ │ │ │
2899 │sequence │ Sequence number │ integer (32 bit) │
2900 └──────────┴────────────────────┴──────────────────┘
2901
2902 ENCRYPTED SECURITY PAYLOAD HEADER EXPRESSION
2903 esp {spi | sequence}
2904
2905 Table 52. ESP header expression
2906 ┌─────────┬────────────────────┬──────────────────┐
2907 │Keyword │ Description │ Type │
2908 ├─────────┼────────────────────┼──────────────────┤
2909 │ │ │ │
2910 │spi │ Security Parameter │ integer (32 bit) │
2911 │ │ Index │ │
2912 ├─────────┼────────────────────┼──────────────────┤
2913 │ │ │ │
2914 │sequence │ Sequence number │ integer (32 bit) │
2915 └─────────┴────────────────────┴──────────────────┘
2916
2917 IPCOMP HEADER EXPRESSION
2918 comp {nexthdr | flags | cpi}
2919
2920 Table 53. IPComp header expression
2921 ┌────────┬─────────────────┬──────────────────┐
2922 │Keyword │ Description │ Type │
2923 ├────────┼─────────────────┼──────────────────┤
2924 │ │ │ │
2925 │nexthdr │ Next header │ inet_proto │
2926 │ │ protocol │ │
2927 ├────────┼─────────────────┼──────────────────┤
2928 │ │ │ │
2929 │flags │ Flags │ bitmask │
2930 ├────────┼─────────────────┼──────────────────┤
2931 │ │ │ │
2932 │cpi │ compression │ integer (16 bit) │
2933 │ │ Parameter Index │ │
2934 └────────┴─────────────────┴──────────────────┘
2935
2936 RAW PAYLOAD EXPRESSION
2937 @base,offset,length
2938
2939 The raw payload expression instructs to load length bits starting at
2940 offset bits. Bit 0 refers to the very first bit — in the C programming
2941 language, this corresponds to the topmost bit, i.e. 0x80 in case of an
2942 octet. They are useful to match headers that do not have a
2943 human-readable template expression yet. Note that nft will not add
2944 dependencies for Raw payload expressions. If you e.g. want to match
2945 protocol fields of a transport header with protocol number 5, you need
2946 to manually exclude packets that have a different transport header, for
2947 instance by using meta l4proto 5 before the raw expression.
2948
2949 Table 54. Supported payload protocol bases
2950 ┌─────┬─────────────────────────┐
2951 │Base │ Description │
2952 ├─────┼─────────────────────────┤
2953 │ │ │
2954 │ll │ Link layer, for example │
2955 │ │ the Ethernet header │
2956 ├─────┼─────────────────────────┤
2957 │ │ │
2958 │nh │ Network header, for │
2959 │ │ example IPv4 or IPv6 │
2960 ├─────┼─────────────────────────┤
2961 │ │ │
2962 │th │ Transport Header, for │
2963 │ │ example TCP │
2964 └─────┴─────────────────────────┘
2965
2966 Matching destination port of both UDP and TCP.
2967
2968 inet filter input meta l4proto {tcp, udp} @th,16,16 { 53, 80 }
2969
2970 The above can also be written as
2971
2972 inet filter input meta l4proto {tcp, udp} th dport { 53, 80 }
2973
2974 it is more convenient, but like the raw expression notation no
2975 dependencies are created or checked. It is the users responsibility to
2976 restrict matching to those header types that have a notion of ports.
2977 Otherwise, rules using raw expressions will errnously match unrelated
2978 packets, e.g. mis-interpreting ESP packets SPI field as a port.
2979
2980 Rewrite arp packet target hardware address if target protocol address
2981 matches a given address.
2982
2983 input meta iifname enp2s0 arp ptype 0x0800 arp htype 1 arp hlen 6 arp plen 4 @nh,192,32 0xc0a88f10 @nh,144,48 set 0x112233445566 accept
2984
2985
2986 EXTENSION HEADER EXPRESSIONS
2987 Extension header expressions refer to data from variable-sized protocol
2988 headers, such as IPv6 extension headers, TCP options and IPv4 options.
2989
2990 nftables currently supports matching (finding) a given ipv6 extension
2991 header, TCP option or IPv4 option.
2992
2993 hbh {nexthdr | hdrlength}
2994 frag {nexthdr | frag-off | more-fragments | id}
2995 rt {nexthdr | hdrlength | type | seg-left}
2996 dst {nexthdr | hdrlength}
2997 mh {nexthdr | hdrlength | checksum | type}
2998 srh {flags | tag | sid | seg-left}
2999 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp} tcp_option_field
3000 ip option { lsrr | ra | rr | ssrr } ip_option_field
3001
3002 The following syntaxes are valid only in a relational expression with
3003 boolean type on right-hand side for checking header existence only:
3004
3005 exthdr {hbh | frag | rt | dst | mh}
3006 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp}
3007 ip option { lsrr | ra | rr | ssrr }
3008
3009 Table 55. IPv6 extension headers
3010 ┌────────┬────────────────────────┐
3011 │Keyword │ Description │
3012 ├────────┼────────────────────────┤
3013 │ │ │
3014 │hbh │ Hop by Hop │
3015 ├────────┼────────────────────────┤
3016 │ │ │
3017 │rt │ Routing Header │
3018 ├────────┼────────────────────────┤
3019 │ │ │
3020 │frag │ Fragmentation header │
3021 ├────────┼────────────────────────┤
3022 │ │ │
3023 │dst │ dst options │
3024 ├────────┼────────────────────────┤
3025 │ │ │
3026 │mh │ Mobility Header │
3027 ├────────┼────────────────────────┤
3028 │ │ │
3029 │srh │ Segment Routing Header │
3030 └────────┴────────────────────────┘
3031
3032 Table 56. TCP Options
3033 ┌──────────┬─────────────────────┬─────────────────────┐
3034 │Keyword │ Description │ TCP option fields │
3035 ├──────────┼─────────────────────┼─────────────────────┤
3036 │ │ │ │
3037 │eol │ End if option list │ kind │
3038 ├──────────┼─────────────────────┼─────────────────────┤
3039 │ │ │ │
3040 │nop │ 1 Byte TCP Nop │ kind │
3041 │ │ padding option │ │
3042 ├──────────┼─────────────────────┼─────────────────────┤
3043 │ │ │ │
3044 │maxseg │ TCP Maximum Segment │ kind, length, size │
3045 │ │ Size │ │
3046 ├──────────┼─────────────────────┼─────────────────────┤
3047 │ │ │ │
3048 │window │ TCP Window Scaling │ kind, length, count │
3049 ├──────────┼─────────────────────┼─────────────────────┤
3050 │ │ │ │
3051 │sack-perm │ TCP SACK permitted │ kind, length │
3052 ├──────────┼─────────────────────┼─────────────────────┤
3053 │ │ │ │
3054 │sack │ TCP Selective │ kind, length, left, │
3055 │ │ Acknowledgement │ right │
3056 │ │ (alias of block 0) │ │
3057 ├──────────┼─────────────────────┼─────────────────────┤
3058 │ │ │ │
3059 │sack0 │ TCP Selective │ kind, length, left, │
3060 │ │ Acknowledgement │ right │
3061 │ │ (block 0) │ │
3062 ├──────────┼─────────────────────┼─────────────────────┤
3063 │ │ │ │
3064 │sack1 │ TCP Selective │ kind, length, left, │
3065 │ │ Acknowledgement │ right │
3066 │ │ (block 1) │ │
3067 ├──────────┼─────────────────────┼─────────────────────┤
3068 │ │ │ │
3069 │sack2 │ TCP Selective │ kind, length, left, │
3070 │ │ Acknowledgement │ right │
3071 │ │ (block 2) │ │
3072 ├──────────┼─────────────────────┼─────────────────────┤
3073 │ │ │ │
3074 │sack3 │ TCP Selective │ kind, length, left, │
3075 │ │ Acknowledgement │ right │
3076 │ │ (block 3) │ │
3077 ├──────────┼─────────────────────┼─────────────────────┤
3078 │ │ │ │
3079 │timestamp │ TCP Timestamps │ kind, length, │
3080 │ │ │ tsval, tsecr │
3081 └──────────┴─────────────────────┴─────────────────────┘
3082
3083 TCP option matching also supports raw expression syntax to access
3084 arbitrary options:
3085
3086 tcp option
3087
3088 tcp option @number,offset,length
3089
3090 Table 57. IP Options
3091 ┌────────┬─────────────────────┬─────────────────────┐
3092 │Keyword │ Description │ IP option fields │
3093 ├────────┼─────────────────────┼─────────────────────┤
3094 │ │ │ │
3095 │lsrr │ Loose Source Route │ type, length, ptr, │
3096 │ │ │ addr │
3097 ├────────┼─────────────────────┼─────────────────────┤
3098 │ │ │ │
3099 │ra │ Router Alert │ type, length, value │
3100 ├────────┼─────────────────────┼─────────────────────┤
3101 │ │ │ │
3102 │rr │ Record Route │ type, length, ptr, │
3103 │ │ │ addr │
3104 ├────────┼─────────────────────┼─────────────────────┤
3105 │ │ │ │
3106 │ssrr │ Strict Source Route │ type, length, ptr, │
3107 │ │ │ addr │
3108 └────────┴─────────────────────┴─────────────────────┘
3109
3110 finding TCP options.
3111
3112 filter input tcp option sack-perm kind 1 counter
3113
3114 matching IPv6 exthdr.
3115
3116 ip6 filter input frag more-fragments 1 counter
3117
3118 finding IP option.
3119
3120 filter input ip option lsrr exists counter
3121
3122
3123 CONNTRACK EXPRESSIONS
3124 Conntrack expressions refer to meta data of the connection tracking
3125 entry associated with a packet.
3126
3127 There are three types of conntrack expressions. Some conntrack
3128 expressions require the flow direction before the conntrack key, others
3129 must be used directly because they are direction agnostic. The packets,
3130 bytes and avgpkt keywords can be used with or without a direction. If
3131 the direction is omitted, the sum of the original and the reply
3132 direction is returned. The same is true for the zone, if a direction is
3133 given, the zone is only matched if the zone id is tied to the given
3134 direction.
3135
3136 ct {state | direction | status | mark | expiration | helper | label | count | id}
3137 ct [original | reply] {l3proto | protocol | bytes | packets | avgpkt | zone}
3138 ct {original | reply} {proto-src | proto-dst}
3139 ct {original | reply} {ip | ip6} {saddr | daddr}
3140
3141 The conntrack-specific types in this table are described in the
3142 sub-section CONNTRACK TYPES above.
3143
3144 Table 58. Conntrack expressions
3145 ┌───────────┬─────────────────────┬─────────────────────┐
3146 │Keyword │ Description │ Type │
3147 ├───────────┼─────────────────────┼─────────────────────┤
3148 │ │ │ │
3149 │state │ State of the │ ct_state │
3150 │ │ connection │ │
3151 ├───────────┼─────────────────────┼─────────────────────┤
3152 │ │ │ │
3153 │direction │ Direction of the │ ct_dir │
3154 │ │ packet relative to │ │
3155 │ │ the connection │ │
3156 ├───────────┼─────────────────────┼─────────────────────┤
3157 │ │ │ │
3158 │status │ Status of the │ ct_status │
3159 │ │ connection │ │
3160 ├───────────┼─────────────────────┼─────────────────────┤
3161 │ │ │ │
3162 │mark │ Connection mark │ mark │
3163 ├───────────┼─────────────────────┼─────────────────────┤
3164 │ │ │ │
3165 │expiration │ Connection │ time │
3166 │ │ expiration time │ │
3167 ├───────────┼─────────────────────┼─────────────────────┤
3168 │ │ │ │
3169 │helper │ Helper associated │ string │
3170 │ │ with the connection │ │
3171 ├───────────┼─────────────────────┼─────────────────────┤
3172 │ │ │ │
3173 │label │ Connection tracking │ ct_label │
3174 │ │ label bit or │ │
3175 │ │ symbolic name │ │
3176 │ │ defined in │ │
3177 │ │ connlabel.conf in │ │
3178 │ │ the nftables │ │
3179 │ │ include path │ │
3180 ├───────────┼─────────────────────┼─────────────────────┤
3181 │ │ │ │
3182 │l3proto │ Layer 3 protocol of │ nf_proto │
3183 │ │ the connection │ │
3184 ├───────────┼─────────────────────┼─────────────────────┤
3185 │ │ │ │
3186 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
3187 │ │ the connection for │ │
3188 │ │ the given direction │ │
3189 ├───────────┼─────────────────────┼─────────────────────┤
3190 │ │ │ │
3191 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
3192 │ │ of the connection │ │
3193 │ │ for the given │ │
3194 │ │ direction │ │
3195 ├───────────┼─────────────────────┼─────────────────────┤
3196 │ │ │ │
3197 │protocol │ Layer 4 protocol of │ inet_proto │
3198 │ │ the connection for │ │
3199 │ │ the given direction │ │
3200 ├───────────┼─────────────────────┼─────────────────────┤
3201 │ │ │ │
3202 │proto-src │ Layer 4 protocol │ integer (16 bit) │
3203 │ │ source for the │ │
3204 │ │ given direction │ │
3205 ├───────────┼─────────────────────┼─────────────────────┤
3206 │ │ │ │
3207 │proto-dst │ Layer 4 protocol │ integer (16 bit) │
3208 │ │ destination for the │ │
3209 │ │ given direction │ │
3210 ├───────────┼─────────────────────┼─────────────────────┤
3211 │ │ │ │
3212 │packets │ packet count seen │ integer (64 bit) │
3213 │ │ in the given │ │
3214 │ │ direction or sum of │ │
3215 │ │ original and reply │ │
3216 ├───────────┼─────────────────────┼─────────────────────┤
3217 │ │ │ │
3218 │bytes │ byte count seen, │ integer (64 bit) │
3219 │ │ see description for │ │
3220 │ │ packets keyword │ │
3221 ├───────────┼─────────────────────┼─────────────────────┤
3222 │ │ │ │
3223 │avgpkt │ average bytes per │ integer (64 bit) │
3224 │ │ packet, see │ │
3225 │ │ description for │ │
3226 │ │ packets keyword │ │
3227 ├───────────┼─────────────────────┼─────────────────────┤
3228 │ │ │ │
3229 │zone │ conntrack zone │ integer (16 bit) │
3230 ├───────────┼─────────────────────┼─────────────────────┤
3231 │ │ │ │
3232 │count │ number of current │ integer (32 bit) │
3233 │ │ connections │ │
3234 ├───────────┼─────────────────────┼─────────────────────┤
3235 │ │ │ │
3236 │id │ Connection id │ ct_id │
3237 └───────────┴─────────────────────┴─────────────────────┘
3238
3239 restrict the number of parallel connections to a server.
3240
3241 nft add set filter ssh_flood '{ type ipv4_addr; flags dynamic; }'
3242 nft add rule filter input tcp dport 22 add @ssh_flood '{ ip saddr ct count over 2 }' reject
3243
3244
3246 Statements represent actions to be performed. They can alter control
3247 flow (return, jump to a different chain, accept or drop the packet) or
3248 can perform actions, such as logging, rejecting a packet, etc.
3249
3250 Statements exist in two kinds. Terminal statements unconditionally
3251 terminate evaluation of the current rule, non-terminal statements
3252 either only conditionally or never terminate evaluation of the current
3253 rule, in other words, they are passive from the ruleset evaluation
3254 perspective. There can be an arbitrary amount of non-terminal
3255 statements in a rule, but only a single terminal statement as the final
3256 statement.
3257
3258 VERDICT STATEMENT
3259 The verdict statement alters control flow in the ruleset and issues
3260 policy decisions for packets.
3261
3262 {accept | drop | queue | continue | return}
3263 {jump | goto} chain
3264
3265 accept and drop are absolute verdicts — they terminate ruleset
3266 evaluation immediately.
3267
3268
3269 accept Terminate ruleset
3270 evaluation and accept the
3271 packet. The packet can
3272 still be dropped later by
3273 another hook, for instance
3274 accept in the forward hook
3275 still allows to drop the
3276 packet later in the
3277 postrouting hook, or
3278 another forward base chain
3279 that has a higher priority
3280 number and is evaluated
3281 afterwards in the
3282 processing pipeline.
3283
3284 drop Terminate ruleset
3285 evaluation and drop the
3286 packet. The drop occurs
3287 instantly, no further
3288 chains or hooks are
3289 evaluated. It is not
3290 possible to accept the
3291 packet in a later chain
3292 again, as those are not
3293 evaluated anymore for the
3294 packet.
3295
3296 queue Terminate ruleset
3297 evaluation and queue the
3298 packet to userspace.
3299 Userspace must provide a
3300 drop or accept verdict. In
3301 case of accept, processing
3302 resumes with the next base
3303 chain hook, not the rule
3304 following the queue
3305 verdict.
3306
3307 continue Continue ruleset
3308 evaluation with the next
3309 rule. This is the default
3310 behaviour in case a rule
3311 issues no verdict.
3312
3313 return Return from the current
3314 chain and continue
3315 evaluation at the next
3316 rule in the last chain. If
3317 issued in a base chain, it
3318 is equivalent to the base
3319 chain policy.
3320
3321 jump chain Continue evaluation at the
3322 first rule in chain. The
3323 current position in the
3324 ruleset is pushed to a
3325 call stack and evaluation
3326 will continue there when
3327 the new chain is entirely
3328 evaluated or a return
3329 verdict is issued. In case
3330 an absolute verdict is
3331 issued by a rule in the
3332 chain, ruleset evaluation
3333 terminates immediately and
3334 the specific action is
3335 taken.
3336
3337 goto chain Similar to jump, but the
3338 current position is not
3339 pushed to the call stack,
3340 meaning that after the new
3341 chain evaluation will
3342 continue at the last chain
3343 instead of the one
3344 containing the goto
3345 statement.
3346
3347
3348 Using verdict statements.
3349
3350 # process packets from eth0 and the internal network in from_lan
3351 # chain, drop all packets from eth0 with different source addresses.
3352
3353 filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan
3354 filter input iif eth0 drop
3355
3356
3357 PAYLOAD STATEMENT
3358 payload_expression set value
3359
3360 The payload statement alters packet content. It can be used for example
3361 to set ip DSCP (diffserv) header field or ipv6 flow labels.
3362
3363 route some packets instead of bridging.
3364
3365 # redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging
3366 # assumes 00:11:22:33:44:55 is local MAC address.
3367 bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55
3368
3369 Set IPv4 DSCP header field.
3370
3371 ip forward ip dscp set 42
3372
3373
3374 EXTENSION HEADER STATEMENT
3375 extension_header_expression set value
3376
3377 The extension header statement alters packet content in variable-sized
3378 headers. This can currently be used to alter the TCP Maximum segment
3379 size of packets, similar to TCPMSS.
3380
3381 change tcp mss.
3382
3383 tcp flags syn tcp option maxseg size set 1360
3384 # set a size based on route information:
3385 tcp flags syn tcp option maxseg size set rt mtu
3386
3387
3388 LOG STATEMENT
3389 log [prefix quoted_string] [level syslog-level] [flags log-flags]
3390 log group nflog_group [prefix quoted_string] [queue-threshold value] [snaplen size]
3391 log level audit
3392
3393 The log statement enables logging of matching packets. When this
3394 statement is used from a rule, the Linux kernel will print some
3395 information on all matching packets, such as header fields, via the
3396 kernel log (where it can be read with dmesg(1) or read in the syslog).
3397
3398 In the second form of invocation (if nflog_group is specified), the
3399 Linux kernel will pass the packet to nfnetlink_log which will multicast
3400 the packet through a netlink socket to the specified multicast group.
3401 One or more userspace processes may subscribe to the group to receive
3402 the packets, see libnetfilter_queue documentation for details.
3403
3404 In the third form of invocation (if level audit is specified), the
3405 Linux kernel writes a message into the audit buffer suitably formatted
3406 for reading with auditd. Therefore no further formatting options (such
3407 as prefix or flags) are allowed in this mode.
3408
3409 This is a non-terminating statement, so the rule evaluation continues
3410 after the packet is logged.
3411
3412 Table 59. log statement options
3413 ┌────────────────┬─────────────────────┬───────────────────┐
3414 │Keyword │ Description │ Type │
3415 ├────────────────┼─────────────────────┼───────────────────┤
3416 │ │ │ │
3417 │prefix │ Log message prefix │ quoted string │
3418 ├────────────────┼─────────────────────┼───────────────────┤
3419 │ │ │ │
3420 │level │ Syslog level of │ string: emerg, │
3421 │ │ logging │ alert, crit, err, │
3422 │ │ │ warn [default], │
3423 │ │ │ notice, info, │
3424 │ │ │ debug, audit │
3425 ├────────────────┼─────────────────────┼───────────────────┤
3426 │ │ │ │
3427 │group │ NFLOG group to send │ unsigned integer │
3428 │ │ messages to │ (16 bit) │
3429 ├────────────────┼─────────────────────┼───────────────────┤
3430 │ │ │ │
3431 │snaplen │ Length of packet │ unsigned integer │
3432 │ │ payload to include │ (32 bit) │
3433 │ │ in netlink message │ │
3434 ├────────────────┼─────────────────────┼───────────────────┤
3435 │ │ │ │
3436 │queue-threshold │ Number of packets │ unsigned integer │
3437 │ │ to queue inside the │ (32 bit) │
3438 │ │ kernel before │ │
3439 │ │ sending them to │ │
3440 │ │ userspace │ │
3441 └────────────────┴─────────────────────┴───────────────────┘
3442
3443 Table 60. log-flags
3444 ┌─────────────┬───────────────────────────┐
3445 │Flag │ Description │
3446 ├─────────────┼───────────────────────────┤
3447 │ │ │
3448 │tcp sequence │ Log TCP sequence numbers. │
3449 ├─────────────┼───────────────────────────┤
3450 │ │ │
3451 │tcp options │ Log options from the TCP │
3452 │ │ packet header. │
3453 ├─────────────┼───────────────────────────┤
3454 │ │ │
3455 │ip options │ Log options from the │
3456 │ │ IP/IPv6 packet header. │
3457 ├─────────────┼───────────────────────────┤
3458 │ │ │
3459 │skuid │ Log the userid of the │
3460 │ │ process which generated │
3461 │ │ the packet. │
3462 ├─────────────┼───────────────────────────┤
3463 │ │ │
3464 │ether │ Decode MAC addresses and │
3465 │ │ protocol. │
3466 ├─────────────┼───────────────────────────┤
3467 │ │ │
3468 │all │ Enable all log flags │
3469 │ │ listed above. │
3470 └─────────────┴───────────────────────────┘
3471
3472 Using log statement.
3473
3474 # log the UID which generated the packet and ip options
3475 ip filter output log flags skuid flags ip options
3476
3477 # log the tcp sequence numbers and tcp options from the TCP packet
3478 ip filter output log flags tcp sequence,options
3479
3480 # enable all supported log flags
3481 ip6 filter output log flags all
3482
3483
3484 REJECT STATEMENT
3485 reject [ with REJECT_WITH ]
3486
3487 REJECT_WITH := icmp icmp_code |
3488 icmpv6 icmpv6_code |
3489 icmpx icmpx_code |
3490 tcp reset
3491
3492 A reject statement is used to send back an error packet in response to
3493 the matched packet otherwise it is equivalent to drop so it is a
3494 terminating statement, ending rule traversal. This statement is only
3495 valid in base chains using the input, forward or output hooks, and
3496 user-defined chains which are only called from those chains.
3497
3498 Table 61. different ICMP reject variants are meant for use in different
3499 table families
3500 ┌────────┬────────┬─────────────┐
3501 │Variant │ Family │ Type │
3502 ├────────┼────────┼─────────────┤
3503 │ │ │ │
3504 │icmp │ ip │ icmp_code │
3505 ├────────┼────────┼─────────────┤
3506 │ │ │ │
3507 │icmpv6 │ ip6 │ icmpv6_code │
3508 ├────────┼────────┼─────────────┤
3509 │ │ │ │
3510 │icmpx │ inet │ icmpx_code │
3511 └────────┴────────┴─────────────┘
3512
3513 For a description of the different types and a list of supported
3514 keywords refer to DATA TYPES section above. The common default reject
3515 value is port-unreachable.
3516
3517 Note that in bridge family, reject statement is only allowed in base
3518 chains which hook into input or prerouting.
3519
3520 COUNTER STATEMENT
3521 A counter statement sets the hit count of packets along with the number
3522 of bytes.
3523
3524 counter packets number bytes number
3525 counter { packets number | bytes number }
3526
3527 CONNTRACK STATEMENT
3528 The conntrack statement can be used to set the conntrack mark and
3529 conntrack labels.
3530
3531 ct {mark | event | label | zone} set value
3532
3533 The ct statement sets meta data associated with a connection. The zone
3534 id has to be assigned before a conntrack lookup takes place, i.e. this
3535 has to be done in prerouting and possibly output (if locally generated
3536 packets need to be placed in a distinct zone), with a hook priority of
3537 raw (-300).
3538
3539 Unlike iptables, where the helper assignment happens in the raw table,
3540 the helper needs to be assigned after a conntrack entry has been found,
3541 i.e. it will not work when used with hook priorities equal or before
3542 -200.
3543
3544 Table 62. Conntrack statement types
3545 ┌────────┬─────────────────────┬──────────────────┐
3546 │Keyword │ Description │ Value │
3547 ├────────┼─────────────────────┼──────────────────┤
3548 │ │ │ │
3549 │event │ conntrack event │ bitmask, integer │
3550 │ │ bits │ (32 bit) │
3551 ├────────┼─────────────────────┼──────────────────┤
3552 │ │ │ │
3553 │helper │ name of ct helper │ quoted string │
3554 │ │ object to assign to │ │
3555 │ │ the connection │ │
3556 ├────────┼─────────────────────┼──────────────────┤
3557 │ │ │ │
3558 │mark │ Connection tracking │ mark │
3559 │ │ mark │ │
3560 ├────────┼─────────────────────┼──────────────────┤
3561 │ │ │ │
3562 │label │ Connection tracking │ label │
3563 │ │ label │ │
3564 ├────────┼─────────────────────┼──────────────────┤
3565 │ │ │ │
3566 │zone │ conntrack zone │ integer (16 bit) │
3567 └────────┴─────────────────────┴──────────────────┘
3568
3569 save packet nfmark in conntrack.
3570
3571 ct mark set meta mark
3572
3573 set zone mapped via interface.
3574
3575 table inet raw {
3576 chain prerouting {
3577 type filter hook prerouting priority raw;
3578 ct zone set iif map { "eth1" : 1, "veth1" : 2 }
3579 }
3580 chain output {
3581 type filter hook output priority raw;
3582 ct zone set oif map { "eth1" : 1, "veth1" : 2 }
3583 }
3584 }
3585
3586 restrict events reported by ctnetlink.
3587
3588 ct event set new,related,destroy
3589
3590
3591 NOTRACK STATEMENT
3592 The notrack statement allows to disable connection tracking for certain
3593 packets.
3594
3595 notrack
3596
3597 Note that for this statement to be effective, it has to be applied to
3598 packets before a conntrack lookup happens. Therefore, it needs to sit
3599 in a chain with either prerouting or output hook and a hook priority of
3600 -300 (raw) or less.
3601
3602 See SYNPROXY STATEMENT for an example usage.
3603
3604 META STATEMENT
3605 A meta statement sets the value of a meta expression. The existing meta
3606 fields are: priority, mark, pkttype, nftrace.
3607
3608 meta {mark | priority | pkttype | nftrace} set value
3609
3610 A meta statement sets meta data associated with a packet.
3611
3612 Table 63. Meta statement types
3613 ┌─────────┬─────────────────────┬───────────┐
3614 │Keyword │ Description │ Value │
3615 ├─────────┼─────────────────────┼───────────┤
3616 │ │ │ │
3617 │priority │ TC packet priority │ tc_handle │
3618 ├─────────┼─────────────────────┼───────────┤
3619 │ │ │ │
3620 │mark │ Packet mark │ mark │
3621 ├─────────┼─────────────────────┼───────────┤
3622 │ │ │ │
3623 │pkttype │ packet type │ pkt_type │
3624 ├─────────┼─────────────────────┼───────────┤
3625 │ │ │ │
3626 │nftrace │ ruleset packet │ 0, 1 │
3627 │ │ tracing on/off. Use │ │
3628 │ │ monitor trace │ │
3629 │ │ command to watch │ │
3630 │ │ traces │ │
3631 └─────────┴─────────────────────┴───────────┘
3632
3633 LIMIT STATEMENT
3634 limit rate [over] packet_number / TIME_UNIT [burst packet_number packets]
3635 limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT]
3636
3637 TIME_UNIT := second | minute | hour | day
3638 BYTE_UNIT := bytes | kbytes | mbytes
3639
3640 A limit statement matches at a limited rate using a token bucket
3641 filter. A rule using this statement will match until this limit is
3642 reached. It can be used in combination with the log statement to give
3643 limited logging. The optional over keyword makes it match over the
3644 specified rate. Default burst is 5. if you specify burst, it must be
3645 non-zero value.
3646
3647 Table 64. limit statement values
3648 ┌──────────────┬───────────────────┬──────────────────┐
3649 │Value │ Description │ Type │
3650 ├──────────────┼───────────────────┼──────────────────┤
3651 │ │ │ │
3652 │packet_number │ Number of packets │ unsigned integer │
3653 │ │ │ (32 bit) │
3654 ├──────────────┼───────────────────┼──────────────────┤
3655 │ │ │ │
3656 │byte_number │ Number of bytes │ unsigned integer │
3657 │ │ │ (32 bit) │
3658 └──────────────┴───────────────────┴──────────────────┘
3659
3660 NAT STATEMENTS
3661 snat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
3662 dnat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
3663 masquerade [to :PORT_SPEC] [FLAGS]
3664 redirect [to :PORT_SPEC] [FLAGS]
3665
3666 ADDR_SPEC := address | address - address
3667 PORT_SPEC := port | port - port
3668
3669 FLAGS := FLAG [, FLAGS]
3670 FLAG := persistent | random | fully-random
3671
3672 The nat statements are only valid from nat chain types.
3673
3674 The snat and masquerade statements specify that the source address of
3675 the packet should be modified. While snat is only valid in the
3676 postrouting and input chains, masquerade makes sense only in
3677 postrouting. The dnat and redirect statements are only valid in the
3678 prerouting and output chains, they specify that the destination address
3679 of the packet should be modified. You can use non-base chains which are
3680 called from base chains of nat chain type too. All future packets in
3681 this connection will also be mangled, and rules should cease being
3682 examined.
3683
3684 The masquerade statement is a special form of snat which always uses
3685 the outgoing interface’s IP address to translate to. It is particularly
3686 useful on gateways with dynamic (public) IP addresses.
3687
3688 The redirect statement is a special form of dnat which always
3689 translates the destination address to the local host’s one. It comes in
3690 handy if one only wants to alter the destination port of incoming
3691 traffic on different interfaces.
3692
3693 When used in the inet family (available with kernel 5.2), the dnat and
3694 snat statements require the use of the ip and ip6 keyword in case an
3695 address is provided, see the examples below.
3696
3697 Before kernel 4.18 nat statements require both prerouting and
3698 postrouting base chains to be present since otherwise packets on the
3699 return path won’t be seen by netfilter and therefore no reverse
3700 translation will take place.
3701
3702 Table 65. NAT statement values
3703 ┌───────────┬─────────────────────┬─────────────────────┐
3704 │Expression │ Description │ Type │
3705 ├───────────┼─────────────────────┼─────────────────────┤
3706 │ │ │ │
3707 │address │ Specifies that the │ ipv4_addr, │
3708 │ │ source/destination │ ipv6_addr, e.g. │
3709 │ │ address of the │ abcd::1234, or you │
3710 │ │ packet should be │ can use a mapping, │
3711 │ │ modified. You may │ e.g. meta mark map │
3712 │ │ specify a mapping │ { 10 : 192.168.1.2, │
3713 │ │ to relate a list of │ 20 : 192.168.1.3 } │
3714 │ │ tuples composed of │ │
3715 │ │ arbitrary │ │
3716 │ │ expression key with │ │
3717 │ │ address value. │ │
3718 ├───────────┼─────────────────────┼─────────────────────┤
3719 │ │ │ │
3720 │port │ Specifies that the │ port number (16 │
3721 │ │ source/destination │ bit) │
3722 │ │ address of the │ │
3723 │ │ packet should be │ │
3724 │ │ modified. │ │
3725 └───────────┴─────────────────────┴─────────────────────┘
3726
3727 Table 66. NAT statement flags
3728 ┌─────────────┬─────────────────────────────┐
3729 │Flag │ Description │
3730 ├─────────────┼─────────────────────────────┤
3731 │ │ │
3732 │persistent │ Gives a client the same │
3733 │ │ source-/destination-address │
3734 │ │ for each connection. │
3735 ├─────────────┼─────────────────────────────┤
3736 │ │ │
3737 │random │ In kernel 5.0 and newer │
3738 │ │ this is the same as │
3739 │ │ fully-random. In earlier │
3740 │ │ kernels the port mapping │
3741 │ │ will be randomized using a │
3742 │ │ seeded MD5 hash mix using │
3743 │ │ source and destination │
3744 │ │ address and destination │
3745 │ │ port. │
3746 ├─────────────┼─────────────────────────────┤
3747 │ │ │
3748 │fully-random │ If used then port mapping │
3749 │ │ is generated based on a │
3750 │ │ 32-bit pseudo-random │
3751 │ │ algorithm. │
3752 └─────────────┴─────────────────────────────┘
3753
3754 Using NAT statements.
3755
3756 # create a suitable table/chain setup for all further examples
3757 add table nat
3758 add chain nat prerouting { type nat hook prerouting priority dstnat; }
3759 add chain nat postrouting { type nat hook postrouting priority srcnat; }
3760
3761 # translate source addresses of all packets leaving via eth0 to address 1.2.3.4
3762 add rule nat postrouting oif eth0 snat to 1.2.3.4
3763
3764 # redirect all traffic entering via eth0 to destination address 192.168.1.120
3765 add rule nat prerouting iif eth0 dnat to 192.168.1.120
3766
3767 # translate source addresses of all packets leaving via eth0 to whatever
3768 # locally generated packets would use as source to reach the same destination
3769 add rule nat postrouting oif eth0 masquerade
3770
3771 # redirect incoming TCP traffic for port 22 to port 2222
3772 add rule nat prerouting tcp dport 22 redirect to :2222
3773
3774 # inet family:
3775 # handle ip dnat:
3776 add rule inet nat prerouting dnat ip to 10.0.2.99
3777 # handle ip6 dnat:
3778 add rule inet nat prerouting dnat ip6 to fe80::dead
3779 # this masquerades both ipv4 and ipv6:
3780 add rule inet nat postrouting meta oif ppp0 masquerade
3781
3782
3783 TPROXY STATEMENT
3784 Tproxy redirects the packet to a local socket without changing the
3785 packet header in any way. If any of the arguments is missing the data
3786 of the incoming packet is used as parameter. Tproxy matching requires
3787 another rule that ensures the presence of transport protocol header is
3788 specified.
3789
3790 tproxy to address:port
3791 tproxy to {address | :port}
3792
3793 This syntax can be used in ip/ip6 tables where network layer protocol
3794 is obvious. Either IP address or port can be specified, but at least
3795 one of them is necessary.
3796
3797 tproxy {ip | ip6} to address[:port]
3798 tproxy to :port
3799
3800 This syntax can be used in inet tables. The ip/ip6 parameter defines
3801 the family the rule will match. The address parameter must be of this
3802 family. When only port is defined, the address family should not be
3803 specified. In this case the rule will match for both families.
3804
3805 Table 67. tproxy attributes
3806 ┌────────┬────────────────────────────┐
3807 │Name │ Description │
3808 ├────────┼────────────────────────────┤
3809 │ │ │
3810 │address │ IP address the listening │
3811 │ │ socket with IP_TRANSPARENT │
3812 │ │ option is bound to. │
3813 ├────────┼────────────────────────────┤
3814 │ │ │
3815 │port │ Port the listening socket │
3816 │ │ with IP_TRANSPARENT option │
3817 │ │ is bound to. │
3818 └────────┴────────────────────────────┘
3819
3820 Example ruleset for tproxy statement.
3821
3822 table ip x {
3823 chain y {
3824 type filter hook prerouting priority mangle; policy accept;
3825 tcp dport ntp tproxy to 1.1.1.1
3826 udp dport ssh tproxy to :2222
3827 }
3828 }
3829 table ip6 x {
3830 chain y {
3831 type filter hook prerouting priority mangle; policy accept;
3832 tcp dport ntp tproxy to [dead::beef]
3833 udp dport ssh tproxy to :2222
3834 }
3835 }
3836 table inet x {
3837 chain y {
3838 type filter hook prerouting priority mangle; policy accept;
3839 tcp dport 321 tproxy to :ssh
3840 tcp dport 99 tproxy ip to 1.1.1.1:999
3841 udp dport 155 tproxy ip6 to [dead::beef]:smux
3842 }
3843 }
3844
3845
3846 SYNPROXY STATEMENT
3847 This statement will process TCP three-way-handshake parallel in
3848 netfilter context to protect either local or backend system. This
3849 statement requires connection tracking because sequence numbers need to
3850 be translated.
3851
3852 synproxy [mss mss_value] [wscale wscale_value] [SYNPROXY_FLAGS]
3853
3854 Table 68. synproxy statement attributes
3855 ┌───────┬────────────────────────────┐
3856 │Name │ Description │
3857 ├───────┼────────────────────────────┤
3858 │ │ │
3859 │mss │ Maximum segment size │
3860 │ │ announced to clients. This │
3861 │ │ must match the backend. │
3862 ├───────┼────────────────────────────┤
3863 │ │ │
3864 │wscale │ Window scale announced to │
3865 │ │ clients. This must match │
3866 │ │ the backend. │
3867 └───────┴────────────────────────────┘
3868
3869 Table 69. synproxy statement flags
3870 ┌──────────┬────────────────────────────┐
3871 │Flag │ Description │
3872 ├──────────┼────────────────────────────┤
3873 │ │ │
3874 │sack-perm │ Pass client selective │
3875 │ │ acknowledgement option to │
3876 │ │ backend (will be disabled │
3877 │ │ if not present). │
3878 ├──────────┼────────────────────────────┤
3879 │ │ │
3880 │timestamp │ Pass client timestamp │
3881 │ │ option to backend (will be │
3882 │ │ disabled if not present, │
3883 │ │ also needed for selective │
3884 │ │ acknowledgement and window │
3885 │ │ scaling). │
3886 └──────────┴────────────────────────────┘
3887
3888 Example ruleset for synproxy statement.
3889
3890 Determine tcp options used by backend, from an external system
3891
3892 tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)'
3893 port 80 &
3894 telnet 192.0.2.42 80
3895 18:57:24.693307 IP 192.0.2.42.80 > 192.0.2.43.48757:
3896 Flags [S.], seq 360414582, ack 788841994, win 14480,
3897 options [mss 1460,sackOK,
3898 TS val 1409056151 ecr 9690221,
3899 nop,wscale 9],
3900 length 0
3901
3902 Switch tcp_loose mode off, so conntrack will mark out-of-flow packets as state INVALID.
3903
3904 echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
3905
3906 Make SYN packets untracked.
3907
3908 table ip x {
3909 chain y {
3910 type filter hook prerouting priority raw; policy accept;
3911 tcp flags syn notrack
3912 }
3913 }
3914
3915 Catch UNTRACKED (SYN packets) and INVALID (3WHS ACK packets) states and send
3916 them to SYNPROXY. This rule will respond to SYN packets with SYN+ACK
3917 syncookies, create ESTABLISHED for valid client response (3WHS ACK packets) and
3918 drop incorrect cookies. Flags combinations not expected during 3WHS will not
3919 match and continue (e.g. SYN+FIN, SYN+ACK). Finally, drop invalid packets, this
3920 will be out-of-flow packets that were not matched by SYNPROXY.
3921
3922 table ip x {
3923 chain z {
3924 type filter hook input priority filter; policy accept;
3925 ct state invalid, untracked synproxy mss 1460 wscale 9 timestamp sack-perm
3926 ct state invalid drop
3927 }
3928 }
3929
3930
3931 FLOW STATEMENT
3932 A flow statement allows us to select what flows you want to accelerate
3933 forwarding through layer 3 network stack bypass. You have to specify
3934 the flowtable name where you want to offload this flow.
3935
3936 flow add @flowtable
3937
3938 QUEUE STATEMENT
3939 This statement passes the packet to userspace using the nfnetlink_queue
3940 handler. The packet is put into the queue identified by its 16-bit
3941 queue number. Userspace can inspect and modify the packet if desired.
3942 Userspace must then drop or re-inject the packet into the kernel. See
3943 libnetfilter_queue documentation for details.
3944
3945 queue [flags QUEUE_FLAGS] [num queue_number]
3946 queue [flags QUEUE_FLAGS] [num queue_number_from - queue_number_to]
3947 queue [flags QUEUE_FLAGS] [to QUEUE_EXPRESSION ]
3948
3949 QUEUE_FLAGS := QUEUE_FLAG [, QUEUE_FLAGS]
3950 QUEUE_FLAG := bypass | fanout
3951 QUEUE_EXPRESSION := numgen | hash | symhash | MAP STATEMENT
3952
3953 QUEUE_EXPRESSION can be used to compute a queue number at run-time with
3954 the hash or numgen expressions. It also allows to use the map statement
3955 to assign fixed queue numbers based on external inputs such as the
3956 source ip address or interface names.
3957
3958 Table 70. queue statement values
3959 ┌──────────────────┬────────────────────┬──────────────────┐
3960 │Value │ Description │ Type │
3961 ├──────────────────┼────────────────────┼──────────────────┤
3962 │ │ │ │
3963 │queue_number │ Sets queue number, │ unsigned integer │
3964 │ │ default is 0. │ (16 bit) │
3965 ├──────────────────┼────────────────────┼──────────────────┤
3966 │ │ │ │
3967 │queue_number_from │ Sets initial queue │ unsigned integer │
3968 │ │ in the range, if │ (16 bit) │
3969 │ │ fanout is used. │ │
3970 ├──────────────────┼────────────────────┼──────────────────┤
3971 │ │ │ │
3972 │queue_number_to │ Sets closing queue │ unsigned integer │
3973 │ │ in the range, if │ (16 bit) │
3974 │ │ fanout is used. │ │
3975 └──────────────────┴────────────────────┴──────────────────┘
3976
3977 Table 71. queue statement flags
3978 ┌───────┬────────────────────────────┐
3979 │Flag │ Description │
3980 ├───────┼────────────────────────────┤
3981 │ │ │
3982 │bypass │ Let packets go through if │
3983 │ │ userspace application │
3984 │ │ cannot back off. Before │
3985 │ │ using this flag, read │
3986 │ │ libnetfilter_queue │
3987 │ │ documentation for │
3988 │ │ performance tuning │
3989 │ │ recommendations. │
3990 ├───────┼────────────────────────────┤
3991 │ │ │
3992 │fanout │ Distribute packets between │
3993 │ │ several queues. │
3994 └───────┴────────────────────────────┘
3995
3996 DUP STATEMENT
3997 The dup statement is used to duplicate a packet and send the copy to a
3998 different destination.
3999
4000 dup to device
4001 dup to address device device
4002
4003 Table 72. Dup statement values
4004 ┌───────────┬─────────────────────┬─────────────────────┐
4005 │Expression │ Description │ Type │
4006 ├───────────┼─────────────────────┼─────────────────────┤
4007 │ │ │ │
4008 │address │ Specifies that the │ ipv4_addr, │
4009 │ │ copy of the packet │ ipv6_addr, e.g. │
4010 │ │ should be sent to a │ abcd::1234, or you │
4011 │ │ new gateway. │ can use a mapping, │
4012 │ │ │ e.g. ip saddr map { │
4013 │ │ │ 192.168.1.2 : │
4014 │ │ │ 10.1.1.1 } │
4015 ├───────────┼─────────────────────┼─────────────────────┤
4016 │ │ │ │
4017 │device │ Specifies that the │ string │
4018 │ │ copy should be │ │
4019 │ │ transmitted via │ │
4020 │ │ device. │ │
4021 └───────────┴─────────────────────┴─────────────────────┘
4022
4023 Using the dup statement.
4024
4025 # send to machine with ip address 10.2.3.4 on eth0
4026 ip filter forward dup to 10.2.3.4 device "eth0"
4027
4028 # copy raw frame to another interface
4029 netdetv ingress dup to "eth0"
4030 dup to "eth0"
4031
4032 # combine with map dst addr to gateways
4033 dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" }
4034
4035
4036 FWD STATEMENT
4037 The fwd statement is used to redirect a raw packet to another
4038 interface. It is only available in the netdev family ingress hook. It
4039 is similar to the dup statement except that no copy is made.
4040
4041 fwd to device
4042
4043 SET STATEMENT
4044 The set statement is used to dynamically add or update elements in a
4045 set from the packet path. The set setname must already exist in the
4046 given table and must have been created with one or both of the dynamic
4047 and the timeout flags. The dynamic flag is required if the set
4048 statement expression includes a stateful object. The timeout flag is
4049 implied if the set is created with a timeout, and is required if the
4050 set statement updates elements, rather than adding them. Furthermore,
4051 these sets should specify both a maximum set size (to prevent memory
4052 exhaustion), and their elements should have a timeout (so their number
4053 will not grow indefinitely) either from the set definition or from the
4054 statement that adds or updates them. The set statement can be used to
4055 e.g. create dynamic blacklists.
4056
4057 {add | update} @setname { expression [timeout timeout] [comment string] }
4058
4059 Example for simple blacklist.
4060
4061 # declare a set, bound to table "filter", in family "ip".
4062 # Timeout and size are mandatory because we will add elements from packet path.
4063 # Entries will timeout after one minute, after which they might be
4064 # re-added if limit condition persists.
4065 nft add set ip filter blackhole \
4066 "{ type ipv4_addr; flags dynamic; timeout 1m; size 65536; }"
4067
4068 # declare a set to store the limit per saddr.
4069 # This must be separate from blackhole since the timeout is different
4070 nft add set ip filter flood \
4071 "{ type ipv4_addr; flags dynamic; timeout 10s; size 128000; }"
4072
4073 # whitelist internal interface.
4074 nft add rule ip filter input meta iifname "internal" accept
4075
4076 # drop packets coming from blacklisted ip addresses.
4077 nft add rule ip filter input ip saddr @blackhole counter drop
4078
4079 # add source ip addresses to the blacklist if more than 10 tcp connection
4080 # requests occurred per second and ip address.
4081 nft add rule ip filter input tcp flags syn tcp dport ssh \
4082 add @flood { ip saddr limit rate over 10/second } \
4083 add @blackhole { ip saddr } \
4084 drop
4085
4086 # inspect state of the sets.
4087 nft list set ip filter flood
4088 nft list set ip filter blackhole
4089
4090 # manually add two addresses to the blackhole.
4091 nft add element filter blackhole { 10.2.3.4, 10.23.1.42 }
4092
4093
4094 MAP STATEMENT
4095 The map statement is used to lookup data based on some specific input
4096 key.
4097
4098 expression map { MAP_ELEMENTS }
4099
4100 MAP_ELEMENTS := MAP_ELEMENT [, MAP_ELEMENTS]
4101 MAP_ELEMENT := key : value
4102
4103 The key is a value returned by expression.
4104
4105 Using the map statement.
4106
4107 # select DNAT target based on TCP dport:
4108 # connections to port 80 are redirected to 192.168.1.100,
4109 # connections to port 8888 are redirected to 192.168.1.101
4110 nft add rule ip nat prerouting dnat tcp dport map { 80 : 192.168.1.100, 8888 : 192.168.1.101 }
4111
4112 # source address based SNAT:
4113 # packets from net 192.168.1.0/24 will appear as originating from 10.0.0.1,
4114 # packets from net 192.168.2.0/24 will appear as originating from 10.0.0.2
4115 nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 }
4116
4117
4118 VMAP STATEMENT
4119 The verdict map (vmap) statement works analogous to the map statement,
4120 but contains verdicts as values.
4121
4122 expression vmap { VMAP_ELEMENTS }
4123
4124 VMAP_ELEMENTS := VMAP_ELEMENT [, VMAP_ELEMENTS]
4125 VMAP_ELEMENT := key : verdict
4126
4127 Using the vmap statement.
4128
4129 # jump to different chains depending on layer 4 protocol type:
4130 nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain }
4131
4132
4134 These are some additional commands included in nft.
4135
4136 MONITOR
4137 The monitor command allows you to listen to Netlink events produced by
4138 the nf_tables subsystem. These are either related to creation and
4139 deletion of objects or to packets for which meta nftrace was enabled.
4140 When they occur, nft will print to stdout the monitored events in
4141 either JSON or native nft format.
4142
4143 monitor [new | destroy] MONITOR_OBJECT
4144 monitor trace
4145
4146 MONITOR_OBJECT := tables | chains | sets | rules | elements | ruleset
4147
4148 To filter events related to a concrete object, use one of the keywords
4149 in MONITOR_OBJECT.
4150
4151 To filter events related to a concrete action, use keyword new or
4152 destroy.
4153
4154 The second form of invocation takes no further options and exclusively
4155 prints events generated for packets with nftrace enabled.
4156
4157 Hit ^C to finish the monitor operation.
4158
4159 Listen to all events, report in native nft format.
4160
4161 % nft monitor
4162
4163 Listen to deleted rules, report in JSON format.
4164
4165 % nft -j monitor destroy rules
4166
4167 Listen to both new and destroyed chains, in native nft format.
4168
4169 % nft monitor chains
4170
4171 Listen to ruleset events such as table, chain, rule, set, counters and
4172 quotas, in native nft format.
4173
4174 % nft monitor ruleset
4175
4176 Trace incoming packets from host 10.0.0.1.
4177
4178 % nft add rule filter input ip saddr 10.0.0.1 meta nftrace set 1
4179 % nft monitor trace
4180
4181
4183 When an error is detected, nft shows the line(s) containing the error,
4184 the position of the erroneous parts in the input stream and marks up
4185 the erroneous parts using carets (^). If the error results from the
4186 combination of two expressions or statements, the part imposing the
4187 constraints which are violated is marked using tildes (~).
4188
4189 For errors returned by the kernel, nft cannot detect which parts of the
4190 input caused the error and the entire command is marked.
4191
4192 Error caused by single incorrect expression.
4193
4194 <cmdline>:1:19-22: Error: Interface does not exist
4195 filter output oif eth0
4196 ^^^^
4197
4198 Error caused by invalid combination of two expressions.
4199
4200 <cmdline>:1:28-36: Error: Right hand side of relational expression (==) must be constant
4201 filter output tcp dport == tcp dport
4202 ~~ ^^^^^^^^^
4203
4204 Error returned by the kernel.
4205
4206 <cmdline>:0:0-23: Error: Could not process rule: Operation not permitted
4207 filter output oif wlan0
4208 ^^^^^^^^^^^^^^^^^^^^^^^
4209
4210
4212 On success, nft exits with a status of 0. Unspecified errors cause it
4213 to exit with a status of 1, memory allocation errors with a status of
4214 2, unable to open Netlink socket with 3.
4215
4217 libnftables(3), libnftables-json(5), iptables(8), ip6tables(8), arptables(8), ebtables(8), ip(8), tc(8)
4218
4219 There is an official wiki at: https://wiki.nftables.org
4220
4222 nftables was written by Patrick McHardy and Pablo Neira Ayuso, among
4223 many other contributors from the Netfilter community.
4224
4226 Copyright © 2008-2014 Patrick McHardy <kaber@trash.net> Copyright ©
4227 2013-2018 Pablo Neira Ayuso <pablo@netfilter.org>
4228
4229 nftables is free software; you can redistribute it and/or modify it
4230 under the terms of the GNU General Public License version 2 as
4231 published by the Free Software Foundation.
4232
4233 This documentation is licensed under the terms of the Creative Commons
4234 Attribution-ShareAlike 4.0 license, CC BY-SA 4.0
4235 http://creativecommons.org/licenses/by-sa/4.0/.
4236
4237
4238
4239 08/19/2021 NFT(8)