1NFT(8) NFT(8)
2
3
4
6 nft - Administration tool of the nftables framework for packet
7 filtering and classification
8
10 nft [ -nNscaeSupyjt ] [ -I directory ] [ -f filename | -i | cmd ...]
11 nft -h
12 nft -v
13
15 nft is the command line tool used to set up, maintain and inspect
16 packet filtering and classification rules in the Linux kernel, in the
17 nftables framework. The Linux kernel subsystem is known as nf_tables,
18 and ‘nf’ stands for Netfilter.
19
21 The command accepts several different options which are documented here
22 in groups for better understanding of their meaning. You can get
23 information about options by running nft --help.
24
25 General options:
26
27 -h, --help
28 Show help message and all options.
29
30 -v, --version
31 Show version.
32
33 -V
34 Show long version information, including compile-time
35 configuration.
36
37 Ruleset input handling options that specify to how to load rulesets:
38
39 -f, --file filename
40 Read input from filename. If filename is -, read from stdin.
41
42 -D, --define name=value
43 Define a variable. You can only combine this option with -f.
44
45 -i, --interactive
46 Read input from an interactive readline CLI. You can use quit to
47 exit, or use the EOF marker, normally this is CTRL-D.
48
49 -I, --includepath directory
50 Add the directory directory to the list of directories to be
51 searched for included files. This option may be specified multiple
52 times.
53
54 -c, --check
55 Check commands validity without actually applying the changes.
56
57 -o, --optimize
58 Optimize your ruleset. You can combine this option with -c to
59 inspect the proposed optimizations.
60
61 Ruleset list output formatting that modify the output of the list
62 ruleset command:
63
64 -a, --handle
65 Show object handles in output.
66
67 -s, --stateless
68 Omit stateful information of rules and stateful objects.
69
70 -t, --terse
71 Omit contents of sets from output.
72
73 -S, --service
74 Translate ports to service names as defined by /etc/services.
75
76 -N, --reversedns
77 Translate IP address to names via reverse DNS lookup. This may slow
78 down your listing since it generates network traffic.
79
80 -u, --guid
81 Translate numeric UID/GID to names as defined by /etc/passwd and
82 /etc/group.
83
84 -n, --numeric
85 Print fully numerical output.
86
87 -y, --numeric-priority
88 Display base chain priority numerically.
89
90 -p, --numeric-protocol
91 Display layer 4 protocol numerically.
92
93 -T, --numeric-time
94 Show time, day and hour values in numeric format.
95
96 Command output formatting:
97
98 -e, --echo
99 When inserting items into the ruleset using add, insert or replace
100 commands, print notifications just like nft monitor.
101
102 -j, --json
103 Format output in JSON. See libnftables-json(5) for a schema
104 description.
105
106 -d, --debug level
107 Enable debugging output. The debug level can be any of scanner,
108 parser, eval, netlink, mnl, proto-ctx, segtree, all. You can
109 combine more than one by separating by the , symbol, for example -d
110 eval,mnl.
111
113 LEXICAL CONVENTIONS
114 Input is parsed line-wise. When the last character of a line, just
115 before the newline character, is a non-quoted backslash (\), the next
116 line is treated as a continuation. Multiple commands on the same line
117 can be separated using a semicolon (;).
118
119 A hash sign (#) begins a comment. All following characters on the same
120 line are ignored.
121
122 Identifiers begin with an alphabetic character (a-z,A-Z), followed by
123 zero or more alphanumeric characters (a-z,A-Z,0-9) and the characters
124 slash (/), backslash (\), underscore (_) and dot (.). Identifiers using
125 different characters or clashing with a keyword need to be enclosed in
126 double quotes (").
127
128 INCLUDE FILES
129 include filename
130
131 Other files can be included by using the include statement. The
132 directories to be searched for include files can be specified using the
133 -I/--includepath option. You can override this behaviour either by
134 prepending ‘./’ to your path to force inclusion of files located in the
135 current working directory (i.e. relative path) or / for file location
136 expressed as an absolute path.
137
138 If -I/--includepath is not specified, then nft relies on the default
139 directory that is specified at compile time. You can retrieve this
140 default directory via the -h/--help option.
141
142 Include statements support the usual shell wildcard symbols (,?,[]).
143 Having no matches for an include statement is not an error, if wildcard
144 symbols are used in the include statement. This allows having
145 potentially empty include directories for statements like include
146 "/etc/firewall/rules/". The wildcard matches are loaded in alphabetical
147 order. Files beginning with dot (.) are not matched by include
148 statements.
149
150 SYMBOLIC VARIABLES
151 define variable = expr
152 undefine variable
153 redefine variable = expr
154 $variable
155
156 Symbolic variables can be defined using the define statement. Variable
157 references are expressions and can be used to initialize other
158 variables. The scope of a definition is the current block and all
159 blocks contained within. Symbolic variables can be undefined using the
160 undefine statement, and modified using the redefine statement.
161
162 Using symbolic variables.
163
164 define int_if1 = eth0
165 define int_if2 = eth1
166 define int_ifs = { $int_if1, $int_if2 }
167 redefine int_if2 = wlan0
168 undefine int_if2
169
170 filter input iif $int_ifs accept
171
172
174 Address families determine the type of packets which are processed. For
175 each address family, the kernel contains so called hooks at specific
176 stages of the packet processing paths, which invoke nftables if rules
177 for these hooks exist.
178
179
180 ip IPv4 address family.
181
182 ip6 IPv6 address family.
183
184 inet Internet (IPv4/IPv6)
185 address family.
186
187 arp ARP address family,
188 handling IPv4 ARP packets.
189
190 bridge Bridge address family,
191 handling packets which
192 traverse a bridge device.
193
194 netdev Netdev address family,
195 handling packets on
196 ingress and egress.
197
198
199 All nftables objects exist in address family specific namespaces,
200 therefore all identifiers include an address family. If an identifier
201 is specified without an address family, the ip family is used by
202 default.
203
204 IPV4/IPV6/INET ADDRESS FAMILIES
205 The IPv4/IPv6/Inet address families handle IPv4, IPv6 or both types of
206 packets. They contain five hooks at different packet processing stages
207 in the network stack.
208
209 Table 1. IPv4/IPv6/Inet address family hooks
210 ┌────────────┬────────────────────────────┐
211 │Hook │ Description │
212 ├────────────┼────────────────────────────┤
213 │ │ │
214 │prerouting │ All packets entering the │
215 │ │ system are processed by │
216 │ │ the prerouting hook. It is │
217 │ │ invoked before the routing │
218 │ │ process and is used for │
219 │ │ early filtering or │
220 │ │ changing packet attributes │
221 │ │ that affect routing. │
222 ├────────────┼────────────────────────────┤
223 │ │ │
224 │input │ Packets delivered to the │
225 │ │ local system are processed │
226 │ │ by the input hook. │
227 ├────────────┼────────────────────────────┤
228 │ │ │
229 │forward │ Packets forwarded to a │
230 │ │ different host are │
231 │ │ processed by the forward │
232 │ │ hook. │
233 ├────────────┼────────────────────────────┤
234 │ │ │
235 │output │ Packets sent by local │
236 │ │ processes are processed by │
237 │ │ the output hook. │
238 ├────────────┼────────────────────────────┤
239 │ │ │
240 │postrouting │ All packets leaving the │
241 │ │ system are processed by │
242 │ │ the postrouting hook. │
243 ├────────────┼────────────────────────────┤
244 │ │ │
245 │ingress │ All packets entering the │
246 │ │ system are processed by │
247 │ │ this hook. It is invoked │
248 │ │ before layer 3 protocol │
249 │ │ handlers, hence before the │
250 │ │ prerouting hook, and it │
251 │ │ can be used for filtering │
252 │ │ and policing. Ingress is │
253 │ │ only available for Inet │
254 │ │ family (since Linux kernel │
255 │ │ 5.10). │
256 └────────────┴────────────────────────────┘
257
258 ARP ADDRESS FAMILY
259 The ARP address family handles ARP packets received and sent by the
260 system. It is commonly used to mangle ARP packets for clustering.
261
262 Table 2. ARP address family hooks
263 ┌───────┬────────────────────────────┐
264 │Hook │ Description │
265 ├───────┼────────────────────────────┤
266 │ │ │
267 │input │ Packets delivered to the │
268 │ │ local system are processed │
269 │ │ by the input hook. │
270 ├───────┼────────────────────────────┤
271 │ │ │
272 │output │ Packets send by the local │
273 │ │ system are processed by │
274 │ │ the output hook. │
275 └───────┴────────────────────────────┘
276
277 BRIDGE ADDRESS FAMILY
278 The bridge address family handles Ethernet packets traversing bridge
279 devices.
280
281 The list of supported hooks is identical to IPv4/IPv6/Inet address
282 families above.
283
284 NETDEV ADDRESS FAMILY
285 The Netdev address family handles packets from the device ingress and
286 egress path. This family allows you to filter packets of any ethertype
287 such as ARP, VLAN 802.1q, VLAN 802.1ad (Q-in-Q) as well as IPv4 and
288 IPv6 packets.
289
290 Table 3. Netdev address family hooks
291 ┌────────┬────────────────────────────┐
292 │Hook │ Description │
293 ├────────┼────────────────────────────┤
294 │ │ │
295 │ingress │ All packets entering the │
296 │ │ system are processed by │
297 │ │ this hook. It is invoked │
298 │ │ after the network taps │
299 │ │ (ie. tcpdump), right after │
300 │ │ tc ingress and before │
301 │ │ layer 3 protocol handlers, │
302 │ │ it can be used for early │
303 │ │ filtering and policing. │
304 ├────────┼────────────────────────────┤
305 │ │ │
306 │egress │ All packets leaving the │
307 │ │ system are processed by │
308 │ │ this hook. It is invoked │
309 │ │ after layer 3 protocol │
310 │ │ handlers and before tc │
311 │ │ egress. It can be used for │
312 │ │ late filtering and │
313 │ │ policing. │
314 └────────┴────────────────────────────┘
315
316 Tunneled packets (such as vxlan) are processed by netdev family hooks
317 both in decapsulated and encapsulated (tunneled) form. So a packet can
318 be filtered on the overlay network as well as on the underlying
319 network.
320
321 Note that the order of netfilter and tc is mirrored on ingress versus
322 egress. This ensures symmetry for NAT and other packet mangling.
323
324 Ingress packets which are redirected out some other interface are only
325 processed by netfilter on egress if they have passed through netfilter
326 ingress processing before. Thus, ingress packets which are redirected
327 by tc are not subjected to netfilter. But they are if they are
328 redirected by netfilter on ingress. Conceptually, tc and netfilter can
329 be thought of as layers, with netfilter layered above tc: If the packet
330 hasn’t been passed up from the tc layer to the netfilter layer, it’s
331 not subjected to netfilter on egress.
332
334 {list | flush} ruleset [family]
335
336 The ruleset keyword is used to identify the whole set of tables,
337 chains, etc. currently in place in kernel. The following ruleset
338 commands exist:
339
340
341 list Print the ruleset in
342 human-readable format.
343
344
345
346
347
348
349
350
351
352
353
354
355
356 flush Clear the whole ruleset.
357 Note that, unlike
358 iptables, this will remove
359 all tables and whatever
360 they contain, effectively
361 leading to an empty
362 ruleset - no packet
363 filtering will happen
364 anymore, so the kernel
365 accepts any valid packet
366 it receives.
367
368
369 It is possible to limit list and flush to a specific address family
370 only. For a list of valid family names, see the section called “ADDRESS
371 FAMILIES” above.
372
373 By design, list ruleset command output may be used as input to nft -f.
374 Effectively, this is the nft-equivalent of iptables-save and
375 iptables-restore.
376
378 {add | create} table [family] table [ {comment comment ;} { flags 'flags ; }]
379 {delete | list | flush} table [family] table
380 list tables [family]
381 delete table [family] handle handle
382
383 Tables are containers for chains, sets and stateful objects. They are
384 identified by their address family and their name. The address family
385 must be one of ip, ip6, inet, arp, bridge, netdev. The inet address
386 family is a dummy family which is used to create hybrid IPv4/IPv6
387 tables. The meta expression nfproto keyword can be used to test which
388 family (ipv4 or ipv6) context the packet is being processed in. When no
389 address family is specified, ip is used by default. The only difference
390 between add and create is that the former will not return an error if
391 the specified table already exists while create will return an error.
392
393 Table 4. Table flags
394 ┌────────┬────────────────────────────┐
395 │Flag │ Description │
396 ├────────┼────────────────────────────┤
397 │ │ │
398 │dormant │ table is not evaluated any │
399 │ │ more (base chains are │
400 │ │ unregistered). │
401 └────────┴────────────────────────────┘
402
403 Add, change, delete a table.
404
405 # start nft in interactive mode
406 nft --interactive
407
408 # create a new table.
409 create table inet mytable
410
411 # add a new base chain: get input packets
412 add chain inet mytable myin { type filter hook input priority filter; }
413
414 # add a single counter to the chain
415 add rule inet mytable myin counter
416
417 # disable the table temporarily -- rules are not evaluated anymore
418 add table inet mytable { flags dormant; }
419
420 # make table active again:
421 add table inet mytable
422
423
424
425 add Add a new table for the
426 given family with the
427 given name.
428
429 delete Delete the specified
430 table.
431
432
433
434 list List all chains and rules
435 of the specified table.
436
437 flush Flush all chains and rules
438 of the specified table.
439
440
442 {add | create} chain [family] table chain [{ type type hook hook [device device] priority priority ; [policy policy ;] [comment comment ;] }]
443 {delete | list | flush} chain ['family] table chain
444 list chains [family]
445 delete chain [family] table handle handle
446 rename chain [family] table chain newname
447
448 Chains are containers for rules. They exist in two kinds, base chains
449 and regular chains. A base chain is an entry point for packets from the
450 networking stack, a regular chain may be used as jump target and is
451 used for better rule organization.
452
453
454 add Add a new chain in the
455 specified table. When a
456 hook and priority value
457 are specified, the chain
458 is created as a base chain
459 and hooked up to the
460 networking stack.
461
462 create Similar to the add
463 command, but returns an
464 error if the chain already
465 exists.
466
467 delete Delete the specified
468 chain. The chain must not
469 contain any rules or be
470 used as jump target.
471
472 rename Rename the specified
473 chain.
474
475 list List all rules of the
476 specified chain.
477
478 flush Flush all rules of the
479 specified chain.
480
481
482 For base chains, type, hook and priority parameters are mandatory.
483
484 Table 5. Supported chain types
485 ┌───────┬───────────────┬────────────────┬──────────────────┐
486 │Type │ Families │ Hooks │ Description │
487 ├───────┼───────────────┼────────────────┼──────────────────┤
488 │ │ │ │ │
489 │filter │ all │ all │ Standard chain │
490 │ │ │ │ type to use in │
491 │ │ │ │ doubt. │
492 ├───────┼───────────────┼────────────────┼──────────────────┤
493 │ │ │ │ │
494 │nat │ ip, ip6, inet │ prerouting, │ Chains of this │
495 │ │ │ input, output, │ type perform │
496 │ │ │ postrouting │ Native Address │
497 │ │ │ │ Translation │
498 │ │ │ │ based on │
499 │ │ │ │ conntrack │
500 │ │ │ │ entries. Only │
501 │ │ │ │ the first packet │
502 │ │ │ │ of a connection │
503 │ │ │ │ actually │
504 │ │ │ │ traverses this │
505 │ │ │ │ chain - its │
506 │ │ │ │ rules usually │
507 │ │ │ │ define details │
508 │ │ │ │ of the created │
509 │ │ │ │ conntrack entry │
510 │ │ │ │ (NAT statements │
511 │ │ │ │ for instance). │
512 ├───────┼───────────────┼────────────────┼──────────────────┤
513 │ │ │ │ │
514 │route │ ip, ip6 │ output │ If a packet has │
515 │ │ │ │ traversed a │
516 │ │ │ │ chain of this │
517 │ │ │ │ type and is │
518 │ │ │ │ about to be │
519 │ │ │ │ accepted, a new │
520 │ │ │ │ route lookup is │
521 │ │ │ │ performed if │
522 │ │ │ │ relevant parts │
523 │ │ │ │ of the IP header │
524 │ │ │ │ have changed. │
525 │ │ │ │ This allows to │
526 │ │ │ │ e.g. implement │
527 │ │ │ │ policy routing │
528 │ │ │ │ selectors in │
529 │ │ │ │ nftables. │
530 └───────┴───────────────┴────────────────┴──────────────────┘
531
532 Apart from the special cases illustrated above (e.g. nat type not
533 supporting forward hook or route type only supporting output hook),
534 there are three further quirks worth noticing:
535
536 • The netdev family supports merely two combinations, namely filter
537 type with ingress hook and filter type with egress hook. Base
538 chains in this family also require the device parameter to be
539 present since they exist per interface only.
540
541 • The arp family supports only the input and output hooks, both in
542 chains of type filter.
543
544 • The inet family also supports the ingress hook (since Linux kernel
545 5.10), to filter IPv4 and IPv6 packet at the same location as the
546 netdev ingress hook. This inet hook allows you to share sets and
547 maps between the usual prerouting, input, forward, output,
548 postrouting and this ingress hook.
549
550 The priority parameter accepts a signed integer value or a standard
551 priority name which specifies the order in which chains with the same
552 hook value are traversed. The ordering is ascending, i.e. lower
553 priority values have precedence over higher ones.
554
555 Standard priority values can be replaced with easily memorizable names.
556 Not all names make sense in every family with every hook (see the
557 compatibility matrices below) but their numerical value can still be
558 used for prioritizing chains.
559
560 These names and values are defined and made available based on what
561 priorities are used by xtables when registering their default chains.
562
563 Most of the families use the same values, but bridge uses different
564 ones from the others. See the following tables that describe the values
565 and compatibility.
566
567 Table 6. Standard priority names, family and hook compatibility matrix
568 ┌─────────┬───────┬────────────────┬─────────────┐
569 │Name │ Value │ Families │ Hooks │
570 ├─────────┼───────┼────────────────┼─────────────┤
571 │ │ │ │ │
572 │raw │ -300 │ ip, ip6, inet │ all │
573 ├─────────┼───────┼────────────────┼─────────────┤
574 │ │ │ │ │
575 │mangle │ -150 │ ip, ip6, inet │ all │
576 ├─────────┼───────┼────────────────┼─────────────┤
577 │ │ │ │ │
578 │dstnat │ -100 │ ip, ip6, inet │ prerouting │
579 ├─────────┼───────┼────────────────┼─────────────┤
580 │ │ │ │ │
581 │filter │ 0 │ ip, ip6, inet, │ all │
582 │ │ │ arp, netdev │ │
583 ├─────────┼───────┼────────────────┼─────────────┤
584 │ │ │ │ │
585 │security │ 50 │ ip, ip6, inet │ all │
586 ├─────────┼───────┼────────────────┼─────────────┤
587 │ │ │ │ │
588 │srcnat │ 100 │ ip, ip6, inet │ postrouting │
589 └─────────┴───────┴────────────────┴─────────────┘
590
591 Table 7. Standard priority names and hook compatibility for the bridge
592 family
593 ┌───────┬───────┬─────────────┐
594 │ │ │ │
595 │Name │ Value │ Hooks │
596 ├───────┼───────┼─────────────┤
597 │ │ │ │
598 │dstnat │ -300 │ prerouting │
599 ├───────┼───────┼─────────────┤
600 │ │ │ │
601 │filter │ -200 │ all │
602 ├───────┼───────┼─────────────┤
603 │ │ │ │
604 │out │ 100 │ output │
605 ├───────┼───────┼─────────────┤
606 │ │ │ │
607 │srcnat │ 300 │ postrouting │
608 └───────┴───────┴─────────────┘
609
610 Basic arithmetic expressions (addition and subtraction) can also be
611 achieved with these standard names to ease relative prioritizing, e.g.
612 mangle - 5 stands for -155. Values will also be printed like this until
613 the value is not further than 10 from the standard value.
614
615 Base chains also allow to set the chain’s policy, i.e. what happens to
616 packets not explicitly accepted or refused in contained rules.
617 Supported policy values are accept (which is the default) or drop.
618
620 {add | insert} rule [family] table chain [handle handle | index index] statement ... [comment comment]
621 replace rule [family] table chain handle handle statement ... [comment comment]
622 delete rule [family] table chain handle handle
623
624 Rules are added to chains in the given table. If the family is not
625 specified, the ip family is used. Rules are constructed from two kinds
626 of components according to a set of grammatical rules: expressions and
627 statements.
628
629 The add and insert commands support an optional location specifier,
630 which is either a handle or the index (starting at zero) of an existing
631 rule. Internally, rule locations are always identified by handle and
632 the translation from index happens in userspace. This has two potential
633 implications in case a concurrent ruleset change happens after the
634 translation was done: The effective rule index might change if a rule
635 was inserted or deleted before the referred one. If the referred rule
636 was deleted, the command is rejected by the kernel just as if an
637 invalid handle was given.
638
639 A comment is a single word or a double-quoted (") multi-word string
640 which can be used to make notes regarding the actual rule. Note: If you
641 use bash for adding rules, you have to escape the quotation marks, e.g.
642 \"enable ssh for servers\".
643
644
645 add Add a new rule described
646 by the list of statements.
647 The rule is appended to
648 the given chain unless a
649 location is specified, in
650 which case the rule is
651 inserted after the
652 specified rule.
653
654 insert Same as add except the
655 rule is inserted at the
656 beginning of the chain or
657 before the specified rule.
658
659 replace Similar to add, but the
660 rule replaces the
661 specified rule.
662
663 delete Delete the specified rule.
664
665
666 add a rule to ip table output chain.
667
668 nft add rule filter output ip daddr 192.168.0.0/24 accept # 'ip filter' is assumed
669 # same command, slightly more verbose
670 nft add rule ip filter output ip daddr 192.168.0.0/24 accept
671
672 delete rule from inet table.
673
674 # nft -a list ruleset
675 table inet filter {
676 chain input {
677 type filter hook input priority filter; policy accept;
678 ct state established,related accept # handle 4
679 ip saddr 10.1.1.1 tcp dport ssh accept # handle 5
680 ...
681 # delete the rule with handle 5
682 nft delete rule inet filter input handle 5
683
684
686 nftables offers two kinds of set concepts. Anonymous sets are sets that
687 have no specific name. The set members are enclosed in curly braces,
688 with commas to separate elements when creating the rule the set is used
689 in. Once that rule is removed, the set is removed as well. They cannot
690 be updated, i.e. once an anonymous set is declared it cannot be changed
691 anymore except by removing/altering the rule that uses the anonymous
692 set.
693
694 Using anonymous sets to accept particular subnets and ports.
695
696 nft add rule filter input ip saddr { 10.0.0.0/8, 192.168.0.0/16 } tcp dport { 22, 443 } accept
697
698 Named sets are sets that need to be defined first before they can be
699 referenced in rules. Unlike anonymous sets, elements can be added to or
700 removed from a named set at any time. Sets are referenced from rules
701 using an @ prefixed to the sets name.
702
703 Using named sets to accept addresses and ports.
704
705 nft add rule filter input ip saddr @allowed_hosts tcp dport @allowed_ports accept
706
707 The sets allowed_hosts and allowed_ports need to be created first. The
708 next section describes nft set syntax in more detail.
709
710 add set [family] table set { type type | typeof expression ; [flags flags ;] [timeout timeout ;] [gc-interval gc-interval ;] [elements = { element[, ...] } ;] [size size ;] [comment comment ;] [policy 'policy ;] [auto-merge ;] }
711 {delete | list | flush} set [family] table set
712 list sets [family]
713 delete set [family] table handle handle
714 {add | delete} element [family] table set { element[, ...] }
715
716 Sets are element containers of a user-defined data type, they are
717 uniquely identified by a user-defined name and attached to tables.
718 Their behaviour can be tuned with the flags that can be specified at
719 set creation time.
720
721
722 add Add a new set in the
723 specified table. See the
724 Set specification table
725 below for more information
726 about how to specify
727 properties of a set.
728
729
730
731 delete Delete the specified set.
732
733 list Display the elements in
734 the specified set.
735
736 flush Remove all elements from
737 the specified set.
738
739
740 Table 8. Set specifications
741 ┌────────────┬──────────────────────┬─────────────────────┐
742 │Keyword │ Description │ Type │
743 ├────────────┼──────────────────────┼─────────────────────┤
744 │ │ │ │
745 │type │ data type of set │ string: ipv4_addr, │
746 │ │ elements │ ipv6_addr, │
747 │ │ │ ether_addr, │
748 │ │ │ inet_proto, │
749 │ │ │ inet_service, mark │
750 ├────────────┼──────────────────────┼─────────────────────┤
751 │ │ │ │
752 │typeof │ data type of set │ expression to │
753 │ │ element │ derive the data │
754 │ │ │ type from │
755 ├────────────┼──────────────────────┼─────────────────────┤
756 │ │ │ │
757 │flags │ set flags │ string: constant, │
758 │ │ │ dynamic, interval, │
759 │ │ │ timeout │
760 ├────────────┼──────────────────────┼─────────────────────┤
761 │ │ │ │
762 │timeout │ time an element │ string, decimal │
763 │ │ stays in the set, │ followed by unit. │
764 │ │ mandatory if set is │ Units are: d, h, m, │
765 │ │ added to from the │ s │
766 │ │ packet path │ │
767 │ │ (ruleset) │ │
768 ├────────────┼──────────────────────┼─────────────────────┤
769 │ │ │ │
770 │gc-interval │ garbage collection │ string, decimal │
771 │ │ interval, only │ followed by unit. │
772 │ │ available when │ Units are: d, h, m, │
773 │ │ timeout or flag │ s │
774 │ │ timeout are active │ │
775 ├────────────┼──────────────────────┼─────────────────────┤
776 │ │ │ │
777 │elements │ elements contained │ set data type │
778 │ │ by the set │ │
779 ├────────────┼──────────────────────┼─────────────────────┤
780 │ │ │ │
781 │size │ maximum number of │ unsigned integer │
782 │ │ elements in the │ (64 bit) │
783 │ │ set, mandatory if │ │
784 │ │ set is added to │ │
785 │ │ from the packet │ │
786 │ │ path (ruleset) │ │
787 ├────────────┼──────────────────────┼─────────────────────┤
788 │ │ │ │
789 │policy │ set policy │ string: performance │
790 │ │ │ [default], memory │
791 ├────────────┼──────────────────────┼─────────────────────┤
792 │ │ │ │
793 │auto-merge │ automatic merge of │ │
794 │ │ adjacent/overlapping │ │
795 │ │ set elements (only │ │
796 │ │ for interval sets) │ │
797 └────────────┴──────────────────────┴─────────────────────┘
798
800 add map [family] table map { type type | typeof expression [flags flags ;] [elements = { element[, ...] } ;] [size size ;] [comment comment ;] [policy 'policy ;] }
801 {delete | list | flush} map [family] table map
802 list maps [family]
803
804 Maps store data based on some specific key used as input. They are
805 uniquely identified by a user-defined name and attached to tables.
806
807
808 add Add a new map in the
809 specified table.
810
811 delete Delete the specified map.
812
813 list Display the elements in
814 the specified map.
815
816 flush Remove all elements from
817 the specified map.
818
819 add element Comma-separated list of
820 elements to add into the
821 specified map.
822
823 delete element Comma-separated list of
824 element keys to delete
825 from the specified map.
826
827
828 Table 9. Map specifications
829 ┌─────────┬─────────────────────┬─────────────────────┐
830 │Keyword │ Description │ Type │
831 ├─────────┼─────────────────────┼─────────────────────┤
832 │ │ │ │
833 │type │ data type of map │ string: ipv4_addr, │
834 │ │ elements │ ipv6_addr, │
835 │ │ │ ether_addr, │
836 │ │ │ inet_proto, │
837 │ │ │ inet_service, mark, │
838 │ │ │ counter, quota. │
839 │ │ │ Counter and quota │
840 │ │ │ can’t be used as │
841 │ │ │ keys │
842 ├─────────┼─────────────────────┼─────────────────────┤
843 │ │ │ │
844 │typeof │ data type of set │ expression to │
845 │ │ element │ derive the data │
846 │ │ │ type from │
847 ├─────────┼─────────────────────┼─────────────────────┤
848 │ │ │ │
849 │flags │ map flags │ string: constant, │
850 │ │ │ interval │
851 ├─────────┼─────────────────────┼─────────────────────┤
852 │ │ │ │
853 │elements │ elements contained │ map data type │
854 │ │ by the map │ │
855 ├─────────┼─────────────────────┼─────────────────────┤
856 │ │ │ │
857 │size │ maximum number of │ unsigned integer │
858 │ │ elements in the map │ (64 bit) │
859 ├─────────┼─────────────────────┼─────────────────────┤
860 │ │ │ │
861 │policy │ map policy │ string: performance │
862 │ │ │ [default], memory │
863 └─────────┴─────────────────────┴─────────────────────┘
864
866 {add | create | delete | get } element [family] table set { ELEMENT[, ...] }
867
868 ELEMENT := key_expression OPTIONS [: value_expression]
869 OPTIONS := [timeout TIMESPEC] [expires TIMESPEC] [comment string]
870 TIMESPEC := [numd][numh][numm][num[s]]
871
872 Element-related commands allow to change contents of named sets and
873 maps. key_expression is typically a value matching the set type.
874 value_expression is not allowed in sets but mandatory when adding to
875 maps, where it matches the data part in its type definition. When
876 deleting from maps, it may be specified but is optional as
877 key_expression uniquely identifies the element.
878
879 create command is similar to add with the exception that none of the
880 listed elements may already exist.
881
882 get command is useful to check if an element is contained in a set
883 which may be non-trivial in very large and/or interval sets. In the
884 latter case, the containing interval is returned instead of just the
885 element itself.
886
887 Table 10. Element options
888 ┌────────┬───────────────────────────┐
889 │Option │ Description │
890 ├────────┼───────────────────────────┤
891 │ │ │
892 │timeout │ timeout value for │
893 │ │ sets/maps with flag │
894 │ │ timeout │
895 ├────────┼───────────────────────────┤
896 │ │ │
897 │expires │ the time until given │
898 │ │ element expires, useful │
899 │ │ for ruleset replication │
900 │ │ only │
901 ├────────┼───────────────────────────┤
902 │ │ │
903 │comment │ per element comment field │
904 └────────┴───────────────────────────┘
905
907 {add | create} flowtable [family] table flowtable { hook hook priority priority ; devices = { device[, ...] } ; }
908 list flowtables [family]
909 {delete | list} flowtable [family] table flowtable
910 delete flowtable [family] table handle handle
911
912 Flowtables allow you to accelerate packet forwarding in software.
913 Flowtables entries are represented through a tuple that is composed of
914 the input interface, source and destination address, source and
915 destination port; and layer 3/4 protocols. Each entry also caches the
916 destination interface and the gateway address - to update the
917 destination link-layer address - to forward packets. The ttl and
918 hoplimit fields are also decremented. Hence, flowtables provides an
919 alternative path that allow packets to bypass the classic forwarding
920 path. Flowtables reside in the ingress hook that is located before the
921 prerouting hook. You can select which flows you want to offload through
922 the flow expression from the forward chain. Flowtables are identified
923 by their address family and their name. The address family must be one
924 of ip, ip6, or inet. The inet address family is a dummy family which is
925 used to create hybrid IPv4/IPv6 tables. When no address family is
926 specified, ip is used by default.
927
928 The priority can be a signed integer or filter which stands for 0.
929 Addition and subtraction can be used to set relative priority, e.g.
930 filter + 5 equals to 5.
931
932
933 add Add a new flowtable for
934 the given family with the
935 given name.
936
937 delete Delete the specified
938 flowtable.
939
940 list List all flowtables.
941
942
944 list { secmarks | synproxys | flow tables | meters | hooks } [family]
945 list { secmarks | synproxys | flow tables | meters | hooks } table [family] table
946 list ct { timeout | expectation | helper | helpers } table [family] table
947
948 Inspect configured objects. list hooks shows the full hook pipeline,
949 including those registered by kernel modules, such as nf_conntrack.
950
952 {add | delete | list | reset} type [family] table object
953 delete type [family] table handle handle
954 list counters [family]
955 list quotas [family]
956 list limits [family]
957
958 Stateful objects are attached to tables and are identified by a unique
959 name. They group stateful information from rules, to reference them in
960 rules the keywords "type name" are used e.g. "counter name".
961
962
963 add Add a new stateful object
964 in the specified table.
965
966 delete Delete the specified
967 object.
968
969 list Display stateful
970 information the object
971 holds.
972
973 reset List-and-reset stateful
974 object.
975
976
977 CT HELPER
978 add ct helper [family] table name { type type protocol protocol ; [l3proto family ;] }
979 delete ct helper [family] table name
980 list ct helpers
981
982 Ct helper is used to define connection tracking helpers that can then
983 be used in combination with the ct helper set statement. type and
984 protocol are mandatory, l3proto is derived from the table family by
985 default, i.e. in the inet table the kernel will try to load both the
986 ipv4 and ipv6 helper backends, if they are supported by the kernel.
987
988 Table 11. conntrack helper specifications
989 ┌─────────┬─────────────────────┬─────────────────────┐
990 │Keyword │ Description │ Type │
991 ├─────────┼─────────────────────┼─────────────────────┤
992 │ │ │ │
993 │type │ name of helper type │ quoted string (e.g. │
994 │ │ │ "ftp") │
995 ├─────────┼─────────────────────┼─────────────────────┤
996 │ │ │ │
997 │protocol │ layer 4 protocol of │ string (e.g. ip) │
998 │ │ the helper │ │
999 ├─────────┼─────────────────────┼─────────────────────┤
1000 │ │ │ │
1001 │l3proto │ layer 3 protocol of │ address family │
1002 │ │ the helper │ (e.g. ip) │
1003 ├─────────┼─────────────────────┼─────────────────────┤
1004 │ │ │ │
1005 │comment │ per ct helper │ string │
1006 │ │ comment field │ │
1007 └─────────┴─────────────────────┴─────────────────────┘
1008
1009 defining and assigning ftp helper.
1010
1011 Unlike iptables, helper assignment needs to be performed after the conntrack
1012 lookup has completed, for example with the default 0 hook priority.
1013
1014 table inet myhelpers {
1015 ct helper ftp-standard {
1016 type "ftp" protocol tcp
1017 }
1018 chain prerouting {
1019 type filter hook prerouting priority filter;
1020 tcp dport 21 ct helper set "ftp-standard"
1021 }
1022 }
1023
1024
1025 CT TIMEOUT
1026 add ct timeout [family] table name { protocol protocol ; policy = { state: value [, ...] } ; [l3proto family ;] }
1027 delete ct timeout [family] table name
1028 list ct timeouts
1029
1030 Ct timeout is used to update connection tracking timeout values.Timeout
1031 policies are assigned with the ct timeout set statement. protocol and
1032 policy are mandatory, l3proto is derived from the table family by
1033 default.
1034
1035 Table 12. conntrack timeout specifications
1036 ┌─────────┬─────────────────────┬──────────────────┐
1037 │Keyword │ Description │ Type │
1038 ├─────────┼─────────────────────┼──────────────────┤
1039 │ │ │ │
1040 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1041 │ │ the timeout object │ │
1042 ├─────────┼─────────────────────┼──────────────────┤
1043 │ │ │ │
1044 │state │ connection state │ string (e.g. │
1045 │ │ name │ "established") │
1046 ├─────────┼─────────────────────┼──────────────────┤
1047 │ │ │ │
1048 │value │ timeout value for │ unsigned integer │
1049 │ │ connection state │ │
1050 ├─────────┼─────────────────────┼──────────────────┤
1051 │ │ │ │
1052 │l3proto │ layer 3 protocol of │ address family │
1053 │ │ the timeout object │ (e.g. ip) │
1054 ├─────────┼─────────────────────┼──────────────────┤
1055 │ │ │ │
1056 │comment │ per ct timeout │ string │
1057 │ │ comment field │ │
1058 └─────────┴─────────────────────┴──────────────────┘
1059
1060 tcp connection state names that can have a specific timeout value are:
1061
1062 close, close_wait, established, fin_wait, last_ack, retrans, syn_recv,
1063 syn_sent, time_wait and unack.
1064
1065 You can use sysctl -a |grep net.netfilter.nf_conntrack_tcp_timeout_ to
1066 view and change the system-wide defaults. ct timeout allows for
1067 flow-specific settings, without changing the global timeouts.
1068
1069 For example, tcp port 53 could have much lower settings than other
1070 traffic.
1071
1072 udp state names that can have a specific timeout value are replied and
1073 unreplied.
1074
1075 defining and assigning ct timeout policy.
1076
1077 table ip filter {
1078 ct timeout customtimeout {
1079 protocol tcp;
1080 l3proto ip
1081 policy = { established: 120, close: 20 }
1082 }
1083
1084 chain output {
1085 type filter hook output priority filter; policy accept;
1086 ct timeout set "customtimeout"
1087 }
1088 }
1089
1090 testing the updated timeout policy.
1091
1092 % conntrack -E
1093
1094 It should display:
1095
1096 [UPDATE] tcp 6 120 ESTABLISHED src=172.16.19.128 dst=172.16.19.1
1097 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128
1098 sport=41360 dport=22
1099
1100
1101 CT EXPECTATION
1102 add ct expectation [family] table name { protocol protocol ; dport dport ; timeout timeout ; size size ; [*l3proto family ;] }
1103 delete ct expectation [family] table name
1104 list ct expectations
1105
1106 Ct expectation is used to create connection expectations. Expectations
1107 are assigned with the ct expectation set statement. protocol, dport,
1108 timeout and size are mandatory, l3proto is derived from the table
1109 family by default.
1110
1111 Table 13. conntrack expectation specifications
1112 ┌─────────┬─────────────────────┬──────────────────┐
1113 │Keyword │ Description │ Type │
1114 ├─────────┼─────────────────────┼──────────────────┤
1115 │ │ │ │
1116 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1117 │ │ the expectation │ │
1118 │ │ object │ │
1119 ├─────────┼─────────────────────┼──────────────────┤
1120 │ │ │ │
1121 │dport │ destination port of │ unsigned integer │
1122 │ │ expected connection │ │
1123 ├─────────┼─────────────────────┼──────────────────┤
1124 │ │ │ │
1125 │timeout │ timeout value for │ unsigned integer │
1126 │ │ expectation │ │
1127 ├─────────┼─────────────────────┼──────────────────┤
1128 │ │ │ │
1129 │size │ size value for │ unsigned integer │
1130 │ │ expectation │ │
1131 ├─────────┼─────────────────────┼──────────────────┤
1132 │ │ │ │
1133 │l3proto │ layer 3 protocol of │ address family │
1134 │ │ the expectation │ (e.g. ip) │
1135 │ │ object │ │
1136 ├─────────┼─────────────────────┼──────────────────┤
1137 │ │ │ │
1138 │comment │ per ct expectation │ string │
1139 │ │ comment field │ │
1140 └─────────┴─────────────────────┴──────────────────┘
1141
1142 defining and assigning ct expectation policy.
1143
1144 table ip filter {
1145 ct expectation expect {
1146 protocol udp
1147 dport 9876
1148 timeout 2m
1149 size 8
1150 l3proto ip
1151 }
1152
1153 chain input {
1154 type filter hook input priority filter; policy accept;
1155 ct expectation set "expect"
1156 }
1157 }
1158
1159
1160 COUNTER
1161 add counter [family] table name [{ [ packets packets bytes bytes ; ] [ comment comment ; }]
1162 delete counter [family] table name
1163 list counters
1164
1165 Table 14. Counter specifications
1166 ┌────────┬─────────────────────┬──────────────────┐
1167 │Keyword │ Description │ Type │
1168 ├────────┼─────────────────────┼──────────────────┤
1169 │ │ │ │
1170 │packets │ initial count of │ unsigned integer │
1171 │ │ packets │ (64 bit) │
1172 ├────────┼─────────────────────┼──────────────────┤
1173 │ │ │ │
1174 │bytes │ initial count of │ unsigned integer │
1175 │ │ bytes │ (64 bit) │
1176 ├────────┼─────────────────────┼──────────────────┤
1177 │ │ │ │
1178 │comment │ per counter comment │ string │
1179 │ │ field │ │
1180 └────────┴─────────────────────┴──────────────────┘
1181
1182 Using named counters.
1183
1184 nft add counter filter http
1185 nft add rule filter input tcp dport 80 counter name \"http\"
1186
1187 Using named counters with maps.
1188
1189 nft add counter filter http
1190 nft add counter filter https
1191 nft add rule filter input counter name tcp dport map { 80 : \"http\", 443 : \"https\" }
1192
1193
1194 QUOTA
1195 add quota [family] table name { [over|until] bytes BYTE_UNIT [ used bytes BYTE_UNIT ] ; [ comment comment ; ] }
1196 BYTE_UNIT := bytes | kbytes | mbytes
1197 delete quota [family] table name
1198 list quotas
1199
1200 Table 15. Quota specifications
1201 ┌────────┬───────────────────┬────────────────────┐
1202 │Keyword │ Description │ Type │
1203 ├────────┼───────────────────┼────────────────────┤
1204 │ │ │ │
1205 │quota │ quota limit, used │ Two arguments, │
1206 │ │ as the quota name │ unsigned integer │
1207 │ │ │ (64 bit) and │
1208 │ │ │ string: bytes, │
1209 │ │ │ kbytes, mbytes. │
1210 │ │ │ "over" and "until" │
1211 │ │ │ go before these │
1212 │ │ │ arguments │
1213 ├────────┼───────────────────┼────────────────────┤
1214 │ │ │ │
1215 │used │ initial value of │ Two arguments, │
1216 │ │ used quota │ unsigned integer │
1217 │ │ │ (64 bit) and │
1218 │ │ │ string: bytes, │
1219 │ │ │ kbytes, mbytes │
1220 ├────────┼───────────────────┼────────────────────┤
1221 │ │ │ │
1222 │comment │ per quota comment │ string │
1223 │ │ field │ │
1224 └────────┴───────────────────┴────────────────────┘
1225
1226 Using named quotas.
1227
1228 nft add quota filter user123 { over 20 mbytes }
1229 nft add rule filter input ip saddr 192.168.10.123 quota name \"user123\"
1230
1231 Using named quotas with maps.
1232
1233 nft add quota filter user123 { over 20 mbytes }
1234 nft add quota filter user124 { over 20 mbytes }
1235 nft add rule filter input quota name ip saddr map { 192.168.10.123 : \"user123\", 192.168.10.124 : \"user124\" }
1236
1237
1239 Expressions represent values, either constants like network addresses,
1240 port numbers, etc., or data gathered from the packet during ruleset
1241 evaluation. Expressions can be combined using binary, logical,
1242 relational and other types of expressions to form complex or relational
1243 (match) expressions. They are also used as arguments to certain types
1244 of operations, like NAT, packet marking etc.
1245
1246 Each expression has a data type, which determines the size, parsing and
1247 representation of symbolic values and type compatibility with other
1248 expressions.
1249
1250 DESCRIBE COMMAND
1251 describe expression | data type
1252
1253 The describe command shows information about the type of an expression
1254 and its data type. A data type may also be given, in which nft will
1255 display more information about the type.
1256
1257 The describe command.
1258
1259 $ nft describe tcp flags
1260 payload expression, datatype tcp_flag (TCP flag) (basetype bitmask, integer), 8 bits
1261
1262 predefined symbolic constants:
1263 fin 0x01
1264 syn 0x02
1265 rst 0x04
1266 psh 0x08
1267 ack 0x10
1268 urg 0x20
1269 ecn 0x40
1270 cwr 0x80
1271
1272
1274 Data types determine the size, parsing and representation of symbolic
1275 values and type compatibility of expressions. A number of global data
1276 types exist, in addition some expression types define further data
1277 types specific to the expression type. Most data types have a fixed
1278 size, some however may have a dynamic size, f.i. the string type. Some
1279 types also have predefined symbolic constants. Those can be listed
1280 using the nft describe command:
1281
1282 $ nft describe ct_state
1283 datatype ct_state (conntrack state) (basetype bitmask, integer), 32 bits
1284
1285 pre-defined symbolic constants (in hexadecimal):
1286 invalid 0x00000001
1287 new ...
1288
1289 Types may be derived from lower order types, f.i. the IPv4 address type
1290 is derived from the integer type, meaning an IPv4 address can also be
1291 specified as an integer value.
1292
1293 In certain contexts (set and map definitions), it is necessary to
1294 explicitly specify a data type. Each type has a name which is used for
1295 this.
1296
1297 INTEGER TYPE
1298 ┌────────┬─────────┬──────────┬───────────┐
1299 │Name │ Keyword │ Size │ Base type │
1300 ├────────┼─────────┼──────────┼───────────┤
1301 │ │ │ │ │
1302 │Integer │ integer │ variable │ - │
1303 └────────┴─────────┴──────────┴───────────┘
1304
1305 The integer type is used for numeric values. It may be specified as a
1306 decimal, hexadecimal or octal number. The integer type does not have a
1307 fixed size, its size is determined by the expression for which it is
1308 used.
1309
1310 BITMASK TYPE
1311 ┌────────┬─────────┬──────────┬───────────┐
1312 │Name │ Keyword │ Size │ Base type │
1313 ├────────┼─────────┼──────────┼───────────┤
1314 │ │ │ │ │
1315 │Bitmask │ bitmask │ variable │ integer │
1316 └────────┴─────────┴──────────┴───────────┘
1317
1318 The bitmask type (bitmask) is used for bitmasks.
1319
1320 STRING TYPE
1321 ┌───────┬─────────┬──────────┬───────────┐
1322 │Name │ Keyword │ Size │ Base type │
1323 ├───────┼─────────┼──────────┼───────────┤
1324 │ │ │ │ │
1325 │String │ string │ variable │ - │
1326 └───────┴─────────┴──────────┴───────────┘
1327
1328 The string type is used for character strings. A string begins with an
1329 alphabetic character (a-zA-Z) followed by zero or more alphanumeric
1330 characters or the characters /, -, _ and .. In addition, anything
1331 enclosed in double quotes (") is recognized as a string.
1332
1333 String specification.
1334
1335 # Interface name
1336 filter input iifname eth0
1337
1338 # Weird interface name
1339 filter input iifname "(eth0)"
1340
1341
1342 LINK LAYER ADDRESS TYPE
1343 ┌───────────┬─────────┬──────────┬───────────┐
1344 │Name │ Keyword │ Size │ Base type │
1345 ├───────────┼─────────┼──────────┼───────────┤
1346 │ │ │ │ │
1347 │Link layer │ lladdr │ variable │ integer │
1348 │address │ │ │ │
1349 └───────────┴─────────┴──────────┴───────────┘
1350
1351 The link layer address type is used for link layer addresses. Link
1352 layer addresses are specified as a variable amount of groups of two
1353 hexadecimal digits separated using colons (:).
1354
1355 Link layer address specification.
1356
1357 # Ethernet destination MAC address
1358 filter input ether daddr 20:c9:d0:43:12:d9
1359
1360
1361 IPV4 ADDRESS TYPE
1362 ┌─────────────┬───────────┬────────┬───────────┐
1363 │Name │ Keyword │ Size │ Base type │
1364 ├─────────────┼───────────┼────────┼───────────┤
1365 │ │ │ │ │
1366 │IPV4 address │ ipv4_addr │ 32 bit │ integer │
1367 └─────────────┴───────────┴────────┴───────────┘
1368
1369 The IPv4 address type is used for IPv4 addresses. Addresses are
1370 specified in either dotted decimal, dotted hexadecimal, dotted octal,
1371 decimal, hexadecimal, octal notation or as a host name. A host name
1372 will be resolved using the standard system resolver.
1373
1374 IPv4 address specification.
1375
1376 # dotted decimal notation
1377 filter output ip daddr 127.0.0.1
1378
1379 # host name
1380 filter output ip daddr localhost
1381
1382
1383 IPV6 ADDRESS TYPE
1384 ┌─────────────┬───────────┬─────────┬───────────┐
1385 │Name │ Keyword │ Size │ Base type │
1386 ├─────────────┼───────────┼─────────┼───────────┤
1387 │ │ │ │ │
1388 │IPv6 address │ ipv6_addr │ 128 bit │ integer │
1389 └─────────────┴───────────┴─────────┴───────────┘
1390
1391 The IPv6 address type is used for IPv6 addresses. Addresses are
1392 specified as a host name or as hexadecimal halfwords separated by
1393 colons. Addresses might be enclosed in square brackets ("[]") to
1394 differentiate them from port numbers.
1395
1396 IPv6 address specification.
1397
1398 # abbreviated loopback address
1399 filter output ip6 daddr ::1
1400
1401 IPv6 address specification with bracket notation.
1402
1403 # without [] the port number (22) would be parsed as part of the
1404 # ipv6 address
1405 ip6 nat prerouting tcp dport 2222 dnat to [1ce::d0]:22
1406
1407
1408 BOOLEAN TYPE
1409 ┌────────┬─────────┬───────┬───────────┐
1410 │Name │ Keyword │ Size │ Base type │
1411 ├────────┼─────────┼───────┼───────────┤
1412 │ │ │ │ │
1413 │Boolean │ boolean │ 1 bit │ integer │
1414 └────────┴─────────┴───────┴───────────┘
1415
1416 The boolean type is a syntactical helper type in userspace. Its use is
1417 in the right-hand side of a (typically implicit) relational expression
1418 to change the expression on the left-hand side into a boolean check
1419 (usually for existence).
1420
1421 Table 16. The following keywords will automatically resolve into a
1422 boolean type with given value
1423 ┌────────┬───────┐
1424 │Keyword │ Value │
1425 ├────────┼───────┤
1426 │ │ │
1427 │exists │ 1 │
1428 ├────────┼───────┤
1429 │ │ │
1430 │missing │ 0 │
1431 └────────┴───────┘
1432
1433 Table 17. expressions support a boolean comparison
1434 ┌───────────┬─────────────────────────┐
1435 │Expression │ Behaviour │
1436 ├───────────┼─────────────────────────┤
1437 │ │ │
1438 │fib │ Check route existence. │
1439 ├───────────┼─────────────────────────┤
1440 │ │ │
1441 │exthdr │ Check IPv6 extension │
1442 │ │ header existence. │
1443 ├───────────┼─────────────────────────┤
1444 │ │ │
1445 │tcp option │ Check TCP option header │
1446 │ │ existence. │
1447 └───────────┴─────────────────────────┘
1448
1449 Boolean specification.
1450
1451 # match if route exists
1452 filter input fib daddr . iif oif exists
1453
1454 # match only non-fragmented packets in IPv6 traffic
1455 filter input exthdr frag missing
1456
1457 # match if TCP timestamp option is present
1458 filter input tcp option timestamp exists
1459
1460
1461 ICMP TYPE TYPE
1462 ┌──────────┬───────────┬───────┬───────────┐
1463 │Name │ Keyword │ Size │ Base type │
1464 ├──────────┼───────────┼───────┼───────────┤
1465 │ │ │ │ │
1466 │ICMP Type │ icmp_type │ 8 bit │ integer │
1467 └──────────┴───────────┴───────┴───────────┘
1468
1469 The ICMP Type type is used to conveniently specify the ICMP header’s
1470 type field.
1471
1472 Table 18. Keywords may be used when specifying the ICMP type
1473 ┌────────────────────────┬───────┐
1474 │Keyword │ Value │
1475 ├────────────────────────┼───────┤
1476 │ │ │
1477 │echo-reply │ 0 │
1478 ├────────────────────────┼───────┤
1479 │ │ │
1480 │destination-unreachable │ 3 │
1481 ├────────────────────────┼───────┤
1482 │ │ │
1483 │source-quench │ 4 │
1484 ├────────────────────────┼───────┤
1485 │ │ │
1486 │redirect │ 5 │
1487 ├────────────────────────┼───────┤
1488 │ │ │
1489 │echo-request │ 8 │
1490 ├────────────────────────┼───────┤
1491 │ │ │
1492 │router-advertisement │ 9 │
1493 ├────────────────────────┼───────┤
1494 │ │ │
1495 │router-solicitation │ 10 │
1496 ├────────────────────────┼───────┤
1497 │ │ │
1498 │time-exceeded │ 11 │
1499 ├────────────────────────┼───────┤
1500 │ │ │
1501 │parameter-problem │ 12 │
1502 ├────────────────────────┼───────┤
1503 │ │ │
1504 │timestamp-request │ 13 │
1505 ├────────────────────────┼───────┤
1506 │ │ │
1507 │timestamp-reply │ 14 │
1508 ├────────────────────────┼───────┤
1509 │ │ │
1510 │info-request │ 15 │
1511 ├────────────────────────┼───────┤
1512 │ │ │
1513 │info-reply │ 16 │
1514 ├────────────────────────┼───────┤
1515 │ │ │
1516 │address-mask-request │ 17 │
1517 ├────────────────────────┼───────┤
1518 │ │ │
1519 │address-mask-reply │ 18 │
1520 └────────────────────────┴───────┘
1521
1522 ICMP Type specification.
1523
1524 # match ping packets
1525 filter output icmp type { echo-request, echo-reply }
1526
1527
1528 ICMP CODE TYPE
1529 ┌──────────┬───────────┬───────┬───────────┐
1530 │Name │ Keyword │ Size │ Base type │
1531 ├──────────┼───────────┼───────┼───────────┤
1532 │ │ │ │ │
1533 │ICMP Code │ icmp_code │ 8 bit │ integer │
1534 └──────────┴───────────┴───────┴───────────┘
1535
1536 The ICMP Code type is used to conveniently specify the ICMP header’s
1537 code field.
1538
1539 Table 19. Keywords may be used when specifying the ICMP code
1540 ┌─────────────────┬───────┐
1541 │Keyword │ Value │
1542 ├─────────────────┼───────┤
1543 │ │ │
1544 │net-unreachable │ 0 │
1545 ├─────────────────┼───────┤
1546 │ │ │
1547 │host-unreachable │ 1 │
1548 ├─────────────────┼───────┤
1549 │ │ │
1550 │prot-unreachable │ 2 │
1551 ├─────────────────┼───────┤
1552 │ │ │
1553 │port-unreachable │ 3 │
1554 ├─────────────────┼───────┤
1555 │ │ │
1556 │frag-needed │ 4 │
1557 ├─────────────────┼───────┤
1558 │ │ │
1559 │net-prohibited │ 9 │
1560 ├─────────────────┼───────┤
1561 │ │ │
1562 │host-prohibited │ 10 │
1563 ├─────────────────┼───────┤
1564 │ │ │
1565 │admin-prohibited │ 13 │
1566 └─────────────────┴───────┘
1567
1568 ICMPV6 TYPE TYPE
1569 ┌────────────┬────────────┬───────┬───────────┐
1570 │Name │ Keyword │ Size │ Base type │
1571 ├────────────┼────────────┼───────┼───────────┤
1572 │ │ │ │ │
1573 │ICMPv6 Type │ icmpx_code │ 8 bit │ integer │
1574 └────────────┴────────────┴───────┴───────────┘
1575
1576 The ICMPv6 Type type is used to conveniently specify the ICMPv6
1577 header’s type field.
1578
1579 Table 20. keywords may be used when specifying the ICMPv6 type:
1580 ┌────────────────────────┬───────┐
1581 │Keyword │ Value │
1582 ├────────────────────────┼───────┤
1583 │ │ │
1584 │destination-unreachable │ 1 │
1585 ├────────────────────────┼───────┤
1586 │ │ │
1587 │packet-too-big │ 2 │
1588 ├────────────────────────┼───────┤
1589 │ │ │
1590 │time-exceeded │ 3 │
1591 ├────────────────────────┼───────┤
1592 │ │ │
1593 │parameter-problem │ 4 │
1594 ├────────────────────────┼───────┤
1595 │ │ │
1596 │echo-request │ 128 │
1597 ├────────────────────────┼───────┤
1598 │ │ │
1599 │echo-reply │ 129 │
1600 ├────────────────────────┼───────┤
1601 │ │ │
1602 │mld-listener-query │ 130 │
1603 ├────────────────────────┼───────┤
1604 │ │ │
1605 │mld-listener-report │ 131 │
1606 ├────────────────────────┼───────┤
1607 │ │ │
1608 │mld-listener-done │ 132 │
1609 ├────────────────────────┼───────┤
1610 │ │ │
1611 │mld-listener-reduction │ 132 │
1612 ├────────────────────────┼───────┤
1613 │ │ │
1614 │nd-router-solicit │ 133 │
1615 ├────────────────────────┼───────┤
1616 │ │ │
1617 │nd-router-advert │ 134 │
1618 ├────────────────────────┼───────┤
1619 │ │ │
1620 │nd-neighbor-solicit │ 135 │
1621 ├────────────────────────┼───────┤
1622 │ │ │
1623 │nd-neighbor-advert │ 136 │
1624 ├────────────────────────┼───────┤
1625 │ │ │
1626 │nd-redirect │ 137 │
1627 ├────────────────────────┼───────┤
1628 │ │ │
1629 │router-renumbering │ 138 │
1630 ├────────────────────────┼───────┤
1631 │ │ │
1632 │ind-neighbor-solicit │ 141 │
1633 ├────────────────────────┼───────┤
1634 │ │ │
1635 │ind-neighbor-advert │ 142 │
1636 ├────────────────────────┼───────┤
1637 │ │ │
1638 │mld2-listener-report │ 143 │
1639 └────────────────────────┴───────┘
1640
1641 ICMPv6 Type specification.
1642
1643 # match ICMPv6 ping packets
1644 filter output icmpv6 type { echo-request, echo-reply }
1645
1646
1647 ICMPV6 CODE TYPE
1648 ┌────────────┬─────────────┬───────┬───────────┐
1649 │Name │ Keyword │ Size │ Base type │
1650 ├────────────┼─────────────┼───────┼───────────┤
1651 │ │ │ │ │
1652 │ICMPv6 Code │ icmpv6_code │ 8 bit │ integer │
1653 └────────────┴─────────────┴───────┴───────────┘
1654
1655 The ICMPv6 Code type is used to conveniently specify the ICMPv6
1656 header’s code field.
1657
1658 Table 21. keywords may be used when specifying the ICMPv6 code
1659 ┌─────────────────┬───────┐
1660 │Keyword │ Value │
1661 ├─────────────────┼───────┤
1662 │ │ │
1663 │no-route │ 0 │
1664 ├─────────────────┼───────┤
1665 │ │ │
1666 │admin-prohibited │ 1 │
1667 ├─────────────────┼───────┤
1668 │ │ │
1669 │addr-unreachable │ 3 │
1670 ├─────────────────┼───────┤
1671 │ │ │
1672 │port-unreachable │ 4 │
1673 ├─────────────────┼───────┤
1674 │ │ │
1675 │policy-fail │ 5 │
1676 ├─────────────────┼───────┤
1677 │ │ │
1678 │reject-route │ 6 │
1679 └─────────────────┴───────┘
1680
1681 ICMPVX CODE TYPE
1682 ┌────────────┬─────────────┬───────┬───────────┐
1683 │Name │ Keyword │ Size │ Base type │
1684 ├────────────┼─────────────┼───────┼───────────┤
1685 │ │ │ │ │
1686 │ICMPvX Code │ icmpv6_type │ 8 bit │ integer │
1687 └────────────┴─────────────┴───────┴───────────┘
1688
1689 The ICMPvX Code type abstraction is a set of values which overlap
1690 between ICMP and ICMPv6 Code types to be used from the inet family.
1691
1692 Table 22. keywords may be used when specifying the ICMPvX code
1693 ┌─────────────────┬───────┐
1694 │Keyword │ Value │
1695 ├─────────────────┼───────┤
1696 │ │ │
1697 │no-route │ 0 │
1698 ├─────────────────┼───────┤
1699 │ │ │
1700 │port-unreachable │ 1 │
1701 ├─────────────────┼───────┤
1702 │ │ │
1703 │host-unreachable │ 2 │
1704 ├─────────────────┼───────┤
1705 │ │ │
1706 │admin-prohibited │ 3 │
1707 └─────────────────┴───────┘
1708
1709 CONNTRACK TYPES
1710 Table 23. overview of types used in ct expression and statement
1711 ┌─────────────────┬───────────┬─────────┬───────────┐
1712 │Name │ Keyword │ Size │ Base type │
1713 ├─────────────────┼───────────┼─────────┼───────────┤
1714 │ │ │ │ │
1715 │conntrack state │ ct_state │ 4 byte │ bitmask │
1716 ├─────────────────┼───────────┼─────────┼───────────┤
1717 │ │ │ │ │
1718 │conntrack │ ct_dir │ 8 bit │ integer │
1719 │direction │ │ │ │
1720 ├─────────────────┼───────────┼─────────┼───────────┤
1721 │ │ │ │ │
1722 │conntrack status │ ct_status │ 4 byte │ bitmask │
1723 ├─────────────────┼───────────┼─────────┼───────────┤
1724 │ │ │ │ │
1725 │conntrack event │ ct_event │ 4 byte │ bitmask │
1726 │bits │ │ │ │
1727 ├─────────────────┼───────────┼─────────┼───────────┤
1728 │ │ │ │ │
1729 │conntrack label │ ct_label │ 128 bit │ bitmask │
1730 └─────────────────┴───────────┴─────────┴───────────┘
1731
1732 For each of the types above, keywords are available for convenience:
1733
1734 Table 24. conntrack state (ct_state)
1735 ┌────────────┬───────┐
1736 │Keyword │ Value │
1737 ├────────────┼───────┤
1738 │ │ │
1739 │invalid │ 1 │
1740 ├────────────┼───────┤
1741 │ │ │
1742 │established │ 2 │
1743 ├────────────┼───────┤
1744 │ │ │
1745 │related │ 4 │
1746 ├────────────┼───────┤
1747 │ │ │
1748 │new │ 8 │
1749 ├────────────┼───────┤
1750 │ │ │
1751 │untracked │ 64 │
1752 └────────────┴───────┘
1753
1754 Table 25. conntrack direction (ct_dir)
1755 ┌─────────┬───────┐
1756 │Keyword │ Value │
1757 ├─────────┼───────┤
1758 │ │ │
1759 │original │ 0 │
1760 ├─────────┼───────┤
1761 │ │ │
1762 │reply │ 1 │
1763 └─────────┴───────┘
1764
1765 Table 26. conntrack status (ct_status)
1766 ┌───────────┬───────┐
1767 │Keyword │ Value │
1768 ├───────────┼───────┤
1769 │ │ │
1770 │expected │ 1 │
1771 ├───────────┼───────┤
1772 │ │ │
1773 │seen-reply │ 2 │
1774 ├───────────┼───────┤
1775 │ │ │
1776 │assured │ 4 │
1777 ├───────────┼───────┤
1778 │ │ │
1779 │confirmed │ 8 │
1780 ├───────────┼───────┤
1781 │ │ │
1782 │snat │ 16 │
1783 ├───────────┼───────┤
1784 │ │ │
1785 │dnat │ 32 │
1786 ├───────────┼───────┤
1787 │ │ │
1788 │dying │ 512 │
1789 └───────────┴───────┘
1790
1791 Table 27. conntrack event bits (ct_event)
1792 ┌──────────┬───────┐
1793 │Keyword │ Value │
1794 ├──────────┼───────┤
1795 │ │ │
1796 │new │ 1 │
1797 ├──────────┼───────┤
1798 │ │ │
1799 │related │ 2 │
1800 ├──────────┼───────┤
1801 │ │ │
1802 │destroy │ 4 │
1803 ├──────────┼───────┤
1804 │ │ │
1805 │reply │ 8 │
1806 ├──────────┼───────┤
1807 │ │ │
1808 │assured │ 16 │
1809 ├──────────┼───────┤
1810 │ │ │
1811 │protoinfo │ 32 │
1812 ├──────────┼───────┤
1813 │ │ │
1814 │helper │ 64 │
1815 ├──────────┼───────┤
1816 │ │ │
1817 │mark │ 128 │
1818 ├──────────┼───────┤
1819 │ │ │
1820 │seqadj │ 256 │
1821 ├──────────┼───────┤
1822 │ │ │
1823 │secmark │ 512 │
1824 ├──────────┼───────┤
1825 │ │ │
1826 │label │ 1024 │
1827 └──────────┴───────┘
1828
1829 Possible keywords for conntrack label type (ct_label) are read at
1830 runtime from /etc/connlabel.conf.
1831
1832 DCCP PKTTYPE TYPE
1833 ┌─────────────────┬──────────────┬───────┬───────────┐
1834 │Name │ Keyword │ Size │ Base type │
1835 ├─────────────────┼──────────────┼───────┼───────────┤
1836 │ │ │ │ │
1837 │DCCP packet type │ dccp_pkttype │ 4 bit │ integer │
1838 └─────────────────┴──────────────┴───────┴───────────┘
1839
1840 The DCCP packet type abstracts the different legal values of the
1841 respective four bit field in the DCCP header, as stated by RFC4340.
1842 Note that possible values 10-15 are considered reserved and therefore
1843 not allowed to be used. In iptables' dccp match, these values are
1844 aliased INVALID. With nftables, one may simply match on the numeric
1845 value range, i.e. 10-15.
1846
1847 Table 28. keywords may be used when specifying the DCCP packet type
1848 ┌─────────┬───────┐
1849 │Keyword │ Value │
1850 ├─────────┼───────┤
1851 │ │ │
1852 │request │ 0 │
1853 ├─────────┼───────┤
1854 │ │ │
1855 │response │ 1 │
1856 ├─────────┼───────┤
1857 │ │ │
1858 │data │ 2 │
1859 ├─────────┼───────┤
1860 │ │ │
1861 │ack │ 3 │
1862 ├─────────┼───────┤
1863 │ │ │
1864 │dataack │ 4 │
1865 ├─────────┼───────┤
1866 │ │ │
1867 │closereq │ 5 │
1868 ├─────────┼───────┤
1869 │ │ │
1870 │close │ 6 │
1871 ├─────────┼───────┤
1872 │ │ │
1873 │reset │ 7 │
1874 ├─────────┼───────┤
1875 │ │ │
1876 │sync │ 8 │
1877 ├─────────┼───────┤
1878 │ │ │
1879 │syncack │ 9 │
1880 └─────────┴───────┘
1881
1883 The lowest order expression is a primary expression, representing
1884 either a constant or a single datum from a packet’s payload, meta data
1885 or a stateful module.
1886
1887 META EXPRESSIONS
1888 meta {length | nfproto | l4proto | protocol | priority}
1889 [meta] {mark | iif | iifname | iiftype | oif | oifname | oiftype | skuid | skgid | nftrace | rtclassid | ibrname | obrname | pkttype | cpu | iifgroup | oifgroup | cgroup | random | ipsec | iifkind | oifkind | time | hour | day }
1890
1891 A meta expression refers to meta data associated with a packet.
1892
1893 There are two types of meta expressions: unqualified and qualified meta
1894 expressions. Qualified meta expressions require the meta keyword before
1895 the meta key, unqualified meta expressions can be specified by using
1896 the meta key directly or as qualified meta expressions. Meta l4proto is
1897 useful to match a particular transport protocol that is part of either
1898 an IPv4 or IPv6 packet. It will also skip any IPv6 extension headers
1899 present in an IPv6 packet.
1900
1901 meta iif, oif, iifname and oifname are used to match the interface a
1902 packet arrived on or is about to be sent out on.
1903
1904 iif and oif are used to match on the interface index, whereas iifname
1905 and oifname are used to match on the interface name. This is not the
1906 same — assuming the rule
1907
1908 filter input meta iif "foo"
1909
1910 Then this rule can only be added if the interface "foo" exists. Also,
1911 the rule will continue to match even if the interface "foo" is renamed
1912 to "bar".
1913
1914 This is because internally the interface index is used. In case of
1915 dynamically created interfaces, such as tun/tap or dialup interfaces
1916 (ppp for example), it might be better to use iifname or oifname
1917 instead.
1918
1919 In these cases, the name is used so the interface doesn’t have to exist
1920 to add such a rule, it will stop matching if the interface gets renamed
1921 and it will match again in case interface gets deleted and later a new
1922 interface with the same name is created.
1923
1924 Like with iptables, wildcard matching on interface name prefixes is
1925 available for iifname and oifname matches by appending an asterisk (*)
1926 character. Note however that unlike iptables, nftables does not accept
1927 interface names consisting of the wildcard character only - users are
1928 supposed to just skip those always matching expressions. In order to
1929 match on literal asterisk character, one may escape it using backslash
1930 (\).
1931
1932 Table 29. Meta expression types
1933 ┌──────────┬─────────────────────┬─────────────────────┐
1934 │Keyword │ Description │ Type │
1935 ├──────────┼─────────────────────┼─────────────────────┤
1936 │ │ │ │
1937 │length │ Length of the │ integer (32-bit) │
1938 │ │ packet in bytes │ │
1939 ├──────────┼─────────────────────┼─────────────────────┤
1940 │ │ │ │
1941 │nfproto │ real hook protocol │ integer (32 bit) │
1942 │ │ family, useful only │ │
1943 │ │ in inet table │ │
1944 ├──────────┼─────────────────────┼─────────────────────┤
1945 │ │ │ │
1946 │l4proto │ layer 4 protocol, │ integer (8 bit) │
1947 │ │ skips ipv6 │ │
1948 │ │ extension headers │ │
1949 ├──────────┼─────────────────────┼─────────────────────┤
1950 │ │ │ │
1951 │protocol │ EtherType protocol │ ether_type │
1952 │ │ value │ │
1953 ├──────────┼─────────────────────┼─────────────────────┤
1954 │ │ │ │
1955 │priority │ TC packet priority │ tc_handle │
1956 ├──────────┼─────────────────────┼─────────────────────┤
1957 │ │ │ │
1958 │mark │ Packet mark │ mark │
1959 ├──────────┼─────────────────────┼─────────────────────┤
1960 │ │ │ │
1961 │iif │ Input interface │ iface_index │
1962 │ │ index │ │
1963 ├──────────┼─────────────────────┼─────────────────────┤
1964 │ │ │ │
1965 │iifname │ Input interface │ ifname │
1966 │ │ name │ │
1967 ├──────────┼─────────────────────┼─────────────────────┤
1968 │ │ │ │
1969 │iiftype │ Input interface │ iface_type │
1970 │ │ type │ │
1971 ├──────────┼─────────────────────┼─────────────────────┤
1972 │ │ │ │
1973 │oif │ Output interface │ iface_index │
1974 │ │ index │ │
1975 ├──────────┼─────────────────────┼─────────────────────┤
1976 │ │ │ │
1977 │oifname │ Output interface │ ifname │
1978 │ │ name │ │
1979 ├──────────┼─────────────────────┼─────────────────────┤
1980 │ │ │ │
1981 │oiftype │ Output interface │ iface_type │
1982 │ │ hardware type │ │
1983 ├──────────┼─────────────────────┼─────────────────────┤
1984 │ │ │ │
1985 │sdif │ Slave device input │ iface_index │
1986 │ │ interface index │ │
1987 ├──────────┼─────────────────────┼─────────────────────┤
1988 │ │ │ │
1989 │sdifname │ Slave device │ ifname │
1990 │ │ interface name │ │
1991 ├──────────┼─────────────────────┼─────────────────────┤
1992 │ │ │ │
1993 │skuid │ UID associated with │ uid │
1994 │ │ originating socket │ │
1995 ├──────────┼─────────────────────┼─────────────────────┤
1996 │ │ │ │
1997 │skgid │ GID associated with │ gid │
1998 │ │ originating socket │ │
1999 ├──────────┼─────────────────────┼─────────────────────┤
2000 │ │ │ │
2001 │rtclassid │ Routing realm │ realm │
2002 ├──────────┼─────────────────────┼─────────────────────┤
2003 │ │ │ │
2004 │ibrname │ Input bridge │ ifname │
2005 │ │ interface name │ │
2006 ├──────────┼─────────────────────┼─────────────────────┤
2007 │ │ │ │
2008 │obrname │ Output bridge │ ifname │
2009 │ │ interface name │ │
2010 ├──────────┼─────────────────────┼─────────────────────┤
2011 │ │ │ │
2012 │pkttype │ packet type │ pkt_type │
2013 ├──────────┼─────────────────────┼─────────────────────┤
2014 │ │ │ │
2015 │cpu │ cpu number │ integer (32 bit) │
2016 │ │ processing the │ │
2017 │ │ packet │ │
2018 ├──────────┼─────────────────────┼─────────────────────┤
2019 │ │ │ │
2020 │iifgroup │ incoming device │ devgroup │
2021 │ │ group │ │
2022 ├──────────┼─────────────────────┼─────────────────────┤
2023 │ │ │ │
2024 │oifgroup │ outgoing device │ devgroup │
2025 │ │ group │ │
2026 ├──────────┼─────────────────────┼─────────────────────┤
2027 │ │ │ │
2028 │cgroup │ control group id │ integer (32 bit) │
2029 ├──────────┼─────────────────────┼─────────────────────┤
2030 │ │ │ │
2031 │random │ pseudo-random │ integer (32 bit) │
2032 │ │ number │ │
2033 ├──────────┼─────────────────────┼─────────────────────┤
2034 │ │ │ │
2035 │ipsec │ true if packet was │ boolean (1 bit) │
2036 │ │ ipsec encrypted │ │
2037 ├──────────┼─────────────────────┼─────────────────────┤
2038 │ │ │ │
2039 │iifkind │ Input interface │ │
2040 │ │ kind │ │
2041 ├──────────┼─────────────────────┼─────────────────────┤
2042 │ │ │ │
2043 │oifkind │ Output interface │ │
2044 │ │ kind │ │
2045 ├──────────┼─────────────────────┼─────────────────────┤
2046 │ │ │ │
2047 │time │ Absolute time of │ Integer (32 bit) or │
2048 │ │ packet reception │ string │
2049 ├──────────┼─────────────────────┼─────────────────────┤
2050 │ │ │ │
2051 │day │ Day of week │ Integer (8 bit) or │
2052 │ │ │ string │
2053 ├──────────┼─────────────────────┼─────────────────────┤
2054 │ │ │ │
2055 │hour │ Hour of day │ String │
2056 └──────────┴─────────────────────┴─────────────────────┘
2057
2058 Table 30. Meta expression specific types
2059 ┌──────────────┬────────────────────────────┐
2060 │Type │ Description │
2061 ├──────────────┼────────────────────────────┤
2062 │ │ │
2063 │iface_index │ Interface index (32 bit │
2064 │ │ number). Can be specified │
2065 │ │ numerically or as name of │
2066 │ │ an existing interface. │
2067 ├──────────────┼────────────────────────────┤
2068 │ │ │
2069 │ifname │ Interface name (16 byte │
2070 │ │ string). Does not have to │
2071 │ │ exist. │
2072 ├──────────────┼────────────────────────────┤
2073 │ │ │
2074 │iface_type │ Interface type (16 bit │
2075 │ │ number). │
2076 ├──────────────┼────────────────────────────┤
2077 │ │ │
2078 │uid │ User ID (32 bit number). │
2079 │ │ Can be specified │
2080 │ │ numerically or as user │
2081 │ │ name. │
2082 ├──────────────┼────────────────────────────┤
2083 │ │ │
2084 │gid │ Group ID (32 bit number). │
2085 │ │ Can be specified │
2086 │ │ numerically or as group │
2087 │ │ name. │
2088 ├──────────────┼────────────────────────────┤
2089 │ │ │
2090 │realm │ Routing Realm (32 bit │
2091 │ │ number). Can be specified │
2092 │ │ numerically or as symbolic │
2093 │ │ name defined in │
2094 │ │ /etc/iproute2/rt_realms. │
2095 ├──────────────┼────────────────────────────┤
2096 │ │ │
2097 │devgroup_type │ Device group (32 bit │
2098 │ │ number). Can be specified │
2099 │ │ numerically or as symbolic │
2100 │ │ name defined in │
2101 │ │ /etc/iproute2/group. │
2102 ├──────────────┼────────────────────────────┤
2103 │ │ │
2104 │pkt_type │ Packet type: host │
2105 │ │ (addressed to local host), │
2106 │ │ broadcast (to all), │
2107 │ │ multicast (to group), │
2108 │ │ other (addressed to │
2109 │ │ another host). │
2110 ├──────────────┼────────────────────────────┤
2111 │ │ │
2112 │ifkind │ Interface kind (16 byte │
2113 │ │ string). See TYPES in │
2114 │ │ ip-link(8) for a list. │
2115 ├──────────────┼────────────────────────────┤
2116 │ │ │
2117 │time │ Either an integer or a │
2118 │ │ date in ISO format. For │
2119 │ │ example: "2019-06-06 │
2120 │ │ 17:00". Hour and seconds │
2121 │ │ are optional and can be │
2122 │ │ omitted if desired. If │
2123 │ │ omitted, midnight will be │
2124 │ │ assumed. The following │
2125 │ │ three would be equivalent: │
2126 │ │ "2019-06-06", "2019-06-06 │
2127 │ │ 00:00" and "2019-06-06 │
2128 │ │ 00:00:00". When an integer │
2129 │ │ is given, it is assumed to │
2130 │ │ be a UNIX timestamp. │
2131 ├──────────────┼────────────────────────────┤
2132 │ │ │
2133 │day │ Either a day of week │
2134 │ │ ("Monday", "Tuesday", │
2135 │ │ etc.), or an integer │
2136 │ │ between 0 and 6. Strings │
2137 │ │ are matched │
2138 │ │ case-insensitively, and a │
2139 │ │ full match is not expected │
2140 │ │ (e.g. "Mon" would match │
2141 │ │ "Monday"). When an integer │
2142 │ │ is given, 0 is Sunday and │
2143 │ │ 6 is Saturday. │
2144 ├──────────────┼────────────────────────────┤
2145 │ │ │
2146 │hour │ A string representing an │
2147 │ │ hour in 24-hour format. │
2148 │ │ Seconds can optionally be │
2149 │ │ specified. For example, │
2150 │ │ 17:00 and 17:00:00 would │
2151 │ │ be equivalent. │
2152 └──────────────┴────────────────────────────┘
2153
2154 Using meta expressions.
2155
2156 # qualified meta expression
2157 filter output meta oif eth0
2158 filter forward meta iifkind { "tun", "veth" }
2159
2160 # unqualified meta expression
2161 filter output oif eth0
2162
2163 # incoming packet was subject to ipsec processing
2164 raw prerouting meta ipsec exists accept
2165
2166
2167 SOCKET EXPRESSION
2168 socket {transparent | mark | wildcard}
2169 socket cgroupv2 level NUM
2170
2171 Socket expression can be used to search for an existing open TCP/UDP
2172 socket and its attributes that can be associated with a packet. It
2173 looks for an established or non-zero bound listening socket (possibly
2174 with a non-local address). You can also use it to match on the socket
2175 cgroupv2 at a given ancestor level, e.g. if the socket belongs to
2176 cgroupv2 a/b, ancestor level 1 checks for a matching on cgroup a and
2177 ancestor level 2 checks for a matching on cgroup b.
2178
2179 Table 31. Available socket attributes
2180 ┌────────────┬─────────────────────┬─────────────────┐
2181 │Name │ Description │ Type │
2182 ├────────────┼─────────────────────┼─────────────────┤
2183 │ │ │ │
2184 │transparent │ Value of the │ boolean (1 bit) │
2185 │ │ IP_TRANSPARENT │ │
2186 │ │ socket option in │ │
2187 │ │ the found socket. │ │
2188 │ │ It can be 0 or 1. │ │
2189 ├────────────┼─────────────────────┼─────────────────┤
2190 │ │ │ │
2191 │mark │ Value of the socket │ mark │
2192 │ │ mark (SOL_SOCKET, │ │
2193 │ │ SO_MARK). │ │
2194 ├────────────┼─────────────────────┼─────────────────┤
2195 │ │ │ │
2196 │wildcard │ Indicates whether │ boolean (1 bit) │
2197 │ │ the socket is │ │
2198 │ │ wildcard-bound │ │
2199 │ │ (e.g. 0.0.0.0 or │ │
2200 │ │ ::0). │ │
2201 ├────────────┼─────────────────────┼─────────────────┤
2202 │ │ │ │
2203 │cgroupv2 │ cgroup version 2 │ cgroupv2 │
2204 │ │ for this socket │ │
2205 │ │ (path from │ │
2206 │ │ /sys/fs/cgroup) │ │
2207 └────────────┴─────────────────────┴─────────────────┘
2208
2209 Using socket expression.
2210
2211 # Mark packets that correspond to a transparent socket. "socket wildcard 0"
2212 # means that zero-bound listener sockets are NOT matched (which is usually
2213 # exactly what you want).
2214 table inet x {
2215 chain y {
2216 type filter hook prerouting priority mangle; policy accept;
2217 socket transparent 1 socket wildcard 0 mark set 0x00000001 accept
2218 }
2219 }
2220
2221 # Trace packets that corresponds to a socket with a mark value of 15
2222 table inet x {
2223 chain y {
2224 type filter hook prerouting priority mangle; policy accept;
2225 socket mark 0x0000000f nftrace set 1
2226 }
2227 }
2228
2229 # Set packet mark to socket mark
2230 table inet x {
2231 chain y {
2232 type filter hook prerouting priority mangle; policy accept;
2233 tcp dport 8080 mark set socket mark
2234 }
2235 }
2236
2237 # Count packets for cgroupv2 "user.slice" at level 1
2238 table inet x {
2239 chain y {
2240 type filter hook input priority filter; policy accept;
2241 socket cgroupv2 level 1 "user.slice" counter
2242 }
2243 }
2244
2245
2246 OSF EXPRESSION
2247 osf [ttl {loose | skip}] {name | version}
2248
2249 The osf expression does passive operating system fingerprinting. This
2250 expression compares some data (Window Size, MSS, options and their
2251 order, DF, and others) from packets with the SYN bit set.
2252
2253 Table 32. Available osf attributes
2254 ┌────────┬─────────────────────┬────────┐
2255 │Name │ Description │ Type │
2256 ├────────┼─────────────────────┼────────┤
2257 │ │ │ │
2258 │ttl │ Do TTL checks on │ string │
2259 │ │ the packet to │ │
2260 │ │ determine the │ │
2261 │ │ operating system. │ │
2262 ├────────┼─────────────────────┼────────┤
2263 │ │ │ │
2264 │version │ Do OS version │ │
2265 │ │ checks on the │ │
2266 │ │ packet. │ │
2267 ├────────┼─────────────────────┼────────┤
2268 │ │ │ │
2269 │name │ Name of the OS │ string │
2270 │ │ signature to match. │ │
2271 │ │ All signatures can │ │
2272 │ │ be found at pf.os │ │
2273 │ │ file. Use "unknown" │ │
2274 │ │ for OS signatures │ │
2275 │ │ that the expression │ │
2276 │ │ could not detect. │ │
2277 └────────┴─────────────────────┴────────┘
2278
2279 Available ttl values.
2280
2281 If no TTL attribute is passed, make a true IP header and fingerprint TTL true comparison. This generally works for LANs.
2282
2283 * loose: Check if the IP header's TTL is less than the fingerprint one. Works for globally-routable addresses.
2284 * skip: Do not compare the TTL at all.
2285
2286 Using osf expression.
2287
2288 # Accept packets that match the "Linux" OS genre signature without comparing TTL.
2289 table inet x {
2290 chain y {
2291 type filter hook input priority filter; policy accept;
2292 osf ttl skip name "Linux"
2293 }
2294 }
2295
2296
2297 FIB EXPRESSIONS
2298 fib {saddr | daddr | mark | iif | oif} [. ...] {oif | oifname | type}
2299
2300 A fib expression queries the fib (forwarding information base) to
2301 obtain information such as the output interface index a particular
2302 address would use. The input is a tuple of elements that is used as
2303 input to the fib lookup functions.
2304
2305 Table 33. fib expression specific types
2306 ┌────────┬──────────────────┬──────────────────┐
2307 │Keyword │ Description │ Type │
2308 ├────────┼──────────────────┼──────────────────┤
2309 │ │ │ │
2310 │oif │ Output interface │ integer (32 bit) │
2311 │ │ index │ │
2312 ├────────┼──────────────────┼──────────────────┤
2313 │ │ │ │
2314 │oifname │ Output interface │ string │
2315 │ │ name │ │
2316 ├────────┼──────────────────┼──────────────────┤
2317 │ │ │ │
2318 │type │ Address type │ fib_addrtype │
2319 └────────┴──────────────────┴──────────────────┘
2320
2321 Use nft describe fib_addrtype to get a list of all address types.
2322
2323 Using fib expressions.
2324
2325 # drop packets without a reverse path
2326 filter prerouting fib saddr . iif oif missing drop
2327
2328 In this example, 'saddr . iif' looks up routing information based on the source address and the input interface.
2329 oif picks the output interface index from the routing information.
2330 If no route was found for the source address/input interface combination, the output interface index is zero.
2331 In case the input interface is specified as part of the input key, the output interface index is always the same as the input interface index or zero.
2332 If only 'saddr oif' is given, then oif can be any interface index or zero.
2333
2334 # drop packets to address not configured on incoming interface
2335 filter prerouting fib daddr . iif type != { local, broadcast, multicast } drop
2336
2337 # perform lookup in a specific 'blackhole' table (0xdead, needs ip appropriate ip rule)
2338 filter prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : jump prohibited, unreachable : drop }
2339
2340
2341 ROUTING EXPRESSIONS
2342 rt [ip | ip6] {classid | nexthop | mtu | ipsec}
2343
2344 A routing expression refers to routing data associated with a packet.
2345
2346 Table 34. Routing expression types
2347 ┌────────┬─────────────────────┬─────────────────────┐
2348 │Keyword │ Description │ Type │
2349 ├────────┼─────────────────────┼─────────────────────┤
2350 │ │ │ │
2351 │classid │ Routing realm │ realm │
2352 ├────────┼─────────────────────┼─────────────────────┤
2353 │ │ │ │
2354 │nexthop │ Routing nexthop │ ipv4_addr/ipv6_addr │
2355 ├────────┼─────────────────────┼─────────────────────┤
2356 │ │ │ │
2357 │mtu │ TCP maximum segment │ integer (16 bit) │
2358 │ │ size of route │ │
2359 ├────────┼─────────────────────┼─────────────────────┤
2360 │ │ │ │
2361 │ipsec │ route via ipsec │ boolean │
2362 │ │ tunnel or transport │ │
2363 └────────┴─────────────────────┴─────────────────────┘
2364
2365 Table 35. Routing expression specific types
2366 ┌──────┬────────────────────────────┐
2367 │Type │ Description │
2368 ├──────┼────────────────────────────┤
2369 │ │ │
2370 │realm │ Routing Realm (32 bit │
2371 │ │ number). Can be specified │
2372 │ │ numerically or as symbolic │
2373 │ │ name defined in │
2374 │ │ /etc/iproute2/rt_realms. │
2375 └──────┴────────────────────────────┘
2376
2377 Using routing expressions.
2378
2379 # IP family independent rt expression
2380 filter output rt classid 10
2381
2382 # IP family dependent rt expressions
2383 ip filter output rt nexthop 192.168.0.1
2384 ip6 filter output rt nexthop fd00::1
2385 inet filter output rt ip nexthop 192.168.0.1
2386 inet filter output rt ip6 nexthop fd00::1
2387
2388 # outgoing packet will be encapsulated/encrypted by ipsec
2389 filter output rt ipsec exists
2390
2391
2392 IPSEC EXPRESSIONS
2393 ipsec {in | out} [ spnum NUM ] {reqid | spi}
2394 ipsec {in | out} [ spnum NUM ] {ip | ip6} {saddr | daddr}
2395
2396 An ipsec expression refers to ipsec data associated with a packet.
2397
2398 The in or out keyword needs to be used to specify if the expression
2399 should examine inbound or outbound policies. The in keyword can be used
2400 in the prerouting, input and forward hooks. The out keyword applies to
2401 forward, output and postrouting hooks. The optional keyword spnum can
2402 be used to match a specific state in a chain, it defaults to 0.
2403
2404 Table 36. Ipsec expression types
2405 ┌────────┬─────────────────────┬─────────────────────┐
2406 │Keyword │ Description │ Type │
2407 ├────────┼─────────────────────┼─────────────────────┤
2408 │ │ │ │
2409 │reqid │ Request ID │ integer (32 bit) │
2410 ├────────┼─────────────────────┼─────────────────────┤
2411 │ │ │ │
2412 │spi │ Security Parameter │ integer (32 bit) │
2413 │ │ Index │ │
2414 ├────────┼─────────────────────┼─────────────────────┤
2415 │ │ │ │
2416 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
2417 │ │ the tunnel │ │
2418 ├────────┼─────────────────────┼─────────────────────┤
2419 │ │ │ │
2420 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
2421 │ │ of the tunnel │ │
2422 └────────┴─────────────────────┴─────────────────────┘
2423
2424 Note: When using xfrm_interface, this expression is not useable in
2425 output hook as the plain packet does not traverse it with IPsec info
2426 attached - use a chain in postrouting hook instead.
2427
2428 NUMGEN EXPRESSION
2429 numgen {inc | random} mod NUM [ offset NUM ]
2430
2431 Create a number generator. The inc or random keywords control its
2432 operation mode: In inc mode, the last returned value is simply
2433 incremented. In random mode, a new random number is returned. The value
2434 after mod keyword specifies an upper boundary (read: modulus) which is
2435 not reached by returned numbers. The optional offset allows to
2436 increment the returned value by a fixed offset.
2437
2438 A typical use-case for numgen is load-balancing:
2439
2440 Using numgen expression.
2441
2442 # round-robin between 192.168.10.100 and 192.168.20.200:
2443 add rule nat prerouting dnat to numgen inc mod 2 map \
2444 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2445
2446 # probability-based with odd bias using intervals:
2447 add rule nat prerouting dnat to numgen random mod 10 map \
2448 { 0-2 : 192.168.10.100, 3-9 : 192.168.20.200 }
2449
2450
2451 HASH EXPRESSIONS
2452 jhash {ip saddr | ip6 daddr | tcp dport | udp sport | ether saddr} [. ...] mod NUM [ seed NUM ] [ offset NUM ]
2453 symhash mod NUM [ offset NUM ]
2454
2455 Use a hashing function to generate a number. The functions available
2456 are jhash, known as Jenkins Hash, and symhash, for Symmetric Hash. The
2457 jhash requires an expression to determine the parameters of the packet
2458 header to apply the hashing, concatenations are possible as well. The
2459 value after mod keyword specifies an upper boundary (read: modulus)
2460 which is not reached by returned numbers. The optional seed is used to
2461 specify an init value used as seed in the hashing function. The
2462 optional offset allows to increment the returned value by a fixed
2463 offset.
2464
2465 A typical use-case for jhash and symhash is load-balancing:
2466
2467 Using hash expressions.
2468
2469 # load balance based on source ip between 2 ip addresses:
2470 add rule nat prerouting dnat to jhash ip saddr mod 2 map \
2471 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2472
2473 # symmetric load balancing between 2 ip addresses:
2474 add rule nat prerouting dnat to symhash mod 2 map \
2475 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2476
2477
2479 Payload expressions refer to data from the packet’s payload.
2480
2481 ETHERNET HEADER EXPRESSION
2482 ether {daddr | saddr | type}
2483
2484 Table 37. Ethernet header expression types
2485 ┌────────┬────────────────────┬────────────┐
2486 │Keyword │ Description │ Type │
2487 ├────────┼────────────────────┼────────────┤
2488 │ │ │ │
2489 │daddr │ Destination MAC │ ether_addr │
2490 │ │ address │ │
2491 ├────────┼────────────────────┼────────────┤
2492 │ │ │ │
2493 │saddr │ Source MAC address │ ether_addr │
2494 ├────────┼────────────────────┼────────────┤
2495 │ │ │ │
2496 │type │ EtherType │ ether_type │
2497 └────────┴────────────────────┴────────────┘
2498
2499 VLAN HEADER EXPRESSION
2500 vlan {id | dei | pcp | type}
2501
2502 Table 38. VLAN header expression
2503 ┌────────┬─────────────────────┬──────────────────┐
2504 │Keyword │ Description │ Type │
2505 ├────────┼─────────────────────┼──────────────────┤
2506 │ │ │ │
2507 │id │ VLAN ID (VID) │ integer (12 bit) │
2508 ├────────┼─────────────────────┼──────────────────┤
2509 │ │ │ │
2510 │dei │ Drop Eligible │ integer (1 bit) │
2511 │ │ Indicator │ │
2512 ├────────┼─────────────────────┼──────────────────┤
2513 │ │ │ │
2514 │pcp │ Priority code point │ integer (3 bit) │
2515 ├────────┼─────────────────────┼──────────────────┤
2516 │ │ │ │
2517 │type │ EtherType │ ether_type │
2518 └────────┴─────────────────────┴──────────────────┘
2519
2520 ARP HEADER EXPRESSION
2521 arp {htype | ptype | hlen | plen | operation | saddr { ip | ether } | daddr { ip | ether }
2522
2523 Table 39. ARP header expression
2524 ┌────────────┬─────────────────────┬──────────────────┐
2525 │Keyword │ Description │ Type │
2526 ├────────────┼─────────────────────┼──────────────────┤
2527 │ │ │ │
2528 │htype │ ARP hardware type │ integer (16 bit) │
2529 ├────────────┼─────────────────────┼──────────────────┤
2530 │ │ │ │
2531 │ptype │ EtherType │ ether_type │
2532 ├────────────┼─────────────────────┼──────────────────┤
2533 │ │ │ │
2534 │hlen │ Hardware address │ integer (8 bit) │
2535 │ │ len │ │
2536 ├────────────┼─────────────────────┼──────────────────┤
2537 │ │ │ │
2538 │plen │ Protocol address │ integer (8 bit) │
2539 │ │ len │ │
2540 ├────────────┼─────────────────────┼──────────────────┤
2541 │ │ │ │
2542 │operation │ Operation │ arp_op │
2543 ├────────────┼─────────────────────┼──────────────────┤
2544 │ │ │ │
2545 │saddr ether │ Ethernet sender │ ether_addr │
2546 │ │ address │ │
2547 ├────────────┼─────────────────────┼──────────────────┤
2548 │ │ │ │
2549 │daddr ether │ Ethernet target │ ether_addr │
2550 │ │ address │ │
2551 ├────────────┼─────────────────────┼──────────────────┤
2552 │ │ │ │
2553 │saddr ip │ IPv4 sender address │ ipv4_addr │
2554 ├────────────┼─────────────────────┼──────────────────┤
2555 │ │ │ │
2556 │daddr ip │ IPv4 target address │ ipv4_addr │
2557 └────────────┴─────────────────────┴──────────────────┘
2558
2559 IPV4 HEADER EXPRESSION
2560 ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
2561
2562 Table 40. IPv4 header expression
2563 ┌──────────┬─────────────────────┬──────────────────┐
2564 │Keyword │ Description │ Type │
2565 ├──────────┼─────────────────────┼──────────────────┤
2566 │ │ │ │
2567 │version │ IP header version │ integer (4 bit) │
2568 │ │ (4) │ │
2569 ├──────────┼─────────────────────┼──────────────────┤
2570 │ │ │ │
2571 │hdrlength │ IP header length │ integer (4 bit) │
2572 │ │ including options │ FIXME scaling │
2573 ├──────────┼─────────────────────┼──────────────────┤
2574 │ │ │ │
2575 │dscp │ Differentiated │ dscp │
2576 │ │ Services Code Point │ │
2577 ├──────────┼─────────────────────┼──────────────────┤
2578 │ │ │ │
2579 │ecn │ Explicit Congestion │ ecn │
2580 │ │ Notification │ │
2581 ├──────────┼─────────────────────┼──────────────────┤
2582 │ │ │ │
2583 │length │ Total packet length │ integer (16 bit) │
2584 ├──────────┼─────────────────────┼──────────────────┤
2585 │ │ │ │
2586 │id │ IP ID │ integer (16 bit) │
2587 ├──────────┼─────────────────────┼──────────────────┤
2588 │ │ │ │
2589 │frag-off │ Fragment offset │ integer (16 bit) │
2590 ├──────────┼─────────────────────┼──────────────────┤
2591 │ │ │ │
2592 │ttl │ Time to live │ integer (8 bit) │
2593 ├──────────┼─────────────────────┼──────────────────┤
2594 │ │ │ │
2595 │protocol │ Upper layer │ inet_proto │
2596 │ │ protocol │ │
2597 ├──────────┼─────────────────────┼──────────────────┤
2598 │ │ │ │
2599 │checksum │ IP header checksum │ integer (16 bit) │
2600 ├──────────┼─────────────────────┼──────────────────┤
2601 │ │ │ │
2602 │saddr │ Source address │ ipv4_addr │
2603 ├──────────┼─────────────────────┼──────────────────┤
2604 │ │ │ │
2605 │daddr │ Destination address │ ipv4_addr │
2606 └──────────┴─────────────────────┴──────────────────┘
2607
2608 ICMP HEADER EXPRESSION
2609 icmp {type | code | checksum | id | sequence | gateway | mtu}
2610
2611 This expression refers to ICMP header fields. When using it in inet,
2612 bridge or netdev families, it will cause an implicit dependency on IPv4
2613 to be created. To match on unusual cases like ICMP over IPv6, one has
2614 to add an explicit meta protocol ip6 match to the rule.
2615
2616 Table 41. ICMP header expression
2617 ┌─────────┬─────────────────────┬──────────────────┐
2618 │Keyword │ Description │ Type │
2619 ├─────────┼─────────────────────┼──────────────────┤
2620 │ │ │ │
2621 │type │ ICMP type field │ icmp_type │
2622 ├─────────┼─────────────────────┼──────────────────┤
2623 │ │ │ │
2624 │code │ ICMP code field │ integer (8 bit) │
2625 ├─────────┼─────────────────────┼──────────────────┤
2626 │ │ │ │
2627 │checksum │ ICMP checksum field │ integer (16 bit) │
2628 ├─────────┼─────────────────────┼──────────────────┤
2629 │ │ │ │
2630 │id │ ID of echo │ integer (16 bit) │
2631 │ │ request/response │ │
2632 ├─────────┼─────────────────────┼──────────────────┤
2633 │ │ │ │
2634 │sequence │ sequence number of │ integer (16 bit) │
2635 │ │ echo │ │
2636 │ │ request/response │ │
2637 ├─────────┼─────────────────────┼──────────────────┤
2638 │ │ │ │
2639 │gateway │ gateway of │ integer (32 bit) │
2640 │ │ redirects │ │
2641 ├─────────┼─────────────────────┼──────────────────┤
2642 │ │ │ │
2643 │mtu │ MTU of path MTU │ integer (16 bit) │
2644 │ │ discovery │ │
2645 └─────────┴─────────────────────┴──────────────────┘
2646
2647 IGMP HEADER EXPRESSION
2648 igmp {type | mrt | checksum | group}
2649
2650 This expression refers to IGMP header fields. When using it in inet,
2651 bridge or netdev families, it will cause an implicit dependency on IPv4
2652 to be created. To match on unusual cases like IGMP over IPv6, one has
2653 to add an explicit meta protocol ip6 match to the rule.
2654
2655 Table 42. IGMP header expression
2656 ┌─────────┬─────────────────────┬──────────────────┐
2657 │Keyword │ Description │ Type │
2658 ├─────────┼─────────────────────┼──────────────────┤
2659 │ │ │ │
2660 │type │ IGMP type field │ igmp_type │
2661 ├─────────┼─────────────────────┼──────────────────┤
2662 │ │ │ │
2663 │mrt │ IGMP maximum │ integer (8 bit) │
2664 │ │ response time field │ │
2665 ├─────────┼─────────────────────┼──────────────────┤
2666 │ │ │ │
2667 │checksum │ IGMP checksum field │ integer (16 bit) │
2668 ├─────────┼─────────────────────┼──────────────────┤
2669 │ │ │ │
2670 │group │ Group address │ integer (32 bit) │
2671 └─────────┴─────────────────────┴──────────────────┘
2672
2673 IPV6 HEADER EXPRESSION
2674 ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
2675
2676 This expression refers to the ipv6 header fields. Caution when using
2677 ip6 nexthdr, the value only refers to the next header, i.e. ip6 nexthdr
2678 tcp will only match if the ipv6 packet does not contain any extension
2679 headers. Packets that are fragmented or e.g. contain a routing
2680 extension headers will not be matched. Please use meta l4proto if you
2681 wish to match the real transport header and ignore any additional
2682 extension headers instead.
2683
2684 Table 43. IPv6 header expression
2685 ┌──────────┬─────────────────────┬──────────────────┐
2686 │Keyword │ Description │ Type │
2687 ├──────────┼─────────────────────┼──────────────────┤
2688 │ │ │ │
2689 │version │ IP header version │ integer (4 bit) │
2690 │ │ (6) │ │
2691 ├──────────┼─────────────────────┼──────────────────┤
2692 │ │ │ │
2693 │dscp │ Differentiated │ dscp │
2694 │ │ Services Code Point │ │
2695 ├──────────┼─────────────────────┼──────────────────┤
2696 │ │ │ │
2697 │ecn │ Explicit Congestion │ ecn │
2698 │ │ Notification │ │
2699 ├──────────┼─────────────────────┼──────────────────┤
2700 │ │ │ │
2701 │flowlabel │ Flow label │ integer (20 bit) │
2702 ├──────────┼─────────────────────┼──────────────────┤
2703 │ │ │ │
2704 │length │ Payload length │ integer (16 bit) │
2705 ├──────────┼─────────────────────┼──────────────────┤
2706 │ │ │ │
2707 │nexthdr │ Nexthdr protocol │ inet_proto │
2708 ├──────────┼─────────────────────┼──────────────────┤
2709 │ │ │ │
2710 │hoplimit │ Hop limit │ integer (8 bit) │
2711 ├──────────┼─────────────────────┼──────────────────┤
2712 │ │ │ │
2713 │saddr │ Source address │ ipv6_addr │
2714 ├──────────┼─────────────────────┼──────────────────┤
2715 │ │ │ │
2716 │daddr │ Destination address │ ipv6_addr │
2717 └──────────┴─────────────────────┴──────────────────┘
2718
2719 Using ip6 header expressions.
2720
2721 # matching if first extension header indicates a fragment
2722 ip6 nexthdr ipv6-frag
2723
2724
2725 ICMPV6 HEADER EXPRESSION
2726 icmpv6 {type | code | checksum | parameter-problem | packet-too-big | id | sequence | max-delay}
2727
2728 This expression refers to ICMPv6 header fields. When using it in inet,
2729 bridge or netdev families, it will cause an implicit dependency on IPv6
2730 to be created. To match on unusual cases like ICMPv6 over IPv4, one has
2731 to add an explicit meta protocol ip match to the rule.
2732
2733 Table 44. ICMPv6 header expression
2734 ┌──────────────────┬────────────────────┬──────────────────┐
2735 │Keyword │ Description │ Type │
2736 ├──────────────────┼────────────────────┼──────────────────┤
2737 │ │ │ │
2738 │type │ ICMPv6 type field │ icmpv6_type │
2739 ├──────────────────┼────────────────────┼──────────────────┤
2740 │ │ │ │
2741 │code │ ICMPv6 code field │ integer (8 bit) │
2742 ├──────────────────┼────────────────────┼──────────────────┤
2743 │ │ │ │
2744 │checksum │ ICMPv6 checksum │ integer (16 bit) │
2745 │ │ field │ │
2746 ├──────────────────┼────────────────────┼──────────────────┤
2747 │ │ │ │
2748 │parameter-problem │ pointer to problem │ integer (32 bit) │
2749 ├──────────────────┼────────────────────┼──────────────────┤
2750 │ │ │ │
2751 │packet-too-big │ oversized MTU │ integer (32 bit) │
2752 ├──────────────────┼────────────────────┼──────────────────┤
2753 │ │ │ │
2754 │id │ ID of echo │ integer (16 bit) │
2755 │ │ request/response │ │
2756 ├──────────────────┼────────────────────┼──────────────────┤
2757 │ │ │ │
2758 │sequence │ sequence number of │ integer (16 bit) │
2759 │ │ echo │ │
2760 │ │ request/response │ │
2761 ├──────────────────┼────────────────────┼──────────────────┤
2762 │ │ │ │
2763 │max-delay │ maximum response │ integer (16 bit) │
2764 │ │ delay of MLD │ │
2765 │ │ queries │ │
2766 └──────────────────┴────────────────────┴──────────────────┘
2767
2768 TCP HEADER EXPRESSION
2769 tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
2770
2771 Table 45. TCP header expression
2772 ┌─────────┬──────────────────┬──────────────────┐
2773 │Keyword │ Description │ Type │
2774 ├─────────┼──────────────────┼──────────────────┤
2775 │ │ │ │
2776 │sport │ Source port │ inet_service │
2777 ├─────────┼──────────────────┼──────────────────┤
2778 │ │ │ │
2779 │dport │ Destination port │ inet_service │
2780 ├─────────┼──────────────────┼──────────────────┤
2781 │ │ │ │
2782 │sequence │ Sequence number │ integer (32 bit) │
2783 ├─────────┼──────────────────┼──────────────────┤
2784 │ │ │ │
2785 │ackseq │ Acknowledgement │ integer (32 bit) │
2786 │ │ number │ │
2787 ├─────────┼──────────────────┼──────────────────┤
2788 │ │ │ │
2789 │doff │ Data offset │ integer (4 bit) │
2790 │ │ │ FIXME scaling │
2791 ├─────────┼──────────────────┼──────────────────┤
2792 │ │ │ │
2793 │reserved │ Reserved area │ integer (4 bit) │
2794 ├─────────┼──────────────────┼──────────────────┤
2795 │ │ │ │
2796 │flags │ TCP flags │ tcp_flag │
2797 ├─────────┼──────────────────┼──────────────────┤
2798 │ │ │ │
2799 │window │ Window │ integer (16 bit) │
2800 ├─────────┼──────────────────┼──────────────────┤
2801 │ │ │ │
2802 │checksum │ Checksum │ integer (16 bit) │
2803 ├─────────┼──────────────────┼──────────────────┤
2804 │ │ │ │
2805 │urgptr │ Urgent pointer │ integer (16 bit) │
2806 └─────────┴──────────────────┴──────────────────┘
2807
2808 UDP HEADER EXPRESSION
2809 udp {sport | dport | length | checksum}
2810
2811 Table 46. UDP header expression
2812 ┌─────────┬─────────────────────┬──────────────────┐
2813 │Keyword │ Description │ Type │
2814 ├─────────┼─────────────────────┼──────────────────┤
2815 │ │ │ │
2816 │sport │ Source port │ inet_service │
2817 ├─────────┼─────────────────────┼──────────────────┤
2818 │ │ │ │
2819 │dport │ Destination port │ inet_service │
2820 ├─────────┼─────────────────────┼──────────────────┤
2821 │ │ │ │
2822 │length │ Total packet length │ integer (16 bit) │
2823 ├─────────┼─────────────────────┼──────────────────┤
2824 │ │ │ │
2825 │checksum │ Checksum │ integer (16 bit) │
2826 └─────────┴─────────────────────┴──────────────────┘
2827
2828 UDP-LITE HEADER EXPRESSION
2829 udplite {sport | dport | checksum}
2830
2831 Table 47. UDP-Lite header expression
2832 ┌─────────┬──────────────────┬──────────────────┐
2833 │Keyword │ Description │ Type │
2834 ├─────────┼──────────────────┼──────────────────┤
2835 │ │ │ │
2836 │sport │ Source port │ inet_service │
2837 ├─────────┼──────────────────┼──────────────────┤
2838 │ │ │ │
2839 │dport │ Destination port │ inet_service │
2840 ├─────────┼──────────────────┼──────────────────┤
2841 │ │ │ │
2842 │checksum │ Checksum │ integer (16 bit) │
2843 └─────────┴──────────────────┴──────────────────┘
2844
2845 SCTP HEADER EXPRESSION
2846 sctp {sport | dport | vtag | checksum}
2847 sctp chunk CHUNK [ FIELD ]
2848
2849 CHUNK := data | init | init-ack | sack | heartbeat |
2850 heartbeat-ack | abort | shutdown | shutdown-ack | error |
2851 cookie-echo | cookie-ack | ecne | cwr | shutdown-complete
2852 | asconf-ack | forward-tsn | asconf
2853
2854 FIELD := COMMON_FIELD | DATA_FIELD | INIT_FIELD | INIT_ACK_FIELD |
2855 SACK_FIELD | SHUTDOWN_FIELD | ECNE_FIELD | CWR_FIELD |
2856 ASCONF_ACK_FIELD | FORWARD_TSN_FIELD | ASCONF_FIELD
2857
2858 COMMON_FIELD := type | flags | length
2859 DATA_FIELD := tsn | stream | ssn | ppid
2860 INIT_FIELD := init-tag | a-rwnd | num-outbound-streams |
2861 num-inbound-streams | initial-tsn
2862 INIT_ACK_FIELD := INIT_FIELD
2863 SACK_FIELD := cum-tsn-ack | a-rwnd | num-gap-ack-blocks |
2864 num-dup-tsns
2865 SHUTDOWN_FIELD := cum-tsn-ack
2866 ECNE_FIELD := lowest-tsn
2867 CWR_FIELD := lowest-tsn
2868 ASCONF_ACK_FIELD := seqno
2869 FORWARD_TSN_FIELD := new-cum-tsn
2870 ASCONF_FIELD := seqno
2871
2872 Table 48. SCTP header expression
2873 ┌─────────┬──────────────────┬────────────────────┐
2874 │Keyword │ Description │ Type │
2875 ├─────────┼──────────────────┼────────────────────┤
2876 │ │ │ │
2877 │sport │ Source port │ inet_service │
2878 ├─────────┼──────────────────┼────────────────────┤
2879 │ │ │ │
2880 │dport │ Destination port │ inet_service │
2881 ├─────────┼──────────────────┼────────────────────┤
2882 │ │ │ │
2883 │vtag │ Verification Tag │ integer (32 bit) │
2884 ├─────────┼──────────────────┼────────────────────┤
2885 │ │ │ │
2886 │checksum │ Checksum │ integer (32 bit) │
2887 ├─────────┼──────────────────┼────────────────────┤
2888 │ │ │ │
2889 │chunk │ Search chunk in │ without FIELD, │
2890 │ │ packet │ boolean indicating │
2891 │ │ │ existence │
2892 └─────────┴──────────────────┴────────────────────┘
2893
2894 Table 49. SCTP chunk fields
2895 ┌─────────────────────┬───────────────┬─────────────────┬──────────────────┐
2896 │Name │ Width in bits │ Chunk │ Notes │
2897 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2898 │ │ │ │ │
2899 │type │ 8 │ all │ not useful, │
2900 │ │ │ │ defined by chunk │
2901 │ │ │ │ type │
2902 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2903 │ │ │ │ │
2904 │flags │ 8 │ all │ semantics │
2905 │ │ │ │ defined on │
2906 │ │ │ │ per-chunk basis │
2907 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2908 │ │ │ │ │
2909 │length │ 16 │ all │ length of this │
2910 │ │ │ │ chunk in bytes │
2911 │ │ │ │ excluding │
2912 │ │ │ │ padding │
2913 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2914 │ │ │ │ │
2915 │tsn │ 32 │ data │ transmission │
2916 │ │ │ │ sequence number │
2917 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2918 │ │ │ │ │
2919 │stream │ 16 │ data │ stream │
2920 │ │ │ │ identifier │
2921 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2922 │ │ │ │ │
2923 │ssn │ 16 │ data │ stream sequence │
2924 │ │ │ │ number │
2925 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2926 │ │ │ │ │
2927 │ppid │ 32 │ data │ payload protocol │
2928 │ │ │ │ identifier │
2929 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2930 │ │ │ │ │
2931 │init-tag │ 32 │ init, init-ack │ initiate tag │
2932 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2933 │ │ │ │ │
2934 │a-rwnd │ 32 │ init, init-ack, │ advertised │
2935 │ │ │ sack │ receiver window │
2936 │ │ │ │ credit │
2937 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2938 │ │ │ │ │
2939 │num-outbound-streams │ 16 │ init, init-ack │ number of │
2940 │ │ │ │ outbound streams │
2941 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2942 │ │ │ │ │
2943 │num-inbound-streams │ 16 │ init, init-ack │ number of │
2944 │ │ │ │ inbound streams │
2945 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2946 │ │ │ │ │
2947 │initial-tsn │ 32 │ init, init-ack │ initial transmit │
2948 │ │ │ │ sequence number │
2949 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2950 │ │ │ │ │
2951 │cum-tsn-ack │ 32 │ sack, shutdown │ cumulative │
2952 │ │ │ │ transmission │
2953 │ │ │ │ sequence number │
2954 │ │ │ │ acknowledged │
2955 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2956 │ │ │ │ │
2957 │num-gap-ack-blocks │ 16 │ sack │ number of Gap │
2958 │ │ │ │ Ack Blocks │
2959 │ │ │ │ included │
2960 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2961 │ │ │ │ │
2962 │num-dup-tsns │ 16 │ sack │ number of │
2963 │ │ │ │ duplicate │
2964 │ │ │ │ transmission │
2965 │ │ │ │ sequence numbers │
2966 │ │ │ │ received │
2967 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2968 │ │ │ │ │
2969 │lowest-tsn │ 32 │ ecne, cwr │ lowest │
2970 │ │ │ │ transmission │
2971 │ │ │ │ sequence number │
2972 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2973 │ │ │ │ │
2974 │seqno │ 32 │ asconf-ack, │ sequence number │
2975 │ │ │ asconf │ │
2976 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2977 │ │ │ │ │
2978 │new-cum-tsn │ 32 │ forward-tsn │ new cumulative │
2979 │ │ │ │ transmission │
2980 │ │ │ │ sequence number │
2981 └─────────────────────┴───────────────┴─────────────────┴──────────────────┘
2982
2983 DCCP HEADER EXPRESSION
2984 dccp {sport | dport | type}
2985
2986 Table 50. DCCP header expression
2987 ┌────────┬──────────────────┬──────────────┐
2988 │Keyword │ Description │ Type │
2989 ├────────┼──────────────────┼──────────────┤
2990 │ │ │ │
2991 │sport │ Source port │ inet_service │
2992 ├────────┼──────────────────┼──────────────┤
2993 │ │ │ │
2994 │dport │ Destination port │ inet_service │
2995 ├────────┼──────────────────┼──────────────┤
2996 │ │ │ │
2997 │type │ Packet type │ dccp_pkttype │
2998 └────────┴──────────────────┴──────────────┘
2999
3000 AUTHENTICATION HEADER EXPRESSION
3001 ah {nexthdr | hdrlength | reserved | spi | sequence}
3002
3003 Table 51. AH header expression
3004 ┌──────────┬────────────────────┬──────────────────┐
3005 │Keyword │ Description │ Type │
3006 ├──────────┼────────────────────┼──────────────────┤
3007 │ │ │ │
3008 │nexthdr │ Next header │ inet_proto │
3009 │ │ protocol │ │
3010 ├──────────┼────────────────────┼──────────────────┤
3011 │ │ │ │
3012 │hdrlength │ AH Header length │ integer (8 bit) │
3013 ├──────────┼────────────────────┼──────────────────┤
3014 │ │ │ │
3015 │reserved │ Reserved area │ integer (16 bit) │
3016 ├──────────┼────────────────────┼──────────────────┤
3017 │ │ │ │
3018 │spi │ Security Parameter │ integer (32 bit) │
3019 │ │ Index │ │
3020 ├──────────┼────────────────────┼──────────────────┤
3021 │ │ │ │
3022 │sequence │ Sequence number │ integer (32 bit) │
3023 └──────────┴────────────────────┴──────────────────┘
3024
3025 ENCRYPTED SECURITY PAYLOAD HEADER EXPRESSION
3026 esp {spi | sequence}
3027
3028 Table 52. ESP header expression
3029 ┌─────────┬────────────────────┬──────────────────┐
3030 │Keyword │ Description │ Type │
3031 ├─────────┼────────────────────┼──────────────────┤
3032 │ │ │ │
3033 │spi │ Security Parameter │ integer (32 bit) │
3034 │ │ Index │ │
3035 ├─────────┼────────────────────┼──────────────────┤
3036 │ │ │ │
3037 │sequence │ Sequence number │ integer (32 bit) │
3038 └─────────┴────────────────────┴──────────────────┘
3039
3040 IPCOMP HEADER EXPRESSION
3041 comp {nexthdr | flags | cpi}
3042
3043 Table 53. IPComp header expression
3044 ┌────────┬─────────────────┬──────────────────┐
3045 │Keyword │ Description │ Type │
3046 ├────────┼─────────────────┼──────────────────┤
3047 │ │ │ │
3048 │nexthdr │ Next header │ inet_proto │
3049 │ │ protocol │ │
3050 ├────────┼─────────────────┼──────────────────┤
3051 │ │ │ │
3052 │flags │ Flags │ bitmask │
3053 ├────────┼─────────────────┼──────────────────┤
3054 │ │ │ │
3055 │cpi │ compression │ integer (16 bit) │
3056 │ │ Parameter Index │ │
3057 └────────┴─────────────────┴──────────────────┘
3058
3059 RAW PAYLOAD EXPRESSION
3060 @base,offset,length
3061
3062 The raw payload expression instructs to load length bits starting at
3063 offset bits. Bit 0 refers to the very first bit — in the C programming
3064 language, this corresponds to the topmost bit, i.e. 0x80 in case of an
3065 octet. They are useful to match headers that do not have a
3066 human-readable template expression yet. Note that nft will not add
3067 dependencies for Raw payload expressions. If you e.g. want to match
3068 protocol fields of a transport header with protocol number 5, you need
3069 to manually exclude packets that have a different transport header, for
3070 instance by using meta l4proto 5 before the raw expression.
3071
3072 Table 54. Supported payload protocol bases
3073 ┌─────┬─────────────────────────┐
3074 │Base │ Description │
3075 ├─────┼─────────────────────────┤
3076 │ │ │
3077 │ll │ Link layer, for example │
3078 │ │ the Ethernet header │
3079 ├─────┼─────────────────────────┤
3080 │ │ │
3081 │nh │ Network header, for │
3082 │ │ example IPv4 or IPv6 │
3083 ├─────┼─────────────────────────┤
3084 │ │ │
3085 │th │ Transport Header, for │
3086 │ │ example TCP │
3087 └─────┴─────────────────────────┘
3088
3089 Matching destination port of both UDP and TCP.
3090
3091 inet filter input meta l4proto {tcp, udp} @th,16,16 { 53, 80 }
3092
3093 The above can also be written as
3094
3095 inet filter input meta l4proto {tcp, udp} th dport { 53, 80 }
3096
3097 it is more convenient, but like the raw expression notation no
3098 dependencies are created or checked. It is the users responsibility to
3099 restrict matching to those header types that have a notion of ports.
3100 Otherwise, rules using raw expressions will errnously match unrelated
3101 packets, e.g. mis-interpreting ESP packets SPI field as a port.
3102
3103 Rewrite arp packet target hardware address if target protocol address
3104 matches a given address.
3105
3106 input meta iifname enp2s0 arp ptype 0x0800 arp htype 1 arp hlen 6 arp plen 4 @nh,192,32 0xc0a88f10 @nh,144,48 set 0x112233445566 accept
3107
3108
3109 EXTENSION HEADER EXPRESSIONS
3110 Extension header expressions refer to data from variable-sized protocol
3111 headers, such as IPv6 extension headers, TCP options and IPv4 options.
3112
3113 nftables currently supports matching (finding) a given ipv6 extension
3114 header, TCP option or IPv4 option.
3115
3116 hbh {nexthdr | hdrlength}
3117 frag {nexthdr | frag-off | more-fragments | id}
3118 rt {nexthdr | hdrlength | type | seg-left}
3119 dst {nexthdr | hdrlength}
3120 mh {nexthdr | hdrlength | checksum | type}
3121 srh {flags | tag | sid | seg-left}
3122 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp} tcp_option_field
3123 ip option { lsrr | ra | rr | ssrr } ip_option_field
3124
3125 The following syntaxes are valid only in a relational expression with
3126 boolean type on right-hand side for checking header existence only:
3127
3128 exthdr {hbh | frag | rt | dst | mh}
3129 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp}
3130 ip option { lsrr | ra | rr | ssrr }
3131
3132 Table 55. IPv6 extension headers
3133 ┌────────┬────────────────────────┐
3134 │Keyword │ Description │
3135 ├────────┼────────────────────────┤
3136 │ │ │
3137 │hbh │ Hop by Hop │
3138 ├────────┼────────────────────────┤
3139 │ │ │
3140 │rt │ Routing Header │
3141 ├────────┼────────────────────────┤
3142 │ │ │
3143 │frag │ Fragmentation header │
3144 ├────────┼────────────────────────┤
3145 │ │ │
3146 │dst │ dst options │
3147 ├────────┼────────────────────────┤
3148 │ │ │
3149 │mh │ Mobility Header │
3150 ├────────┼────────────────────────┤
3151 │ │ │
3152 │srh │ Segment Routing Header │
3153 └────────┴────────────────────────┘
3154
3155 Table 56. TCP Options
3156 ┌──────────┬─────────────────────┬─────────────────────┐
3157 │Keyword │ Description │ TCP option fields │
3158 ├──────────┼─────────────────────┼─────────────────────┤
3159 │ │ │ │
3160 │eol │ End if option list │ - │
3161 ├──────────┼─────────────────────┼─────────────────────┤
3162 │ │ │ │
3163 │nop │ 1 Byte TCP Nop │ - │
3164 │ │ padding option │ │
3165 ├──────────┼─────────────────────┼─────────────────────┤
3166 │ │ │ │
3167 │maxseg │ TCP Maximum Segment │ length, size │
3168 │ │ Size │ │
3169 ├──────────┼─────────────────────┼─────────────────────┤
3170 │ │ │ │
3171 │window │ TCP Window Scaling │ length, count │
3172 ├──────────┼─────────────────────┼─────────────────────┤
3173 │ │ │ │
3174 │sack-perm │ TCP SACK permitted │ length │
3175 ├──────────┼─────────────────────┼─────────────────────┤
3176 │ │ │ │
3177 │sack │ TCP Selective │ length, left, right │
3178 │ │ Acknowledgement │ │
3179 │ │ (alias of block 0) │ │
3180 ├──────────┼─────────────────────┼─────────────────────┤
3181 │ │ │ │
3182 │sack0 │ TCP Selective │ length, left, right │
3183 │ │ Acknowledgement │ │
3184 │ │ (block 0) │ │
3185 ├──────────┼─────────────────────┼─────────────────────┤
3186 │ │ │ │
3187 │sack1 │ TCP Selective │ length, left, right │
3188 │ │ Acknowledgement │ │
3189 │ │ (block 1) │ │
3190 ├──────────┼─────────────────────┼─────────────────────┤
3191 │ │ │ │
3192 │sack2 │ TCP Selective │ length, left, right │
3193 │ │ Acknowledgement │ │
3194 │ │ (block 2) │ │
3195 ├──────────┼─────────────────────┼─────────────────────┤
3196 │ │ │ │
3197 │sack3 │ TCP Selective │ length, left, right │
3198 │ │ Acknowledgement │ │
3199 │ │ (block 3) │ │
3200 ├──────────┼─────────────────────┼─────────────────────┤
3201 │ │ │ │
3202 │timestamp │ TCP Timestamps │ length, tsval, │
3203 │ │ │ tsecr │
3204 └──────────┴─────────────────────┴─────────────────────┘
3205
3206 TCP option matching also supports raw expression syntax to access
3207 arbitrary options:
3208
3209 tcp option
3210
3211 tcp option @number,offset,length
3212
3213 Table 57. IP Options
3214 ┌────────┬─────────────────────┬─────────────────────┐
3215 │Keyword │ Description │ IP option fields │
3216 ├────────┼─────────────────────┼─────────────────────┤
3217 │ │ │ │
3218 │lsrr │ Loose Source Route │ type, length, ptr, │
3219 │ │ │ addr │
3220 ├────────┼─────────────────────┼─────────────────────┤
3221 │ │ │ │
3222 │ra │ Router Alert │ type, length, value │
3223 ├────────┼─────────────────────┼─────────────────────┤
3224 │ │ │ │
3225 │rr │ Record Route │ type, length, ptr, │
3226 │ │ │ addr │
3227 ├────────┼─────────────────────┼─────────────────────┤
3228 │ │ │ │
3229 │ssrr │ Strict Source Route │ type, length, ptr, │
3230 │ │ │ addr │
3231 └────────┴─────────────────────┴─────────────────────┘
3232
3233 finding TCP options.
3234
3235 filter input tcp option sack-perm exists counter
3236
3237 matching TCP options.
3238
3239 filter input tcp option maxseg size lt 536
3240
3241 matching IPv6 exthdr.
3242
3243 ip6 filter input frag more-fragments 1 counter
3244
3245 finding IP option.
3246
3247 filter input ip option lsrr exists counter
3248
3249
3250 CONNTRACK EXPRESSIONS
3251 Conntrack expressions refer to meta data of the connection tracking
3252 entry associated with a packet.
3253
3254 There are three types of conntrack expressions. Some conntrack
3255 expressions require the flow direction before the conntrack key, others
3256 must be used directly because they are direction agnostic. The packets,
3257 bytes and avgpkt keywords can be used with or without a direction. If
3258 the direction is omitted, the sum of the original and the reply
3259 direction is returned. The same is true for the zone, if a direction is
3260 given, the zone is only matched if the zone id is tied to the given
3261 direction.
3262
3263 ct {state | direction | status | mark | expiration | helper | label | count | id}
3264 ct [original | reply] {l3proto | protocol | bytes | packets | avgpkt | zone}
3265 ct {original | reply} {proto-src | proto-dst}
3266 ct {original | reply} {ip | ip6} {saddr | daddr}
3267
3268 The conntrack-specific types in this table are described in the
3269 sub-section CONNTRACK TYPES above.
3270
3271 Table 58. Conntrack expressions
3272 ┌───────────┬─────────────────────┬─────────────────────┐
3273 │Keyword │ Description │ Type │
3274 ├───────────┼─────────────────────┼─────────────────────┤
3275 │ │ │ │
3276 │state │ State of the │ ct_state │
3277 │ │ connection │ │
3278 ├───────────┼─────────────────────┼─────────────────────┤
3279 │ │ │ │
3280 │direction │ Direction of the │ ct_dir │
3281 │ │ packet relative to │ │
3282 │ │ the connection │ │
3283 ├───────────┼─────────────────────┼─────────────────────┤
3284 │ │ │ │
3285 │status │ Status of the │ ct_status │
3286 │ │ connection │ │
3287 ├───────────┼─────────────────────┼─────────────────────┤
3288 │ │ │ │
3289 │mark │ Connection mark │ mark │
3290 ├───────────┼─────────────────────┼─────────────────────┤
3291 │ │ │ │
3292 │expiration │ Connection │ time │
3293 │ │ expiration time │ │
3294 ├───────────┼─────────────────────┼─────────────────────┤
3295 │ │ │ │
3296 │helper │ Helper associated │ string │
3297 │ │ with the connection │ │
3298 ├───────────┼─────────────────────┼─────────────────────┤
3299 │ │ │ │
3300 │label │ Connection tracking │ ct_label │
3301 │ │ label bit or │ │
3302 │ │ symbolic name │ │
3303 │ │ defined in │ │
3304 │ │ connlabel.conf in │ │
3305 │ │ the nftables │ │
3306 │ │ include path │ │
3307 ├───────────┼─────────────────────┼─────────────────────┤
3308 │ │ │ │
3309 │l3proto │ Layer 3 protocol of │ nf_proto │
3310 │ │ the connection │ │
3311 ├───────────┼─────────────────────┼─────────────────────┤
3312 │ │ │ │
3313 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
3314 │ │ the connection for │ │
3315 │ │ the given direction │ │
3316 ├───────────┼─────────────────────┼─────────────────────┤
3317 │ │ │ │
3318 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
3319 │ │ of the connection │ │
3320 │ │ for the given │ │
3321 │ │ direction │ │
3322 ├───────────┼─────────────────────┼─────────────────────┤
3323 │ │ │ │
3324 │protocol │ Layer 4 protocol of │ inet_proto │
3325 │ │ the connection for │ │
3326 │ │ the given direction │ │
3327 ├───────────┼─────────────────────┼─────────────────────┤
3328 │ │ │ │
3329 │proto-src │ Layer 4 protocol │ integer (16 bit) │
3330 │ │ source for the │ │
3331 │ │ given direction │ │
3332 ├───────────┼─────────────────────┼─────────────────────┤
3333 │ │ │ │
3334 │proto-dst │ Layer 4 protocol │ integer (16 bit) │
3335 │ │ destination for the │ │
3336 │ │ given direction │ │
3337 ├───────────┼─────────────────────┼─────────────────────┤
3338 │ │ │ │
3339 │packets │ packet count seen │ integer (64 bit) │
3340 │ │ in the given │ │
3341 │ │ direction or sum of │ │
3342 │ │ original and reply │ │
3343 ├───────────┼─────────────────────┼─────────────────────┤
3344 │ │ │ │
3345 │bytes │ byte count seen, │ integer (64 bit) │
3346 │ │ see description for │ │
3347 │ │ packets keyword │ │
3348 ├───────────┼─────────────────────┼─────────────────────┤
3349 │ │ │ │
3350 │avgpkt │ average bytes per │ integer (64 bit) │
3351 │ │ packet, see │ │
3352 │ │ description for │ │
3353 │ │ packets keyword │ │
3354 ├───────────┼─────────────────────┼─────────────────────┤
3355 │ │ │ │
3356 │zone │ conntrack zone │ integer (16 bit) │
3357 ├───────────┼─────────────────────┼─────────────────────┤
3358 │ │ │ │
3359 │count │ number of current │ integer (32 bit) │
3360 │ │ connections │ │
3361 ├───────────┼─────────────────────┼─────────────────────┤
3362 │ │ │ │
3363 │id │ Connection id │ ct_id │
3364 └───────────┴─────────────────────┴─────────────────────┘
3365
3366 restrict the number of parallel connections to a server.
3367
3368 nft add set filter ssh_flood '{ type ipv4_addr; flags dynamic; }'
3369 nft add rule filter input tcp dport 22 add @ssh_flood '{ ip saddr ct count over 2 }' reject
3370
3371
3373 Statements represent actions to be performed. They can alter control
3374 flow (return, jump to a different chain, accept or drop the packet) or
3375 can perform actions, such as logging, rejecting a packet, etc.
3376
3377 Statements exist in two kinds. Terminal statements unconditionally
3378 terminate evaluation of the current rule, non-terminal statements
3379 either only conditionally or never terminate evaluation of the current
3380 rule, in other words, they are passive from the ruleset evaluation
3381 perspective. There can be an arbitrary amount of non-terminal
3382 statements in a rule, but only a single terminal statement as the final
3383 statement.
3384
3385 VERDICT STATEMENT
3386 The verdict statement alters control flow in the ruleset and issues
3387 policy decisions for packets.
3388
3389 {accept | drop | queue | continue | return}
3390 {jump | goto} chain
3391
3392 accept and drop are absolute verdicts — they terminate ruleset
3393 evaluation immediately.
3394
3395
3396 accept Terminate ruleset
3397 evaluation and accept the
3398 packet. The packet can
3399 still be dropped later by
3400 another hook, for instance
3401 accept in the forward hook
3402 still allows to drop the
3403 packet later in the
3404 postrouting hook, or
3405 another forward base chain
3406 that has a higher priority
3407 number and is evaluated
3408 afterwards in the
3409 processing pipeline.
3410
3411 drop Terminate ruleset
3412 evaluation and drop the
3413 packet. The drop occurs
3414 instantly, no further
3415 chains or hooks are
3416 evaluated. It is not
3417 possible to accept the
3418 packet in a later chain
3419 again, as those are not
3420 evaluated anymore for the
3421 packet.
3422
3423 queue Terminate ruleset
3424 evaluation and queue the
3425 packet to userspace.
3426 Userspace must provide a
3427 drop or accept verdict. In
3428 case of accept, processing
3429 resumes with the next base
3430 chain hook, not the rule
3431 following the queue
3432 verdict.
3433
3434 continue Continue ruleset
3435 evaluation with the next
3436 rule. This is the default
3437 behaviour in case a rule
3438 issues no verdict.
3439
3440 return Return from the current
3441 chain and continue
3442 evaluation at the next
3443 rule in the last chain. If
3444 issued in a base chain, it
3445 is equivalent to the base
3446 chain policy.
3447
3448 jump chain Continue evaluation at the
3449 first rule in chain. The
3450 current position in the
3451 ruleset is pushed to a
3452 call stack and evaluation
3453 will continue there when
3454 the new chain is entirely
3455 evaluated or a return
3456 verdict is issued. In case
3457 an absolute verdict is
3458 issued by a rule in the
3459 chain, ruleset evaluation
3460 terminates immediately and
3461 the specific action is
3462 taken.
3463
3464 goto chain Similar to jump, but the
3465 current position is not
3466 pushed to the call stack,
3467 meaning that after the new
3468 chain evaluation will
3469 continue at the last chain
3470 instead of the one
3471 containing the goto
3472 statement.
3473
3474
3475 Using verdict statements.
3476
3477 # process packets from eth0 and the internal network in from_lan
3478 # chain, drop all packets from eth0 with different source addresses.
3479
3480 filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan
3481 filter input iif eth0 drop
3482
3483
3484 PAYLOAD STATEMENT
3485 payload_expression set value
3486
3487 The payload statement alters packet content. It can be used for example
3488 to set ip DSCP (diffserv) header field or ipv6 flow labels.
3489
3490 route some packets instead of bridging.
3491
3492 # redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging
3493 # assumes 00:11:22:33:44:55 is local MAC address.
3494 bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55
3495
3496 Set IPv4 DSCP header field.
3497
3498 ip forward ip dscp set 42
3499
3500
3501 EXTENSION HEADER STATEMENT
3502 extension_header_expression set value
3503
3504 The extension header statement alters packet content in variable-sized
3505 headers. This can currently be used to alter the TCP Maximum segment
3506 size of packets, similar to the TCPMSS target in iptables.
3507
3508 change tcp mss.
3509
3510 tcp flags syn tcp option maxseg size set 1360
3511 # set a size based on route information:
3512 tcp flags syn tcp option maxseg size set rt mtu
3513
3514 You can also remove tcp options via reset keyword.
3515
3516 remove tcp option.
3517
3518 tcp flags syn reset tcp option sack-perm
3519
3520
3521 LOG STATEMENT
3522 log [prefix quoted_string] [level syslog-level] [flags log-flags]
3523 log group nflog_group [prefix quoted_string] [queue-threshold value] [snaplen size]
3524 log level audit
3525
3526 The log statement enables logging of matching packets. When this
3527 statement is used from a rule, the Linux kernel will print some
3528 information on all matching packets, such as header fields, via the
3529 kernel log (where it can be read with dmesg(1) or read in the syslog).
3530
3531 In the second form of invocation (if nflog_group is specified), the
3532 Linux kernel will pass the packet to nfnetlink_log which will send the
3533 log through a netlink socket to the specified group. One userspace
3534 process may subscribe to the group to receive the logs, see man(8)
3535 ulogd for the Netfilter userspace log daemon and libnetfilter_log
3536 documentation for details in case you would like to develop a custom
3537 program to digest your logs.
3538
3539 In the third form of invocation (if level audit is specified), the
3540 Linux kernel writes a message into the audit buffer suitably formatted
3541 for reading with auditd. Therefore no further formatting options (such
3542 as prefix or flags) are allowed in this mode.
3543
3544 This is a non-terminating statement, so the rule evaluation continues
3545 after the packet is logged.
3546
3547 Table 59. log statement options
3548 ┌────────────────┬─────────────────────┬───────────────────┐
3549 │Keyword │ Description │ Type │
3550 ├────────────────┼─────────────────────┼───────────────────┤
3551 │ │ │ │
3552 │prefix │ Log message prefix │ quoted string │
3553 ├────────────────┼─────────────────────┼───────────────────┤
3554 │ │ │ │
3555 │level │ Syslog level of │ string: emerg, │
3556 │ │ logging │ alert, crit, err, │
3557 │ │ │ warn [default], │
3558 │ │ │ notice, info, │
3559 │ │ │ debug, audit │
3560 ├────────────────┼─────────────────────┼───────────────────┤
3561 │ │ │ │
3562 │group │ NFLOG group to send │ unsigned integer │
3563 │ │ messages to │ (16 bit) │
3564 ├────────────────┼─────────────────────┼───────────────────┤
3565 │ │ │ │
3566 │snaplen │ Length of packet │ unsigned integer │
3567 │ │ payload to include │ (32 bit) │
3568 │ │ in netlink message │ │
3569 ├────────────────┼─────────────────────┼───────────────────┤
3570 │ │ │ │
3571 │queue-threshold │ Number of packets │ unsigned integer │
3572 │ │ to queue inside the │ (32 bit) │
3573 │ │ kernel before │ │
3574 │ │ sending them to │ │
3575 │ │ userspace │ │
3576 └────────────────┴─────────────────────┴───────────────────┘
3577
3578 Table 60. log-flags
3579 ┌─────────────┬───────────────────────────┐
3580 │Flag │ Description │
3581 ├─────────────┼───────────────────────────┤
3582 │ │ │
3583 │tcp sequence │ Log TCP sequence numbers. │
3584 ├─────────────┼───────────────────────────┤
3585 │ │ │
3586 │tcp options │ Log options from the TCP │
3587 │ │ packet header. │
3588 ├─────────────┼───────────────────────────┤
3589 │ │ │
3590 │ip options │ Log options from the │
3591 │ │ IP/IPv6 packet header. │
3592 ├─────────────┼───────────────────────────┤
3593 │ │ │
3594 │skuid │ Log the userid of the │
3595 │ │ process which generated │
3596 │ │ the packet. │
3597 ├─────────────┼───────────────────────────┤
3598 │ │ │
3599 │ether │ Decode MAC addresses and │
3600 │ │ protocol. │
3601 ├─────────────┼───────────────────────────┤
3602 │ │ │
3603 │all │ Enable all log flags │
3604 │ │ listed above. │
3605 └─────────────┴───────────────────────────┘
3606
3607 Using log statement.
3608
3609 # log the UID which generated the packet and ip options
3610 ip filter output log flags skuid flags ip options
3611
3612 # log the tcp sequence numbers and tcp options from the TCP packet
3613 ip filter output log flags tcp sequence,options
3614
3615 # enable all supported log flags
3616 ip6 filter output log flags all
3617
3618
3619 REJECT STATEMENT
3620 reject [ with REJECT_WITH ]
3621
3622 REJECT_WITH := icmp icmp_code |
3623 icmpv6 icmpv6_code |
3624 icmpx icmpx_code |
3625 tcp reset
3626
3627 A reject statement is used to send back an error packet in response to
3628 the matched packet otherwise it is equivalent to drop so it is a
3629 terminating statement, ending rule traversal. This statement is only
3630 valid in base chains using the input, forward or output hooks, and
3631 user-defined chains which are only called from those chains.
3632
3633 Table 61. different ICMP reject variants are meant for use in different
3634 table families
3635 ┌────────┬────────┬─────────────┐
3636 │Variant │ Family │ Type │
3637 ├────────┼────────┼─────────────┤
3638 │ │ │ │
3639 │icmp │ ip │ icmp_code │
3640 ├────────┼────────┼─────────────┤
3641 │ │ │ │
3642 │icmpv6 │ ip6 │ icmpv6_code │
3643 ├────────┼────────┼─────────────┤
3644 │ │ │ │
3645 │icmpx │ inet │ icmpx_code │
3646 └────────┴────────┴─────────────┘
3647
3648 For a description of the different types and a list of supported
3649 keywords refer to DATA TYPES section above. The common default reject
3650 value is port-unreachable.
3651
3652 Note that in bridge family, reject statement is only allowed in base
3653 chains which hook into input or prerouting.
3654
3655 COUNTER STATEMENT
3656 A counter statement sets the hit count of packets along with the number
3657 of bytes.
3658
3659 counter packets number bytes number
3660 counter { packets number | bytes number }
3661
3662 CONNTRACK STATEMENT
3663 The conntrack statement can be used to set the conntrack mark and
3664 conntrack labels.
3665
3666 ct {mark | event | label | zone} set value
3667
3668 The ct statement sets meta data associated with a connection. The zone
3669 id has to be assigned before a conntrack lookup takes place, i.e. this
3670 has to be done in prerouting and possibly output (if locally generated
3671 packets need to be placed in a distinct zone), with a hook priority of
3672 raw (-300).
3673
3674 Unlike iptables, where the helper assignment happens in the raw table,
3675 the helper needs to be assigned after a conntrack entry has been found,
3676 i.e. it will not work when used with hook priorities equal or before
3677 -200.
3678
3679 Table 62. Conntrack statement types
3680 ┌────────┬─────────────────────┬──────────────────┐
3681 │Keyword │ Description │ Value │
3682 ├────────┼─────────────────────┼──────────────────┤
3683 │ │ │ │
3684 │event │ conntrack event │ bitmask, integer │
3685 │ │ bits │ (32 bit) │
3686 ├────────┼─────────────────────┼──────────────────┤
3687 │ │ │ │
3688 │helper │ name of ct helper │ quoted string │
3689 │ │ object to assign to │ │
3690 │ │ the connection │ │
3691 ├────────┼─────────────────────┼──────────────────┤
3692 │ │ │ │
3693 │mark │ Connection tracking │ mark │
3694 │ │ mark │ │
3695 ├────────┼─────────────────────┼──────────────────┤
3696 │ │ │ │
3697 │label │ Connection tracking │ label │
3698 │ │ label │ │
3699 ├────────┼─────────────────────┼──────────────────┤
3700 │ │ │ │
3701 │zone │ conntrack zone │ integer (16 bit) │
3702 └────────┴─────────────────────┴──────────────────┘
3703
3704 save packet nfmark in conntrack.
3705
3706 ct mark set meta mark
3707
3708 set zone mapped via interface.
3709
3710 table inet raw {
3711 chain prerouting {
3712 type filter hook prerouting priority raw;
3713 ct zone set iif map { "eth1" : 1, "veth1" : 2 }
3714 }
3715 chain output {
3716 type filter hook output priority raw;
3717 ct zone set oif map { "eth1" : 1, "veth1" : 2 }
3718 }
3719 }
3720
3721 restrict events reported by ctnetlink.
3722
3723 ct event set new,related,destroy
3724
3725
3726 NOTRACK STATEMENT
3727 The notrack statement allows to disable connection tracking for certain
3728 packets.
3729
3730 notrack
3731
3732 Note that for this statement to be effective, it has to be applied to
3733 packets before a conntrack lookup happens. Therefore, it needs to sit
3734 in a chain with either prerouting or output hook and a hook priority of
3735 -300 (raw) or less.
3736
3737 See SYNPROXY STATEMENT for an example usage.
3738
3739 META STATEMENT
3740 A meta statement sets the value of a meta expression. The existing meta
3741 fields are: priority, mark, pkttype, nftrace.
3742
3743 meta {mark | priority | pkttype | nftrace} set value
3744
3745 A meta statement sets meta data associated with a packet.
3746
3747 Table 63. Meta statement types
3748 ┌─────────┬─────────────────────┬───────────┐
3749 │Keyword │ Description │ Value │
3750 ├─────────┼─────────────────────┼───────────┤
3751 │ │ │ │
3752 │priority │ TC packet priority │ tc_handle │
3753 ├─────────┼─────────────────────┼───────────┤
3754 │ │ │ │
3755 │mark │ Packet mark │ mark │
3756 ├─────────┼─────────────────────┼───────────┤
3757 │ │ │ │
3758 │pkttype │ packet type │ pkt_type │
3759 ├─────────┼─────────────────────┼───────────┤
3760 │ │ │ │
3761 │nftrace │ ruleset packet │ 0, 1 │
3762 │ │ tracing on/off. Use │ │
3763 │ │ monitor trace │ │
3764 │ │ command to watch │ │
3765 │ │ traces │ │
3766 └─────────┴─────────────────────┴───────────┘
3767
3768 LIMIT STATEMENT
3769 limit rate [over] packet_number / TIME_UNIT [burst packet_number packets]
3770 limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT]
3771
3772 TIME_UNIT := second | minute | hour | day
3773 BYTE_UNIT := bytes | kbytes | mbytes
3774
3775 A limit statement matches at a limited rate using a token bucket
3776 filter. A rule using this statement will match until this limit is
3777 reached. It can be used in combination with the log statement to give
3778 limited logging. The optional over keyword makes it match over the
3779 specified rate. Default burst is 5. if you specify burst, it must be
3780 non-zero value.
3781
3782 Table 64. limit statement values
3783 ┌──────────────┬───────────────────┬──────────────────┐
3784 │Value │ Description │ Type │
3785 ├──────────────┼───────────────────┼──────────────────┤
3786 │ │ │ │
3787 │packet_number │ Number of packets │ unsigned integer │
3788 │ │ │ (32 bit) │
3789 ├──────────────┼───────────────────┼──────────────────┤
3790 │ │ │ │
3791 │byte_number │ Number of bytes │ unsigned integer │
3792 │ │ │ (32 bit) │
3793 └──────────────┴───────────────────┴──────────────────┘
3794
3795 NAT STATEMENTS
3796 snat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
3797 dnat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
3798 masquerade [to :PORT_SPEC] [FLAGS]
3799 redirect [to :PORT_SPEC] [FLAGS]
3800
3801 ADDR_SPEC := address | address - address
3802 PORT_SPEC := port | port - port
3803
3804 FLAGS := FLAG [, FLAGS]
3805 FLAG := persistent | random | fully-random
3806
3807 The nat statements are only valid from nat chain types.
3808
3809 The snat and masquerade statements specify that the source address of
3810 the packet should be modified. While snat is only valid in the
3811 postrouting and input chains, masquerade makes sense only in
3812 postrouting. The dnat and redirect statements are only valid in the
3813 prerouting and output chains, they specify that the destination address
3814 of the packet should be modified. You can use non-base chains which are
3815 called from base chains of nat chain type too. All future packets in
3816 this connection will also be mangled, and rules should cease being
3817 examined.
3818
3819 The masquerade statement is a special form of snat which always uses
3820 the outgoing interface’s IP address to translate to. It is particularly
3821 useful on gateways with dynamic (public) IP addresses.
3822
3823 The redirect statement is a special form of dnat which always
3824 translates the destination address to the local host’s one. It comes in
3825 handy if one only wants to alter the destination port of incoming
3826 traffic on different interfaces.
3827
3828 When used in the inet family (available with kernel 5.2), the dnat and
3829 snat statements require the use of the ip and ip6 keyword in case an
3830 address is provided, see the examples below.
3831
3832 Before kernel 4.18 nat statements require both prerouting and
3833 postrouting base chains to be present since otherwise packets on the
3834 return path won’t be seen by netfilter and therefore no reverse
3835 translation will take place.
3836
3837 Table 65. NAT statement values
3838 ┌───────────┬─────────────────────┬─────────────────────┐
3839 │Expression │ Description │ Type │
3840 ├───────────┼─────────────────────┼─────────────────────┤
3841 │ │ │ │
3842 │address │ Specifies that the │ ipv4_addr, │
3843 │ │ source/destination │ ipv6_addr, e.g. │
3844 │ │ address of the │ abcd::1234, or you │
3845 │ │ packet should be │ can use a mapping, │
3846 │ │ modified. You may │ e.g. meta mark map │
3847 │ │ specify a mapping │ { 10 : 192.168.1.2, │
3848 │ │ to relate a list of │ 20 : 192.168.1.3 } │
3849 │ │ tuples composed of │ │
3850 │ │ arbitrary │ │
3851 │ │ expression key with │ │
3852 │ │ address value. │ │
3853 ├───────────┼─────────────────────┼─────────────────────┤
3854 │ │ │ │
3855 │port │ Specifies that the │ port number (16 │
3856 │ │ source/destination │ bit) │
3857 │ │ address of the │ │
3858 │ │ packet should be │ │
3859 │ │ modified. │ │
3860 └───────────┴─────────────────────┴─────────────────────┘
3861
3862 Table 66. NAT statement flags
3863 ┌─────────────┬─────────────────────────────┐
3864 │Flag │ Description │
3865 ├─────────────┼─────────────────────────────┤
3866 │ │ │
3867 │persistent │ Gives a client the same │
3868 │ │ source-/destination-address │
3869 │ │ for each connection. │
3870 ├─────────────┼─────────────────────────────┤
3871 │ │ │
3872 │random │ In kernel 5.0 and newer │
3873 │ │ this is the same as │
3874 │ │ fully-random. In earlier │
3875 │ │ kernels the port mapping │
3876 │ │ will be randomized using a │
3877 │ │ seeded MD5 hash mix using │
3878 │ │ source and destination │
3879 │ │ address and destination │
3880 │ │ port. │
3881 ├─────────────┼─────────────────────────────┤
3882 │ │ │
3883 │fully-random │ If used then port mapping │
3884 │ │ is generated based on a │
3885 │ │ 32-bit pseudo-random │
3886 │ │ algorithm. │
3887 └─────────────┴─────────────────────────────┘
3888
3889 Using NAT statements.
3890
3891 # create a suitable table/chain setup for all further examples
3892 add table nat
3893 add chain nat prerouting { type nat hook prerouting priority dstnat; }
3894 add chain nat postrouting { type nat hook postrouting priority srcnat; }
3895
3896 # translate source addresses of all packets leaving via eth0 to address 1.2.3.4
3897 add rule nat postrouting oif eth0 snat to 1.2.3.4
3898
3899 # redirect all traffic entering via eth0 to destination address 192.168.1.120
3900 add rule nat prerouting iif eth0 dnat to 192.168.1.120
3901
3902 # translate source addresses of all packets leaving via eth0 to whatever
3903 # locally generated packets would use as source to reach the same destination
3904 add rule nat postrouting oif eth0 masquerade
3905
3906 # redirect incoming TCP traffic for port 22 to port 2222
3907 add rule nat prerouting tcp dport 22 redirect to :2222
3908
3909 # inet family:
3910 # handle ip dnat:
3911 add rule inet nat prerouting dnat ip to 10.0.2.99
3912 # handle ip6 dnat:
3913 add rule inet nat prerouting dnat ip6 to fe80::dead
3914 # this masquerades both ipv4 and ipv6:
3915 add rule inet nat postrouting meta oif ppp0 masquerade
3916
3917
3918 TPROXY STATEMENT
3919 Tproxy redirects the packet to a local socket without changing the
3920 packet header in any way. If any of the arguments is missing the data
3921 of the incoming packet is used as parameter. Tproxy matching requires
3922 another rule that ensures the presence of transport protocol header is
3923 specified.
3924
3925 tproxy to address:port
3926 tproxy to {address | :port}
3927
3928 This syntax can be used in ip/ip6 tables where network layer protocol
3929 is obvious. Either IP address or port can be specified, but at least
3930 one of them is necessary.
3931
3932 tproxy {ip | ip6} to address[:port]
3933 tproxy to :port
3934
3935 This syntax can be used in inet tables. The ip/ip6 parameter defines
3936 the family the rule will match. The address parameter must be of this
3937 family. When only port is defined, the address family should not be
3938 specified. In this case the rule will match for both families.
3939
3940 Table 67. tproxy attributes
3941 ┌────────┬────────────────────────────┐
3942 │Name │ Description │
3943 ├────────┼────────────────────────────┤
3944 │ │ │
3945 │address │ IP address the listening │
3946 │ │ socket with IP_TRANSPARENT │
3947 │ │ option is bound to. │
3948 ├────────┼────────────────────────────┤
3949 │ │ │
3950 │port │ Port the listening socket │
3951 │ │ with IP_TRANSPARENT option │
3952 │ │ is bound to. │
3953 └────────┴────────────────────────────┘
3954
3955 Example ruleset for tproxy statement.
3956
3957 table ip x {
3958 chain y {
3959 type filter hook prerouting priority mangle; policy accept;
3960 tcp dport ntp tproxy to 1.1.1.1
3961 udp dport ssh tproxy to :2222
3962 }
3963 }
3964 table ip6 x {
3965 chain y {
3966 type filter hook prerouting priority mangle; policy accept;
3967 tcp dport ntp tproxy to [dead::beef]
3968 udp dport ssh tproxy to :2222
3969 }
3970 }
3971 table inet x {
3972 chain y {
3973 type filter hook prerouting priority mangle; policy accept;
3974 tcp dport 321 tproxy to :ssh
3975 tcp dport 99 tproxy ip to 1.1.1.1:999
3976 udp dport 155 tproxy ip6 to [dead::beef]:smux
3977 }
3978 }
3979
3980
3981 SYNPROXY STATEMENT
3982 This statement will process TCP three-way-handshake parallel in
3983 netfilter context to protect either local or backend system. This
3984 statement requires connection tracking because sequence numbers need to
3985 be translated.
3986
3987 synproxy [mss mss_value] [wscale wscale_value] [SYNPROXY_FLAGS]
3988
3989 Table 68. synproxy statement attributes
3990 ┌───────┬────────────────────────────┐
3991 │Name │ Description │
3992 ├───────┼────────────────────────────┤
3993 │ │ │
3994 │mss │ Maximum segment size │
3995 │ │ announced to clients. This │
3996 │ │ must match the backend. │
3997 ├───────┼────────────────────────────┤
3998 │ │ │
3999 │wscale │ Window scale announced to │
4000 │ │ clients. This must match │
4001 │ │ the backend. │
4002 └───────┴────────────────────────────┘
4003
4004 Table 69. synproxy statement flags
4005 ┌──────────┬────────────────────────────┐
4006 │Flag │ Description │
4007 ├──────────┼────────────────────────────┤
4008 │ │ │
4009 │sack-perm │ Pass client selective │
4010 │ │ acknowledgement option to │
4011 │ │ backend (will be disabled │
4012 │ │ if not present). │
4013 ├──────────┼────────────────────────────┤
4014 │ │ │
4015 │timestamp │ Pass client timestamp │
4016 │ │ option to backend (will be │
4017 │ │ disabled if not present, │
4018 │ │ also needed for selective │
4019 │ │ acknowledgement and window │
4020 │ │ scaling). │
4021 └──────────┴────────────────────────────┘
4022
4023 Example ruleset for synproxy statement.
4024
4025 Determine tcp options used by backend, from an external system
4026
4027 tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)'
4028 port 80 &
4029 telnet 192.0.2.42 80
4030 18:57:24.693307 IP 192.0.2.42.80 > 192.0.2.43.48757:
4031 Flags [S.], seq 360414582, ack 788841994, win 14480,
4032 options [mss 1460,sackOK,
4033 TS val 1409056151 ecr 9690221,
4034 nop,wscale 9],
4035 length 0
4036
4037 Switch tcp_loose mode off, so conntrack will mark out-of-flow packets as state INVALID.
4038
4039 echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
4040
4041 Make SYN packets untracked.
4042
4043 table ip x {
4044 chain y {
4045 type filter hook prerouting priority raw; policy accept;
4046 tcp flags syn notrack
4047 }
4048 }
4049
4050 Catch UNTRACKED (SYN packets) and INVALID (3WHS ACK packets) states and send
4051 them to SYNPROXY. This rule will respond to SYN packets with SYN+ACK
4052 syncookies, create ESTABLISHED for valid client response (3WHS ACK packets) and
4053 drop incorrect cookies. Flags combinations not expected during 3WHS will not
4054 match and continue (e.g. SYN+FIN, SYN+ACK). Finally, drop invalid packets, this
4055 will be out-of-flow packets that were not matched by SYNPROXY.
4056
4057 table ip x {
4058 chain z {
4059 type filter hook input priority filter; policy accept;
4060 ct state invalid, untracked synproxy mss 1460 wscale 9 timestamp sack-perm
4061 ct state invalid drop
4062 }
4063 }
4064
4065
4066 FLOW STATEMENT
4067 A flow statement allows us to select what flows you want to accelerate
4068 forwarding through layer 3 network stack bypass. You have to specify
4069 the flowtable name where you want to offload this flow.
4070
4071 flow add @flowtable
4072
4073 QUEUE STATEMENT
4074 This statement passes the packet to userspace using the nfnetlink_queue
4075 handler. The packet is put into the queue identified by its 16-bit
4076 queue number. Userspace can inspect and modify the packet if desired.
4077 Userspace must then drop or re-inject the packet into the kernel. See
4078 libnetfilter_queue documentation for details.
4079
4080 queue [flags QUEUE_FLAGS] [to queue_number]
4081 queue [flags QUEUE_FLAGS] [to queue_number_from - queue_number_to]
4082 queue [flags QUEUE_FLAGS] [to QUEUE_EXPRESSION ]
4083
4084 QUEUE_FLAGS := QUEUE_FLAG [, QUEUE_FLAGS]
4085 QUEUE_FLAG := bypass | fanout
4086 QUEUE_EXPRESSION := numgen | hash | symhash | MAP STATEMENT
4087
4088 QUEUE_EXPRESSION can be used to compute a queue number at run-time with
4089 the hash or numgen expressions. It also allows to use the map statement
4090 to assign fixed queue numbers based on external inputs such as the
4091 source ip address or interface names.
4092
4093 Table 70. queue statement values
4094 ┌──────────────────┬────────────────────┬──────────────────┐
4095 │Value │ Description │ Type │
4096 ├──────────────────┼────────────────────┼──────────────────┤
4097 │ │ │ │
4098 │queue_number │ Sets queue number, │ unsigned integer │
4099 │ │ default is 0. │ (16 bit) │
4100 ├──────────────────┼────────────────────┼──────────────────┤
4101 │ │ │ │
4102 │queue_number_from │ Sets initial queue │ unsigned integer │
4103 │ │ in the range, if │ (16 bit) │
4104 │ │ fanout is used. │ │
4105 ├──────────────────┼────────────────────┼──────────────────┤
4106 │ │ │ │
4107 │queue_number_to │ Sets closing queue │ unsigned integer │
4108 │ │ in the range, if │ (16 bit) │
4109 │ │ fanout is used. │ │
4110 └──────────────────┴────────────────────┴──────────────────┘
4111
4112 Table 71. queue statement flags
4113 ┌───────┬────────────────────────────┐
4114 │Flag │ Description │
4115 ├───────┼────────────────────────────┤
4116 │ │ │
4117 │bypass │ Let packets go through if │
4118 │ │ userspace application │
4119 │ │ cannot back off. Before │
4120 │ │ using this flag, read │
4121 │ │ libnetfilter_queue │
4122 │ │ documentation for │
4123 │ │ performance tuning │
4124 │ │ recommendations. │
4125 ├───────┼────────────────────────────┤
4126 │ │ │
4127 │fanout │ Distribute packets between │
4128 │ │ several queues. │
4129 └───────┴────────────────────────────┘
4130
4131 DUP STATEMENT
4132 The dup statement is used to duplicate a packet and send the copy to a
4133 different destination.
4134
4135 dup to device
4136 dup to address device device
4137
4138 Table 72. Dup statement values
4139 ┌───────────┬─────────────────────┬─────────────────────┐
4140 │Expression │ Description │ Type │
4141 ├───────────┼─────────────────────┼─────────────────────┤
4142 │ │ │ │
4143 │address │ Specifies that the │ ipv4_addr, │
4144 │ │ copy of the packet │ ipv6_addr, e.g. │
4145 │ │ should be sent to a │ abcd::1234, or you │
4146 │ │ new gateway. │ can use a mapping, │
4147 │ │ │ e.g. ip saddr map { │
4148 │ │ │ 192.168.1.2 : │
4149 │ │ │ 10.1.1.1 } │
4150 ├───────────┼─────────────────────┼─────────────────────┤
4151 │ │ │ │
4152 │device │ Specifies that the │ string │
4153 │ │ copy should be │ │
4154 │ │ transmitted via │ │
4155 │ │ device. │ │
4156 └───────────┴─────────────────────┴─────────────────────┘
4157
4158 Using the dup statement.
4159
4160 # send to machine with ip address 10.2.3.4 on eth0
4161 ip filter forward dup to 10.2.3.4 device "eth0"
4162
4163 # copy raw frame to another interface
4164 netdev ingress dup to "eth0"
4165 dup to "eth0"
4166
4167 # combine with map dst addr to gateways
4168 dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" }
4169
4170
4171 FWD STATEMENT
4172 The fwd statement is used to redirect a raw packet to another
4173 interface. It is only available in the netdev family ingress and egress
4174 hooks. It is similar to the dup statement except that no copy is made.
4175
4176 fwd to device
4177
4178 SET STATEMENT
4179 The set statement is used to dynamically add or update elements in a
4180 set from the packet path. The set setname must already exist in the
4181 given table and must have been created with one or both of the dynamic
4182 and the timeout flags. The dynamic flag is required if the set
4183 statement expression includes a stateful object. The timeout flag is
4184 implied if the set is created with a timeout, and is required if the
4185 set statement updates elements, rather than adding them. Furthermore,
4186 these sets should specify both a maximum set size (to prevent memory
4187 exhaustion), and their elements should have a timeout (so their number
4188 will not grow indefinitely) either from the set definition or from the
4189 statement that adds or updates them. The set statement can be used to
4190 e.g. create dynamic blacklists.
4191
4192 {add | update} @setname { expression [timeout timeout] [comment string] }
4193
4194 Example for simple blacklist.
4195
4196 # declare a set, bound to table "filter", in family "ip".
4197 # Timeout and size are mandatory because we will add elements from packet path.
4198 # Entries will timeout after one minute, after which they might be
4199 # re-added if limit condition persists.
4200 nft add set ip filter blackhole \
4201 "{ type ipv4_addr; flags dynamic; timeout 1m; size 65536; }"
4202
4203 # declare a set to store the limit per saddr.
4204 # This must be separate from blackhole since the timeout is different
4205 nft add set ip filter flood \
4206 "{ type ipv4_addr; flags dynamic; timeout 10s; size 128000; }"
4207
4208 # whitelist internal interface.
4209 nft add rule ip filter input meta iifname "internal" accept
4210
4211 # drop packets coming from blacklisted ip addresses.
4212 nft add rule ip filter input ip saddr @blackhole counter drop
4213
4214 # add source ip addresses to the blacklist if more than 10 tcp connection
4215 # requests occurred per second and ip address.
4216 nft add rule ip filter input tcp flags syn tcp dport ssh \
4217 add @flood { ip saddr limit rate over 10/second } \
4218 add @blackhole { ip saddr } \
4219 drop
4220
4221 # inspect state of the sets.
4222 nft list set ip filter flood
4223 nft list set ip filter blackhole
4224
4225 # manually add two addresses to the blackhole.
4226 nft add element filter blackhole { 10.2.3.4, 10.23.1.42 }
4227
4228
4229 MAP STATEMENT
4230 The map statement is used to lookup data based on some specific input
4231 key.
4232
4233 expression map { MAP_ELEMENTS }
4234
4235 MAP_ELEMENTS := MAP_ELEMENT [, MAP_ELEMENTS]
4236 MAP_ELEMENT := key : value
4237
4238 The key is a value returned by expression.
4239
4240 Using the map statement.
4241
4242 # select DNAT target based on TCP dport:
4243 # connections to port 80 are redirected to 192.168.1.100,
4244 # connections to port 8888 are redirected to 192.168.1.101
4245 nft add rule ip nat prerouting dnat tcp dport map { 80 : 192.168.1.100, 8888 : 192.168.1.101 }
4246
4247 # source address based SNAT:
4248 # packets from net 192.168.1.0/24 will appear as originating from 10.0.0.1,
4249 # packets from net 192.168.2.0/24 will appear as originating from 10.0.0.2
4250 nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 }
4251
4252
4253 VMAP STATEMENT
4254 The verdict map (vmap) statement works analogous to the map statement,
4255 but contains verdicts as values.
4256
4257 expression vmap { VMAP_ELEMENTS }
4258
4259 VMAP_ELEMENTS := VMAP_ELEMENT [, VMAP_ELEMENTS]
4260 VMAP_ELEMENT := key : verdict
4261
4262 Using the vmap statement.
4263
4264 # jump to different chains depending on layer 4 protocol type:
4265 nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain }
4266
4267
4269 These are some additional commands included in nft.
4270
4271 MONITOR
4272 The monitor command allows you to listen to Netlink events produced by
4273 the nf_tables subsystem. These are either related to creation and
4274 deletion of objects or to packets for which meta nftrace was enabled.
4275 When they occur, nft will print to stdout the monitored events in
4276 either JSON or native nft format.
4277
4278 monitor [new | destroy] MONITOR_OBJECT
4279 monitor trace
4280
4281 MONITOR_OBJECT := tables | chains | sets | rules | elements | ruleset
4282
4283 To filter events related to a concrete object, use one of the keywords
4284 in MONITOR_OBJECT.
4285
4286 To filter events related to a concrete action, use keyword new or
4287 destroy.
4288
4289 The second form of invocation takes no further options and exclusively
4290 prints events generated for packets with nftrace enabled.
4291
4292 Hit ^C to finish the monitor operation.
4293
4294 Listen to all events, report in native nft format.
4295
4296 % nft monitor
4297
4298 Listen to deleted rules, report in JSON format.
4299
4300 % nft -j monitor destroy rules
4301
4302 Listen to both new and destroyed chains, in native nft format.
4303
4304 % nft monitor chains
4305
4306 Listen to ruleset events such as table, chain, rule, set, counters and
4307 quotas, in native nft format.
4308
4309 % nft monitor ruleset
4310
4311 Trace incoming packets from host 10.0.0.1.
4312
4313 % nft add rule filter input ip saddr 10.0.0.1 meta nftrace set 1
4314 % nft monitor trace
4315
4316
4318 When an error is detected, nft shows the line(s) containing the error,
4319 the position of the erroneous parts in the input stream and marks up
4320 the erroneous parts using carets (^). If the error results from the
4321 combination of two expressions or statements, the part imposing the
4322 constraints which are violated is marked using tildes (~).
4323
4324 For errors returned by the kernel, nft cannot detect which parts of the
4325 input caused the error and the entire command is marked.
4326
4327 Error caused by single incorrect expression.
4328
4329 <cmdline>:1:19-22: Error: Interface does not exist
4330 filter output oif eth0
4331 ^^^^
4332
4333 Error caused by invalid combination of two expressions.
4334
4335 <cmdline>:1:28-36: Error: Right hand side of relational expression (==) must be constant
4336 filter output tcp dport == tcp dport
4337 ~~ ^^^^^^^^^
4338
4339 Error returned by the kernel.
4340
4341 <cmdline>:0:0-23: Error: Could not process rule: Operation not permitted
4342 filter output oif wlan0
4343 ^^^^^^^^^^^^^^^^^^^^^^^
4344
4345
4347 On success, nft exits with a status of 0. Unspecified errors cause it
4348 to exit with a status of 1, memory allocation errors with a status of
4349 2, unable to open Netlink socket with 3.
4350
4352 libnftables(3), libnftables-json(5), iptables(8), ip6tables(8), arptables(8), ebtables(8), ip(8), tc(8)
4353
4354 There is an official wiki at: https://wiki.nftables.org
4355
4357 nftables was written by Patrick McHardy and Pablo Neira Ayuso, among
4358 many other contributors from the Netfilter community.
4359
4361 Copyright © 2008-2014 Patrick McHardy <kaber@trash.net> Copyright ©
4362 2013-2018 Pablo Neira Ayuso <pablo@netfilter.org>
4363
4364 nftables is free software; you can redistribute it and/or modify it
4365 under the terms of the GNU General Public License version 2 as
4366 published by the Free Software Foundation.
4367
4368 This documentation is licensed under the terms of the Creative Commons
4369 Attribution-ShareAlike 4.0 license, CC BY-SA 4.0
4370 http://creativecommons.org/licenses/by-sa/4.0/.
4371
4372
4373
4374 08/09/2022 NFT(8)