1NFT(8) NFT(8)
2
3
4
6 nft - Administration tool of the nftables framework for packet
7 filtering and classification
8
10 nft [ -nNscaeSupyjt ] [ -I directory ] [ -f filename | -i | cmd ...]
11 nft -h
12 nft -v
13
15 nft is the command line tool used to set up, maintain and inspect
16 packet filtering and classification rules in the Linux kernel, in the
17 nftables framework. The Linux kernel subsystem is known as nf_tables,
18 and ‘nf’ stands for Netfilter.
19
21 The command accepts several different options which are documented here
22 in groups for better understanding of their meaning. You can get
23 information about options by running nft --help.
24
25 General options:
26
27 -h, --help
28 Show help message and all options.
29
30 -v, --version
31 Show version.
32
33 -V
34 Show long version information, including compile-time
35 configuration.
36
37 Ruleset input handling options that specify to how to load rulesets:
38
39 -f, --file filename
40 Read input from filename. If filename is -, read from stdin.
41
42 -D, --define name=value
43 Define a variable. You can only combine this option with -f.
44
45 -i, --interactive
46 Read input from an interactive readline CLI. You can use quit to
47 exit, or use the EOF marker, normally this is CTRL-D.
48
49 -I, --includepath directory
50 Add the directory directory to the list of directories to be
51 searched for included files. This option may be specified multiple
52 times.
53
54 -c, --check
55 Check commands validity without actually applying the changes.
56
57 Ruleset list output formatting that modify the output of the list
58 ruleset command:
59
60 -a, --handle
61 Show object handles in output.
62
63 -s, --stateless
64 Omit stateful information of rules and stateful objects.
65
66 -t, --terse
67 Omit contents of sets from output.
68
69 -S, --service
70 Translate ports to service names as defined by /etc/services.
71
72 -N, --reversedns
73 Translate IP address to names via reverse DNS lookup. This may slow
74 down your listing since it generates network traffic.
75
76 -u, --guid
77 Translate numeric UID/GID to names as defined by /etc/passwd and
78 /etc/group.
79
80 -n, --numeric
81 Print fully numerical output.
82
83 -y, --numeric-priority
84 Display base chain priority numerically.
85
86 -p, --numeric-protocol
87 Display layer 4 protocol numerically.
88
89 -T, --numeric-time
90 Show time, day and hour values in numeric format.
91
92 Command output formatting:
93
94 -e, --echo
95 When inserting items into the ruleset using add, insert or replace
96 commands, print notifications just like nft monitor.
97
98 -j, --json
99 Format output in JSON. See libnftables-json(5) for a schema
100 description.
101
102 -d, --debug level
103 Enable debugging output. The debug level can be any of scanner,
104 parser, eval, netlink, mnl, proto-ctx, segtree, all. You can
105 combine more than one by separating by the , symbol, for example -d
106 eval,mnl.
107
109 LEXICAL CONVENTIONS
110 Input is parsed line-wise. When the last character of a line, just
111 before the newline character, is a non-quoted backslash (\), the next
112 line is treated as a continuation. Multiple commands on the same line
113 can be separated using a semicolon (;).
114
115 A hash sign (#) begins a comment. All following characters on the same
116 line are ignored.
117
118 Identifiers begin with an alphabetic character (a-z,A-Z), followed by
119 zero or more alphanumeric characters (a-z,A-Z,0-9) and the characters
120 slash (/), backslash (\), underscore (_) and dot (.). Identifiers using
121 different characters or clashing with a keyword need to be enclosed in
122 double quotes (").
123
124 INCLUDE FILES
125 include filename
126
127 Other files can be included by using the include statement. The
128 directories to be searched for include files can be specified using the
129 -I/--includepath option. You can override this behaviour either by
130 prepending ‘./’ to your path to force inclusion of files located in the
131 current working directory (i.e. relative path) or / for file location
132 expressed as an absolute path.
133
134 If -I/--includepath is not specified, then nft relies on the default
135 directory that is specified at compile time. You can retrieve this
136 default directory via the -h/--help option.
137
138 Include statements support the usual shell wildcard symbols (,?,[]).
139 Having no matches for an include statement is not an error, if wildcard
140 symbols are used in the include statement. This allows having
141 potentially empty include directories for statements like include
142 "/etc/firewall/rules/". The wildcard matches are loaded in alphabetical
143 order. Files beginning with dot (.) are not matched by include
144 statements.
145
146 SYMBOLIC VARIABLES
147 define variable = expr
148 $variable
149
150 Symbolic variables can be defined using the define statement. Variable
151 references are expressions and can be used to initialize other
152 variables. The scope of a definition is the current block and all
153 blocks contained within.
154
155 Using symbolic variables.
156
157 define int_if1 = eth0
158 define int_if2 = eth1
159 define int_ifs = { $int_if1, $int_if2 }
160
161 filter input iif $int_ifs accept
162
163
165 Address families determine the type of packets which are processed. For
166 each address family, the kernel contains so called hooks at specific
167 stages of the packet processing paths, which invoke nftables if rules
168 for these hooks exist.
169
170
171 ip IPv4 address family.
172
173 ip6 IPv6 address family.
174
175 inet Internet (IPv4/IPv6)
176 address family.
177
178 arp ARP address family,
179 handling IPv4 ARP packets.
180
181 bridge Bridge address family,
182 handling packets which
183 traverse a bridge device.
184
185 netdev Netdev address family,
186 handling packets on
187 ingress and egress.
188
189
190 All nftables objects exist in address family specific namespaces,
191 therefore all identifiers include an address family. If an identifier
192 is specified without an address family, the ip family is used by
193 default.
194
195 IPV4/IPV6/INET ADDRESS FAMILIES
196 The IPv4/IPv6/Inet address families handle IPv4, IPv6 or both types of
197 packets. They contain five hooks at different packet processing stages
198 in the network stack.
199
200 Table 1. IPv4/IPv6/Inet address family hooks
201 ┌────────────┬────────────────────────────┐
202 │Hook │ Description │
203 ├────────────┼────────────────────────────┤
204 │ │ │
205 │prerouting │ All packets entering the │
206 │ │ system are processed by │
207 │ │ the prerouting hook. It is │
208 │ │ invoked before the routing │
209 │ │ process and is used for │
210 │ │ early filtering or │
211 │ │ changing packet attributes │
212 │ │ that affect routing. │
213 ├────────────┼────────────────────────────┤
214 │ │ │
215 │input │ Packets delivered to the │
216 │ │ local system are processed │
217 │ │ by the input hook. │
218 ├────────────┼────────────────────────────┤
219 │ │ │
220 │forward │ Packets forwarded to a │
221 │ │ different host are │
222 │ │ processed by the forward │
223 │ │ hook. │
224 ├────────────┼────────────────────────────┤
225 │ │ │
226 │output │ Packets sent by local │
227 │ │ processes are processed by │
228 │ │ the output hook. │
229 ├────────────┼────────────────────────────┤
230 │ │ │
231 │postrouting │ All packets leaving the │
232 │ │ system are processed by │
233 │ │ the postrouting hook. │
234 ├────────────┼────────────────────────────┤
235 │ │ │
236 │ingress │ All packets entering the │
237 │ │ system are processed by │
238 │ │ this hook. It is invoked │
239 │ │ before layer 3 protocol │
240 │ │ handlers, hence before the │
241 │ │ prerouting hook, and it │
242 │ │ can be used for filtering │
243 │ │ and policing. Ingress is │
244 │ │ only available for Inet │
245 │ │ family (since Linux kernel │
246 │ │ 5.10). │
247 └────────────┴────────────────────────────┘
248
249 ARP ADDRESS FAMILY
250 The ARP address family handles ARP packets received and sent by the
251 system. It is commonly used to mangle ARP packets for clustering.
252
253 Table 2. ARP address family hooks
254 ┌───────┬────────────────────────────┐
255 │Hook │ Description │
256 ├───────┼────────────────────────────┤
257 │ │ │
258 │input │ Packets delivered to the │
259 │ │ local system are processed │
260 │ │ by the input hook. │
261 ├───────┼────────────────────────────┤
262 │ │ │
263 │output │ Packets send by the local │
264 │ │ system are processed by │
265 │ │ the output hook. │
266 └───────┴────────────────────────────┘
267
268 BRIDGE ADDRESS FAMILY
269 The bridge address family handles Ethernet packets traversing bridge
270 devices.
271
272 The list of supported hooks is identical to IPv4/IPv6/Inet address
273 families above.
274
275 NETDEV ADDRESS FAMILY
276 The Netdev address family handles packets from the device ingress and
277 egress path. This family allows you to filter packets of any ethertype
278 such as ARP, VLAN 802.1q, VLAN 802.1ad (Q-in-Q) as well as IPv4 and
279 IPv6 packets.
280
281 Table 3. Netdev address family hooks
282 ┌────────┬────────────────────────────┐
283 │Hook │ Description │
284 ├────────┼────────────────────────────┤
285 │ │ │
286 │ingress │ All packets entering the │
287 │ │ system are processed by │
288 │ │ this hook. It is invoked │
289 │ │ after the network taps │
290 │ │ (ie. tcpdump), right after │
291 │ │ tc ingress and before │
292 │ │ layer 3 protocol handlers, │
293 │ │ it can be used for early │
294 │ │ filtering and policing. │
295 ├────────┼────────────────────────────┤
296 │ │ │
297 │egress │ All packets leaving the │
298 │ │ system are processed by │
299 │ │ this hook. It is invoked │
300 │ │ after layer 3 protocol │
301 │ │ handlers and before tc │
302 │ │ egress. It can be used for │
303 │ │ late filtering and │
304 │ │ policing. │
305 └────────┴────────────────────────────┘
306
307 Tunneled packets (such as vxlan) are processed by netdev family hooks
308 both in decapsulated and encapsulated (tunneled) form. So a packet can
309 be filtered on the overlay network as well as on the underlying
310 network.
311
312 Note that the order of netfilter and tc is mirrored on ingress versus
313 egress. This ensures symmetry for NAT and other packet mangling.
314
315 Ingress packets which are redirected out some other interface are only
316 processed by netfilter on egress if they have passed through netfilter
317 ingress processing before. Thus, ingress packets which are redirected
318 by tc are not subjected to netfilter. But they are if they are
319 redirected by netfilter on ingress. Conceptually, tc and netfilter can
320 be thought of as layers, with netfilter layered above tc: If the packet
321 hasn’t been passed up from the tc layer to the netfilter layer, it’s
322 not subjected to netfilter on egress.
323
325 {list | flush} ruleset [family]
326
327 The ruleset keyword is used to identify the whole set of tables,
328 chains, etc. currently in place in kernel. The following ruleset
329 commands exist:
330
331
332 list Print the ruleset in
333 human-readable format.
334
335
336
337
338 flush Clear the whole ruleset.
339 Note that, unlike
340 iptables, this will remove
341 all tables and whatever
342 they contain, effectively
343 leading to an empty
344 ruleset - no packet
345 filtering will happen
346 anymore, so the kernel
347 accepts any valid packet
348 it receives.
349
350
351 It is possible to limit list and flush to a specific address family
352 only. For a list of valid family names, see the section called “ADDRESS
353 FAMILIES” above.
354
355 By design, list ruleset command output may be used as input to nft -f.
356 Effectively, this is the nft-equivalent of iptables-save and
357 iptables-restore.
358
360 {add | create} table [family] table [ {comment comment ;} { flags 'flags ; }]
361 {delete | list | flush} table [family] table
362 list tables [family]
363 delete table [family] handle handle
364
365 Tables are containers for chains, sets and stateful objects. They are
366 identified by their address family and their name. The address family
367 must be one of ip, ip6, inet, arp, bridge, netdev. The inet address
368 family is a dummy family which is used to create hybrid IPv4/IPv6
369 tables. The meta expression nfproto keyword can be used to test which
370 family (ipv4 or ipv6) context the packet is being processed in. When no
371 address family is specified, ip is used by default. The only difference
372 between add and create is that the former will not return an error if
373 the specified table already exists while create will return an error.
374
375 Table 4. Table flags
376 ┌────────┬────────────────────────────┐
377 │Flag │ Description │
378 ├────────┼────────────────────────────┤
379 │ │ │
380 │dormant │ table is not evaluated any │
381 │ │ more (base chains are │
382 │ │ unregistered). │
383 └────────┴────────────────────────────┘
384
385 Add, change, delete a table.
386
387 # start nft in interactive mode
388 nft --interactive
389
390 # create a new table.
391 create table inet mytable
392
393 # add a new base chain: get input packets
394 add chain inet mytable myin { type filter hook input priority filter; }
395
396 # add a single counter to the chain
397 add rule inet mytable myin counter
398
399 # disable the table temporarily -- rules are not evaluated anymore
400 add table inet mytable { flags dormant; }
401
402 # make table active again:
403 add table inet mytable
404
405
406
407 add Add a new table for the
408 given family with the
409 given name.
410
411 delete Delete the specified
412 table.
413
414 list List all chains and rules
415 of the specified table.
416
417 flush Flush all chains and rules
418 of the specified table.
419
420
422 {add | create} chain [family] table chain [{ type type hook hook [device device] priority priority ; [policy policy ;] [comment comment ;] }]
423 {delete | list | flush} chain ['family] table chain
424 list chains [family]
425 delete chain [family] table handle handle
426 rename chain [family] table chain newname
427
428 Chains are containers for rules. They exist in two kinds, base chains
429 and regular chains. A base chain is an entry point for packets from the
430 networking stack, a regular chain may be used as jump target and is
431 used for better rule organization.
432
433
434 add Add a new chain in the
435 specified table. When a
436 hook and priority value
437 are specified, the chain
438 is created as a base chain
439 and hooked up to the
440 networking stack.
441
442 create Similar to the add
443 command, but returns an
444 error if the chain already
445 exists.
446
447 delete Delete the specified
448 chain. The chain must not
449 contain any rules or be
450 used as jump target.
451
452 rename Rename the specified
453 chain.
454
455 list List all rules of the
456 specified chain.
457
458 flush Flush all rules of the
459 specified chain.
460
461
462 For base chains, type, hook and priority parameters are mandatory.
463
464 Table 5. Supported chain types
465 ┌───────┬───────────────┬────────────────┬──────────────────┐
466 │Type │ Families │ Hooks │ Description │
467 ├───────┼───────────────┼────────────────┼──────────────────┤
468 │ │ │ │ │
469 │filter │ all │ all │ Standard chain │
470 │ │ │ │ type to use in │
471 │ │ │ │ doubt. │
472 ├───────┼───────────────┼────────────────┼──────────────────┤
473 │ │ │ │ │
474 │nat │ ip, ip6, inet │ prerouting, │ Chains of this │
475 │ │ │ input, output, │ type perform │
476 │ │ │ postrouting │ Native Address │
477 │ │ │ │ Translation │
478 │ │ │ │ based on │
479 │ │ │ │ conntrack │
480 │ │ │ │ entries. Only │
481 │ │ │ │ the first packet │
482 │ │ │ │ of a connection │
483 │ │ │ │ actually │
484 │ │ │ │ traverses this │
485 │ │ │ │ chain - its │
486 │ │ │ │ rules usually │
487 │ │ │ │ define details │
488 │ │ │ │ of the created │
489 │ │ │ │ conntrack entry │
490 │ │ │ │ (NAT statements │
491 │ │ │ │ for instance). │
492 ├───────┼───────────────┼────────────────┼──────────────────┤
493 │ │ │ │ │
494 │route │ ip, ip6 │ output │ If a packet has │
495 │ │ │ │ traversed a │
496 │ │ │ │ chain of this │
497 │ │ │ │ type and is │
498 │ │ │ │ about to be │
499 │ │ │ │ accepted, a new │
500 │ │ │ │ route lookup is │
501 │ │ │ │ performed if │
502 │ │ │ │ relevant parts │
503 │ │ │ │ of the IP header │
504 │ │ │ │ have changed. │
505 │ │ │ │ This allows to │
506 │ │ │ │ e.g. implement │
507 │ │ │ │ policy routing │
508 │ │ │ │ selectors in │
509 │ │ │ │ nftables. │
510 └───────┴───────────────┴────────────────┴──────────────────┘
511
512 Apart from the special cases illustrated above (e.g. nat type not
513 supporting forward hook or route type only supporting output hook),
514 there are three further quirks worth noticing:
515
516 • The netdev family supports merely two combinations, namely filter
517 type with ingress hook and filter type with egress hook. Base
518 chains in this family also require the device parameter to be
519 present since they exist per interface only.
520
521 • The arp family supports only the input and output hooks, both in
522 chains of type filter.
523
524 • The inet family also supports the ingress hook (since Linux kernel
525 5.10), to filter IPv4 and IPv6 packet at the same location as the
526 netdev ingress hook. This inet hook allows you to share sets and
527 maps between the usual prerouting, input, forward, output,
528 postrouting and this ingress hook.
529
530 The priority parameter accepts a signed integer value or a standard
531 priority name which specifies the order in which chains with the same
532 hook value are traversed. The ordering is ascending, i.e. lower
533 priority values have precedence over higher ones.
534
535 Standard priority values can be replaced with easily memorizable names.
536 Not all names make sense in every family with every hook (see the
537 compatibility matrices below) but their numerical value can still be
538 used for prioritizing chains.
539
540 These names and values are defined and made available based on what
541 priorities are used by xtables when registering their default chains.
542
543 Most of the families use the same values, but bridge uses different
544 ones from the others. See the following tables that describe the values
545 and compatibility.
546
547 Table 6. Standard priority names, family and hook compatibility matrix
548 ┌─────────┬───────┬────────────────┬─────────────┐
549 │Name │ Value │ Families │ Hooks │
550 ├─────────┼───────┼────────────────┼─────────────┤
551 │ │ │ │ │
552 │raw │ -300 │ ip, ip6, inet │ all │
553 ├─────────┼───────┼────────────────┼─────────────┤
554 │ │ │ │ │
555 │mangle │ -150 │ ip, ip6, inet │ all │
556 ├─────────┼───────┼────────────────┼─────────────┤
557 │ │ │ │ │
558 │dstnat │ -100 │ ip, ip6, inet │ prerouting │
559 ├─────────┼───────┼────────────────┼─────────────┤
560 │ │ │ │ │
561 │filter │ 0 │ ip, ip6, inet, │ all │
562 │ │ │ arp, netdev │ │
563 ├─────────┼───────┼────────────────┼─────────────┤
564 │ │ │ │ │
565 │security │ 50 │ ip, ip6, inet │ all │
566 ├─────────┼───────┼────────────────┼─────────────┤
567 │ │ │ │ │
568 │srcnat │ 100 │ ip, ip6, inet │ postrouting │
569 └─────────┴───────┴────────────────┴─────────────┘
570
571 Table 7. Standard priority names and hook compatibility for the bridge
572 family
573 ┌───────┬───────┬─────────────┐
574 │ │ │ │
575 │Name │ Value │ Hooks │
576 ├───────┼───────┼─────────────┤
577 │ │ │ │
578 │dstnat │ -300 │ prerouting │
579 ├───────┼───────┼─────────────┤
580 │ │ │ │
581 │filter │ -200 │ all │
582 ├───────┼───────┼─────────────┤
583 │ │ │ │
584 │out │ 100 │ output │
585 ├───────┼───────┼─────────────┤
586 │ │ │ │
587 │srcnat │ 300 │ postrouting │
588 └───────┴───────┴─────────────┘
589
590 Basic arithmetic expressions (addition and subtraction) can also be
591 achieved with these standard names to ease relative prioritizing, e.g.
592 mangle - 5 stands for -155. Values will also be printed like this until
593 the value is not further than 10 from the standard value.
594
595 Base chains also allow to set the chain’s policy, i.e. what happens to
596 packets not explicitly accepted or refused in contained rules.
597 Supported policy values are accept (which is the default) or drop.
598
600 {add | insert} rule [family] table chain [handle handle | index index] statement ... [comment comment]
601 replace rule [family] table chain handle handle statement ... [comment comment]
602 delete rule [family] table chain handle handle
603
604 Rules are added to chains in the given table. If the family is not
605 specified, the ip family is used. Rules are constructed from two kinds
606 of components according to a set of grammatical rules: expressions and
607 statements.
608
609 The add and insert commands support an optional location specifier,
610 which is either a handle or the index (starting at zero) of an existing
611 rule. Internally, rule locations are always identified by handle and
612 the translation from index happens in userspace. This has two potential
613 implications in case a concurrent ruleset change happens after the
614 translation was done: The effective rule index might change if a rule
615 was inserted or deleted before the referred one. If the referred rule
616 was deleted, the command is rejected by the kernel just as if an
617 invalid handle was given.
618
619 A comment is a single word or a double-quoted (") multi-word string
620 which can be used to make notes regarding the actual rule. Note: If you
621 use bash for adding rules, you have to escape the quotation marks, e.g.
622 \"enable ssh for servers\".
623
624
625 add Add a new rule described
626 by the list of statements.
627 The rule is appended to
628 the given chain unless a
629 location is specified, in
630 which case the rule is
631 inserted after the
632 specified rule.
633
634 insert Same as add except the
635 rule is inserted at the
636 beginning of the chain or
637 before the specified rule.
638
639 replace Similar to add, but the
640 rule replaces the
641 specified rule.
642
643 delete Delete the specified rule.
644
645
646 add a rule to ip table output chain.
647
648 nft add rule filter output ip daddr 192.168.0.0/24 accept # 'ip filter' is assumed
649 # same command, slightly more verbose
650 nft add rule ip filter output ip daddr 192.168.0.0/24 accept
651
652 delete rule from inet table.
653
654 # nft -a list ruleset
655 table inet filter {
656 chain input {
657 type filter hook input priority filter; policy accept;
658 ct state established,related accept # handle 4
659 ip saddr 10.1.1.1 tcp dport ssh accept # handle 5
660 ...
661 # delete the rule with handle 5
662 nft delete rule inet filter input handle 5
663
664
666 nftables offers two kinds of set concepts. Anonymous sets are sets that
667 have no specific name. The set members are enclosed in curly braces,
668 with commas to separate elements when creating the rule the set is used
669 in. Once that rule is removed, the set is removed as well. They cannot
670 be updated, i.e. once an anonymous set is declared it cannot be changed
671 anymore except by removing/altering the rule that uses the anonymous
672 set.
673
674 Using anonymous sets to accept particular subnets and ports.
675
676 nft add rule filter input ip saddr { 10.0.0.0/8, 192.168.0.0/16 } tcp dport { 22, 443 } accept
677
678 Named sets are sets that need to be defined first before they can be
679 referenced in rules. Unlike anonymous sets, elements can be added to or
680 removed from a named set at any time. Sets are referenced from rules
681 using an @ prefixed to the sets name.
682
683 Using named sets to accept addresses and ports.
684
685 nft add rule filter input ip saddr @allowed_hosts tcp dport @allowed_ports accept
686
687 The sets allowed_hosts and allowed_ports need to be created first. The
688 next section describes nft set syntax in more detail.
689
690 add set [family] table set { type type | typeof expression ; [flags flags ;] [timeout timeout ;] [gc-interval gc-interval ;] [elements = { element[, ...] } ;] [size size ;] [comment comment ;] [policy 'policy ;] [auto-merge ;] }
691 {delete | list | flush} set [family] table set
692 list sets [family]
693 delete set [family] table handle handle
694 {add | delete} element [family] table set { element[, ...] }
695
696 Sets are element containers of a user-defined data type, they are
697 uniquely identified by a user-defined name and attached to tables.
698 Their behaviour can be tuned with the flags that can be specified at
699 set creation time.
700
701
702 add Add a new set in the
703 specified table. See the
704 Set specification table
705 below for more information
706 about how to specify
707 properties of a set.
708
709 delete Delete the specified set.
710
711 list Display the elements in
712 the specified set.
713
714 flush Remove all elements from
715 the specified set.
716
717
718 Table 8. Set specifications
719 ┌────────────┬──────────────────────┬─────────────────────┐
720 │Keyword │ Description │ Type │
721 ├────────────┼──────────────────────┼─────────────────────┤
722 │ │ │ │
723 │type │ data type of set │ string: ipv4_addr, │
724 │ │ elements │ ipv6_addr, │
725 │ │ │ ether_addr, │
726 │ │ │ inet_proto, │
727 │ │ │ inet_service, mark │
728 ├────────────┼──────────────────────┼─────────────────────┤
729 │ │ │ │
730 │typeof │ data type of set │ expression to │
731 │ │ element │ derive the data │
732 │ │ │ type from │
733 ├────────────┼──────────────────────┼─────────────────────┤
734 │ │ │ │
735 │flags │ set flags │ string: constant, │
736 │ │ │ dynamic, interval, │
737 │ │ │ timeout │
738 ├────────────┼──────────────────────┼─────────────────────┤
739 │ │ │ │
740 │timeout │ time an element │ string, decimal │
741 │ │ stays in the set, │ followed by unit. │
742 │ │ mandatory if set is │ Units are: d, h, m, │
743 │ │ added to from the │ s │
744 │ │ packet path │ │
745 │ │ (ruleset) │ │
746 ├────────────┼──────────────────────┼─────────────────────┤
747 │ │ │ │
748 │gc-interval │ garbage collection │ string, decimal │
749 │ │ interval, only │ followed by unit. │
750 │ │ available when │ Units are: d, h, m, │
751 │ │ timeout or flag │ s │
752 │ │ timeout are active │ │
753 ├────────────┼──────────────────────┼─────────────────────┤
754 │ │ │ │
755 │elements │ elements contained │ set data type │
756 │ │ by the set │ │
757 ├────────────┼──────────────────────┼─────────────────────┤
758 │ │ │ │
759 │size │ maximum number of │ unsigned integer │
760 │ │ elements in the │ (64 bit) │
761 │ │ set, mandatory if │ │
762 │ │ set is added to │ │
763 │ │ from the packet │ │
764 │ │ path (ruleset) │ │
765 ├────────────┼──────────────────────┼─────────────────────┤
766 │ │ │ │
767 │policy │ set policy │ string: performance │
768 │ │ │ [default], memory │
769 ├────────────┼──────────────────────┼─────────────────────┤
770 │ │ │ │
771 │auto-merge │ automatic merge of │ │
772 │ │ adjacent/overlapping │ │
773 │ │ set elements (only │ │
774 │ │ for interval sets) │ │
775 └────────────┴──────────────────────┴─────────────────────┘
776
778 add map [family] table map { type type | typeof expression [flags flags ;] [elements = { element[, ...] } ;] [size size ;] [comment comment ;] [policy 'policy ;] }
779 {delete | list | flush} map [family] table map
780 list maps [family]
781
782 Maps store data based on some specific key used as input. They are
783 uniquely identified by a user-defined name and attached to tables.
784
785
786 add Add a new map in the
787 specified table.
788
789 delete Delete the specified map.
790
791 list Display the elements in
792 the specified map.
793
794 flush Remove all elements from
795 the specified map.
796
797 add element Comma-separated list of
798 elements to add into the
799 specified map.
800
801 delete element Comma-separated list of
802 element keys to delete
803 from the specified map.
804
805
806 Table 9. Map specifications
807 ┌─────────┬─────────────────────┬─────────────────────┐
808 │Keyword │ Description │ Type │
809 ├─────────┼─────────────────────┼─────────────────────┤
810 │ │ │ │
811 │type │ data type of map │ string: ipv4_addr, │
812 │ │ elements │ ipv6_addr, │
813 │ │ │ ether_addr, │
814 │ │ │ inet_proto, │
815 │ │ │ inet_service, mark, │
816 │ │ │ counter, quota. │
817 │ │ │ Counter and quota │
818 │ │ │ can’t be used as │
819 │ │ │ keys │
820 ├─────────┼─────────────────────┼─────────────────────┤
821 │ │ │ │
822 │typeof │ data type of set │ expression to │
823 │ │ element │ derive the data │
824 │ │ │ type from │
825 ├─────────┼─────────────────────┼─────────────────────┤
826 │ │ │ │
827 │flags │ map flags │ string: constant, │
828 │ │ │ interval │
829 ├─────────┼─────────────────────┼─────────────────────┤
830 │ │ │ │
831 │elements │ elements contained │ map data type │
832 │ │ by the map │ │
833 ├─────────┼─────────────────────┼─────────────────────┤
834 │ │ │ │
835 │size │ maximum number of │ unsigned integer │
836 │ │ elements in the map │ (64 bit) │
837 ├─────────┼─────────────────────┼─────────────────────┤
838 │ │ │ │
839 │policy │ map policy │ string: performance │
840 │ │ │ [default], memory │
841 └─────────┴─────────────────────┴─────────────────────┘
842
844 {add | create | delete | get } element [family] table set { ELEMENT[, ...] }
845
846 ELEMENT := key_expression OPTIONS [: value_expression]
847 OPTIONS := [timeout TIMESPEC] [expires TIMESPEC] [comment string]
848 TIMESPEC := [numd][numh][numm][num[s]]
849
850 Element-related commands allow to change contents of named sets and
851 maps. key_expression is typically a value matching the set type.
852 value_expression is not allowed in sets but mandatory when adding to
853 maps, where it matches the data part in its type definition. When
854 deleting from maps, it may be specified but is optional as
855 key_expression uniquely identifies the element.
856
857 create command is similar to add with the exception that none of the
858 listed elements may already exist.
859
860 get command is useful to check if an element is contained in a set
861 which may be non-trivial in very large and/or interval sets. In the
862 latter case, the containing interval is returned instead of just the
863 element itself.
864
865 Table 10. Element options
866 ┌────────┬───────────────────────────┐
867 │Option │ Description │
868 ├────────┼───────────────────────────┤
869 │ │ │
870 │timeout │ timeout value for │
871 │ │ sets/maps with flag │
872 │ │ timeout │
873 ├────────┼───────────────────────────┤
874 │ │ │
875 │expires │ the time until given │
876 │ │ element expires, useful │
877 │ │ for ruleset replication │
878 │ │ only │
879 ├────────┼───────────────────────────┤
880 │ │ │
881 │comment │ per element comment field │
882 └────────┴───────────────────────────┘
883
885 {add | create} flowtable [family] table flowtable { hook hook priority priority ; devices = { device[, ...] } ; }
886 list flowtables [family]
887 {delete | list} flowtable [family] table flowtable
888 delete flowtable [family] table handle handle
889
890 Flowtables allow you to accelerate packet forwarding in software.
891 Flowtables entries are represented through a tuple that is composed of
892 the input interface, source and destination address, source and
893 destination port; and layer 3/4 protocols. Each entry also caches the
894 destination interface and the gateway address - to update the
895 destination link-layer address - to forward packets. The ttl and
896 hoplimit fields are also decremented. Hence, flowtables provides an
897 alternative path that allow packets to bypass the classic forwarding
898 path. Flowtables reside in the ingress hook that is located before the
899 prerouting hook. You can select which flows you want to offload through
900 the flow expression from the forward chain. Flowtables are identified
901 by their address family and their name. The address family must be one
902 of ip, ip6, or inet. The inet address family is a dummy family which is
903 used to create hybrid IPv4/IPv6 tables. When no address family is
904 specified, ip is used by default.
905
906 The priority can be a signed integer or filter which stands for 0.
907 Addition and subtraction can be used to set relative priority, e.g.
908 filter + 5 equals to 5.
909
910
911 add Add a new flowtable for
912 the given family with the
913 given name.
914
915 delete Delete the specified
916 flowtable.
917
918 list List all flowtables.
919
920
922 list { secmarks | synproxys | flow tables | meters | hooks } [family]
923 list { secmarks | synproxys | flow tables | meters | hooks } table [family] table
924 list ct { timeout | expectation | helper | helpers } table [family] table
925
926 Inspect configured objects. list hooks shows the full hook pipeline,
927 including those registered by kernel modules, such as nf_conntrack.
928
930 {add | delete | list | reset} type [family] table object
931 delete type [family] table handle handle
932 list counters [family]
933 list quotas [family]
934 list limits [family]
935
936 Stateful objects are attached to tables and are identified by a unique
937 name. They group stateful information from rules, to reference them in
938 rules the keywords "type name" are used e.g. "counter name".
939
940
941 add Add a new stateful object
942 in the specified table.
943
944 delete Delete the specified
945 object.
946
947 list Display stateful
948 information the object
949 holds.
950
951 reset List-and-reset stateful
952 object.
953
954
955 CT HELPER
956 add ct helper [family] table name { type type protocol protocol ; [l3proto family ;] }
957 delete ct helper [family] table name
958 list ct helpers
959
960 Ct helper is used to define connection tracking helpers that can then
961 be used in combination with the ct helper set statement. type and
962 protocol are mandatory, l3proto is derived from the table family by
963 default, i.e. in the inet table the kernel will try to load both the
964 ipv4 and ipv6 helper backends, if they are supported by the kernel.
965
966 Table 11. conntrack helper specifications
967 ┌─────────┬─────────────────────┬─────────────────────┐
968 │Keyword │ Description │ Type │
969 ├─────────┼─────────────────────┼─────────────────────┤
970 │ │ │ │
971 │type │ name of helper type │ quoted string (e.g. │
972 │ │ │ "ftp") │
973 ├─────────┼─────────────────────┼─────────────────────┤
974 │ │ │ │
975 │protocol │ layer 4 protocol of │ string (e.g. ip) │
976 │ │ the helper │ │
977 ├─────────┼─────────────────────┼─────────────────────┤
978 │ │ │ │
979 │l3proto │ layer 3 protocol of │ address family │
980 │ │ the helper │ (e.g. ip) │
981 ├─────────┼─────────────────────┼─────────────────────┤
982 │ │ │ │
983 │comment │ per ct helper │ string │
984 │ │ comment field │ │
985 └─────────┴─────────────────────┴─────────────────────┘
986
987 defining and assigning ftp helper.
988
989 Unlike iptables, helper assignment needs to be performed after the conntrack
990 lookup has completed, for example with the default 0 hook priority.
991
992 table inet myhelpers {
993 ct helper ftp-standard {
994 type "ftp" protocol tcp
995 }
996 chain prerouting {
997 type filter hook prerouting priority filter;
998 tcp dport 21 ct helper set "ftp-standard"
999 }
1000 }
1001
1002
1003 CT TIMEOUT
1004 add ct timeout [family] table name { protocol protocol ; policy = { state: value [, ...] } ; [l3proto family ;] }
1005 delete ct timeout [family] table name
1006 list ct timeouts
1007
1008 Ct timeout is used to update connection tracking timeout values.Timeout
1009 policies are assigned with the ct timeout set statement. protocol and
1010 policy are mandatory, l3proto is derived from the table family by
1011 default.
1012
1013 Table 12. conntrack timeout specifications
1014 ┌─────────┬─────────────────────┬──────────────────┐
1015 │Keyword │ Description │ Type │
1016 ├─────────┼─────────────────────┼──────────────────┤
1017 │ │ │ │
1018 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1019 │ │ the timeout object │ │
1020 ├─────────┼─────────────────────┼──────────────────┤
1021 │ │ │ │
1022 │state │ connection state │ string (e.g. │
1023 │ │ name │ "established") │
1024 ├─────────┼─────────────────────┼──────────────────┤
1025 │ │ │ │
1026 │value │ timeout value for │ unsigned integer │
1027 │ │ connection state │ │
1028 ├─────────┼─────────────────────┼──────────────────┤
1029 │ │ │ │
1030 │l3proto │ layer 3 protocol of │ address family │
1031 │ │ the timeout object │ (e.g. ip) │
1032 ├─────────┼─────────────────────┼──────────────────┤
1033 │ │ │ │
1034 │comment │ per ct timeout │ string │
1035 │ │ comment field │ │
1036 └─────────┴─────────────────────┴──────────────────┘
1037
1038 tcp connection state names that can have a specific timeout value are:
1039
1040 close, close_wait, established, fin_wait, last_ack, retrans, syn_recv,
1041 syn_sent, time_wait and unack.
1042
1043 You can use sysctl -a |grep net.netfilter.nf_conntrack_tcp_timeout_ to
1044 view and change the system-wide defaults. ct timeout allows for
1045 flow-specific settings, without changing the global timeouts.
1046
1047 For example, tcp port 53 could have much lower settings than other
1048 traffic.
1049
1050 udp state names that can have a specific timeout value are replied and
1051 unreplied.
1052
1053 defining and assigning ct timeout policy.
1054
1055 table ip filter {
1056 ct timeout customtimeout {
1057 protocol tcp;
1058 l3proto ip
1059 policy = { established: 120, close: 20 }
1060 }
1061
1062 chain output {
1063 type filter hook output priority filter; policy accept;
1064 ct timeout set "customtimeout"
1065 }
1066 }
1067
1068 testing the updated timeout policy.
1069
1070 % conntrack -E
1071
1072 It should display:
1073
1074 [UPDATE] tcp 6 120 ESTABLISHED src=172.16.19.128 dst=172.16.19.1
1075 sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128
1076 sport=41360 dport=22
1077
1078
1079 CT EXPECTATION
1080 add ct expectation [family] table name { protocol protocol ; dport dport ; timeout timeout ; size size ; [*l3proto family ;] }
1081 delete ct expectation [family] table name
1082 list ct expectations
1083
1084 Ct expectation is used to create connection expectations. Expectations
1085 are assigned with the ct expectation set statement. protocol, dport,
1086 timeout and size are mandatory, l3proto is derived from the table
1087 family by default.
1088
1089 Table 13. conntrack expectation specifications
1090 ┌─────────┬─────────────────────┬──────────────────┐
1091 │Keyword │ Description │ Type │
1092 ├─────────┼─────────────────────┼──────────────────┤
1093 │ │ │ │
1094 │protocol │ layer 4 protocol of │ string (e.g. ip) │
1095 │ │ the expectation │ │
1096 │ │ object │ │
1097 ├─────────┼─────────────────────┼──────────────────┤
1098 │ │ │ │
1099 │dport │ destination port of │ unsigned integer │
1100 │ │ expected connection │ │
1101 ├─────────┼─────────────────────┼──────────────────┤
1102 │ │ │ │
1103 │timeout │ timeout value for │ unsigned integer │
1104 │ │ expectation │ │
1105 ├─────────┼─────────────────────┼──────────────────┤
1106 │ │ │ │
1107 │size │ size value for │ unsigned integer │
1108 │ │ expectation │ │
1109 ├─────────┼─────────────────────┼──────────────────┤
1110 │ │ │ │
1111 │l3proto │ layer 3 protocol of │ address family │
1112 │ │ the expectation │ (e.g. ip) │
1113 │ │ object │ │
1114 ├─────────┼─────────────────────┼──────────────────┤
1115 │ │ │ │
1116 │comment │ per ct expectation │ string │
1117 │ │ comment field │ │
1118 └─────────┴─────────────────────┴──────────────────┘
1119
1120 defining and assigning ct expectation policy.
1121
1122 table ip filter {
1123 ct expectation expect {
1124 protocol udp
1125 dport 9876
1126 timeout 2m
1127 size 8
1128 l3proto ip
1129 }
1130
1131 chain input {
1132 type filter hook input priority filter; policy accept;
1133 ct expectation set "expect"
1134 }
1135 }
1136
1137
1138 COUNTER
1139 add counter [family] table name [{ [ packets packets bytes bytes ; ] [ comment comment ; }]
1140 delete counter [family] table name
1141 list counters
1142
1143 Table 14. Counter specifications
1144 ┌────────┬─────────────────────┬──────────────────┐
1145 │Keyword │ Description │ Type │
1146 ├────────┼─────────────────────┼──────────────────┤
1147 │ │ │ │
1148 │packets │ initial count of │ unsigned integer │
1149 │ │ packets │ (64 bit) │
1150 ├────────┼─────────────────────┼──────────────────┤
1151 │ │ │ │
1152 │bytes │ initial count of │ unsigned integer │
1153 │ │ bytes │ (64 bit) │
1154 ├────────┼─────────────────────┼──────────────────┤
1155 │ │ │ │
1156 │comment │ per counter comment │ string │
1157 │ │ field │ │
1158 └────────┴─────────────────────┴──────────────────┘
1159
1160 Using named counters.
1161
1162 nft add counter filter http
1163 nft add rule filter input tcp dport 80 counter name \"http\"
1164
1165 Using named counters with maps.
1166
1167 nft add counter filter http
1168 nft add counter filter https
1169 nft add rule filter input counter name tcp dport map { 80 : \"http\", 443 : \"https\" }
1170
1171
1172 QUOTA
1173 add quota [family] table name { [over|until] bytes BYTE_UNIT [ used bytes BYTE_UNIT ] ; [ comment comment ; ] }
1174 BYTE_UNIT := bytes | kbytes | mbytes
1175 delete quota [family] table name
1176 list quotas
1177
1178 Table 15. Quota specifications
1179 ┌────────┬───────────────────┬────────────────────┐
1180 │Keyword │ Description │ Type │
1181 ├────────┼───────────────────┼────────────────────┤
1182 │ │ │ │
1183 │quota │ quota limit, used │ Two arguments, │
1184 │ │ as the quota name │ unsigned integer │
1185 │ │ │ (64 bit) and │
1186 │ │ │ string: bytes, │
1187 │ │ │ kbytes, mbytes. │
1188 │ │ │ "over" and "until" │
1189 │ │ │ go before these │
1190 │ │ │ arguments │
1191 ├────────┼───────────────────┼────────────────────┤
1192 │ │ │ │
1193 │used │ initial value of │ Two arguments, │
1194 │ │ used quota │ unsigned integer │
1195 │ │ │ (64 bit) and │
1196 │ │ │ string: bytes, │
1197 │ │ │ kbytes, mbytes │
1198 ├────────┼───────────────────┼────────────────────┤
1199 │ │ │ │
1200 │comment │ per quota comment │ string │
1201 │ │ field │ │
1202 └────────┴───────────────────┴────────────────────┘
1203
1204 Using named quotas.
1205
1206 nft add quota filter user123 { over 20 mbytes }
1207 nft add rule filter input ip saddr 192.168.10.123 quota name \"user123\"
1208
1209 Using named quotas with maps.
1210
1211 nft add quota filter user123 { over 20 mbytes }
1212 nft add quota filter user124 { over 20 mbytes }
1213 nft add rule filter input quota name ip saddr map { 192.168.10.123 : \"user123\", 192.168.10.124 : \"user124\" }
1214
1215
1217 Expressions represent values, either constants like network addresses,
1218 port numbers, etc., or data gathered from the packet during ruleset
1219 evaluation. Expressions can be combined using binary, logical,
1220 relational and other types of expressions to form complex or relational
1221 (match) expressions. They are also used as arguments to certain types
1222 of operations, like NAT, packet marking etc.
1223
1224 Each expression has a data type, which determines the size, parsing and
1225 representation of symbolic values and type compatibility with other
1226 expressions.
1227
1228 DESCRIBE COMMAND
1229 describe expression | data type
1230
1231 The describe command shows information about the type of an expression
1232 and its data type. A data type may also be given, in which nft will
1233 display more information about the type.
1234
1235 The describe command.
1236
1237 $ nft describe tcp flags
1238 payload expression, datatype tcp_flag (TCP flag) (basetype bitmask, integer), 8 bits
1239
1240 predefined symbolic constants:
1241 fin 0x01
1242 syn 0x02
1243 rst 0x04
1244 psh 0x08
1245 ack 0x10
1246 urg 0x20
1247 ecn 0x40
1248 cwr 0x80
1249
1250
1252 Data types determine the size, parsing and representation of symbolic
1253 values and type compatibility of expressions. A number of global data
1254 types exist, in addition some expression types define further data
1255 types specific to the expression type. Most data types have a fixed
1256 size, some however may have a dynamic size, f.i. the string type. Some
1257 types also have predefined symbolic constants. Those can be listed
1258 using the nft describe command:
1259
1260 $ nft describe ct_state
1261 datatype ct_state (conntrack state) (basetype bitmask, integer), 32 bits
1262
1263 pre-defined symbolic constants (in hexadecimal):
1264 invalid 0x00000001
1265 new ...
1266
1267 Types may be derived from lower order types, f.i. the IPv4 address type
1268 is derived from the integer type, meaning an IPv4 address can also be
1269 specified as an integer value.
1270
1271 In certain contexts (set and map definitions), it is necessary to
1272 explicitly specify a data type. Each type has a name which is used for
1273 this.
1274
1275 INTEGER TYPE
1276 ┌────────┬─────────┬──────────┬───────────┐
1277 │Name │ Keyword │ Size │ Base type │
1278 ├────────┼─────────┼──────────┼───────────┤
1279 │ │ │ │ │
1280 │Integer │ integer │ variable │ - │
1281 └────────┴─────────┴──────────┴───────────┘
1282
1283 The integer type is used for numeric values. It may be specified as a
1284 decimal, hexadecimal or octal number. The integer type does not have a
1285 fixed size, its size is determined by the expression for which it is
1286 used.
1287
1288 BITMASK TYPE
1289 ┌────────┬─────────┬──────────┬───────────┐
1290 │Name │ Keyword │ Size │ Base type │
1291 ├────────┼─────────┼──────────┼───────────┤
1292 │ │ │ │ │
1293 │Bitmask │ bitmask │ variable │ integer │
1294 └────────┴─────────┴──────────┴───────────┘
1295
1296 The bitmask type (bitmask) is used for bitmasks.
1297
1298 STRING TYPE
1299 ┌───────┬─────────┬──────────┬───────────┐
1300 │Name │ Keyword │ Size │ Base type │
1301 ├───────┼─────────┼──────────┼───────────┤
1302 │ │ │ │ │
1303 │String │ string │ variable │ - │
1304 └───────┴─────────┴──────────┴───────────┘
1305
1306 The string type is used for character strings. A string begins with an
1307 alphabetic character (a-zA-Z) followed by zero or more alphanumeric
1308 characters or the characters /, -, _ and .. In addition, anything
1309 enclosed in double quotes (") is recognized as a string.
1310
1311 String specification.
1312
1313 # Interface name
1314 filter input iifname eth0
1315
1316 # Weird interface name
1317 filter input iifname "(eth0)"
1318
1319
1320 LINK LAYER ADDRESS TYPE
1321 ┌───────────┬─────────┬──────────┬───────────┐
1322 │Name │ Keyword │ Size │ Base type │
1323 ├───────────┼─────────┼──────────┼───────────┤
1324 │ │ │ │ │
1325 │Link layer │ lladdr │ variable │ integer │
1326 │address │ │ │ │
1327 └───────────┴─────────┴──────────┴───────────┘
1328
1329 The link layer address type is used for link layer addresses. Link
1330 layer addresses are specified as a variable amount of groups of two
1331 hexadecimal digits separated using colons (:).
1332
1333 Link layer address specification.
1334
1335 # Ethernet destination MAC address
1336 filter input ether daddr 20:c9:d0:43:12:d9
1337
1338
1339 IPV4 ADDRESS TYPE
1340 ┌─────────────┬───────────┬────────┬───────────┐
1341 │Name │ Keyword │ Size │ Base type │
1342 ├─────────────┼───────────┼────────┼───────────┤
1343 │ │ │ │ │
1344 │IPV4 address │ ipv4_addr │ 32 bit │ integer │
1345 └─────────────┴───────────┴────────┴───────────┘
1346
1347 The IPv4 address type is used for IPv4 addresses. Addresses are
1348 specified in either dotted decimal, dotted hexadecimal, dotted octal,
1349 decimal, hexadecimal, octal notation or as a host name. A host name
1350 will be resolved using the standard system resolver.
1351
1352 IPv4 address specification.
1353
1354 # dotted decimal notation
1355 filter output ip daddr 127.0.0.1
1356
1357 # host name
1358 filter output ip daddr localhost
1359
1360
1361 IPV6 ADDRESS TYPE
1362 ┌─────────────┬───────────┬─────────┬───────────┐
1363 │Name │ Keyword │ Size │ Base type │
1364 ├─────────────┼───────────┼─────────┼───────────┤
1365 │ │ │ │ │
1366 │IPv6 address │ ipv6_addr │ 128 bit │ integer │
1367 └─────────────┴───────────┴─────────┴───────────┘
1368
1369 The IPv6 address type is used for IPv6 addresses. Addresses are
1370 specified as a host name or as hexadecimal halfwords separated by
1371 colons. Addresses might be enclosed in square brackets ("[]") to
1372 differentiate them from port numbers.
1373
1374 IPv6 address specification.
1375
1376 # abbreviated loopback address
1377 filter output ip6 daddr ::1
1378
1379 IPv6 address specification with bracket notation.
1380
1381 # without [] the port number (22) would be parsed as part of the
1382 # ipv6 address
1383 ip6 nat prerouting tcp dport 2222 dnat to [1ce::d0]:22
1384
1385
1386 BOOLEAN TYPE
1387 ┌────────┬─────────┬───────┬───────────┐
1388 │Name │ Keyword │ Size │ Base type │
1389 ├────────┼─────────┼───────┼───────────┤
1390 │ │ │ │ │
1391 │Boolean │ boolean │ 1 bit │ integer │
1392 └────────┴─────────┴───────┴───────────┘
1393
1394 The boolean type is a syntactical helper type in userspace. Its use is
1395 in the right-hand side of a (typically implicit) relational expression
1396 to change the expression on the left-hand side into a boolean check
1397 (usually for existence).
1398
1399 Table 16. The following keywords will automatically resolve into a
1400 boolean type with given value
1401 ┌────────┬───────┐
1402 │Keyword │ Value │
1403 ├────────┼───────┤
1404 │ │ │
1405 │exists │ 1 │
1406 ├────────┼───────┤
1407 │ │ │
1408 │missing │ 0 │
1409 └────────┴───────┘
1410
1411 Table 17. expressions support a boolean comparison
1412 ┌───────────┬─────────────────────────┐
1413 │Expression │ Behaviour │
1414 ├───────────┼─────────────────────────┤
1415 │ │ │
1416 │fib │ Check route existence. │
1417 ├───────────┼─────────────────────────┤
1418 │ │ │
1419 │exthdr │ Check IPv6 extension │
1420 │ │ header existence. │
1421 ├───────────┼─────────────────────────┤
1422 │ │ │
1423 │tcp option │ Check TCP option header │
1424 │ │ existence. │
1425 └───────────┴─────────────────────────┘
1426
1427 Boolean specification.
1428
1429 # match if route exists
1430 filter input fib daddr . iif oif exists
1431
1432 # match only non-fragmented packets in IPv6 traffic
1433 filter input exthdr frag missing
1434
1435 # match if TCP timestamp option is present
1436 filter input tcp option timestamp exists
1437
1438
1439 ICMP TYPE TYPE
1440 ┌──────────┬───────────┬───────┬───────────┐
1441 │Name │ Keyword │ Size │ Base type │
1442 ├──────────┼───────────┼───────┼───────────┤
1443 │ │ │ │ │
1444 │ICMP Type │ icmp_type │ 8 bit │ integer │
1445 └──────────┴───────────┴───────┴───────────┘
1446
1447 The ICMP Type type is used to conveniently specify the ICMP header’s
1448 type field.
1449
1450 Table 18. Keywords may be used when specifying the ICMP type
1451 ┌────────────────────────┬───────┐
1452 │Keyword │ Value │
1453 ├────────────────────────┼───────┤
1454 │ │ │
1455 │echo-reply │ 0 │
1456 ├────────────────────────┼───────┤
1457 │ │ │
1458 │destination-unreachable │ 3 │
1459 ├────────────────────────┼───────┤
1460 │ │ │
1461 │source-quench │ 4 │
1462 ├────────────────────────┼───────┤
1463 │ │ │
1464 │redirect │ 5 │
1465 ├────────────────────────┼───────┤
1466 │ │ │
1467 │echo-request │ 8 │
1468 ├────────────────────────┼───────┤
1469 │ │ │
1470 │router-advertisement │ 9 │
1471 ├────────────────────────┼───────┤
1472 │ │ │
1473 │router-solicitation │ 10 │
1474 ├────────────────────────┼───────┤
1475 │ │ │
1476 │time-exceeded │ 11 │
1477 ├────────────────────────┼───────┤
1478 │ │ │
1479 │parameter-problem │ 12 │
1480 ├────────────────────────┼───────┤
1481 │ │ │
1482 │timestamp-request │ 13 │
1483 ├────────────────────────┼───────┤
1484 │ │ │
1485 │timestamp-reply │ 14 │
1486 ├────────────────────────┼───────┤
1487 │ │ │
1488 │info-request │ 15 │
1489 ├────────────────────────┼───────┤
1490 │ │ │
1491 │info-reply │ 16 │
1492 ├────────────────────────┼───────┤
1493 │ │ │
1494 │address-mask-request │ 17 │
1495 ├────────────────────────┼───────┤
1496 │ │ │
1497 │address-mask-reply │ 18 │
1498 └────────────────────────┴───────┘
1499
1500 ICMP Type specification.
1501
1502 # match ping packets
1503 filter output icmp type { echo-request, echo-reply }
1504
1505
1506 ICMP CODE TYPE
1507 ┌──────────┬───────────┬───────┬───────────┐
1508 │Name │ Keyword │ Size │ Base type │
1509 ├──────────┼───────────┼───────┼───────────┤
1510 │ │ │ │ │
1511 │ICMP Code │ icmp_code │ 8 bit │ integer │
1512 └──────────┴───────────┴───────┴───────────┘
1513
1514 The ICMP Code type is used to conveniently specify the ICMP header’s
1515 code field.
1516
1517 Table 19. Keywords may be used when specifying the ICMP code
1518 ┌─────────────────┬───────┐
1519 │Keyword │ Value │
1520 ├─────────────────┼───────┤
1521 │ │ │
1522 │net-unreachable │ 0 │
1523 ├─────────────────┼───────┤
1524 │ │ │
1525 │host-unreachable │ 1 │
1526 ├─────────────────┼───────┤
1527 │ │ │
1528 │prot-unreachable │ 2 │
1529 ├─────────────────┼───────┤
1530 │ │ │
1531 │port-unreachable │ 3 │
1532 ├─────────────────┼───────┤
1533 │ │ │
1534 │frag-needed │ 4 │
1535 ├─────────────────┼───────┤
1536 │ │ │
1537 │net-prohibited │ 9 │
1538 ├─────────────────┼───────┤
1539 │ │ │
1540 │host-prohibited │ 10 │
1541 ├─────────────────┼───────┤
1542 │ │ │
1543 │admin-prohibited │ 13 │
1544 └─────────────────┴───────┘
1545
1546 ICMPV6 TYPE TYPE
1547 ┌────────────┬────────────┬───────┬───────────┐
1548 │Name │ Keyword │ Size │ Base type │
1549 ├────────────┼────────────┼───────┼───────────┤
1550 │ │ │ │ │
1551 │ICMPv6 Type │ icmpx_code │ 8 bit │ integer │
1552 └────────────┴────────────┴───────┴───────────┘
1553
1554 The ICMPv6 Type type is used to conveniently specify the ICMPv6
1555 header’s type field.
1556
1557 Table 20. keywords may be used when specifying the ICMPv6 type:
1558 ┌────────────────────────┬───────┐
1559 │Keyword │ Value │
1560 ├────────────────────────┼───────┤
1561 │ │ │
1562 │destination-unreachable │ 1 │
1563 ├────────────────────────┼───────┤
1564 │ │ │
1565 │packet-too-big │ 2 │
1566 ├────────────────────────┼───────┤
1567 │ │ │
1568 │time-exceeded │ 3 │
1569 ├────────────────────────┼───────┤
1570 │ │ │
1571 │parameter-problem │ 4 │
1572 ├────────────────────────┼───────┤
1573 │ │ │
1574 │echo-request │ 128 │
1575 ├────────────────────────┼───────┤
1576 │ │ │
1577 │echo-reply │ 129 │
1578 ├────────────────────────┼───────┤
1579 │ │ │
1580 │mld-listener-query │ 130 │
1581 ├────────────────────────┼───────┤
1582 │ │ │
1583 │mld-listener-report │ 131 │
1584 ├────────────────────────┼───────┤
1585 │ │ │
1586 │mld-listener-done │ 132 │
1587 ├────────────────────────┼───────┤
1588 │ │ │
1589 │mld-listener-reduction │ 132 │
1590 ├────────────────────────┼───────┤
1591 │ │ │
1592 │nd-router-solicit │ 133 │
1593 ├────────────────────────┼───────┤
1594 │ │ │
1595 │nd-router-advert │ 134 │
1596 ├────────────────────────┼───────┤
1597 │ │ │
1598 │nd-neighbor-solicit │ 135 │
1599 ├────────────────────────┼───────┤
1600 │ │ │
1601 │nd-neighbor-advert │ 136 │
1602 ├────────────────────────┼───────┤
1603 │ │ │
1604 │nd-redirect │ 137 │
1605 ├────────────────────────┼───────┤
1606 │ │ │
1607 │router-renumbering │ 138 │
1608 ├────────────────────────┼───────┤
1609 │ │ │
1610 │ind-neighbor-solicit │ 141 │
1611 ├────────────────────────┼───────┤
1612 │ │ │
1613 │ind-neighbor-advert │ 142 │
1614 ├────────────────────────┼───────┤
1615 │ │ │
1616 │mld2-listener-report │ 143 │
1617 └────────────────────────┴───────┘
1618
1619 ICMPv6 Type specification.
1620
1621 # match ICMPv6 ping packets
1622 filter output icmpv6 type { echo-request, echo-reply }
1623
1624
1625 ICMPV6 CODE TYPE
1626 ┌────────────┬─────────────┬───────┬───────────┐
1627 │Name │ Keyword │ Size │ Base type │
1628 ├────────────┼─────────────┼───────┼───────────┤
1629 │ │ │ │ │
1630 │ICMPv6 Code │ icmpv6_code │ 8 bit │ integer │
1631 └────────────┴─────────────┴───────┴───────────┘
1632
1633 The ICMPv6 Code type is used to conveniently specify the ICMPv6
1634 header’s code field.
1635
1636 Table 21. keywords may be used when specifying the ICMPv6 code
1637 ┌─────────────────┬───────┐
1638 │Keyword │ Value │
1639 ├─────────────────┼───────┤
1640 │ │ │
1641 │no-route │ 0 │
1642 ├─────────────────┼───────┤
1643 │ │ │
1644 │admin-prohibited │ 1 │
1645 ├─────────────────┼───────┤
1646 │ │ │
1647 │addr-unreachable │ 3 │
1648 ├─────────────────┼───────┤
1649 │ │ │
1650 │port-unreachable │ 4 │
1651 ├─────────────────┼───────┤
1652 │ │ │
1653 │policy-fail │ 5 │
1654 ├─────────────────┼───────┤
1655 │ │ │
1656 │reject-route │ 6 │
1657 └─────────────────┴───────┘
1658
1659 ICMPVX CODE TYPE
1660 ┌────────────┬─────────────┬───────┬───────────┐
1661 │Name │ Keyword │ Size │ Base type │
1662 ├────────────┼─────────────┼───────┼───────────┤
1663 │ │ │ │ │
1664 │ICMPvX Code │ icmpv6_type │ 8 bit │ integer │
1665 └────────────┴─────────────┴───────┴───────────┘
1666
1667 The ICMPvX Code type abstraction is a set of values which overlap
1668 between ICMP and ICMPv6 Code types to be used from the inet family.
1669
1670 Table 22. keywords may be used when specifying the ICMPvX code
1671 ┌─────────────────┬───────┐
1672 │Keyword │ Value │
1673 ├─────────────────┼───────┤
1674 │ │ │
1675 │no-route │ 0 │
1676 ├─────────────────┼───────┤
1677 │ │ │
1678 │port-unreachable │ 1 │
1679 ├─────────────────┼───────┤
1680 │ │ │
1681 │host-unreachable │ 2 │
1682 ├─────────────────┼───────┤
1683 │ │ │
1684 │admin-prohibited │ 3 │
1685 └─────────────────┴───────┘
1686
1687 CONNTRACK TYPES
1688 Table 23. overview of types used in ct expression and statement
1689 ┌─────────────────┬───────────┬─────────┬───────────┐
1690 │Name │ Keyword │ Size │ Base type │
1691 ├─────────────────┼───────────┼─────────┼───────────┤
1692 │ │ │ │ │
1693 │conntrack state │ ct_state │ 4 byte │ bitmask │
1694 ├─────────────────┼───────────┼─────────┼───────────┤
1695 │ │ │ │ │
1696 │conntrack │ ct_dir │ 8 bit │ integer │
1697 │direction │ │ │ │
1698 ├─────────────────┼───────────┼─────────┼───────────┤
1699 │ │ │ │ │
1700 │conntrack status │ ct_status │ 4 byte │ bitmask │
1701 ├─────────────────┼───────────┼─────────┼───────────┤
1702 │ │ │ │ │
1703 │conntrack event │ ct_event │ 4 byte │ bitmask │
1704 │bits │ │ │ │
1705 ├─────────────────┼───────────┼─────────┼───────────┤
1706 │ │ │ │ │
1707 │conntrack label │ ct_label │ 128 bit │ bitmask │
1708 └─────────────────┴───────────┴─────────┴───────────┘
1709
1710 For each of the types above, keywords are available for convenience:
1711
1712 Table 24. conntrack state (ct_state)
1713 ┌────────────┬───────┐
1714 │Keyword │ Value │
1715 ├────────────┼───────┤
1716 │ │ │
1717 │invalid │ 1 │
1718 ├────────────┼───────┤
1719 │ │ │
1720 │established │ 2 │
1721 ├────────────┼───────┤
1722 │ │ │
1723 │related │ 4 │
1724 ├────────────┼───────┤
1725 │ │ │
1726 │new │ 8 │
1727 ├────────────┼───────┤
1728 │ │ │
1729 │untracked │ 64 │
1730 └────────────┴───────┘
1731
1732 Table 25. conntrack direction (ct_dir)
1733 ┌─────────┬───────┐
1734 │Keyword │ Value │
1735 ├─────────┼───────┤
1736 │ │ │
1737 │original │ 0 │
1738 ├─────────┼───────┤
1739 │ │ │
1740 │reply │ 1 │
1741 └─────────┴───────┘
1742
1743 Table 26. conntrack status (ct_status)
1744 ┌───────────┬───────┐
1745 │Keyword │ Value │
1746 ├───────────┼───────┤
1747 │ │ │
1748 │expected │ 1 │
1749 ├───────────┼───────┤
1750 │ │ │
1751 │seen-reply │ 2 │
1752 ├───────────┼───────┤
1753 │ │ │
1754 │assured │ 4 │
1755 ├───────────┼───────┤
1756 │ │ │
1757 │confirmed │ 8 │
1758 ├───────────┼───────┤
1759 │ │ │
1760 │snat │ 16 │
1761 ├───────────┼───────┤
1762 │ │ │
1763 │dnat │ 32 │
1764 ├───────────┼───────┤
1765 │ │ │
1766 │dying │ 512 │
1767 └───────────┴───────┘
1768
1769 Table 27. conntrack event bits (ct_event)
1770 ┌──────────┬───────┐
1771 │Keyword │ Value │
1772 ├──────────┼───────┤
1773 │ │ │
1774 │new │ 1 │
1775 ├──────────┼───────┤
1776 │ │ │
1777 │related │ 2 │
1778 ├──────────┼───────┤
1779 │ │ │
1780 │destroy │ 4 │
1781 ├──────────┼───────┤
1782 │ │ │
1783 │reply │ 8 │
1784 ├──────────┼───────┤
1785 │ │ │
1786 │assured │ 16 │
1787 ├──────────┼───────┤
1788 │ │ │
1789 │protoinfo │ 32 │
1790 ├──────────┼───────┤
1791 │ │ │
1792 │helper │ 64 │
1793 ├──────────┼───────┤
1794 │ │ │
1795 │mark │ 128 │
1796 ├──────────┼───────┤
1797 │ │ │
1798 │seqadj │ 256 │
1799 ├──────────┼───────┤
1800 │ │ │
1801 │secmark │ 512 │
1802 ├──────────┼───────┤
1803 │ │ │
1804 │label │ 1024 │
1805 └──────────┴───────┘
1806
1807 Possible keywords for conntrack label type (ct_label) are read at
1808 runtime from /etc/connlabel.conf.
1809
1810 DCCP PKTTYPE TYPE
1811 ┌─────────────────┬──────────────┬───────┬───────────┐
1812 │Name │ Keyword │ Size │ Base type │
1813 ├─────────────────┼──────────────┼───────┼───────────┤
1814 │ │ │ │ │
1815 │DCCP packet type │ dccp_pkttype │ 4 bit │ integer │
1816 └─────────────────┴──────────────┴───────┴───────────┘
1817
1818 The DCCP packet type abstracts the different legal values of the
1819 respective four bit field in the DCCP header, as stated by RFC4340.
1820 Note that possible values 10-15 are considered reserved and therefore
1821 not allowed to be used. In iptables' dccp match, these values are
1822 aliased INVALID. With nftables, one may simply match on the numeric
1823 value range, i.e. 10-15.
1824
1825 Table 28. keywords may be used when specifying the DCCP packet type
1826 ┌─────────┬───────┐
1827 │Keyword │ Value │
1828 ├─────────┼───────┤
1829 │ │ │
1830 │request │ 0 │
1831 ├─────────┼───────┤
1832 │ │ │
1833 │response │ 1 │
1834 ├─────────┼───────┤
1835 │ │ │
1836 │data │ 2 │
1837 ├─────────┼───────┤
1838 │ │ │
1839 │ack │ 3 │
1840 ├─────────┼───────┤
1841 │ │ │
1842 │dataack │ 4 │
1843 ├─────────┼───────┤
1844 │ │ │
1845 │closereq │ 5 │
1846 ├─────────┼───────┤
1847 │ │ │
1848 │close │ 6 │
1849 ├─────────┼───────┤
1850 │ │ │
1851 │reset │ 7 │
1852 ├─────────┼───────┤
1853 │ │ │
1854 │sync │ 8 │
1855 ├─────────┼───────┤
1856 │ │ │
1857 │syncack │ 9 │
1858 └─────────┴───────┘
1859
1861 The lowest order expression is a primary expression, representing
1862 either a constant or a single datum from a packet’s payload, meta data
1863 or a stateful module.
1864
1865 META EXPRESSIONS
1866 meta {length | nfproto | l4proto | protocol | priority}
1867 [meta] {mark | iif | iifname | iiftype | oif | oifname | oiftype | skuid | skgid | nftrace | rtclassid | ibrname | obrname | pkttype | cpu | iifgroup | oifgroup | cgroup | random | ipsec | iifkind | oifkind | time | hour | day }
1868
1869 A meta expression refers to meta data associated with a packet.
1870
1871 There are two types of meta expressions: unqualified and qualified meta
1872 expressions. Qualified meta expressions require the meta keyword before
1873 the meta key, unqualified meta expressions can be specified by using
1874 the meta key directly or as qualified meta expressions. Meta l4proto is
1875 useful to match a particular transport protocol that is part of either
1876 an IPv4 or IPv6 packet. It will also skip any IPv6 extension headers
1877 present in an IPv6 packet.
1878
1879 meta iif, oif, iifname and oifname are used to match the interface a
1880 packet arrived on or is about to be sent out on.
1881
1882 iif and oif are used to match on the interface index, whereas iifname
1883 and oifname are used to match on the interface name. This is not the
1884 same — assuming the rule
1885
1886 filter input meta iif "foo"
1887
1888 Then this rule can only be added if the interface "foo" exists. Also,
1889 the rule will continue to match even if the interface "foo" is renamed
1890 to "bar".
1891
1892 This is because internally the interface index is used. In case of
1893 dynamically created interfaces, such as tun/tap or dialup interfaces
1894 (ppp for example), it might be better to use iifname or oifname
1895 instead.
1896
1897 In these cases, the name is used so the interface doesn’t have to exist
1898 to add such a rule, it will stop matching if the interface gets renamed
1899 and it will match again in case interface gets deleted and later a new
1900 interface with the same name is created.
1901
1902 Like with iptables, wildcard matching on interface name prefixes is
1903 available for iifname and oifname matches by appending an asterisk (*)
1904 character. Note however that unlike iptables, nftables does not accept
1905 interface names consisting of the wildcard character only - users are
1906 supposed to just skip those always matching expressions. In order to
1907 match on literal asterisk character, one may escape it using backslash
1908 (\).
1909
1910 Table 29. Meta expression types
1911 ┌──────────┬─────────────────────┬─────────────────────┐
1912 │Keyword │ Description │ Type │
1913 ├──────────┼─────────────────────┼─────────────────────┤
1914 │ │ │ │
1915 │length │ Length of the │ integer (32-bit) │
1916 │ │ packet in bytes │ │
1917 ├──────────┼─────────────────────┼─────────────────────┤
1918 │ │ │ │
1919 │nfproto │ real hook protocol │ integer (32 bit) │
1920 │ │ family, useful only │ │
1921 │ │ in inet table │ │
1922 ├──────────┼─────────────────────┼─────────────────────┤
1923 │ │ │ │
1924 │l4proto │ layer 4 protocol, │ integer (8 bit) │
1925 │ │ skips ipv6 │ │
1926 │ │ extension headers │ │
1927 ├──────────┼─────────────────────┼─────────────────────┤
1928 │ │ │ │
1929 │protocol │ EtherType protocol │ ether_type │
1930 │ │ value │ │
1931 ├──────────┼─────────────────────┼─────────────────────┤
1932 │ │ │ │
1933 │priority │ TC packet priority │ tc_handle │
1934 ├──────────┼─────────────────────┼─────────────────────┤
1935 │ │ │ │
1936 │mark │ Packet mark │ mark │
1937 ├──────────┼─────────────────────┼─────────────────────┤
1938 │ │ │ │
1939 │iif │ Input interface │ iface_index │
1940 │ │ index │ │
1941 ├──────────┼─────────────────────┼─────────────────────┤
1942 │ │ │ │
1943 │iifname │ Input interface │ ifname │
1944 │ │ name │ │
1945 ├──────────┼─────────────────────┼─────────────────────┤
1946 │ │ │ │
1947 │iiftype │ Input interface │ iface_type │
1948 │ │ type │ │
1949 ├──────────┼─────────────────────┼─────────────────────┤
1950 │ │ │ │
1951 │oif │ Output interface │ iface_index │
1952 │ │ index │ │
1953 ├──────────┼─────────────────────┼─────────────────────┤
1954 │ │ │ │
1955 │oifname │ Output interface │ ifname │
1956 │ │ name │ │
1957 ├──────────┼─────────────────────┼─────────────────────┤
1958 │ │ │ │
1959 │oiftype │ Output interface │ iface_type │
1960 │ │ hardware type │ │
1961 ├──────────┼─────────────────────┼─────────────────────┤
1962 │ │ │ │
1963 │sdif │ Slave device input │ iface_index │
1964 │ │ interface index │ │
1965 ├──────────┼─────────────────────┼─────────────────────┤
1966 │ │ │ │
1967 │sdifname │ Slave device │ ifname │
1968 │ │ interface name │ │
1969 ├──────────┼─────────────────────┼─────────────────────┤
1970 │ │ │ │
1971 │skuid │ UID associated with │ uid │
1972 │ │ originating socket │ │
1973 ├──────────┼─────────────────────┼─────────────────────┤
1974 │ │ │ │
1975 │skgid │ GID associated with │ gid │
1976 │ │ originating socket │ │
1977 ├──────────┼─────────────────────┼─────────────────────┤
1978 │ │ │ │
1979 │rtclassid │ Routing realm │ realm │
1980 ├──────────┼─────────────────────┼─────────────────────┤
1981 │ │ │ │
1982 │ibrname │ Input bridge │ ifname │
1983 │ │ interface name │ │
1984 ├──────────┼─────────────────────┼─────────────────────┤
1985 │ │ │ │
1986 │obrname │ Output bridge │ ifname │
1987 │ │ interface name │ │
1988 ├──────────┼─────────────────────┼─────────────────────┤
1989 │ │ │ │
1990 │pkttype │ packet type │ pkt_type │
1991 ├──────────┼─────────────────────┼─────────────────────┤
1992 │ │ │ │
1993 │cpu │ cpu number │ integer (32 bit) │
1994 │ │ processing the │ │
1995 │ │ packet │ │
1996 ├──────────┼─────────────────────┼─────────────────────┤
1997 │ │ │ │
1998 │iifgroup │ incoming device │ devgroup │
1999 │ │ group │ │
2000 ├──────────┼─────────────────────┼─────────────────────┤
2001 │ │ │ │
2002 │oifgroup │ outgoing device │ devgroup │
2003 │ │ group │ │
2004 ├──────────┼─────────────────────┼─────────────────────┤
2005 │ │ │ │
2006 │cgroup │ control group id │ integer (32 bit) │
2007 ├──────────┼─────────────────────┼─────────────────────┤
2008 │ │ │ │
2009 │random │ pseudo-random │ integer (32 bit) │
2010 │ │ number │ │
2011 ├──────────┼─────────────────────┼─────────────────────┤
2012 │ │ │ │
2013 │ipsec │ true if packet was │ boolean (1 bit) │
2014 │ │ ipsec encrypted │ │
2015 ├──────────┼─────────────────────┼─────────────────────┤
2016 │ │ │ │
2017 │iifkind │ Input interface │ │
2018 │ │ kind │ │
2019 ├──────────┼─────────────────────┼─────────────────────┤
2020 │ │ │ │
2021 │oifkind │ Output interface │ │
2022 │ │ kind │ │
2023 ├──────────┼─────────────────────┼─────────────────────┤
2024 │ │ │ │
2025 │time │ Absolute time of │ Integer (32 bit) or │
2026 │ │ packet reception │ string │
2027 ├──────────┼─────────────────────┼─────────────────────┤
2028 │ │ │ │
2029 │day │ Day of week │ Integer (8 bit) or │
2030 │ │ │ string │
2031 ├──────────┼─────────────────────┼─────────────────────┤
2032 │ │ │ │
2033 │hour │ Hour of day │ String │
2034 └──────────┴─────────────────────┴─────────────────────┘
2035
2036 Table 30. Meta expression specific types
2037 ┌──────────────┬────────────────────────────┐
2038 │Type │ Description │
2039 ├──────────────┼────────────────────────────┤
2040 │ │ │
2041 │iface_index │ Interface index (32 bit │
2042 │ │ number). Can be specified │
2043 │ │ numerically or as name of │
2044 │ │ an existing interface. │
2045 ├──────────────┼────────────────────────────┤
2046 │ │ │
2047 │ifname │ Interface name (16 byte │
2048 │ │ string). Does not have to │
2049 │ │ exist. │
2050 ├──────────────┼────────────────────────────┤
2051 │ │ │
2052 │iface_type │ Interface type (16 bit │
2053 │ │ number). │
2054 ├──────────────┼────────────────────────────┤
2055 │ │ │
2056 │uid │ User ID (32 bit number). │
2057 │ │ Can be specified │
2058 │ │ numerically or as user │
2059 │ │ name. │
2060 ├──────────────┼────────────────────────────┤
2061 │ │ │
2062 │gid │ Group ID (32 bit number). │
2063 │ │ Can be specified │
2064 │ │ numerically or as group │
2065 │ │ name. │
2066 ├──────────────┼────────────────────────────┤
2067 │ │ │
2068 │realm │ Routing Realm (32 bit │
2069 │ │ number). Can be specified │
2070 │ │ numerically or as symbolic │
2071 │ │ name defined in │
2072 │ │ /etc/iproute2/rt_realms. │
2073 ├──────────────┼────────────────────────────┤
2074 │ │ │
2075 │devgroup_type │ Device group (32 bit │
2076 │ │ number). Can be specified │
2077 │ │ numerically or as symbolic │
2078 │ │ name defined in │
2079 │ │ /etc/iproute2/group. │
2080 ├──────────────┼────────────────────────────┤
2081 │ │ │
2082 │pkt_type │ Packet type: host │
2083 │ │ (addressed to local host), │
2084 │ │ broadcast (to all), │
2085 │ │ multicast (to group), │
2086 │ │ other (addressed to │
2087 │ │ another host). │
2088 ├──────────────┼────────────────────────────┤
2089 │ │ │
2090 │ifkind │ Interface kind (16 byte │
2091 │ │ string). See TYPES in │
2092 │ │ ip-link(8) for a list. │
2093 ├──────────────┼────────────────────────────┤
2094 │ │ │
2095 │time │ Either an integer or a │
2096 │ │ date in ISO format. For │
2097 │ │ example: "2019-06-06 │
2098 │ │ 17:00". Hour and seconds │
2099 │ │ are optional and can be │
2100 │ │ omitted if desired. If │
2101 │ │ omitted, midnight will be │
2102 │ │ assumed. The following │
2103 │ │ three would be equivalent: │
2104 │ │ "2019-06-06", "2019-06-06 │
2105 │ │ 00:00" and "2019-06-06 │
2106 │ │ 00:00:00". When an integer │
2107 │ │ is given, it is assumed to │
2108 │ │ be a UNIX timestamp. │
2109 ├──────────────┼────────────────────────────┤
2110 │ │ │
2111 │day │ Either a day of week │
2112 │ │ ("Monday", "Tuesday", │
2113 │ │ etc.), or an integer │
2114 │ │ between 0 and 6. Strings │
2115 │ │ are matched │
2116 │ │ case-insensitively, and a │
2117 │ │ full match is not expected │
2118 │ │ (e.g. "Mon" would match │
2119 │ │ "Monday"). When an integer │
2120 │ │ is given, 0 is Sunday and │
2121 │ │ 6 is Saturday. │
2122 ├──────────────┼────────────────────────────┤
2123 │ │ │
2124 │hour │ A string representing an │
2125 │ │ hour in 24-hour format. │
2126 │ │ Seconds can optionally be │
2127 │ │ specified. For example, │
2128 │ │ 17:00 and 17:00:00 would │
2129 │ │ be equivalent. │
2130 └──────────────┴────────────────────────────┘
2131
2132 Using meta expressions.
2133
2134 # qualified meta expression
2135 filter output meta oif eth0
2136 filter forward meta iifkind { "tun", "veth" }
2137
2138 # unqualified meta expression
2139 filter output oif eth0
2140
2141 # incoming packet was subject to ipsec processing
2142 raw prerouting meta ipsec exists accept
2143
2144
2145 SOCKET EXPRESSION
2146 socket {transparent | mark | wildcard}
2147 socket cgroupv2 level NUM
2148
2149 Socket expression can be used to search for an existing open TCP/UDP
2150 socket and its attributes that can be associated with a packet. It
2151 looks for an established or non-zero bound listening socket (possibly
2152 with a non-local address). You can also use it to match on the socket
2153 cgroupv2 at a given ancestor level, e.g. if the socket belongs to
2154 cgroupv2 a/b, ancestor level 1 checks for a matching on cgroup a and
2155 ancestor level 2 checks for a matching on cgroup b.
2156
2157 Table 31. Available socket attributes
2158 ┌────────────┬─────────────────────┬─────────────────┐
2159 │Name │ Description │ Type │
2160 ├────────────┼─────────────────────┼─────────────────┤
2161 │ │ │ │
2162 │transparent │ Value of the │ boolean (1 bit) │
2163 │ │ IP_TRANSPARENT │ │
2164 │ │ socket option in │ │
2165 │ │ the found socket. │ │
2166 │ │ It can be 0 or 1. │ │
2167 ├────────────┼─────────────────────┼─────────────────┤
2168 │ │ │ │
2169 │mark │ Value of the socket │ mark │
2170 │ │ mark (SOL_SOCKET, │ │
2171 │ │ SO_MARK). │ │
2172 ├────────────┼─────────────────────┼─────────────────┤
2173 │ │ │ │
2174 │wildcard │ Indicates whether │ boolean (1 bit) │
2175 │ │ the socket is │ │
2176 │ │ wildcard-bound │ │
2177 │ │ (e.g. 0.0.0.0 or │ │
2178 │ │ ::0). │ │
2179 ├────────────┼─────────────────────┼─────────────────┤
2180 │ │ │ │
2181 │cgroupv2 │ cgroup version 2 │ cgroupv2 │
2182 │ │ for this socket │ │
2183 │ │ (path from │ │
2184 │ │ /sys/fs/cgroup) │ │
2185 └────────────┴─────────────────────┴─────────────────┘
2186
2187 Using socket expression.
2188
2189 # Mark packets that correspond to a transparent socket. "socket wildcard 0"
2190 # means that zero-bound listener sockets are NOT matched (which is usually
2191 # exactly what you want).
2192 table inet x {
2193 chain y {
2194 type filter hook prerouting priority mangle; policy accept;
2195 socket transparent 1 socket wildcard 0 mark set 0x00000001 accept
2196 }
2197 }
2198
2199 # Trace packets that corresponds to a socket with a mark value of 15
2200 table inet x {
2201 chain y {
2202 type filter hook prerouting priority mangle; policy accept;
2203 socket mark 0x0000000f nftrace set 1
2204 }
2205 }
2206
2207 # Set packet mark to socket mark
2208 table inet x {
2209 chain y {
2210 type filter hook prerouting priority mangle; policy accept;
2211 tcp dport 8080 mark set socket mark
2212 }
2213 }
2214
2215 # Count packets for cgroupv2 "user.slice" at level 1
2216 table inet x {
2217 chain y {
2218 type filter hook input priority filter; policy accept;
2219 socket cgroupv2 level 1 "user.slice" counter
2220 }
2221 }
2222
2223
2224 OSF EXPRESSION
2225 osf [ttl {loose | skip}] {name | version}
2226
2227 The osf expression does passive operating system fingerprinting. This
2228 expression compares some data (Window Size, MSS, options and their
2229 order, DF, and others) from packets with the SYN bit set.
2230
2231 Table 32. Available osf attributes
2232 ┌────────┬─────────────────────┬────────┐
2233 │Name │ Description │ Type │
2234 ├────────┼─────────────────────┼────────┤
2235 │ │ │ │
2236 │ttl │ Do TTL checks on │ string │
2237 │ │ the packet to │ │
2238 │ │ determine the │ │
2239 │ │ operating system. │ │
2240 ├────────┼─────────────────────┼────────┤
2241 │ │ │ │
2242 │version │ Do OS version │ │
2243 │ │ checks on the │ │
2244 │ │ packet. │ │
2245 ├────────┼─────────────────────┼────────┤
2246 │ │ │ │
2247 │name │ Name of the OS │ string │
2248 │ │ signature to match. │ │
2249 │ │ All signatures can │ │
2250 │ │ be found at pf.os │ │
2251 │ │ file. Use "unknown" │ │
2252 │ │ for OS signatures │ │
2253 │ │ that the expression │ │
2254 │ │ could not detect. │ │
2255 └────────┴─────────────────────┴────────┘
2256
2257 Available ttl values.
2258
2259 If no TTL attribute is passed, make a true IP header and fingerprint TTL true comparison. This generally works for LANs.
2260
2261 * loose: Check if the IP header's TTL is less than the fingerprint one. Works for globally-routable addresses.
2262 * skip: Do not compare the TTL at all.
2263
2264 Using osf expression.
2265
2266 # Accept packets that match the "Linux" OS genre signature without comparing TTL.
2267 table inet x {
2268 chain y {
2269 type filter hook input priority filter; policy accept;
2270 osf ttl skip name "Linux"
2271 }
2272 }
2273
2274
2275 FIB EXPRESSIONS
2276 fib {saddr | daddr | mark | iif | oif} [. ...] {oif | oifname | type}
2277
2278 A fib expression queries the fib (forwarding information base) to
2279 obtain information such as the output interface index a particular
2280 address would use. The input is a tuple of elements that is used as
2281 input to the fib lookup functions.
2282
2283 Table 33. fib expression specific types
2284 ┌────────┬──────────────────┬──────────────────┐
2285 │Keyword │ Description │ Type │
2286 ├────────┼──────────────────┼──────────────────┤
2287 │ │ │ │
2288 │oif │ Output interface │ integer (32 bit) │
2289 │ │ index │ │
2290 ├────────┼──────────────────┼──────────────────┤
2291 │ │ │ │
2292 │oifname │ Output interface │ string │
2293 │ │ name │ │
2294 ├────────┼──────────────────┼──────────────────┤
2295 │ │ │ │
2296 │type │ Address type │ fib_addrtype │
2297 └────────┴──────────────────┴──────────────────┘
2298
2299 Use nft describe fib_addrtype to get a list of all address types.
2300
2301 Using fib expressions.
2302
2303 # drop packets without a reverse path
2304 filter prerouting fib saddr . iif oif missing drop
2305
2306 In this example, 'saddr . iif' looks up routing information based on the source address and the input interface.
2307 oif picks the output interface index from the routing information.
2308 If no route was found for the source address/input interface combination, the output interface index is zero.
2309 In case the input interface is specified as part of the input key, the output interface index is always the same as the input interface index or zero.
2310 If only 'saddr oif' is given, then oif can be any interface index or zero.
2311
2312 # drop packets to address not configured on incoming interface
2313 filter prerouting fib daddr . iif type != { local, broadcast, multicast } drop
2314
2315 # perform lookup in a specific 'blackhole' table (0xdead, needs ip appropriate ip rule)
2316 filter prerouting meta mark set 0xdead fib daddr . mark type vmap { blackhole : drop, prohibit : jump prohibited, unreachable : drop }
2317
2318
2319 ROUTING EXPRESSIONS
2320 rt [ip | ip6] {classid | nexthop | mtu | ipsec}
2321
2322 A routing expression refers to routing data associated with a packet.
2323
2324 Table 34. Routing expression types
2325 ┌────────┬─────────────────────┬─────────────────────┐
2326 │Keyword │ Description │ Type │
2327 ├────────┼─────────────────────┼─────────────────────┤
2328 │ │ │ │
2329 │classid │ Routing realm │ realm │
2330 ├────────┼─────────────────────┼─────────────────────┤
2331 │ │ │ │
2332 │nexthop │ Routing nexthop │ ipv4_addr/ipv6_addr │
2333 ├────────┼─────────────────────┼─────────────────────┤
2334 │ │ │ │
2335 │mtu │ TCP maximum segment │ integer (16 bit) │
2336 │ │ size of route │ │
2337 ├────────┼─────────────────────┼─────────────────────┤
2338 │ │ │ │
2339 │ipsec │ route via ipsec │ boolean │
2340 │ │ tunnel or transport │ │
2341 └────────┴─────────────────────┴─────────────────────┘
2342
2343 Table 35. Routing expression specific types
2344 ┌──────┬────────────────────────────┐
2345 │Type │ Description │
2346 ├──────┼────────────────────────────┤
2347 │ │ │
2348 │realm │ Routing Realm (32 bit │
2349 │ │ number). Can be specified │
2350 │ │ numerically or as symbolic │
2351 │ │ name defined in │
2352 │ │ /etc/iproute2/rt_realms. │
2353 └──────┴────────────────────────────┘
2354
2355 Using routing expressions.
2356
2357 # IP family independent rt expression
2358 filter output rt classid 10
2359
2360 # IP family dependent rt expressions
2361 ip filter output rt nexthop 192.168.0.1
2362 ip6 filter output rt nexthop fd00::1
2363 inet filter output rt ip nexthop 192.168.0.1
2364 inet filter output rt ip6 nexthop fd00::1
2365
2366 # outgoing packet will be encapsulated/encrypted by ipsec
2367 filter output rt ipsec exists
2368
2369
2370 IPSEC EXPRESSIONS
2371 ipsec {in | out} [ spnum NUM ] {reqid | spi}
2372 ipsec {in | out} [ spnum NUM ] {ip | ip6} {saddr | daddr}
2373
2374 An ipsec expression refers to ipsec data associated with a packet.
2375
2376 The in or out keyword needs to be used to specify if the expression
2377 should examine inbound or outbound policies. The in keyword can be used
2378 in the prerouting, input and forward hooks. The out keyword applies to
2379 forward, output and postrouting hooks. The optional keyword spnum can
2380 be used to match a specific state in a chain, it defaults to 0.
2381
2382 Table 36. Ipsec expression types
2383 ┌────────┬─────────────────────┬─────────────────────┐
2384 │Keyword │ Description │ Type │
2385 ├────────┼─────────────────────┼─────────────────────┤
2386 │ │ │ │
2387 │reqid │ Request ID │ integer (32 bit) │
2388 ├────────┼─────────────────────┼─────────────────────┤
2389 │ │ │ │
2390 │spi │ Security Parameter │ integer (32 bit) │
2391 │ │ Index │ │
2392 ├────────┼─────────────────────┼─────────────────────┤
2393 │ │ │ │
2394 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
2395 │ │ the tunnel │ │
2396 ├────────┼─────────────────────┼─────────────────────┤
2397 │ │ │ │
2398 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
2399 │ │ of the tunnel │ │
2400 └────────┴─────────────────────┴─────────────────────┘
2401
2402 NUMGEN EXPRESSION
2403 numgen {inc | random} mod NUM [ offset NUM ]
2404
2405 Create a number generator. The inc or random keywords control its
2406 operation mode: In inc mode, the last returned value is simply
2407 incremented. In random mode, a new random number is returned. The value
2408 after mod keyword specifies an upper boundary (read: modulus) which is
2409 not reached by returned numbers. The optional offset allows to
2410 increment the returned value by a fixed offset.
2411
2412 A typical use-case for numgen is load-balancing:
2413
2414 Using numgen expression.
2415
2416 # round-robin between 192.168.10.100 and 192.168.20.200:
2417 add rule nat prerouting dnat to numgen inc mod 2 map \
2418 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2419
2420 # probability-based with odd bias using intervals:
2421 add rule nat prerouting dnat to numgen random mod 10 map \
2422 { 0-2 : 192.168.10.100, 3-9 : 192.168.20.200 }
2423
2424
2425 HASH EXPRESSIONS
2426 jhash {ip saddr | ip6 daddr | tcp dport | udp sport | ether saddr} [. ...] mod NUM [ seed NUM ] [ offset NUM ]
2427 symhash mod NUM [ offset NUM ]
2428
2429 Use a hashing function to generate a number. The functions available
2430 are jhash, known as Jenkins Hash, and symhash, for Symmetric Hash. The
2431 jhash requires an expression to determine the parameters of the packet
2432 header to apply the hashing, concatenations are possible as well. The
2433 value after mod keyword specifies an upper boundary (read: modulus)
2434 which is not reached by returned numbers. The optional seed is used to
2435 specify an init value used as seed in the hashing function. The
2436 optional offset allows to increment the returned value by a fixed
2437 offset.
2438
2439 A typical use-case for jhash and symhash is load-balancing:
2440
2441 Using hash expressions.
2442
2443 # load balance based on source ip between 2 ip addresses:
2444 add rule nat prerouting dnat to jhash ip saddr mod 2 map \
2445 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2446
2447 # symmetric load balancing between 2 ip addresses:
2448 add rule nat prerouting dnat to symhash mod 2 map \
2449 { 0 : 192.168.10.100, 1 : 192.168.20.200 }
2450
2451
2453 Payload expressions refer to data from the packet’s payload.
2454
2455 ETHERNET HEADER EXPRESSION
2456 ether {daddr | saddr | type}
2457
2458 Table 37. Ethernet header expression types
2459 ┌────────┬────────────────────┬────────────┐
2460 │Keyword │ Description │ Type │
2461 ├────────┼────────────────────┼────────────┤
2462 │ │ │ │
2463 │daddr │ Destination MAC │ ether_addr │
2464 │ │ address │ │
2465 ├────────┼────────────────────┼────────────┤
2466 │ │ │ │
2467 │saddr │ Source MAC address │ ether_addr │
2468 ├────────┼────────────────────┼────────────┤
2469 │ │ │ │
2470 │type │ EtherType │ ether_type │
2471 └────────┴────────────────────┴────────────┘
2472
2473 VLAN HEADER EXPRESSION
2474 vlan {id | dei | pcp | type}
2475
2476 Table 38. VLAN header expression
2477 ┌────────┬─────────────────────┬──────────────────┐
2478 │Keyword │ Description │ Type │
2479 ├────────┼─────────────────────┼──────────────────┤
2480 │ │ │ │
2481 │id │ VLAN ID (VID) │ integer (12 bit) │
2482 ├────────┼─────────────────────┼──────────────────┤
2483 │ │ │ │
2484 │dei │ Drop Eligible │ integer (1 bit) │
2485 │ │ Indicator │ │
2486 ├────────┼─────────────────────┼──────────────────┤
2487 │ │ │ │
2488 │pcp │ Priority code point │ integer (3 bit) │
2489 ├────────┼─────────────────────┼──────────────────┤
2490 │ │ │ │
2491 │type │ EtherType │ ether_type │
2492 └────────┴─────────────────────┴──────────────────┘
2493
2494 ARP HEADER EXPRESSION
2495 arp {htype | ptype | hlen | plen | operation | saddr { ip | ether } | daddr { ip | ether }
2496
2497 Table 39. ARP header expression
2498 ┌────────────┬─────────────────────┬──────────────────┐
2499 │Keyword │ Description │ Type │
2500 ├────────────┼─────────────────────┼──────────────────┤
2501 │ │ │ │
2502 │htype │ ARP hardware type │ integer (16 bit) │
2503 ├────────────┼─────────────────────┼──────────────────┤
2504 │ │ │ │
2505 │ptype │ EtherType │ ether_type │
2506 ├────────────┼─────────────────────┼──────────────────┤
2507 │ │ │ │
2508 │hlen │ Hardware address │ integer (8 bit) │
2509 │ │ len │ │
2510 ├────────────┼─────────────────────┼──────────────────┤
2511 │ │ │ │
2512 │plen │ Protocol address │ integer (8 bit) │
2513 │ │ len │ │
2514 ├────────────┼─────────────────────┼──────────────────┤
2515 │ │ │ │
2516 │operation │ Operation │ arp_op │
2517 ├────────────┼─────────────────────┼──────────────────┤
2518 │ │ │ │
2519 │saddr ether │ Ethernet sender │ ether_addr │
2520 │ │ address │ │
2521 ├────────────┼─────────────────────┼──────────────────┤
2522 │ │ │ │
2523 │daddr ether │ Ethernet target │ ether_addr │
2524 │ │ address │ │
2525 ├────────────┼─────────────────────┼──────────────────┤
2526 │ │ │ │
2527 │saddr ip │ IPv4 sender address │ ipv4_addr │
2528 ├────────────┼─────────────────────┼──────────────────┤
2529 │ │ │ │
2530 │daddr ip │ IPv4 target address │ ipv4_addr │
2531 └────────────┴─────────────────────┴──────────────────┘
2532
2533 IPV4 HEADER EXPRESSION
2534 ip {version | hdrlength | dscp | ecn | length | id | frag-off | ttl | protocol | checksum | saddr | daddr }
2535
2536 Table 40. IPv4 header expression
2537 ┌──────────┬─────────────────────┬──────────────────┐
2538 │Keyword │ Description │ Type │
2539 ├──────────┼─────────────────────┼──────────────────┤
2540 │ │ │ │
2541 │version │ IP header version │ integer (4 bit) │
2542 │ │ (4) │ │
2543 ├──────────┼─────────────────────┼──────────────────┤
2544 │ │ │ │
2545 │hdrlength │ IP header length │ integer (4 bit) │
2546 │ │ including options │ FIXME scaling │
2547 ├──────────┼─────────────────────┼──────────────────┤
2548 │ │ │ │
2549 │dscp │ Differentiated │ dscp │
2550 │ │ Services Code Point │ │
2551 ├──────────┼─────────────────────┼──────────────────┤
2552 │ │ │ │
2553 │ecn │ Explicit Congestion │ ecn │
2554 │ │ Notification │ │
2555 ├──────────┼─────────────────────┼──────────────────┤
2556 │ │ │ │
2557 │length │ Total packet length │ integer (16 bit) │
2558 ├──────────┼─────────────────────┼──────────────────┤
2559 │ │ │ │
2560 │id │ IP ID │ integer (16 bit) │
2561 ├──────────┼─────────────────────┼──────────────────┤
2562 │ │ │ │
2563 │frag-off │ Fragment offset │ integer (16 bit) │
2564 ├──────────┼─────────────────────┼──────────────────┤
2565 │ │ │ │
2566 │ttl │ Time to live │ integer (8 bit) │
2567 ├──────────┼─────────────────────┼──────────────────┤
2568 │ │ │ │
2569 │protocol │ Upper layer │ inet_proto │
2570 │ │ protocol │ │
2571 ├──────────┼─────────────────────┼──────────────────┤
2572 │ │ │ │
2573 │checksum │ IP header checksum │ integer (16 bit) │
2574 ├──────────┼─────────────────────┼──────────────────┤
2575 │ │ │ │
2576 │saddr │ Source address │ ipv4_addr │
2577 ├──────────┼─────────────────────┼──────────────────┤
2578 │ │ │ │
2579 │daddr │ Destination address │ ipv4_addr │
2580 └──────────┴─────────────────────┴──────────────────┘
2581
2582 ICMP HEADER EXPRESSION
2583 icmp {type | code | checksum | id | sequence | gateway | mtu}
2584
2585 This expression refers to ICMP header fields. When using it in inet,
2586 bridge or netdev families, it will cause an implicit dependency on IPv4
2587 to be created. To match on unusual cases like ICMP over IPv6, one has
2588 to add an explicit meta protocol ip6 match to the rule.
2589
2590 Table 41. ICMP header expression
2591 ┌─────────┬─────────────────────┬──────────────────┐
2592 │Keyword │ Description │ Type │
2593 ├─────────┼─────────────────────┼──────────────────┤
2594 │ │ │ │
2595 │type │ ICMP type field │ icmp_type │
2596 ├─────────┼─────────────────────┼──────────────────┤
2597 │ │ │ │
2598 │code │ ICMP code field │ integer (8 bit) │
2599 ├─────────┼─────────────────────┼──────────────────┤
2600 │ │ │ │
2601 │checksum │ ICMP checksum field │ integer (16 bit) │
2602 ├─────────┼─────────────────────┼──────────────────┤
2603 │ │ │ │
2604 │id │ ID of echo │ integer (16 bit) │
2605 │ │ request/response │ │
2606 ├─────────┼─────────────────────┼──────────────────┤
2607 │ │ │ │
2608 │sequence │ sequence number of │ integer (16 bit) │
2609 │ │ echo │ │
2610 │ │ request/response │ │
2611 ├─────────┼─────────────────────┼──────────────────┤
2612 │ │ │ │
2613 │gateway │ gateway of │ integer (32 bit) │
2614 │ │ redirects │ │
2615 ├─────────┼─────────────────────┼──────────────────┤
2616 │ │ │ │
2617 │mtu │ MTU of path MTU │ integer (16 bit) │
2618 │ │ discovery │ │
2619 └─────────┴─────────────────────┴──────────────────┘
2620
2621 IGMP HEADER EXPRESSION
2622 igmp {type | mrt | checksum | group}
2623
2624 This expression refers to IGMP header fields. When using it in inet,
2625 bridge or netdev families, it will cause an implicit dependency on IPv4
2626 to be created. To match on unusual cases like IGMP over IPv6, one has
2627 to add an explicit meta protocol ip6 match to the rule.
2628
2629 Table 42. IGMP header expression
2630 ┌─────────┬─────────────────────┬──────────────────┐
2631 │Keyword │ Description │ Type │
2632 ├─────────┼─────────────────────┼──────────────────┤
2633 │ │ │ │
2634 │type │ IGMP type field │ igmp_type │
2635 ├─────────┼─────────────────────┼──────────────────┤
2636 │ │ │ │
2637 │mrt │ IGMP maximum │ integer (8 bit) │
2638 │ │ response time field │ │
2639 ├─────────┼─────────────────────┼──────────────────┤
2640 │ │ │ │
2641 │checksum │ IGMP checksum field │ integer (16 bit) │
2642 ├─────────┼─────────────────────┼──────────────────┤
2643 │ │ │ │
2644 │group │ Group address │ integer (32 bit) │
2645 └─────────┴─────────────────────┴──────────────────┘
2646
2647 IPV6 HEADER EXPRESSION
2648 ip6 {version | dscp | ecn | flowlabel | length | nexthdr | hoplimit | saddr | daddr}
2649
2650 This expression refers to the ipv6 header fields. Caution when using
2651 ip6 nexthdr, the value only refers to the next header, i.e. ip6 nexthdr
2652 tcp will only match if the ipv6 packet does not contain any extension
2653 headers. Packets that are fragmented or e.g. contain a routing
2654 extension headers will not be matched. Please use meta l4proto if you
2655 wish to match the real transport header and ignore any additional
2656 extension headers instead.
2657
2658 Table 43. IPv6 header expression
2659 ┌──────────┬─────────────────────┬──────────────────┐
2660 │Keyword │ Description │ Type │
2661 ├──────────┼─────────────────────┼──────────────────┤
2662 │ │ │ │
2663 │version │ IP header version │ integer (4 bit) │
2664 │ │ (6) │ │
2665 ├──────────┼─────────────────────┼──────────────────┤
2666 │ │ │ │
2667 │dscp │ Differentiated │ dscp │
2668 │ │ Services Code Point │ │
2669 ├──────────┼─────────────────────┼──────────────────┤
2670 │ │ │ │
2671 │ecn │ Explicit Congestion │ ecn │
2672 │ │ Notification │ │
2673 ├──────────┼─────────────────────┼──────────────────┤
2674 │ │ │ │
2675 │flowlabel │ Flow label │ integer (20 bit) │
2676 ├──────────┼─────────────────────┼──────────────────┤
2677 │ │ │ │
2678 │length │ Payload length │ integer (16 bit) │
2679 ├──────────┼─────────────────────┼──────────────────┤
2680 │ │ │ │
2681 │nexthdr │ Nexthdr protocol │ inet_proto │
2682 ├──────────┼─────────────────────┼──────────────────┤
2683 │ │ │ │
2684 │hoplimit │ Hop limit │ integer (8 bit) │
2685 ├──────────┼─────────────────────┼──────────────────┤
2686 │ │ │ │
2687 │saddr │ Source address │ ipv6_addr │
2688 ├──────────┼─────────────────────┼──────────────────┤
2689 │ │ │ │
2690 │daddr │ Destination address │ ipv6_addr │
2691 └──────────┴─────────────────────┴──────────────────┘
2692
2693 Using ip6 header expressions.
2694
2695 # matching if first extension header indicates a fragment
2696 ip6 nexthdr ipv6-frag
2697
2698
2699 ICMPV6 HEADER EXPRESSION
2700 icmpv6 {type | code | checksum | parameter-problem | packet-too-big | id | sequence | max-delay}
2701
2702 This expression refers to ICMPv6 header fields. When using it in inet,
2703 bridge or netdev families, it will cause an implicit dependency on IPv6
2704 to be created. To match on unusual cases like ICMPv6 over IPv4, one has
2705 to add an explicit meta protocol ip match to the rule.
2706
2707 Table 44. ICMPv6 header expression
2708 ┌──────────────────┬────────────────────┬──────────────────┐
2709 │Keyword │ Description │ Type │
2710 ├──────────────────┼────────────────────┼──────────────────┤
2711 │ │ │ │
2712 │type │ ICMPv6 type field │ icmpv6_type │
2713 ├──────────────────┼────────────────────┼──────────────────┤
2714 │ │ │ │
2715 │code │ ICMPv6 code field │ integer (8 bit) │
2716 ├──────────────────┼────────────────────┼──────────────────┤
2717 │ │ │ │
2718 │checksum │ ICMPv6 checksum │ integer (16 bit) │
2719 │ │ field │ │
2720 ├──────────────────┼────────────────────┼──────────────────┤
2721 │ │ │ │
2722 │parameter-problem │ pointer to problem │ integer (32 bit) │
2723 ├──────────────────┼────────────────────┼──────────────────┤
2724 │ │ │ │
2725 │packet-too-big │ oversized MTU │ integer (32 bit) │
2726 ├──────────────────┼────────────────────┼──────────────────┤
2727 │ │ │ │
2728 │id │ ID of echo │ integer (16 bit) │
2729 │ │ request/response │ │
2730 ├──────────────────┼────────────────────┼──────────────────┤
2731 │ │ │ │
2732 │sequence │ sequence number of │ integer (16 bit) │
2733 │ │ echo │ │
2734 │ │ request/response │ │
2735 ├──────────────────┼────────────────────┼──────────────────┤
2736 │ │ │ │
2737 │max-delay │ maximum response │ integer (16 bit) │
2738 │ │ delay of MLD │ │
2739 │ │ queries │ │
2740 └──────────────────┴────────────────────┴──────────────────┘
2741
2742 TCP HEADER EXPRESSION
2743 tcp {sport | dport | sequence | ackseq | doff | reserved | flags | window | checksum | urgptr}
2744
2745 Table 45. TCP header expression
2746 ┌─────────┬──────────────────┬──────────────────┐
2747 │Keyword │ Description │ Type │
2748 ├─────────┼──────────────────┼──────────────────┤
2749 │ │ │ │
2750 │sport │ Source port │ inet_service │
2751 ├─────────┼──────────────────┼──────────────────┤
2752 │ │ │ │
2753 │dport │ Destination port │ inet_service │
2754 ├─────────┼──────────────────┼──────────────────┤
2755 │ │ │ │
2756 │sequence │ Sequence number │ integer (32 bit) │
2757 ├─────────┼──────────────────┼──────────────────┤
2758 │ │ │ │
2759 │ackseq │ Acknowledgement │ integer (32 bit) │
2760 │ │ number │ │
2761 ├─────────┼──────────────────┼──────────────────┤
2762 │ │ │ │
2763 │doff │ Data offset │ integer (4 bit) │
2764 │ │ │ FIXME scaling │
2765 ├─────────┼──────────────────┼──────────────────┤
2766 │ │ │ │
2767 │reserved │ Reserved area │ integer (4 bit) │
2768 ├─────────┼──────────────────┼──────────────────┤
2769 │ │ │ │
2770 │flags │ TCP flags │ tcp_flag │
2771 ├─────────┼──────────────────┼──────────────────┤
2772 │ │ │ │
2773 │window │ Window │ integer (16 bit) │
2774 ├─────────┼──────────────────┼──────────────────┤
2775 │ │ │ │
2776 │checksum │ Checksum │ integer (16 bit) │
2777 ├─────────┼──────────────────┼──────────────────┤
2778 │ │ │ │
2779 │urgptr │ Urgent pointer │ integer (16 bit) │
2780 └─────────┴──────────────────┴──────────────────┘
2781
2782 UDP HEADER EXPRESSION
2783 udp {sport | dport | length | checksum}
2784
2785 Table 46. UDP header expression
2786 ┌─────────┬─────────────────────┬──────────────────┐
2787 │Keyword │ Description │ Type │
2788 ├─────────┼─────────────────────┼──────────────────┤
2789 │ │ │ │
2790 │sport │ Source port │ inet_service │
2791 ├─────────┼─────────────────────┼──────────────────┤
2792 │ │ │ │
2793 │dport │ Destination port │ inet_service │
2794 ├─────────┼─────────────────────┼──────────────────┤
2795 │ │ │ │
2796 │length │ Total packet length │ integer (16 bit) │
2797 ├─────────┼─────────────────────┼──────────────────┤
2798 │ │ │ │
2799 │checksum │ Checksum │ integer (16 bit) │
2800 └─────────┴─────────────────────┴──────────────────┘
2801
2802 UDP-LITE HEADER EXPRESSION
2803 udplite {sport | dport | checksum}
2804
2805 Table 47. UDP-Lite header expression
2806 ┌─────────┬──────────────────┬──────────────────┐
2807 │Keyword │ Description │ Type │
2808 ├─────────┼──────────────────┼──────────────────┤
2809 │ │ │ │
2810 │sport │ Source port │ inet_service │
2811 ├─────────┼──────────────────┼──────────────────┤
2812 │ │ │ │
2813 │dport │ Destination port │ inet_service │
2814 ├─────────┼──────────────────┼──────────────────┤
2815 │ │ │ │
2816 │checksum │ Checksum │ integer (16 bit) │
2817 └─────────┴──────────────────┴──────────────────┘
2818
2819 SCTP HEADER EXPRESSION
2820 sctp {sport | dport | vtag | checksum}
2821 sctp chunk CHUNK [ FIELD ]
2822
2823 CHUNK := data | init | init-ack | sack | heartbeat |
2824 heartbeat-ack | abort | shutdown | shutdown-ack | error |
2825 cookie-echo | cookie-ack | ecne | cwr | shutdown-complete
2826 | asconf-ack | forward-tsn | asconf
2827
2828 FIELD := COMMON_FIELD | DATA_FIELD | INIT_FIELD | INIT_ACK_FIELD |
2829 SACK_FIELD | SHUTDOWN_FIELD | ECNE_FIELD | CWR_FIELD |
2830 ASCONF_ACK_FIELD | FORWARD_TSN_FIELD | ASCONF_FIELD
2831
2832 COMMON_FIELD := type | flags | length
2833 DATA_FIELD := tsn | stream | ssn | ppid
2834 INIT_FIELD := init-tag | a-rwnd | num-outbound-streams |
2835 num-inbound-streams | initial-tsn
2836 INIT_ACK_FIELD := INIT_FIELD
2837 SACK_FIELD := cum-tsn-ack | a-rwnd | num-gap-ack-blocks |
2838 num-dup-tsns
2839 SHUTDOWN_FIELD := cum-tsn-ack
2840 ECNE_FIELD := lowest-tsn
2841 CWR_FIELD := lowest-tsn
2842 ASCONF_ACK_FIELD := seqno
2843 FORWARD_TSN_FIELD := new-cum-tsn
2844 ASCONF_FIELD := seqno
2845
2846 Table 48. SCTP header expression
2847 ┌─────────┬──────────────────┬────────────────────┐
2848 │Keyword │ Description │ Type │
2849 ├─────────┼──────────────────┼────────────────────┤
2850 │ │ │ │
2851 │sport │ Source port │ inet_service │
2852 ├─────────┼──────────────────┼────────────────────┤
2853 │ │ │ │
2854 │dport │ Destination port │ inet_service │
2855 ├─────────┼──────────────────┼────────────────────┤
2856 │ │ │ │
2857 │vtag │ Verification Tag │ integer (32 bit) │
2858 ├─────────┼──────────────────┼────────────────────┤
2859 │ │ │ │
2860 │checksum │ Checksum │ integer (32 bit) │
2861 ├─────────┼──────────────────┼────────────────────┤
2862 │ │ │ │
2863 │chunk │ Search chunk in │ without FIELD, │
2864 │ │ packet │ boolean indicating │
2865 │ │ │ existence │
2866 └─────────┴──────────────────┴────────────────────┘
2867
2868 Table 49. SCTP chunk fields
2869 ┌─────────────────────┬───────────────┬─────────────────┬──────────────────┐
2870 │Name │ Width in bits │ Chunk │ Notes │
2871 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2872 │ │ │ │ │
2873 │type │ 8 │ all │ not useful, │
2874 │ │ │ │ defined by chunk │
2875 │ │ │ │ type │
2876 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2877 │ │ │ │ │
2878 │flags │ 8 │ all │ semantics │
2879 │ │ │ │ defined on │
2880 │ │ │ │ per-chunk basis │
2881 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2882 │ │ │ │ │
2883 │length │ 16 │ all │ length of this │
2884 │ │ │ │ chunk in bytes │
2885 │ │ │ │ excluding │
2886 │ │ │ │ padding │
2887 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2888 │ │ │ │ │
2889 │tsn │ 32 │ data │ transmission │
2890 │ │ │ │ sequence number │
2891 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2892 │ │ │ │ │
2893 │stream │ 16 │ data │ stream │
2894 │ │ │ │ identifier │
2895 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2896 │ │ │ │ │
2897 │ssn │ 16 │ data │ stream sequence │
2898 │ │ │ │ number │
2899 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2900 │ │ │ │ │
2901 │ppid │ 32 │ data │ payload protocol │
2902 │ │ │ │ identifier │
2903 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2904 │ │ │ │ │
2905 │init-tag │ 32 │ init, init-ack │ initiate tag │
2906 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2907 │ │ │ │ │
2908 │a-rwnd │ 32 │ init, init-ack, │ advertised │
2909 │ │ │ sack │ receiver window │
2910 │ │ │ │ credit │
2911 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2912 │ │ │ │ │
2913 │num-outbound-streams │ 16 │ init, init-ack │ number of │
2914 │ │ │ │ outbound streams │
2915 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2916 │ │ │ │ │
2917 │num-inbound-streams │ 16 │ init, init-ack │ number of │
2918 │ │ │ │ inbound streams │
2919 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2920 │ │ │ │ │
2921 │initial-tsn │ 32 │ init, init-ack │ initial transmit │
2922 │ │ │ │ sequence number │
2923 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2924 │ │ │ │ │
2925 │cum-tsn-ack │ 32 │ sack, shutdown │ cumulative │
2926 │ │ │ │ transmission │
2927 │ │ │ │ sequence number │
2928 │ │ │ │ acknowledged │
2929 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2930 │ │ │ │ │
2931 │num-gap-ack-blocks │ 16 │ sack │ number of Gap │
2932 │ │ │ │ Ack Blocks │
2933 │ │ │ │ included │
2934 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2935 │ │ │ │ │
2936 │num-dup-tsns │ 16 │ sack │ number of │
2937 │ │ │ │ duplicate │
2938 │ │ │ │ transmission │
2939 │ │ │ │ sequence numbers │
2940 │ │ │ │ received │
2941 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2942 │ │ │ │ │
2943 │lowest-tsn │ 32 │ ecne, cwr │ lowest │
2944 │ │ │ │ transmission │
2945 │ │ │ │ sequence number │
2946 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2947 │ │ │ │ │
2948 │seqno │ 32 │ asconf-ack, │ sequence number │
2949 │ │ │ asconf │ │
2950 ├─────────────────────┼───────────────┼─────────────────┼──────────────────┤
2951 │ │ │ │ │
2952 │new-cum-tsn │ 32 │ forward-tsn │ new cumulative │
2953 │ │ │ │ transmission │
2954 │ │ │ │ sequence number │
2955 └─────────────────────┴───────────────┴─────────────────┴──────────────────┘
2956
2957 DCCP HEADER EXPRESSION
2958 dccp {sport | dport | type}
2959
2960 Table 50. DCCP header expression
2961 ┌────────┬──────────────────┬──────────────┐
2962 │Keyword │ Description │ Type │
2963 ├────────┼──────────────────┼──────────────┤
2964 │ │ │ │
2965 │sport │ Source port │ inet_service │
2966 ├────────┼──────────────────┼──────────────┤
2967 │ │ │ │
2968 │dport │ Destination port │ inet_service │
2969 ├────────┼──────────────────┼──────────────┤
2970 │ │ │ │
2971 │type │ Packet type │ dccp_pkttype │
2972 └────────┴──────────────────┴──────────────┘
2973
2974 AUTHENTICATION HEADER EXPRESSION
2975 ah {nexthdr | hdrlength | reserved | spi | sequence}
2976
2977 Table 51. AH header expression
2978 ┌──────────┬────────────────────┬──────────────────┐
2979 │Keyword │ Description │ Type │
2980 ├──────────┼────────────────────┼──────────────────┤
2981 │ │ │ │
2982 │nexthdr │ Next header │ inet_proto │
2983 │ │ protocol │ │
2984 ├──────────┼────────────────────┼──────────────────┤
2985 │ │ │ │
2986 │hdrlength │ AH Header length │ integer (8 bit) │
2987 ├──────────┼────────────────────┼──────────────────┤
2988 │ │ │ │
2989 │reserved │ Reserved area │ integer (16 bit) │
2990 ├──────────┼────────────────────┼──────────────────┤
2991 │ │ │ │
2992 │spi │ Security Parameter │ integer (32 bit) │
2993 │ │ Index │ │
2994 ├──────────┼────────────────────┼──────────────────┤
2995 │ │ │ │
2996 │sequence │ Sequence number │ integer (32 bit) │
2997 └──────────┴────────────────────┴──────────────────┘
2998
2999 ENCRYPTED SECURITY PAYLOAD HEADER EXPRESSION
3000 esp {spi | sequence}
3001
3002 Table 52. ESP header expression
3003 ┌─────────┬────────────────────┬──────────────────┐
3004 │Keyword │ Description │ Type │
3005 ├─────────┼────────────────────┼──────────────────┤
3006 │ │ │ │
3007 │spi │ Security Parameter │ integer (32 bit) │
3008 │ │ Index │ │
3009 ├─────────┼────────────────────┼──────────────────┤
3010 │ │ │ │
3011 │sequence │ Sequence number │ integer (32 bit) │
3012 └─────────┴────────────────────┴──────────────────┘
3013
3014 IPCOMP HEADER EXPRESSION
3015 comp {nexthdr | flags | cpi}
3016
3017 Table 53. IPComp header expression
3018 ┌────────┬─────────────────┬──────────────────┐
3019 │Keyword │ Description │ Type │
3020 ├────────┼─────────────────┼──────────────────┤
3021 │ │ │ │
3022 │nexthdr │ Next header │ inet_proto │
3023 │ │ protocol │ │
3024 ├────────┼─────────────────┼──────────────────┤
3025 │ │ │ │
3026 │flags │ Flags │ bitmask │
3027 ├────────┼─────────────────┼──────────────────┤
3028 │ │ │ │
3029 │cpi │ compression │ integer (16 bit) │
3030 │ │ Parameter Index │ │
3031 └────────┴─────────────────┴──────────────────┘
3032
3033 RAW PAYLOAD EXPRESSION
3034 @base,offset,length
3035
3036 The raw payload expression instructs to load length bits starting at
3037 offset bits. Bit 0 refers to the very first bit — in the C programming
3038 language, this corresponds to the topmost bit, i.e. 0x80 in case of an
3039 octet. They are useful to match headers that do not have a
3040 human-readable template expression yet. Note that nft will not add
3041 dependencies for Raw payload expressions. If you e.g. want to match
3042 protocol fields of a transport header with protocol number 5, you need
3043 to manually exclude packets that have a different transport header, for
3044 instance by using meta l4proto 5 before the raw expression.
3045
3046 Table 54. Supported payload protocol bases
3047 ┌─────┬─────────────────────────┐
3048 │Base │ Description │
3049 ├─────┼─────────────────────────┤
3050 │ │ │
3051 │ll │ Link layer, for example │
3052 │ │ the Ethernet header │
3053 ├─────┼─────────────────────────┤
3054 │ │ │
3055 │nh │ Network header, for │
3056 │ │ example IPv4 or IPv6 │
3057 ├─────┼─────────────────────────┤
3058 │ │ │
3059 │th │ Transport Header, for │
3060 │ │ example TCP │
3061 └─────┴─────────────────────────┘
3062
3063 Matching destination port of both UDP and TCP.
3064
3065 inet filter input meta l4proto {tcp, udp} @th,16,16 { 53, 80 }
3066
3067 The above can also be written as
3068
3069 inet filter input meta l4proto {tcp, udp} th dport { 53, 80 }
3070
3071 it is more convenient, but like the raw expression notation no
3072 dependencies are created or checked. It is the users responsibility to
3073 restrict matching to those header types that have a notion of ports.
3074 Otherwise, rules using raw expressions will errnously match unrelated
3075 packets, e.g. mis-interpreting ESP packets SPI field as a port.
3076
3077 Rewrite arp packet target hardware address if target protocol address
3078 matches a given address.
3079
3080 input meta iifname enp2s0 arp ptype 0x0800 arp htype 1 arp hlen 6 arp plen 4 @nh,192,32 0xc0a88f10 @nh,144,48 set 0x112233445566 accept
3081
3082
3083 EXTENSION HEADER EXPRESSIONS
3084 Extension header expressions refer to data from variable-sized protocol
3085 headers, such as IPv6 extension headers, TCP options and IPv4 options.
3086
3087 nftables currently supports matching (finding) a given ipv6 extension
3088 header, TCP option or IPv4 option.
3089
3090 hbh {nexthdr | hdrlength}
3091 frag {nexthdr | frag-off | more-fragments | id}
3092 rt {nexthdr | hdrlength | type | seg-left}
3093 dst {nexthdr | hdrlength}
3094 mh {nexthdr | hdrlength | checksum | type}
3095 srh {flags | tag | sid | seg-left}
3096 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp} tcp_option_field
3097 ip option { lsrr | ra | rr | ssrr } ip_option_field
3098
3099 The following syntaxes are valid only in a relational expression with
3100 boolean type on right-hand side for checking header existence only:
3101
3102 exthdr {hbh | frag | rt | dst | mh}
3103 tcp option {eol | nop | maxseg | window | sack-perm | sack | sack0 | sack1 | sack2 | sack3 | timestamp}
3104 ip option { lsrr | ra | rr | ssrr }
3105
3106 Table 55. IPv6 extension headers
3107 ┌────────┬────────────────────────┐
3108 │Keyword │ Description │
3109 ├────────┼────────────────────────┤
3110 │ │ │
3111 │hbh │ Hop by Hop │
3112 ├────────┼────────────────────────┤
3113 │ │ │
3114 │rt │ Routing Header │
3115 ├────────┼────────────────────────┤
3116 │ │ │
3117 │frag │ Fragmentation header │
3118 ├────────┼────────────────────────┤
3119 │ │ │
3120 │dst │ dst options │
3121 ├────────┼────────────────────────┤
3122 │ │ │
3123 │mh │ Mobility Header │
3124 ├────────┼────────────────────────┤
3125 │ │ │
3126 │srh │ Segment Routing Header │
3127 └────────┴────────────────────────┘
3128
3129 Table 56. TCP Options
3130 ┌──────────┬─────────────────────┬─────────────────────┐
3131 │Keyword │ Description │ TCP option fields │
3132 ├──────────┼─────────────────────┼─────────────────────┤
3133 │ │ │ │
3134 │eol │ End if option list │ kind │
3135 ├──────────┼─────────────────────┼─────────────────────┤
3136 │ │ │ │
3137 │nop │ 1 Byte TCP Nop │ kind │
3138 │ │ padding option │ │
3139 ├──────────┼─────────────────────┼─────────────────────┤
3140 │ │ │ │
3141 │maxseg │ TCP Maximum Segment │ kind, length, size │
3142 │ │ Size │ │
3143 ├──────────┼─────────────────────┼─────────────────────┤
3144 │ │ │ │
3145 │window │ TCP Window Scaling │ kind, length, count │
3146 ├──────────┼─────────────────────┼─────────────────────┤
3147 │ │ │ │
3148 │sack-perm │ TCP SACK permitted │ kind, length │
3149 ├──────────┼─────────────────────┼─────────────────────┤
3150 │ │ │ │
3151 │sack │ TCP Selective │ kind, length, left, │
3152 │ │ Acknowledgement │ right │
3153 │ │ (alias of block 0) │ │
3154 ├──────────┼─────────────────────┼─────────────────────┤
3155 │ │ │ │
3156 │sack0 │ TCP Selective │ kind, length, left, │
3157 │ │ Acknowledgement │ right │
3158 │ │ (block 0) │ │
3159 ├──────────┼─────────────────────┼─────────────────────┤
3160 │ │ │ │
3161 │sack1 │ TCP Selective │ kind, length, left, │
3162 │ │ Acknowledgement │ right │
3163 │ │ (block 1) │ │
3164 ├──────────┼─────────────────────┼─────────────────────┤
3165 │ │ │ │
3166 │sack2 │ TCP Selective │ kind, length, left, │
3167 │ │ Acknowledgement │ right │
3168 │ │ (block 2) │ │
3169 ├──────────┼─────────────────────┼─────────────────────┤
3170 │ │ │ │
3171 │sack3 │ TCP Selective │ kind, length, left, │
3172 │ │ Acknowledgement │ right │
3173 │ │ (block 3) │ │
3174 ├──────────┼─────────────────────┼─────────────────────┤
3175 │ │ │ │
3176 │timestamp │ TCP Timestamps │ kind, length, │
3177 │ │ │ tsval, tsecr │
3178 └──────────┴─────────────────────┴─────────────────────┘
3179
3180 TCP option matching also supports raw expression syntax to access
3181 arbitrary options:
3182
3183 tcp option
3184
3185 tcp option @number,offset,length
3186
3187 Table 57. IP Options
3188 ┌────────┬─────────────────────┬─────────────────────┐
3189 │Keyword │ Description │ IP option fields │
3190 ├────────┼─────────────────────┼─────────────────────┤
3191 │ │ │ │
3192 │lsrr │ Loose Source Route │ type, length, ptr, │
3193 │ │ │ addr │
3194 ├────────┼─────────────────────┼─────────────────────┤
3195 │ │ │ │
3196 │ra │ Router Alert │ type, length, value │
3197 ├────────┼─────────────────────┼─────────────────────┤
3198 │ │ │ │
3199 │rr │ Record Route │ type, length, ptr, │
3200 │ │ │ addr │
3201 ├────────┼─────────────────────┼─────────────────────┤
3202 │ │ │ │
3203 │ssrr │ Strict Source Route │ type, length, ptr, │
3204 │ │ │ addr │
3205 └────────┴─────────────────────┴─────────────────────┘
3206
3207 finding TCP options.
3208
3209 filter input tcp option sack-perm kind 1 counter
3210
3211 matching IPv6 exthdr.
3212
3213 ip6 filter input frag more-fragments 1 counter
3214
3215 finding IP option.
3216
3217 filter input ip option lsrr exists counter
3218
3219
3220 CONNTRACK EXPRESSIONS
3221 Conntrack expressions refer to meta data of the connection tracking
3222 entry associated with a packet.
3223
3224 There are three types of conntrack expressions. Some conntrack
3225 expressions require the flow direction before the conntrack key, others
3226 must be used directly because they are direction agnostic. The packets,
3227 bytes and avgpkt keywords can be used with or without a direction. If
3228 the direction is omitted, the sum of the original and the reply
3229 direction is returned. The same is true for the zone, if a direction is
3230 given, the zone is only matched if the zone id is tied to the given
3231 direction.
3232
3233 ct {state | direction | status | mark | expiration | helper | label | count | id}
3234 ct [original | reply] {l3proto | protocol | bytes | packets | avgpkt | zone}
3235 ct {original | reply} {proto-src | proto-dst}
3236 ct {original | reply} {ip | ip6} {saddr | daddr}
3237
3238 The conntrack-specific types in this table are described in the
3239 sub-section CONNTRACK TYPES above.
3240
3241 Table 58. Conntrack expressions
3242 ┌───────────┬─────────────────────┬─────────────────────┐
3243 │Keyword │ Description │ Type │
3244 ├───────────┼─────────────────────┼─────────────────────┤
3245 │ │ │ │
3246 │state │ State of the │ ct_state │
3247 │ │ connection │ │
3248 ├───────────┼─────────────────────┼─────────────────────┤
3249 │ │ │ │
3250 │direction │ Direction of the │ ct_dir │
3251 │ │ packet relative to │ │
3252 │ │ the connection │ │
3253 ├───────────┼─────────────────────┼─────────────────────┤
3254 │ │ │ │
3255 │status │ Status of the │ ct_status │
3256 │ │ connection │ │
3257 ├───────────┼─────────────────────┼─────────────────────┤
3258 │ │ │ │
3259 │mark │ Connection mark │ mark │
3260 ├───────────┼─────────────────────┼─────────────────────┤
3261 │ │ │ │
3262 │expiration │ Connection │ time │
3263 │ │ expiration time │ │
3264 ├───────────┼─────────────────────┼─────────────────────┤
3265 │ │ │ │
3266 │helper │ Helper associated │ string │
3267 │ │ with the connection │ │
3268 ├───────────┼─────────────────────┼─────────────────────┤
3269 │ │ │ │
3270 │label │ Connection tracking │ ct_label │
3271 │ │ label bit or │ │
3272 │ │ symbolic name │ │
3273 │ │ defined in │ │
3274 │ │ connlabel.conf in │ │
3275 │ │ the nftables │ │
3276 │ │ include path │ │
3277 ├───────────┼─────────────────────┼─────────────────────┤
3278 │ │ │ │
3279 │l3proto │ Layer 3 protocol of │ nf_proto │
3280 │ │ the connection │ │
3281 ├───────────┼─────────────────────┼─────────────────────┤
3282 │ │ │ │
3283 │saddr │ Source address of │ ipv4_addr/ipv6_addr │
3284 │ │ the connection for │ │
3285 │ │ the given direction │ │
3286 ├───────────┼─────────────────────┼─────────────────────┤
3287 │ │ │ │
3288 │daddr │ Destination address │ ipv4_addr/ipv6_addr │
3289 │ │ of the connection │ │
3290 │ │ for the given │ │
3291 │ │ direction │ │
3292 ├───────────┼─────────────────────┼─────────────────────┤
3293 │ │ │ │
3294 │protocol │ Layer 4 protocol of │ inet_proto │
3295 │ │ the connection for │ │
3296 │ │ the given direction │ │
3297 ├───────────┼─────────────────────┼─────────────────────┤
3298 │ │ │ │
3299 │proto-src │ Layer 4 protocol │ integer (16 bit) │
3300 │ │ source for the │ │
3301 │ │ given direction │ │
3302 ├───────────┼─────────────────────┼─────────────────────┤
3303 │ │ │ │
3304 │proto-dst │ Layer 4 protocol │ integer (16 bit) │
3305 │ │ destination for the │ │
3306 │ │ given direction │ │
3307 ├───────────┼─────────────────────┼─────────────────────┤
3308 │ │ │ │
3309 │packets │ packet count seen │ integer (64 bit) │
3310 │ │ in the given │ │
3311 │ │ direction or sum of │ │
3312 │ │ original and reply │ │
3313 ├───────────┼─────────────────────┼─────────────────────┤
3314 │ │ │ │
3315 │bytes │ byte count seen, │ integer (64 bit) │
3316 │ │ see description for │ │
3317 │ │ packets keyword │ │
3318 ├───────────┼─────────────────────┼─────────────────────┤
3319 │ │ │ │
3320 │avgpkt │ average bytes per │ integer (64 bit) │
3321 │ │ packet, see │ │
3322 │ │ description for │ │
3323 │ │ packets keyword │ │
3324 ├───────────┼─────────────────────┼─────────────────────┤
3325 │ │ │ │
3326 │zone │ conntrack zone │ integer (16 bit) │
3327 ├───────────┼─────────────────────┼─────────────────────┤
3328 │ │ │ │
3329 │count │ number of current │ integer (32 bit) │
3330 │ │ connections │ │
3331 ├───────────┼─────────────────────┼─────────────────────┤
3332 │ │ │ │
3333 │id │ Connection id │ ct_id │
3334 └───────────┴─────────────────────┴─────────────────────┘
3335
3336 restrict the number of parallel connections to a server.
3337
3338 nft add set filter ssh_flood '{ type ipv4_addr; flags dynamic; }'
3339 nft add rule filter input tcp dport 22 add @ssh_flood '{ ip saddr ct count over 2 }' reject
3340
3341
3343 Statements represent actions to be performed. They can alter control
3344 flow (return, jump to a different chain, accept or drop the packet) or
3345 can perform actions, such as logging, rejecting a packet, etc.
3346
3347 Statements exist in two kinds. Terminal statements unconditionally
3348 terminate evaluation of the current rule, non-terminal statements
3349 either only conditionally or never terminate evaluation of the current
3350 rule, in other words, they are passive from the ruleset evaluation
3351 perspective. There can be an arbitrary amount of non-terminal
3352 statements in a rule, but only a single terminal statement as the final
3353 statement.
3354
3355 VERDICT STATEMENT
3356 The verdict statement alters control flow in the ruleset and issues
3357 policy decisions for packets.
3358
3359 {accept | drop | queue | continue | return}
3360 {jump | goto} chain
3361
3362 accept and drop are absolute verdicts — they terminate ruleset
3363 evaluation immediately.
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374 accept Terminate ruleset
3375 evaluation and accept the
3376 packet. The packet can
3377 still be dropped later by
3378 another hook, for instance
3379 accept in the forward hook
3380 still allows to drop the
3381 packet later in the
3382 postrouting hook, or
3383 another forward base chain
3384 that has a higher priority
3385 number and is evaluated
3386 afterwards in the
3387 processing pipeline.
3388
3389 drop Terminate ruleset
3390 evaluation and drop the
3391 packet. The drop occurs
3392 instantly, no further
3393 chains or hooks are
3394 evaluated. It is not
3395 possible to accept the
3396 packet in a later chain
3397 again, as those are not
3398 evaluated anymore for the
3399 packet.
3400
3401 queue Terminate ruleset
3402 evaluation and queue the
3403 packet to userspace.
3404 Userspace must provide a
3405 drop or accept verdict. In
3406 case of accept, processing
3407 resumes with the next base
3408 chain hook, not the rule
3409 following the queue
3410 verdict.
3411
3412 continue Continue ruleset
3413 evaluation with the next
3414 rule. This is the default
3415 behaviour in case a rule
3416 issues no verdict.
3417
3418 return Return from the current
3419 chain and continue
3420 evaluation at the next
3421 rule in the last chain. If
3422 issued in a base chain, it
3423 is equivalent to the base
3424 chain policy.
3425
3426 jump chain Continue evaluation at the
3427 first rule in chain. The
3428 current position in the
3429 ruleset is pushed to a
3430 call stack and evaluation
3431 will continue there when
3432 the new chain is entirely
3433 evaluated or a return
3434 verdict is issued. In case
3435 an absolute verdict is
3436 issued by a rule in the
3437 chain, ruleset evaluation
3438 terminates immediately and
3439 the specific action is
3440 taken.
3441
3442 goto chain Similar to jump, but the
3443 current position is not
3444 pushed to the call stack,
3445 meaning that after the new
3446 chain evaluation will
3447 continue at the last chain
3448 instead of the one
3449 containing the goto
3450 statement.
3451
3452
3453 Using verdict statements.
3454
3455 # process packets from eth0 and the internal network in from_lan
3456 # chain, drop all packets from eth0 with different source addresses.
3457
3458 filter input iif eth0 ip saddr 192.168.0.0/24 jump from_lan
3459 filter input iif eth0 drop
3460
3461
3462 PAYLOAD STATEMENT
3463 payload_expression set value
3464
3465 The payload statement alters packet content. It can be used for example
3466 to set ip DSCP (diffserv) header field or ipv6 flow labels.
3467
3468 route some packets instead of bridging.
3469
3470 # redirect tcp:http from 192.160.0.0/16 to local machine for routing instead of bridging
3471 # assumes 00:11:22:33:44:55 is local MAC address.
3472 bridge input meta iif eth0 ip saddr 192.168.0.0/16 tcp dport 80 meta pkttype set unicast ether daddr set 00:11:22:33:44:55
3473
3474 Set IPv4 DSCP header field.
3475
3476 ip forward ip dscp set 42
3477
3478
3479 EXTENSION HEADER STATEMENT
3480 extension_header_expression set value
3481
3482 The extension header statement alters packet content in variable-sized
3483 headers. This can currently be used to alter the TCP Maximum segment
3484 size of packets, similar to TCPMSS.
3485
3486 change tcp mss.
3487
3488 tcp flags syn tcp option maxseg size set 1360
3489 # set a size based on route information:
3490 tcp flags syn tcp option maxseg size set rt mtu
3491
3492
3493 LOG STATEMENT
3494 log [prefix quoted_string] [level syslog-level] [flags log-flags]
3495 log group nflog_group [prefix quoted_string] [queue-threshold value] [snaplen size]
3496 log level audit
3497
3498 The log statement enables logging of matching packets. When this
3499 statement is used from a rule, the Linux kernel will print some
3500 information on all matching packets, such as header fields, via the
3501 kernel log (where it can be read with dmesg(1) or read in the syslog).
3502
3503 In the second form of invocation (if nflog_group is specified), the
3504 Linux kernel will pass the packet to nfnetlink_log which will send the
3505 log through a netlink socket to the specified group. One userspace
3506 process may subscribe to the group to receive the logs, see man(8)
3507 ulogd for the Netfilter userspace log daemon and libnetfilter_log
3508 documentation for details in case you would like to develop a custom
3509 program to digest your logs.
3510
3511 In the third form of invocation (if level audit is specified), the
3512 Linux kernel writes a message into the audit buffer suitably formatted
3513 for reading with auditd. Therefore no further formatting options (such
3514 as prefix or flags) are allowed in this mode.
3515
3516 This is a non-terminating statement, so the rule evaluation continues
3517 after the packet is logged.
3518
3519 Table 59. log statement options
3520 ┌────────────────┬─────────────────────┬───────────────────┐
3521 │Keyword │ Description │ Type │
3522 ├────────────────┼─────────────────────┼───────────────────┤
3523 │ │ │ │
3524 │prefix │ Log message prefix │ quoted string │
3525 ├────────────────┼─────────────────────┼───────────────────┤
3526 │ │ │ │
3527 │level │ Syslog level of │ string: emerg, │
3528 │ │ logging │ alert, crit, err, │
3529 │ │ │ warn [default], │
3530 │ │ │ notice, info, │
3531 │ │ │ debug, audit │
3532 ├────────────────┼─────────────────────┼───────────────────┤
3533 │ │ │ │
3534 │group │ NFLOG group to send │ unsigned integer │
3535 │ │ messages to │ (16 bit) │
3536 ├────────────────┼─────────────────────┼───────────────────┤
3537 │ │ │ │
3538 │snaplen │ Length of packet │ unsigned integer │
3539 │ │ payload to include │ (32 bit) │
3540 │ │ in netlink message │ │
3541 ├────────────────┼─────────────────────┼───────────────────┤
3542 │ │ │ │
3543 │queue-threshold │ Number of packets │ unsigned integer │
3544 │ │ to queue inside the │ (32 bit) │
3545 │ │ kernel before │ │
3546 │ │ sending them to │ │
3547 │ │ userspace │ │
3548 └────────────────┴─────────────────────┴───────────────────┘
3549
3550 Table 60. log-flags
3551 ┌─────────────┬───────────────────────────┐
3552 │Flag │ Description │
3553 ├─────────────┼───────────────────────────┤
3554 │ │ │
3555 │tcp sequence │ Log TCP sequence numbers. │
3556 ├─────────────┼───────────────────────────┤
3557 │ │ │
3558 │tcp options │ Log options from the TCP │
3559 │ │ packet header. │
3560 ├─────────────┼───────────────────────────┤
3561 │ │ │
3562 │ip options │ Log options from the │
3563 │ │ IP/IPv6 packet header. │
3564 ├─────────────┼───────────────────────────┤
3565 │ │ │
3566 │skuid │ Log the userid of the │
3567 │ │ process which generated │
3568 │ │ the packet. │
3569 ├─────────────┼───────────────────────────┤
3570 │ │ │
3571 │ether │ Decode MAC addresses and │
3572 │ │ protocol. │
3573 ├─────────────┼───────────────────────────┤
3574 │ │ │
3575 │all │ Enable all log flags │
3576 │ │ listed above. │
3577 └─────────────┴───────────────────────────┘
3578
3579 Using log statement.
3580
3581 # log the UID which generated the packet and ip options
3582 ip filter output log flags skuid flags ip options
3583
3584 # log the tcp sequence numbers and tcp options from the TCP packet
3585 ip filter output log flags tcp sequence,options
3586
3587 # enable all supported log flags
3588 ip6 filter output log flags all
3589
3590
3591 REJECT STATEMENT
3592 reject [ with REJECT_WITH ]
3593
3594 REJECT_WITH := icmp icmp_code |
3595 icmpv6 icmpv6_code |
3596 icmpx icmpx_code |
3597 tcp reset
3598
3599 A reject statement is used to send back an error packet in response to
3600 the matched packet otherwise it is equivalent to drop so it is a
3601 terminating statement, ending rule traversal. This statement is only
3602 valid in base chains using the input, forward or output hooks, and
3603 user-defined chains which are only called from those chains.
3604
3605 Table 61. different ICMP reject variants are meant for use in different
3606 table families
3607 ┌────────┬────────┬─────────────┐
3608 │Variant │ Family │ Type │
3609 ├────────┼────────┼─────────────┤
3610 │ │ │ │
3611 │icmp │ ip │ icmp_code │
3612 ├────────┼────────┼─────────────┤
3613 │ │ │ │
3614 │icmpv6 │ ip6 │ icmpv6_code │
3615 ├────────┼────────┼─────────────┤
3616 │ │ │ │
3617 │icmpx │ inet │ icmpx_code │
3618 └────────┴────────┴─────────────┘
3619
3620 For a description of the different types and a list of supported
3621 keywords refer to DATA TYPES section above. The common default reject
3622 value is port-unreachable.
3623
3624 Note that in bridge family, reject statement is only allowed in base
3625 chains which hook into input or prerouting.
3626
3627 COUNTER STATEMENT
3628 A counter statement sets the hit count of packets along with the number
3629 of bytes.
3630
3631 counter packets number bytes number
3632 counter { packets number | bytes number }
3633
3634 CONNTRACK STATEMENT
3635 The conntrack statement can be used to set the conntrack mark and
3636 conntrack labels.
3637
3638 ct {mark | event | label | zone} set value
3639
3640 The ct statement sets meta data associated with a connection. The zone
3641 id has to be assigned before a conntrack lookup takes place, i.e. this
3642 has to be done in prerouting and possibly output (if locally generated
3643 packets need to be placed in a distinct zone), with a hook priority of
3644 raw (-300).
3645
3646 Unlike iptables, where the helper assignment happens in the raw table,
3647 the helper needs to be assigned after a conntrack entry has been found,
3648 i.e. it will not work when used with hook priorities equal or before
3649 -200.
3650
3651 Table 62. Conntrack statement types
3652 ┌────────┬─────────────────────┬──────────────────┐
3653 │Keyword │ Description │ Value │
3654 ├────────┼─────────────────────┼──────────────────┤
3655 │ │ │ │
3656 │event │ conntrack event │ bitmask, integer │
3657 │ │ bits │ (32 bit) │
3658 ├────────┼─────────────────────┼──────────────────┤
3659 │ │ │ │
3660 │helper │ name of ct helper │ quoted string │
3661 │ │ object to assign to │ │
3662 │ │ the connection │ │
3663 ├────────┼─────────────────────┼──────────────────┤
3664 │ │ │ │
3665 │mark │ Connection tracking │ mark │
3666 │ │ mark │ │
3667 ├────────┼─────────────────────┼──────────────────┤
3668 │ │ │ │
3669 │label │ Connection tracking │ label │
3670 │ │ label │ │
3671 ├────────┼─────────────────────┼──────────────────┤
3672 │ │ │ │
3673 │zone │ conntrack zone │ integer (16 bit) │
3674 └────────┴─────────────────────┴──────────────────┘
3675
3676 save packet nfmark in conntrack.
3677
3678 ct mark set meta mark
3679
3680 set zone mapped via interface.
3681
3682 table inet raw {
3683 chain prerouting {
3684 type filter hook prerouting priority raw;
3685 ct zone set iif map { "eth1" : 1, "veth1" : 2 }
3686 }
3687 chain output {
3688 type filter hook output priority raw;
3689 ct zone set oif map { "eth1" : 1, "veth1" : 2 }
3690 }
3691 }
3692
3693 restrict events reported by ctnetlink.
3694
3695 ct event set new,related,destroy
3696
3697
3698 NOTRACK STATEMENT
3699 The notrack statement allows to disable connection tracking for certain
3700 packets.
3701
3702 notrack
3703
3704 Note that for this statement to be effective, it has to be applied to
3705 packets before a conntrack lookup happens. Therefore, it needs to sit
3706 in a chain with either prerouting or output hook and a hook priority of
3707 -300 (raw) or less.
3708
3709 See SYNPROXY STATEMENT for an example usage.
3710
3711 META STATEMENT
3712 A meta statement sets the value of a meta expression. The existing meta
3713 fields are: priority, mark, pkttype, nftrace.
3714
3715 meta {mark | priority | pkttype | nftrace} set value
3716
3717 A meta statement sets meta data associated with a packet.
3718
3719 Table 63. Meta statement types
3720 ┌─────────┬─────────────────────┬───────────┐
3721 │Keyword │ Description │ Value │
3722 ├─────────┼─────────────────────┼───────────┤
3723 │ │ │ │
3724 │priority │ TC packet priority │ tc_handle │
3725 ├─────────┼─────────────────────┼───────────┤
3726 │ │ │ │
3727 │mark │ Packet mark │ mark │
3728 ├─────────┼─────────────────────┼───────────┤
3729 │ │ │ │
3730 │pkttype │ packet type │ pkt_type │
3731 ├─────────┼─────────────────────┼───────────┤
3732 │ │ │ │
3733 │nftrace │ ruleset packet │ 0, 1 │
3734 │ │ tracing on/off. Use │ │
3735 │ │ monitor trace │ │
3736 │ │ command to watch │ │
3737 │ │ traces │ │
3738 └─────────┴─────────────────────┴───────────┘
3739
3740 LIMIT STATEMENT
3741 limit rate [over] packet_number / TIME_UNIT [burst packet_number packets]
3742 limit rate [over] byte_number BYTE_UNIT / TIME_UNIT [burst byte_number BYTE_UNIT]
3743
3744 TIME_UNIT := second | minute | hour | day
3745 BYTE_UNIT := bytes | kbytes | mbytes
3746
3747 A limit statement matches at a limited rate using a token bucket
3748 filter. A rule using this statement will match until this limit is
3749 reached. It can be used in combination with the log statement to give
3750 limited logging. The optional over keyword makes it match over the
3751 specified rate. Default burst is 5. if you specify burst, it must be
3752 non-zero value.
3753
3754 Table 64. limit statement values
3755 ┌──────────────┬───────────────────┬──────────────────┐
3756 │Value │ Description │ Type │
3757 ├──────────────┼───────────────────┼──────────────────┤
3758 │ │ │ │
3759 │packet_number │ Number of packets │ unsigned integer │
3760 │ │ │ (32 bit) │
3761 ├──────────────┼───────────────────┼──────────────────┤
3762 │ │ │ │
3763 │byte_number │ Number of bytes │ unsigned integer │
3764 │ │ │ (32 bit) │
3765 └──────────────┴───────────────────┴──────────────────┘
3766
3767 NAT STATEMENTS
3768 snat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
3769 dnat [[ip | ip6] to] ADDR_SPEC [:PORT_SPEC] [FLAGS]
3770 masquerade [to :PORT_SPEC] [FLAGS]
3771 redirect [to :PORT_SPEC] [FLAGS]
3772
3773 ADDR_SPEC := address | address - address
3774 PORT_SPEC := port | port - port
3775
3776 FLAGS := FLAG [, FLAGS]
3777 FLAG := persistent | random | fully-random
3778
3779 The nat statements are only valid from nat chain types.
3780
3781 The snat and masquerade statements specify that the source address of
3782 the packet should be modified. While snat is only valid in the
3783 postrouting and input chains, masquerade makes sense only in
3784 postrouting. The dnat and redirect statements are only valid in the
3785 prerouting and output chains, they specify that the destination address
3786 of the packet should be modified. You can use non-base chains which are
3787 called from base chains of nat chain type too. All future packets in
3788 this connection will also be mangled, and rules should cease being
3789 examined.
3790
3791 The masquerade statement is a special form of snat which always uses
3792 the outgoing interface’s IP address to translate to. It is particularly
3793 useful on gateways with dynamic (public) IP addresses.
3794
3795 The redirect statement is a special form of dnat which always
3796 translates the destination address to the local host’s one. It comes in
3797 handy if one only wants to alter the destination port of incoming
3798 traffic on different interfaces.
3799
3800 When used in the inet family (available with kernel 5.2), the dnat and
3801 snat statements require the use of the ip and ip6 keyword in case an
3802 address is provided, see the examples below.
3803
3804 Before kernel 4.18 nat statements require both prerouting and
3805 postrouting base chains to be present since otherwise packets on the
3806 return path won’t be seen by netfilter and therefore no reverse
3807 translation will take place.
3808
3809 Table 65. NAT statement values
3810 ┌───────────┬─────────────────────┬─────────────────────┐
3811 │Expression │ Description │ Type │
3812 ├───────────┼─────────────────────┼─────────────────────┤
3813 │ │ │ │
3814 │address │ Specifies that the │ ipv4_addr, │
3815 │ │ source/destination │ ipv6_addr, e.g. │
3816 │ │ address of the │ abcd::1234, or you │
3817 │ │ packet should be │ can use a mapping, │
3818 │ │ modified. You may │ e.g. meta mark map │
3819 │ │ specify a mapping │ { 10 : 192.168.1.2, │
3820 │ │ to relate a list of │ 20 : 192.168.1.3 } │
3821 │ │ tuples composed of │ │
3822 │ │ arbitrary │ │
3823 │ │ expression key with │ │
3824 │ │ address value. │ │
3825 ├───────────┼─────────────────────┼─────────────────────┤
3826 │ │ │ │
3827 │port │ Specifies that the │ port number (16 │
3828 │ │ source/destination │ bit) │
3829 │ │ address of the │ │
3830 │ │ packet should be │ │
3831 │ │ modified. │ │
3832 └───────────┴─────────────────────┴─────────────────────┘
3833
3834 Table 66. NAT statement flags
3835 ┌─────────────┬─────────────────────────────┐
3836 │Flag │ Description │
3837 ├─────────────┼─────────────────────────────┤
3838 │ │ │
3839 │persistent │ Gives a client the same │
3840 │ │ source-/destination-address │
3841 │ │ for each connection. │
3842 ├─────────────┼─────────────────────────────┤
3843 │ │ │
3844 │random │ In kernel 5.0 and newer │
3845 │ │ this is the same as │
3846 │ │ fully-random. In earlier │
3847 │ │ kernels the port mapping │
3848 │ │ will be randomized using a │
3849 │ │ seeded MD5 hash mix using │
3850 │ │ source and destination │
3851 │ │ address and destination │
3852 │ │ port. │
3853 ├─────────────┼─────────────────────────────┤
3854 │ │ │
3855 │fully-random │ If used then port mapping │
3856 │ │ is generated based on a │
3857 │ │ 32-bit pseudo-random │
3858 │ │ algorithm. │
3859 └─────────────┴─────────────────────────────┘
3860
3861 Using NAT statements.
3862
3863 # create a suitable table/chain setup for all further examples
3864 add table nat
3865 add chain nat prerouting { type nat hook prerouting priority dstnat; }
3866 add chain nat postrouting { type nat hook postrouting priority srcnat; }
3867
3868 # translate source addresses of all packets leaving via eth0 to address 1.2.3.4
3869 add rule nat postrouting oif eth0 snat to 1.2.3.4
3870
3871 # redirect all traffic entering via eth0 to destination address 192.168.1.120
3872 add rule nat prerouting iif eth0 dnat to 192.168.1.120
3873
3874 # translate source addresses of all packets leaving via eth0 to whatever
3875 # locally generated packets would use as source to reach the same destination
3876 add rule nat postrouting oif eth0 masquerade
3877
3878 # redirect incoming TCP traffic for port 22 to port 2222
3879 add rule nat prerouting tcp dport 22 redirect to :2222
3880
3881 # inet family:
3882 # handle ip dnat:
3883 add rule inet nat prerouting dnat ip to 10.0.2.99
3884 # handle ip6 dnat:
3885 add rule inet nat prerouting dnat ip6 to fe80::dead
3886 # this masquerades both ipv4 and ipv6:
3887 add rule inet nat postrouting meta oif ppp0 masquerade
3888
3889
3890 TPROXY STATEMENT
3891 Tproxy redirects the packet to a local socket without changing the
3892 packet header in any way. If any of the arguments is missing the data
3893 of the incoming packet is used as parameter. Tproxy matching requires
3894 another rule that ensures the presence of transport protocol header is
3895 specified.
3896
3897 tproxy to address:port
3898 tproxy to {address | :port}
3899
3900 This syntax can be used in ip/ip6 tables where network layer protocol
3901 is obvious. Either IP address or port can be specified, but at least
3902 one of them is necessary.
3903
3904 tproxy {ip | ip6} to address[:port]
3905 tproxy to :port
3906
3907 This syntax can be used in inet tables. The ip/ip6 parameter defines
3908 the family the rule will match. The address parameter must be of this
3909 family. When only port is defined, the address family should not be
3910 specified. In this case the rule will match for both families.
3911
3912 Table 67. tproxy attributes
3913 ┌────────┬────────────────────────────┐
3914 │Name │ Description │
3915 ├────────┼────────────────────────────┤
3916 │ │ │
3917 │address │ IP address the listening │
3918 │ │ socket with IP_TRANSPARENT │
3919 │ │ option is bound to. │
3920 ├────────┼────────────────────────────┤
3921 │ │ │
3922 │port │ Port the listening socket │
3923 │ │ with IP_TRANSPARENT option │
3924 │ │ is bound to. │
3925 └────────┴────────────────────────────┘
3926
3927 Example ruleset for tproxy statement.
3928
3929 table ip x {
3930 chain y {
3931 type filter hook prerouting priority mangle; policy accept;
3932 tcp dport ntp tproxy to 1.1.1.1
3933 udp dport ssh tproxy to :2222
3934 }
3935 }
3936 table ip6 x {
3937 chain y {
3938 type filter hook prerouting priority mangle; policy accept;
3939 tcp dport ntp tproxy to [dead::beef]
3940 udp dport ssh tproxy to :2222
3941 }
3942 }
3943 table inet x {
3944 chain y {
3945 type filter hook prerouting priority mangle; policy accept;
3946 tcp dport 321 tproxy to :ssh
3947 tcp dport 99 tproxy ip to 1.1.1.1:999
3948 udp dport 155 tproxy ip6 to [dead::beef]:smux
3949 }
3950 }
3951
3952
3953 SYNPROXY STATEMENT
3954 This statement will process TCP three-way-handshake parallel in
3955 netfilter context to protect either local or backend system. This
3956 statement requires connection tracking because sequence numbers need to
3957 be translated.
3958
3959 synproxy [mss mss_value] [wscale wscale_value] [SYNPROXY_FLAGS]
3960
3961 Table 68. synproxy statement attributes
3962 ┌───────┬────────────────────────────┐
3963 │Name │ Description │
3964 ├───────┼────────────────────────────┤
3965 │ │ │
3966 │mss │ Maximum segment size │
3967 │ │ announced to clients. This │
3968 │ │ must match the backend. │
3969 ├───────┼────────────────────────────┤
3970 │ │ │
3971 │wscale │ Window scale announced to │
3972 │ │ clients. This must match │
3973 │ │ the backend. │
3974 └───────┴────────────────────────────┘
3975
3976 Table 69. synproxy statement flags
3977 ┌──────────┬────────────────────────────┐
3978 │Flag │ Description │
3979 ├──────────┼────────────────────────────┤
3980 │ │ │
3981 │sack-perm │ Pass client selective │
3982 │ │ acknowledgement option to │
3983 │ │ backend (will be disabled │
3984 │ │ if not present). │
3985 ├──────────┼────────────────────────────┤
3986 │ │ │
3987 │timestamp │ Pass client timestamp │
3988 │ │ option to backend (will be │
3989 │ │ disabled if not present, │
3990 │ │ also needed for selective │
3991 │ │ acknowledgement and window │
3992 │ │ scaling). │
3993 └──────────┴────────────────────────────┘
3994
3995 Example ruleset for synproxy statement.
3996
3997 Determine tcp options used by backend, from an external system
3998
3999 tcpdump -pni eth0 -c 1 'tcp[tcpflags] == (tcp-syn|tcp-ack)'
4000 port 80 &
4001 telnet 192.0.2.42 80
4002 18:57:24.693307 IP 192.0.2.42.80 > 192.0.2.43.48757:
4003 Flags [S.], seq 360414582, ack 788841994, win 14480,
4004 options [mss 1460,sackOK,
4005 TS val 1409056151 ecr 9690221,
4006 nop,wscale 9],
4007 length 0
4008
4009 Switch tcp_loose mode off, so conntrack will mark out-of-flow packets as state INVALID.
4010
4011 echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose
4012
4013 Make SYN packets untracked.
4014
4015 table ip x {
4016 chain y {
4017 type filter hook prerouting priority raw; policy accept;
4018 tcp flags syn notrack
4019 }
4020 }
4021
4022 Catch UNTRACKED (SYN packets) and INVALID (3WHS ACK packets) states and send
4023 them to SYNPROXY. This rule will respond to SYN packets with SYN+ACK
4024 syncookies, create ESTABLISHED for valid client response (3WHS ACK packets) and
4025 drop incorrect cookies. Flags combinations not expected during 3WHS will not
4026 match and continue (e.g. SYN+FIN, SYN+ACK). Finally, drop invalid packets, this
4027 will be out-of-flow packets that were not matched by SYNPROXY.
4028
4029 table ip x {
4030 chain z {
4031 type filter hook input priority filter; policy accept;
4032 ct state invalid, untracked synproxy mss 1460 wscale 9 timestamp sack-perm
4033 ct state invalid drop
4034 }
4035 }
4036
4037
4038 FLOW STATEMENT
4039 A flow statement allows us to select what flows you want to accelerate
4040 forwarding through layer 3 network stack bypass. You have to specify
4041 the flowtable name where you want to offload this flow.
4042
4043 flow add @flowtable
4044
4045 QUEUE STATEMENT
4046 This statement passes the packet to userspace using the nfnetlink_queue
4047 handler. The packet is put into the queue identified by its 16-bit
4048 queue number. Userspace can inspect and modify the packet if desired.
4049 Userspace must then drop or re-inject the packet into the kernel. See
4050 libnetfilter_queue documentation for details.
4051
4052 queue [flags QUEUE_FLAGS] [to queue_number]
4053 queue [flags QUEUE_FLAGS] [to queue_number_from - queue_number_to]
4054 queue [flags QUEUE_FLAGS] [to QUEUE_EXPRESSION ]
4055
4056 QUEUE_FLAGS := QUEUE_FLAG [, QUEUE_FLAGS]
4057 QUEUE_FLAG := bypass | fanout
4058 QUEUE_EXPRESSION := numgen | hash | symhash | MAP STATEMENT
4059
4060 QUEUE_EXPRESSION can be used to compute a queue number at run-time with
4061 the hash or numgen expressions. It also allows to use the map statement
4062 to assign fixed queue numbers based on external inputs such as the
4063 source ip address or interface names.
4064
4065 Table 70. queue statement values
4066 ┌──────────────────┬────────────────────┬──────────────────┐
4067 │Value │ Description │ Type │
4068 ├──────────────────┼────────────────────┼──────────────────┤
4069 │ │ │ │
4070 │queue_number │ Sets queue number, │ unsigned integer │
4071 │ │ default is 0. │ (16 bit) │
4072 ├──────────────────┼────────────────────┼──────────────────┤
4073 │ │ │ │
4074 │queue_number_from │ Sets initial queue │ unsigned integer │
4075 │ │ in the range, if │ (16 bit) │
4076 │ │ fanout is used. │ │
4077 ├──────────────────┼────────────────────┼──────────────────┤
4078 │ │ │ │
4079 │queue_number_to │ Sets closing queue │ unsigned integer │
4080 │ │ in the range, if │ (16 bit) │
4081 │ │ fanout is used. │ │
4082 └──────────────────┴────────────────────┴──────────────────┘
4083
4084 Table 71. queue statement flags
4085 ┌───────┬────────────────────────────┐
4086 │Flag │ Description │
4087 ├───────┼────────────────────────────┤
4088 │ │ │
4089 │bypass │ Let packets go through if │
4090 │ │ userspace application │
4091 │ │ cannot back off. Before │
4092 │ │ using this flag, read │
4093 │ │ libnetfilter_queue │
4094 │ │ documentation for │
4095 │ │ performance tuning │
4096 │ │ recommendations. │
4097 ├───────┼────────────────────────────┤
4098 │ │ │
4099 │fanout │ Distribute packets between │
4100 │ │ several queues. │
4101 └───────┴────────────────────────────┘
4102
4103 DUP STATEMENT
4104 The dup statement is used to duplicate a packet and send the copy to a
4105 different destination.
4106
4107 dup to device
4108 dup to address device device
4109
4110 Table 72. Dup statement values
4111 ┌───────────┬─────────────────────┬─────────────────────┐
4112 │Expression │ Description │ Type │
4113 ├───────────┼─────────────────────┼─────────────────────┤
4114 │ │ │ │
4115 │address │ Specifies that the │ ipv4_addr, │
4116 │ │ copy of the packet │ ipv6_addr, e.g. │
4117 │ │ should be sent to a │ abcd::1234, or you │
4118 │ │ new gateway. │ can use a mapping, │
4119 │ │ │ e.g. ip saddr map { │
4120 │ │ │ 192.168.1.2 : │
4121 │ │ │ 10.1.1.1 } │
4122 ├───────────┼─────────────────────┼─────────────────────┤
4123 │ │ │ │
4124 │device │ Specifies that the │ string │
4125 │ │ copy should be │ │
4126 │ │ transmitted via │ │
4127 │ │ device. │ │
4128 └───────────┴─────────────────────┴─────────────────────┘
4129
4130 Using the dup statement.
4131
4132 # send to machine with ip address 10.2.3.4 on eth0
4133 ip filter forward dup to 10.2.3.4 device "eth0"
4134
4135 # copy raw frame to another interface
4136 netdev ingress dup to "eth0"
4137 dup to "eth0"
4138
4139 # combine with map dst addr to gateways
4140 dup to ip daddr map { 192.168.7.1 : "eth0", 192.168.7.2 : "eth1" }
4141
4142
4143 FWD STATEMENT
4144 The fwd statement is used to redirect a raw packet to another
4145 interface. It is only available in the netdev family ingress and egress
4146 hooks. It is similar to the dup statement except that no copy is made.
4147
4148 fwd to device
4149
4150 SET STATEMENT
4151 The set statement is used to dynamically add or update elements in a
4152 set from the packet path. The set setname must already exist in the
4153 given table and must have been created with one or both of the dynamic
4154 and the timeout flags. The dynamic flag is required if the set
4155 statement expression includes a stateful object. The timeout flag is
4156 implied if the set is created with a timeout, and is required if the
4157 set statement updates elements, rather than adding them. Furthermore,
4158 these sets should specify both a maximum set size (to prevent memory
4159 exhaustion), and their elements should have a timeout (so their number
4160 will not grow indefinitely) either from the set definition or from the
4161 statement that adds or updates them. The set statement can be used to
4162 e.g. create dynamic blacklists.
4163
4164 {add | update} @setname { expression [timeout timeout] [comment string] }
4165
4166 Example for simple blacklist.
4167
4168 # declare a set, bound to table "filter", in family "ip".
4169 # Timeout and size are mandatory because we will add elements from packet path.
4170 # Entries will timeout after one minute, after which they might be
4171 # re-added if limit condition persists.
4172 nft add set ip filter blackhole \
4173 "{ type ipv4_addr; flags dynamic; timeout 1m; size 65536; }"
4174
4175 # declare a set to store the limit per saddr.
4176 # This must be separate from blackhole since the timeout is different
4177 nft add set ip filter flood \
4178 "{ type ipv4_addr; flags dynamic; timeout 10s; size 128000; }"
4179
4180 # whitelist internal interface.
4181 nft add rule ip filter input meta iifname "internal" accept
4182
4183 # drop packets coming from blacklisted ip addresses.
4184 nft add rule ip filter input ip saddr @blackhole counter drop
4185
4186 # add source ip addresses to the blacklist if more than 10 tcp connection
4187 # requests occurred per second and ip address.
4188 nft add rule ip filter input tcp flags syn tcp dport ssh \
4189 add @flood { ip saddr limit rate over 10/second } \
4190 add @blackhole { ip saddr } \
4191 drop
4192
4193 # inspect state of the sets.
4194 nft list set ip filter flood
4195 nft list set ip filter blackhole
4196
4197 # manually add two addresses to the blackhole.
4198 nft add element filter blackhole { 10.2.3.4, 10.23.1.42 }
4199
4200
4201 MAP STATEMENT
4202 The map statement is used to lookup data based on some specific input
4203 key.
4204
4205 expression map { MAP_ELEMENTS }
4206
4207 MAP_ELEMENTS := MAP_ELEMENT [, MAP_ELEMENTS]
4208 MAP_ELEMENT := key : value
4209
4210 The key is a value returned by expression.
4211
4212 Using the map statement.
4213
4214 # select DNAT target based on TCP dport:
4215 # connections to port 80 are redirected to 192.168.1.100,
4216 # connections to port 8888 are redirected to 192.168.1.101
4217 nft add rule ip nat prerouting dnat tcp dport map { 80 : 192.168.1.100, 8888 : 192.168.1.101 }
4218
4219 # source address based SNAT:
4220 # packets from net 192.168.1.0/24 will appear as originating from 10.0.0.1,
4221 # packets from net 192.168.2.0/24 will appear as originating from 10.0.0.2
4222 nft add rule ip nat postrouting snat to ip saddr map { 192.168.1.0/24 : 10.0.0.1, 192.168.2.0/24 : 10.0.0.2 }
4223
4224
4225 VMAP STATEMENT
4226 The verdict map (vmap) statement works analogous to the map statement,
4227 but contains verdicts as values.
4228
4229 expression vmap { VMAP_ELEMENTS }
4230
4231 VMAP_ELEMENTS := VMAP_ELEMENT [, VMAP_ELEMENTS]
4232 VMAP_ELEMENT := key : verdict
4233
4234 Using the vmap statement.
4235
4236 # jump to different chains depending on layer 4 protocol type:
4237 nft add rule ip filter input ip protocol vmap { tcp : jump tcp-chain, udp : jump udp-chain , icmp : jump icmp-chain }
4238
4239
4241 These are some additional commands included in nft.
4242
4243 MONITOR
4244 The monitor command allows you to listen to Netlink events produced by
4245 the nf_tables subsystem. These are either related to creation and
4246 deletion of objects or to packets for which meta nftrace was enabled.
4247 When they occur, nft will print to stdout the monitored events in
4248 either JSON or native nft format.
4249
4250 monitor [new | destroy] MONITOR_OBJECT
4251 monitor trace
4252
4253 MONITOR_OBJECT := tables | chains | sets | rules | elements | ruleset
4254
4255 To filter events related to a concrete object, use one of the keywords
4256 in MONITOR_OBJECT.
4257
4258 To filter events related to a concrete action, use keyword new or
4259 destroy.
4260
4261 The second form of invocation takes no further options and exclusively
4262 prints events generated for packets with nftrace enabled.
4263
4264 Hit ^C to finish the monitor operation.
4265
4266 Listen to all events, report in native nft format.
4267
4268 % nft monitor
4269
4270 Listen to deleted rules, report in JSON format.
4271
4272 % nft -j monitor destroy rules
4273
4274 Listen to both new and destroyed chains, in native nft format.
4275
4276 % nft monitor chains
4277
4278 Listen to ruleset events such as table, chain, rule, set, counters and
4279 quotas, in native nft format.
4280
4281 % nft monitor ruleset
4282
4283 Trace incoming packets from host 10.0.0.1.
4284
4285 % nft add rule filter input ip saddr 10.0.0.1 meta nftrace set 1
4286 % nft monitor trace
4287
4288
4290 When an error is detected, nft shows the line(s) containing the error,
4291 the position of the erroneous parts in the input stream and marks up
4292 the erroneous parts using carets (^). If the error results from the
4293 combination of two expressions or statements, the part imposing the
4294 constraints which are violated is marked using tildes (~).
4295
4296 For errors returned by the kernel, nft cannot detect which parts of the
4297 input caused the error and the entire command is marked.
4298
4299 Error caused by single incorrect expression.
4300
4301 <cmdline>:1:19-22: Error: Interface does not exist
4302 filter output oif eth0
4303 ^^^^
4304
4305 Error caused by invalid combination of two expressions.
4306
4307 <cmdline>:1:28-36: Error: Right hand side of relational expression (==) must be constant
4308 filter output tcp dport == tcp dport
4309 ~~ ^^^^^^^^^
4310
4311 Error returned by the kernel.
4312
4313 <cmdline>:0:0-23: Error: Could not process rule: Operation not permitted
4314 filter output oif wlan0
4315 ^^^^^^^^^^^^^^^^^^^^^^^
4316
4317
4319 On success, nft exits with a status of 0. Unspecified errors cause it
4320 to exit with a status of 1, memory allocation errors with a status of
4321 2, unable to open Netlink socket with 3.
4322
4324 libnftables(3), libnftables-json(5), iptables(8), ip6tables(8), arptables(8), ebtables(8), ip(8), tc(8)
4325
4326 There is an official wiki at: https://wiki.nftables.org
4327
4329 nftables was written by Patrick McHardy and Pablo Neira Ayuso, among
4330 many other contributors from the Netfilter community.
4331
4333 Copyright © 2008-2014 Patrick McHardy <kaber@trash.net> Copyright ©
4334 2013-2018 Pablo Neira Ayuso <pablo@netfilter.org>
4335
4336 nftables is free software; you can redistribute it and/or modify it
4337 under the terms of the GNU General Public License version 2 as
4338 published by the Free Software Foundation.
4339
4340 This documentation is licensed under the terms of the Creative Commons
4341 Attribution-ShareAlike 4.0 license, CC BY-SA 4.0
4342 http://creativecommons.org/licenses/by-sa/4.0/.
4343
4344
4345
4346 11/18/2021 NFT(8)