1ovn-northd(8) Open vSwitch Manual ovn-northd(8)
2
3
4
6 ovn-northd - Open Virtual Network central control daemon
7
9 ovn-northd [options]
10
12 ovn-northd is a centralized daemon responsible for translating the
13 high-level OVN configuration into logical configuration consumable by
14 daemons such as ovn-controller. It translates the logical network con‐
15 figuration in terms of conventional network concepts, taken from the
16 OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
17 the OVN Southbound Database (see ovn-sb(5)) below it.
18
20 --ovnnb-db=database
21 The OVSDB database containing the OVN Northbound Database. If
22 the OVN_NB_DB environment variable is set, its value is used as
23 the default. Otherwise, the default is unix:/ovnnb_db.sock.
24
25 --ovnsb-db=database
26 The OVSDB database containing the OVN Southbound Database. If
27 the OVN_SB_DB environment variable is set, its value is used as
28 the default. Otherwise, the default is unix:/ovnsb_db.sock.
29
30 database in the above options must be an OVSDB active or passive con‐
31 nection method, as described in ovsdb(7).
32
33 Daemon Options
34 --pidfile[=pidfile]
35 Causes a file (by default, program.pid) to be created indicating
36 the PID of the running process. If the pidfile argument is not
37 specified, or if it does not begin with /, then it is created in
38 .
39
40 If --pidfile is not specified, no pidfile is created.
41
42 --overwrite-pidfile
43 By default, when --pidfile is specified and the specified pid‐
44 file already exists and is locked by a running process, the dae‐
45 mon refuses to start. Specify --overwrite-pidfile to cause it to
46 instead overwrite the pidfile.
47
48 When --pidfile is not specified, this option has no effect.
49
50 --detach
51 Runs this program as a background process. The process forks,
52 and in the child it starts a new session, closes the standard
53 file descriptors (which has the side effect of disabling logging
54 to the console), and changes its current directory to the root
55 (unless --no-chdir is specified). After the child completes its
56 initialization, the parent exits.
57
58 --monitor
59 Creates an additional process to monitor this program. If it
60 dies due to a signal that indicates a programming error (SIGA‐
61 BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
62 or SIGXFSZ) then the monitor process starts a new copy of it. If
63 the daemon dies or exits for another reason, the monitor process
64 exits.
65
66 This option is normally used with --detach, but it also func‐
67 tions without it.
68
69 --no-chdir
70 By default, when --detach is specified, the daemon changes its
71 current working directory to the root directory after it
72 detaches. Otherwise, invoking the daemon from a carelessly cho‐
73 sen directory would prevent the administrator from unmounting
74 the file system that holds that directory.
75
76 Specifying --no-chdir suppresses this behavior, preventing the
77 daemon from changing its current working directory. This may be
78 useful for collecting core files, since it is common behavior to
79 write core dumps into the current working directory and the root
80 directory is not a good directory to use.
81
82 This option has no effect when --detach is not specified.
83
84 --no-self-confinement
85 By default this daemon will try to self-confine itself to work
86 with files under well-known directories whitelisted at build
87 time. It is better to stick with this default behavior and not
88 to use this flag unless some other Access Control is used to
89 confine daemon. Note that in contrast to other access control
90 implementations that are typically enforced from kernel-space
91 (e.g. DAC or MAC), self-confinement is imposed from the user-
92 space daemon itself and hence should not be considered as a full
93 confinement strategy, but instead should be viewed as an addi‐
94 tional layer of security.
95
96 --user=user:group
97 Causes this program to run as a different user specified in
98 user:group, thus dropping most of the root privileges. Short
99 forms user and :group are also allowed, with current user or
100 group assumed, respectively. Only daemons started by the root
101 user accepts this argument.
102
103 On Linux, daemons will be granted CAP_IPC_LOCK and
104 CAP_NET_BIND_SERVICES before dropping root privileges. Daemons
105 that interact with a datapath, such as ovs-vswitchd, will be
106 granted three additional capabilities, namely CAP_NET_ADMIN,
107 CAP_NET_BROADCAST and CAP_NET_RAW. The capability change will
108 apply even if the new user is root.
109
110 On Windows, this option is not currently supported. For security
111 reasons, specifying this option will cause the daemon process
112 not to start.
113
114 Logging Options
115 -v[spec]
116 --verbose=[spec]
117 Sets logging levels. Without any spec, sets the log level for
118 every module and destination to dbg. Otherwise, spec is a list of
119 words separated by spaces or commas or colons, up to one from each
120 category below:
121
122 · A valid module name, as displayed by the vlog/list command
123 on ovs-appctl(8), limits the log level change to the speci‐
124 fied module.
125
126 · syslog, console, or file, to limit the log level change to
127 only to the system log, to the console, or to a file,
128 respectively. (If --detach is specified, the daemon closes
129 its standard file descriptors, so logging to the console
130 will have no effect.)
131
132 On Windows platform, syslog is accepted as a word and is
133 only useful along with the --syslog-target option (the word
134 has no effect otherwise).
135
136 · off, emer, err, warn, info, or dbg, to control the log
137 level. Messages of the given severity or higher will be
138 logged, and messages of lower severity will be filtered
139 out. off filters out all messages. See ovs-appctl(8) for a
140 definition of each log level.
141
142 Case is not significant within spec.
143
144 Regardless of the log levels set for file, logging to a file will
145 not take place unless --log-file is also specified (see below).
146
147 For compatibility with older versions of OVS, any is accepted as a
148 word but has no effect.
149
150 -v
151 --verbose
152 Sets the maximum logging verbosity level, equivalent to --ver‐
153 bose=dbg.
154
155 -vPATTERN:destination:pattern
156 --verbose=PATTERN:destination:pattern
157 Sets the log pattern for destination to pattern. Refer to
158 ovs-appctl(8) for a description of the valid syntax for pattern.
159
160 -vFACILITY:facility
161 --verbose=FACILITY:facility
162 Sets the RFC5424 facility of the log message. facility can be one
163 of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
164 ftp, ntp, audit, alert, clock2, local0, local1, local2, local3,
165 local4, local5, local6 or local7. If this option is not specified,
166 daemon is used as the default for the local system syslog and
167 local0 is used while sending a message to the target provided via
168 the --syslog-target option.
169
170 --log-file[=file]
171 Enables logging to a file. If file is specified, then it is used
172 as the exact name for the log file. The default log file name used
173 if file is omitted is /var/log/ovn/program.log.
174
175 --syslog-target=host:port
176 Send syslog messages to UDP port on host, in addition to the sys‐
177 tem syslog. The host must be a numerical IP address, not a host‐
178 name.
179
180 --syslog-method=method
181 Specify method as how syslog messages should be sent to syslog
182 daemon. The following forms are supported:
183
184 · libc, to use the libc syslog() function. Downside of using
185 this options is that libc adds fixed prefix to every mes‐
186 sage before it is actually sent to the syslog daemon over
187 /dev/log UNIX domain socket.
188
189 · unix:file, to use a UNIX domain socket directly. It is pos‐
190 sible to specify arbitrary message format with this option.
191 However, rsyslogd 8.9 and older versions use hard coded
192 parser function anyway that limits UNIX domain socket use.
193 If you want to use arbitrary message format with older
194 rsyslogd versions, then use UDP socket to localhost IP
195 address instead.
196
197 · udp:ip:port, to use a UDP socket. With this method it is
198 possible to use arbitrary message format also with older
199 rsyslogd. When sending syslog messages over UDP socket
200 extra precaution needs to be taken into account, for exam‐
201 ple, syslog daemon needs to be configured to listen on the
202 specified UDP port, accidental iptables rules could be
203 interfering with local syslog traffic and there are some
204 security considerations that apply to UDP sockets, but do
205 not apply to UNIX domain sockets.
206
207 · null, to discard all messages logged to syslog.
208
209 The default is taken from the OVS_SYSLOG_METHOD environment vari‐
210 able; if it is unset, the default is libc.
211
212 PKI Options
213 PKI configuration is required in order to use SSL for the connections
214 to the Northbound and Southbound databases.
215
216 -p privkey.pem
217 --private-key=privkey.pem
218 Specifies a PEM file containing the private key used as
219 identity for outgoing SSL connections.
220
221 -c cert.pem
222 --certificate=cert.pem
223 Specifies a PEM file containing a certificate that certi‐
224 fies the private key specified on -p or --private-key to be
225 trustworthy. The certificate must be signed by the certifi‐
226 cate authority (CA) that the peer in SSL connections will
227 use to verify it.
228
229 -C cacert.pem
230 --ca-cert=cacert.pem
231 Specifies a PEM file containing the CA certificate for ver‐
232 ifying certificates presented to this program by SSL peers.
233 (This may be the same certificate that SSL peers use to
234 verify the certificate specified on -c or --certificate, or
235 it may be a different one, depending on the PKI design in
236 use.)
237
238 -C none
239 --ca-cert=none
240 Disables verification of certificates presented by SSL
241 peers. This introduces a security risk, because it means
242 that certificates cannot be verified to be those of known
243 trusted hosts.
244
245 Other Options
246 --unixctl=socket
247 Sets the name of the control socket on which program listens for
248 runtime management commands (see RUNTIME MANAGEMENT COMMANDS,
249 below). If socket does not begin with /, it is interpreted as
250 relative to . If --unixctl is not used at all, the default
251 socket is /program.pid.ctl, where pid is program’s process ID.
252
253 On Windows a local named pipe is used to listen for runtime man‐
254 agement commands. A file is created in the absolute path as
255 pointed by socket or if --unixctl is not used at all, a file is
256 created as program in the configured OVS_RUNDIR directory. The
257 file exists just to mimic the behavior of a Unix domain socket.
258
259 Specifying none for socket disables the control socket feature.
260
261
262
263 -h
264 --help
265 Prints a brief help message to the console.
266
267 -V
268 --version
269 Prints version information to the console.
270
272 ovs-appctl can send commands to a running ovn-northd process. The cur‐
273 rently supported commands are described below.
274
275 exit Causes ovn-northd to gracefully terminate.
276
277 pause Pauses the ovn-northd operation from processing any
278 Northbound and Southbound database changes.
279
280 resume Resumes the ovn-northd operation to process Northbound
281 and Southbound database contents and generate logical
282 flows.
283
284 is-paused
285 Returns "true" if ovn-northd is currently paused, "false"
286 otherwise.
287
289 You may run ovn-northd more than once in an OVN deployment. OVN will
290 automatically ensure that only one of them is active at a time. If mul‐
291 tiple instances of ovn-northd are running and the active ovn-northd
292 fails, one of the hot standby instances of ovn-northd will automati‐
293 cally take over.
294
295 Active-Standby with multiple OVN DB servers
296 You may run multiple OVN DB servers in an OVN deployment with:
297
298 · OVN DB servers deployed in active/passive mode with one
299 active and multiple passive ovsdb-servers.
300
301 · ovn-northd also deployed on all these nodes, using unix
302 ctl sockets to connect to the local OVN DB servers.
303
304 In such deployments, the ovn-northds on the passive nodes will process
305 the DB changes and compute logical flows to be thrown out later,
306 because write transactions are not allowed by the passive ovsdb-
307 servers. It results in unnecessary CPU usage.
308
309 With the help of runtime management command pause, you can pause
310 ovn-northd on these nodes. When a passive node becomes master, you can
311 use the runtime management command resume to resume the ovn-northd to
312 process the DB changes.
313
315 One of the main purposes of ovn-northd is to populate the Logical_Flow
316 table in the OVN_Southbound database. This section describes how
317 ovn-northd does this for switch and router logical datapaths.
318
319 Logical Switch Datapaths
320 Ingress Table 0: Admission Control and Ingress Port Security - L2
321
322 Ingress table 0 contains these logical flows:
323
324 · Priority 100 flows to drop packets with VLAN tags or mul‐
325 ticast Ethernet source addresses.
326
327 · Priority 50 flows that implement ingress port security
328 for each enabled logical port. For logical ports on which
329 port security is enabled, these match the inport and the
330 valid eth.src address(es) and advance only those packets
331 to the next flow table. For logical ports on which port
332 security is not enabled, these advance all packets that
333 match the inport.
334
335 There are no flows for disabled logical ports because the default-drop
336 behavior of logical flow tables causes packets that ingress from them
337 to be dropped.
338
339 Ingress Table 1: Ingress Port Security - IP
340
341 Ingress table 1 contains these logical flows:
342
343 · For each element in the port security set having one or
344 more IPv4 or IPv6 addresses (or both),
345
346 · Priority 90 flow to allow IPv4 traffic if it has
347 IPv4 addresses which match the inport, valid
348 eth.src and valid ip4.src address(es).
349
350 · Priority 90 flow to allow IPv4 DHCP discovery
351 traffic if it has a valid eth.src. This is neces‐
352 sary since DHCP discovery messages are sent from
353 the unspecified IPv4 address (0.0.0.0) since the
354 IPv4 address has not yet been assigned.
355
356 · Priority 90 flow to allow IPv6 traffic if it has
357 IPv6 addresses which match the inport, valid
358 eth.src and valid ip6.src address(es).
359
360 · Priority 90 flow to allow IPv6 DAD (Duplicate
361 Address Detection) traffic if it has a valid
362 eth.src. This is is necessary since DAD include
363 requires joining an multicast group and sending
364 neighbor solicitations for the newly assigned
365 address. Since no address is yet assigned, these
366 are sent from the unspecified IPv6 address (::).
367
368 · Priority 80 flow to drop IP (both IPv4 and IPv6)
369 traffic which match the inport and valid eth.src.
370
371 · One priority-0 fallback flow that matches all packets and
372 advances to the next table.
373
374 Ingress Table 2: Ingress Port Security - Neighbor discovery
375
376 Ingress table 2 contains these logical flows:
377
378 · For each element in the port security set,
379
380 · Priority 90 flow to allow ARP traffic which match
381 the inport and valid eth.src and arp.sha. If the
382 element has one or more IPv4 addresses, then it
383 also matches the valid arp.spa.
384
385 · Priority 90 flow to allow IPv6 Neighbor Solicita‐
386 tion and Advertisement traffic which match the
387 inport, valid eth.src and nd.sll/nd.tll. If the
388 element has one or more IPv6 addresses, then it
389 also matches the valid nd.target address(es) for
390 Neighbor Advertisement traffic.
391
392 · Priority 80 flow to drop ARP and IPv6 Neighbor
393 Solicitation and Advertisement traffic which match
394 the inport and valid eth.src.
395
396 · One priority-0 fallback flow that matches all packets and
397 advances to the next table.
398
399 Ingress Table 3: from-lport Pre-ACLs
400
401 This table prepares flows for possible stateful ACL processing in
402 ingress table ACLs. It contains a priority-0 flow that simply moves
403 traffic to the next table. If stateful ACLs are used in the logical
404 datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
405 1; next;) for table Pre-stateful to send IP packets to the connection
406 tracker before eventually advancing to ingress table ACLs. If special
407 ports such as route ports or localnet ports can’t use ct(), a prior‐
408 ity-110 flow is added to skip over stateful ACLs.
409
410 Ingress Table 4: Pre-LB
411
412 This table prepares flows for possible stateful load balancing process‐
413 ing in ingress table LB and Stateful. It contains a priority-0 flow
414 that simply moves traffic to the next table. Moreover it contains a
415 priority-110 flow to move IPv6 Neighbor Discovery traffic to the next
416 table. If load balancing rules with virtual IP addresses (and ports)
417 are configured in OVN_Northbound database for a logical switch data‐
418 path, a priority-100 flow is added for each configured virtual IP
419 address VIP. For IPv4 VIPs, the match is ip && ip4.dst == VIP. For IPv6
420 VIPs, the match is ip && ip6.dst == VIP. The flow sets an action
421 reg0[0] = 1; next; to act as a hint for table Pre-stateful to send IP
422 packets to the connection tracker for packet de-fragmentation before
423 eventually advancing to ingress table LB. If controller_event has been
424 enabled and load balancing rules with empty backends have been added in
425 OVN_Northbound, a 130 flow is added to trigger ovn-controller events
426 whenever the chassis receives a packet for that particular VIP. If
427 event-elb meter has been previously created, it will be associated to
428 the empty_lb logical flow
429
430 Ingress Table 5: Pre-stateful
431
432 This table prepares flows for all possible stateful processing in next
433 tables. It contains a priority-0 flow that simply moves traffic to the
434 next table. A priority-100 flow sends the packets to connection tracker
435 based on a hint provided by the previous tables (with a match for
436 reg0[0] == 1) by using the ct_next; action.
437
438 Ingress table 6: from-lport ACLs
439
440 Logical flows in this table closely reproduce those in the ACL table in
441 the OVN_Northbound database for the from-lport direction. The priority
442 values from the ACL table have a limited range and have 1000 added to
443 them to leave room for OVN default flows at both higher and lower pri‐
444 orities.
445
446 · allow ACLs translate into logical flows with the next;
447 action. If there are any stateful ACLs on this datapath,
448 then allow ACLs translate to ct_commit; next; (which acts
449 as a hint for the next tables to commit the connection to
450 conntrack),
451
452 · allow-related ACLs translate into logical flows with the
453 ct_commit(ct_label=0/1); next; actions for new connec‐
454 tions and reg0[1] = 1; next; for existing connections.
455
456 · Other ACLs translate to drop; for new or untracked con‐
457 nections and ct_commit(ct_label=1/1); for known connec‐
458 tions. Setting ct_label marks a connection as one that
459 was previously allowed, but should no longer be allowed
460 due to a policy change.
461
462 This table also contains a priority 0 flow with action next;, so that
463 ACLs allow packets by default. If the logical datapath has a statetful
464 ACL, the following flows will also be added:
465
466 · A priority-1 flow that sets the hint to commit IP traffic
467 to the connection tracker (with action reg0[1] = 1;
468 next;). This is needed for the default allow policy
469 because, while the initiator’s direction may not have any
470 stateful rules, the server’s may and then its return
471 traffic would not be known and marked as invalid.
472
473 · A priority-65535 flow that allows any traffic in the
474 reply direction for a connection that has been committed
475 to the connection tracker (i.e., established flows), as
476 long as the committed flow does not have ct_label.blocked
477 set. We only handle traffic in the reply direction here
478 because we want all packets going in the request direc‐
479 tion to still go through the flows that implement the
480 currently defined policy based on ACLs. If a connection
481 is no longer allowed by policy, ct_label.blocked will get
482 set and packets in the reply direction will no longer be
483 allowed, either.
484
485 · A priority-65535 flow that allows any traffic that is
486 considered related to a committed flow in the connection
487 tracker (e.g., an ICMP Port Unreachable from a non-lis‐
488 tening UDP port), as long as the committed flow does not
489 have ct_label.blocked set.
490
491 · A priority-65535 flow that drops all traffic marked by
492 the connection tracker as invalid.
493
494 · A priority-65535 flow that drops all traffic in the reply
495 direction with ct_label.blocked set meaning that the con‐
496 nection should no longer be allowed due to a policy
497 change. Packets in the request direction are skipped here
498 to let a newly created ACL re-allow this connection.
499
500 Ingress Table 7: from-lport QoS Marking
501
502 Logical flows in this table closely reproduce those in the QoS table
503 with the action column set in the OVN_Northbound database for the
504 from-lport direction.
505
506 · For every qos_rules entry in a logical switch with DSCP
507 marking enabled, a flow will be added at the priority
508 mentioned in the QoS table.
509
510 · One priority-0 fallback flow that matches all packets and
511 advances to the next table.
512
513 Ingress Table 8: from-lport QoS Meter
514
515 Logical flows in this table closely reproduce those in the QoS table
516 with the bandwidth column set in the OVN_Northbound database for the
517 from-lport direction.
518
519 · For every qos_rules entry in a logical switch with meter‐
520 ing enabled, a flow will be added at the priorirty men‐
521 tioned in the QoS table.
522
523 · One priority-0 fallback flow that matches all packets and
524 advances to the next table.
525
526 Ingress Table 9: LB
527
528 It contains a priority-0 flow that simply moves traffic to the next ta‐
529 ble. For established connections a priority 100 flow matches on ct.est
530 && !ct.rel && !ct.new && !ct.inv and sets an action reg0[2] = 1; next;
531 to act as a hint for table Stateful to send packets through connection
532 tracker to NAT the packets. (The packet will automatically get DNATed
533 to the same IP address as the first packet in that connection.)
534
535 Ingress Table 10: Stateful
536
537 · For all the configured load balancing rules for a switch
538 in OVN_Northbound database that includes a L4 port PORT
539 of protocol P and IP address VIP, a priority-120 flow is
540 added. For IPv4 VIPs , the flow matches ct.new && ip &&
541 ip4.dst == VIP && P && P.dst == PORT. For IPv6 VIPs, the
542 flow matches ct.new && ip && ip6.dst == VIP && P && P.dst
543 == PORT. The flow’s action is ct_lb(args) , where args
544 contains comma separated IP addresses (and optional port
545 numbers) to load balance to. The address family of the IP
546 addresses of args is the same as the address family of
547 VIP
548
549 · For all the configured load balancing rules for a switch
550 in OVN_Northbound database that includes just an IP
551 address VIP to match on, OVN adds a priority-110 flow.
552 For IPv4 VIPs, the flow matches ct.new && ip && ip4.dst
553 == VIP. For IPv6 VIPs, the flow matches ct.new && ip &&
554 ip6.dst == VIP. The action on this flow is ct_lb(args),
555 where args contains comma separated IP addresses of the
556 same address family as VIP.
557
558 · A priority-100 flow commits packets to connection tracker
559 using ct_commit; next; action based on a hint provided by
560 the previous tables (with a match for reg0[1] == 1).
561
562 · A priority-100 flow sends the packets to connection
563 tracker using ct_lb; as the action based on a hint pro‐
564 vided by the previous tables (with a match for reg0[2] ==
565 1).
566
567 · A priority-0 flow that simply moves traffic to the next
568 table.
569
570 Ingress Table 11: ARP/ND responder
571
572 This table implements ARP/ND responder in a logical switch for known
573 IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
574 by locally responding to ARP requests without the need to send to other
575 hypervisors. One common case is when the inport is a logical port asso‐
576 ciated with a VIF and the broadcast is responded to on the local hyper‐
577 visor rather than broadcast across the whole network and responded to
578 by the destination VM. This behavior is proxy ARP.
579
580 ARP requests arrive from VMs from a logical switch inport of type
581 default. For this case, the logical switch proxy ARP rules can be for
582 other VMs or logical router ports. Logical switch proxy ARP rules may
583 be programmed both for mac binding of IP addresses on other logical
584 switch VIF ports (which are of the default logical switch port type,
585 representing connectivity to VMs or containers), and for mac binding of
586 IP addresses on logical switch router type ports, representing their
587 logical router port peers. In order to support proxy ARP for logical
588 router ports, an IP address must be configured on the logical switch
589 router type port, with the same value as the peer logical router port.
590 The configured MAC addresses must match as well. When a VM sends an ARP
591 request for a distributed logical router port and if the peer router
592 type port of the attached logical switch does not have an IP address
593 configured, the ARP request will be broadcast on the logical switch.
594 One of the copies of the ARP request will go through the logical switch
595 router type port to the logical router datapath, where the logical
596 router ARP responder will generate a reply. The MAC binding of a dis‐
597 tributed logical router, once learned by an associated VM, is used for
598 all that VM’s communication needing routing. Hence, the action of a VM
599 re-arping for the mac binding of the logical router port should be
600 rare.
601
602 Logical switch ARP responder proxy ARP rules can also be hit when
603 receiving ARP requests externally on a L2 gateway port. In this case,
604 the hypervisor acting as an L2 gateway, responds to the ARP request on
605 behalf of a destination VM.
606
607 Note that ARP requests received from localnet or vtep logical inports
608 can either go directly to VMs, in which case the VM responds or can hit
609 an ARP responder for a logical router port if the packet is used to
610 resolve a logical router port next hop address. In either case, logical
611 switch ARP responder rules will not be hit. It contains these logical
612 flows:
613
614 · Priority-100 flows to skip the ARP responder if inport is
615 of type localnet or vtep and advances directly to the
616 next table. ARP requests sent to localnet or vtep ports
617 can be received by multiple hypervisors. Now, because the
618 same mac binding rules are downloaded to all hypervisors,
619 each of the multiple hypervisors will respond. This will
620 confuse L2 learning on the source of the ARP requests.
621 ARP requests received on an inport of type router are not
622 expected to hit any logical switch ARP responder flows.
623 However, no skip flows are installed for these packets,
624 as there would be some additional flow cost for this and
625 the value appears limited.
626
627 · If inport V is of type virtual adds a priority-100 logi‐
628 cal flow for each P configured in the options:virtual-
629 parents column with the match
630
631 inport == P && && ((arp.op == 1 && arp.spa == VIP && arp.tpa == VIP) || (arp.op == 2 && arp.spa == VIP))
632
633
634 and applies the action
635
636 bind_vport(V, inport);
637
638
639 and advances the packet to the next table.
640
641 Where VIP is the virtual ip configured in the column
642 options:virtual-ip.
643
644 · Priority-50 flows that match ARP requests to each known
645 IP address A of every logical switch port, and respond
646 with ARP replies directly with corresponding Ethernet
647 address E:
648
649 eth.dst = eth.src;
650 eth.src = E;
651 arp.op = 2; /* ARP reply. */
652 arp.tha = arp.sha;
653 arp.sha = E;
654 arp.tpa = arp.spa;
655 arp.spa = A;
656 outport = inport;
657 flags.loopback = 1;
658 output;
659
660
661 These flows are omitted for logical ports (other than
662 router ports or localport ports) that are down and for
663 logical ports of type virtual.
664
665 · Priority-50 flows that match IPv6 ND neighbor solicita‐
666 tions to each known IP address A (and A’s solicited node
667 address) of every logical switch port except of type
668 router, and respond with neighbor advertisements directly
669 with corresponding Ethernet address E:
670
671 nd_na {
672 eth.src = E;
673 ip6.src = A;
674 nd.target = A;
675 nd.tll = E;
676 outport = inport;
677 flags.loopback = 1;
678 output;
679 };
680
681
682 Priority-50 flows that match IPv6 ND neighbor solicita‐
683 tions to each known IP address A (and A’s solicited node
684 address) of logical switch port of type router, and
685 respond with neighbor advertisements directly with corre‐
686 sponding Ethernet address E:
687
688 nd_na_router {
689 eth.src = E;
690 ip6.src = A;
691 nd.target = A;
692 nd.tll = E;
693 outport = inport;
694 flags.loopback = 1;
695 output;
696 };
697
698
699 These flows are omitted for logical ports (other than
700 router ports or localport ports) that are down and for
701 logical ports of type virtual.
702
703 · Priority-100 flows with match criteria like the ARP and
704 ND flows above, except that they only match packets from
705 the inport that owns the IP addresses in question, with
706 action next;. These flows prevent OVN from replying to,
707 for example, an ARP request emitted by a VM for its own
708 IP address. A VM only makes this kind of request to
709 attempt to detect a duplicate IP address assignment, so
710 sending a reply will prevent the VM from accepting the IP
711 address that it owns.
712
713 In place of next;, it would be reasonable to use drop;
714 for the flows’ actions. If everything is working as it is
715 configured, then this would produce equivalent results,
716 since no host should reply to the request. But ARPing for
717 one’s own IP address is intended to detect situations
718 where the network is not working as configured, so drop‐
719 ping the request would frustrate that intent.
720
721 · One priority-0 fallback flow that matches all packets and
722 advances to the next table.
723
724 Ingress Table 12: DHCP option processing
725
726 This table adds the DHCPv4 options to a DHCPv4 packet from the logical
727 ports configured with IPv4 address(es) and DHCPv4 options, and simi‐
728 larly for DHCPv6 options. This table also adds flows for the logical
729 ports of type external.
730
731 · A priority-100 logical flow is added for these logical
732 ports which matches the IPv4 packet with udp.src = 68 and
733 udp.dst = 67 and applies the action put_dhcp_opts and
734 advances the packet to the next table.
735
736 reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
737 next;
738
739
740 For DHCPDISCOVER and DHCPREQUEST, this transforms the
741 packet into a DHCP reply, adds the DHCP offer IP ip and
742 options to the packet, and stores 1 into reg0[3]. For
743 other kinds of packets, it just stores 0 into reg0[3].
744 Either way, it continues to the next table.
745
746 · A priority-100 logical flow is added for these logical
747 ports which matches the IPv6 packet with udp.src = 546
748 and udp.dst = 547 and applies the action put_dhcpv6_opts
749 and advances the packet to the next table.
750
751 reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
752 next;
753
754
755 For DHCPv6 Solicit/Request/Confirm packets, this trans‐
756 forms the packet into a DHCPv6 Advertise/Reply, adds the
757 DHCPv6 offer IP ip and options to the packet, and stores
758 1 into reg0[3]. For other kinds of packets, it just
759 stores 0 into reg0[3]. Either way, it continues to the
760 next table.
761
762 · A priority-0 flow that matches all packets to advances to
763 table 11.
764
765 Ingress Table 13: DHCP responses
766
767 This table implements DHCP responder for the DHCP replies generated by
768 the previous table.
769
770 · A priority 100 logical flow is added for the logical
771 ports configured with DHCPv4 options which matches IPv4
772 packets with udp.src == 68 && udp.dst == 67 && reg0[3] ==
773 1 and responds back to the inport after applying these
774 actions. If reg0[3] is set to 1, it means that the action
775 put_dhcp_opts was successful.
776
777 eth.dst = eth.src;
778 eth.src = E;
779 ip4.dst = A;
780 ip4.src = S;
781 udp.src = 67;
782 udp.dst = 68;
783 outport = P;
784 flags.loopback = 1;
785 output;
786
787
788 where E is the server MAC address and S is the server
789 IPv4 address defined in the DHCPv4 options and A is the
790 IPv4 address defined in the logical port’s addresses col‐
791 umn.
792
793 (This terminates ingress packet processing; the packet
794 does not go to the next ingress table.)
795
796 · A priority 100 logical flow is added for the logical
797 ports configured with DHCPv6 options which matches IPv6
798 packets with udp.src == 546 && udp.dst == 547 && reg0[3]
799 == 1 and responds back to the inport after applying these
800 actions. If reg0[3] is set to 1, it means that the action
801 put_dhcpv6_opts was successful.
802
803 eth.dst = eth.src;
804 eth.src = E;
805 ip6.dst = A;
806 ip6.src = S;
807 udp.src = 547;
808 udp.dst = 546;
809 outport = P;
810 flags.loopback = 1;
811 output;
812
813
814 where E is the server MAC address and S is the server
815 IPv6 LLA address generated from the server_id defined in
816 the DHCPv6 options and A is the IPv6 address defined in
817 the logical port’s addresses column.
818
819 (This terminates packet processing; the packet does not
820 go on the next ingress table.)
821
822 · A priority-0 flow that matches all packets to advances to
823 table 12.
824
825 Ingress Table 14 DNS Lookup
826
827 This table looks up and resolves the DNS names to the corresponding
828 configured IP address(es).
829
830 · A priority-100 logical flow for each logical switch data‐
831 path if it is configured with DNS records, which matches
832 the IPv4 and IPv6 packets with udp.dst = 53 and applies
833 the action dns_lookup and advances the packet to the next
834 table.
835
836 reg0[4] = dns_lookup(); next;
837
838
839 For valid DNS packets, this transforms the packet into a
840 DNS reply if the DNS name can be resolved, and stores 1
841 into reg0[4]. For failed DNS resolution or other kinds of
842 packets, it just stores 0 into reg0[4]. Either way, it
843 continues to the next table.
844
845 Ingress Table 15 DNS Responses
846
847 This table implements DNS responder for the DNS replies generated by
848 the previous table.
849
850 · A priority-100 logical flow for each logical switch data‐
851 path if it is configured with DNS records, which matches
852 the IPv4 and IPv6 packets with udp.dst = 53 && reg0[4] ==
853 1 and responds back to the inport after applying these
854 actions. If reg0[4] is set to 1, it means that the action
855 dns_lookup was successful.
856
857 eth.dst <-> eth.src;
858 ip4.src <-> ip4.dst;
859 udp.dst = udp.src;
860 udp.src = 53;
861 outport = P;
862 flags.loopback = 1;
863 output;
864
865
866 (This terminates ingress packet processing; the packet
867 does not go to the next ingress table.)
868
869 Ingress table 16 External ports
870
871 Traffic from the external logical ports enter the ingress datapath
872 pipeline via the localnet port. This table adds the below logical flows
873 to handle the traffic from these ports.
874
875 · A priority-100 flow is added for each external logical
876 port which doesn’t reside on a chassis to drop the
877 ARP/IPv6 NS request to the router IP(s) (of the logical
878 switch) which matches on the inport of the external logi‐
879 cal port and the valid eth.src address(es) of the exter‐
880 nal logical port.
881
882 This flow guarantees that the ARP/NS request to the
883 router IP address from the external ports is responded by
884 only the chassis which has claimed these external ports.
885 All the other chassis, drops these packets.
886
887 · A priority-0 flow that matches all packets to advances to
888 table 17.
889
890 Ingress Table 17 Destination Lookup
891
892 This table implements switching behavior. It contains these logical
893 flows:
894
895 · A priority-100 flow that punts all IGMP packets to
896 ovn-controller if IGMP snooping is enabled on the logical
897 switch. The flow also forwards the IGMP packets to the
898 MC_MROUTER_STATIC multicast group, which ovn-northd popu‐
899 lates with all the logical ports that have options
900 :mcast_flood_reports=’true’.
901
902 · Priority-90 flows that forward registered IP multicast
903 traffic to their corresponding multicast group, which
904 ovn-northd creates based on learnt IGMP_Group entries.
905 The flows also forward packets to the MC_MROUTER_FLOOD
906 multicast group, which ovn-nortdh populates with all the
907 logical ports that are connected to logical routers with
908 options:mcast_relay=’true’.
909
910 · A priority-85 flow that forwards all IP multicast traffic
911 destined to 224.0.0.X to the MC_FLOOD multicast group,
912 which ovn-northd populates with all enabled logical
913 ports.
914
915 · A priority-80 flow that forwards all unregistered IP mul‐
916 ticast traffic to the MC_STATIC multicast group, which
917 ovn-northd populates with all the logical ports that have
918 options :mcast_flood=’true’. The flow also forwards
919 unregistered IP multicast traffic to the MC_MROUTER_FLOOD
920 multicast group, which ovn-northd populates with all the
921 logical ports connected to logical routers that have
922 options :mcast_relay=’true’.
923
924 · A priority-80 flow that drops all unregistered IP multi‐
925 cast traffic if other_config :mcast_snoop=’true’ and
926 other_config :mcast_flood_unregistered=’false’ and the
927 switch is not connected to a logical router that has
928 options :mcast_relay=’true’ and the switch doesn’t have
929 any logical port with options :mcast_flood=’true’.
930
931 · A priority-70 flow that outputs all packets with an Eth‐
932 ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
933 ticast group.
934
935 · One priority-50 flow that matches each known Ethernet
936 address against eth.dst and outputs the packet to the
937 single associated output port.
938
939 For the Ethernet address on a logical switch port of type
940 router, when that logical switch port’s addresses column
941 is set to router and the connected logical router port
942 specifies a redirect-chassis:
943
944 · The flow for the connected logical router port’s
945 Ethernet address is only programmed on the redi‐
946 rect-chassis.
947
948 · If the logical router has rules specified in nat
949 with external_mac, then those addresses are also
950 used to populate the switch’s destination lookup
951 on the chassis where logical_port is resident.
952
953 For the Ethernet address on a logical switch port of type
954 router, when that logical switch port’s addresses column
955 is set to router and the connected logical router port
956 specifies a reside-on-redirect-chassis and the logical
957 router to which the connected logical router port belongs
958 to has a redirect-chassis distributed gateway logical
959 router port:
960
961 · The flow for the connected logical router port’s
962 Ethernet address is only programmed on the redi‐
963 rect-chassis.
964
965 · One priority-0 fallback flow that matches all packets and
966 outputs them to the MC_UNKNOWN multicast group, which
967 ovn-northd populates with all enabled logical ports that
968 accept unknown destination packets. As a small optimiza‐
969 tion, if no logical ports accept unknown destination
970 packets, ovn-northd omits this multicast group and logi‐
971 cal flow.
972
973 Egress Table 0: Pre-LB
974
975 This table is similar to ingress table Pre-LB. It contains a priority-0
976 flow that simply moves traffic to the next table. Moreover it contains
977 a priority-110 flow to move IPv6 Neighbor Discovery traffic to the next
978 table. If any load balancing rules exist for the datapath, a prior‐
979 ity-100 flow is added with a match of ip and action of reg0[0] = 1;
980 next; to act as a hint for table Pre-stateful to send IP packets to the
981 connection tracker for packet de-fragmentation.
982
983 Egress Table 1: to-lport Pre-ACLs
984
985 This is similar to ingress table Pre-ACLs except for to-lport traffic.
986
987 Egress Table 2: Pre-stateful
988
989 This is similar to ingress table Pre-stateful.
990
991 Egress Table 3: LB
992
993 This is similar to ingress table LB.
994
995 Egress Table 4: to-lport ACLs
996
997 This is similar to ingress table ACLs except for to-lport ACLs.
998
999 In addition, the following flows are added.
1000
1001 · A priority 34000 logical flow is added for each logical
1002 port which has DHCPv4 options defined to allow the DHCPv4
1003 reply packet and which has DHCPv6 options defined to
1004 allow the DHCPv6 reply packet from the Ingress Table 13:
1005 DHCP responses.
1006
1007 · A priority 34000 logical flow is added for each logical
1008 switch datapath configured with DNS records with the
1009 match udp.dst = 53 to allow the DNS reply packet from the
1010 Ingress Table 15:DNS responses.
1011
1012 Egress Table 5: to-lport QoS Marking
1013
1014 This is similar to ingress table QoS marking except they apply to
1015 to-lport QoS rules.
1016
1017 Egress Table 6: to-lport QoS Meter
1018
1019 This is similar to ingress table QoS meter except they apply to
1020 to-lport QoS rules.
1021
1022 Egress Table 7: Stateful
1023
1024 This is similar to ingress table Stateful except that there are no
1025 rules added for load balancing new connections.
1026
1027 Egress Table 8: Egress Port Security - IP
1028
1029 This is similar to the port security logic in table Ingress Port Secu‐
1030 rity - IP except that outport, eth.dst, ip4.dst and ip6.dst are checked
1031 instead of inport, eth.src, ip4.src and ip6.src
1032
1033 Egress Table 9: Egress Port Security - L2
1034
1035 This is similar to the ingress port security logic in ingress table
1036 Admission Control and Ingress Port Security - L2, but with important
1037 differences. Most obviously, outport and eth.dst are checked instead of
1038 inport and eth.src. Second, packets directed to broadcast or multicast
1039 eth.dst are always accepted instead of being subject to the port secu‐
1040 rity rules; this is implemented through a priority-100 flow that
1041 matches on eth.mcast with action output;. Moreover, to ensure that even
1042 broadcast and multicast packets are not delivered to disabled logical
1043 ports, a priority-150 flow for each disabled logical outport overrides
1044 the priority-100 flow with a drop; action. Finally if egress qos has
1045 been enabled on a localnet port, the outgoing queue id is set through
1046 set_queue action. Please remember to mark the corresponding physical
1047 interface with ovn-egress-iface set to true in external_ids
1048
1049 Logical Router Datapaths
1050 Logical router datapaths will only exist for Logical_Router rows in the
1051 OVN_Northbound database that do not have enabled set to false
1052
1053 Ingress Table 0: L2 Admission Control
1054
1055 This table drops packets that the router shouldn’t see at all based on
1056 their Ethernet headers. It contains the following flows:
1057
1058 · Priority-100 flows to drop packets with VLAN tags or mul‐
1059 ticast Ethernet source addresses.
1060
1061 · For each enabled router port P with Ethernet address E, a
1062 priority-50 flow that matches inport == P && (eth.mcast
1063 || eth.dst == E), with action next;.
1064
1065 For the gateway port on a distributed logical router
1066 (where one of the logical router ports specifies a redi‐
1067 rect-chassis), the above flow matching eth.dst == E is
1068 only programmed on the gateway port instance on the redi‐
1069 rect-chassis.
1070
1071 · For each dnat_and_snat NAT rule on a distributed router
1072 that specifies an external Ethernet address E, a prior‐
1073 ity-50 flow that matches inport == GW && eth.dst == E,
1074 where GW is the logical router gateway port, with action
1075 next;.
1076
1077 This flow is only programmed on the gateway port instance
1078 on the chassis where the logical_port specified in the
1079 NAT rule resides.
1080
1081 Other packets are implicitly dropped.
1082
1083 Ingress Table 1: Neighbor lookup
1084
1085 For ARP and IPv6 Neighbor Discovery packets, this table looks into the
1086 MAC_Binding records to determine if OVN needs to learn the mac bind‐
1087 ings. Following flows are added:
1088
1089 · For each router port P that owns IP address A, which
1090 belongs to subnet S with prefix length L, a priority-100
1091 flow is added which matches inport == P && arp.spa == S/L
1092 && arp.op == 1 (ARP request) with the following actions:
1093
1094 reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
1095 next;
1096
1097
1098 If the logical router port P is a distributed gateway
1099 router port, additional match is_chassis_resident(cr-P)
1100 is added so that the resident gateway chassis handles the
1101 neighbor lookup.
1102
1103 · A priority-100 flow which matches on ARP reply packets
1104 and applies the actions:
1105
1106 reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
1107 next;
1108
1109
1110 · A priority-100 flow which matches on IPv6 Neighbor Dis‐
1111 covery advertisement packet and applies the actions:
1112
1113 reg9[4] = lookup_nd(inport, nd.target, nd.tll);
1114 next;
1115
1116
1117 · A priority-100 flow which matches on IPv6 Neighbor Dis‐
1118 covery solicitation packet and applies the actions:
1119
1120 reg9[4] = lookup_nd(inport, ip6.src, nd.sll);
1121 next;
1122
1123
1124 · A priority-0 fallback flow that matches all packets and
1125 applies the action reg9[5] = 1; next; advancing the
1126 packet to the next table.
1127
1128 Ingress Table 2: Neighbor learning
1129
1130 This table adds flows to learn the mac bindings from the ARP and IPv6
1131 Neighbor Solicitation/Advertisement packets if ARP/ND lookup failed in
1132 the previous table.
1133
1134 reg9[4] will be 1 if the lookup_arp/lookup_nd in the previous table was
1135 successful.
1136
1137 reg9[5] will be 1 if there was no need to do the lookup.
1138
1139 · A priority-100 flow with the match reg9[4] == 1 ||
1140 reg9[5] == 1 and advances the packet to the next table as
1141 there is no need to learn the neighbor.
1142
1143 · A priority-90 flow with the match arp and applies the
1144 action put_arp(inport, arp.spa, arp.sha); next;
1145
1146 · A priority-90 flow with the match nd_na and applies the
1147 action put_nd(inport, nd.target, nd.tll); next;
1148
1149 · A priority-90 flow with the match nd_ns and applies the
1150 action put_nd(inport, ip6.src, nd.sll); next;
1151
1152 Ingress Table 3: IP Input
1153
1154 This table is the core of the logical router datapath functionality. It
1155 contains the following flows to implement very basic IP host function‐
1156 ality.
1157
1158 · L3 admission control: A priority-100 flow drops packets
1159 that match any of the following:
1160
1161 · ip4.src[28..31] == 0xe (multicast source)
1162
1163 · ip4.src == 255.255.255.255 (broadcast source)
1164
1165 · ip4.src == 127.0.0.0/8 || ip4.dst == 127.0.0.0/8
1166 (localhost source or destination)
1167
1168 · ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
1169 network source or destination)
1170
1171 · ip4.src or ip6.src is any IP address owned by the
1172 router, unless the packet was recirculated due to
1173 egress loopback as indicated by REG‐
1174 BIT_EGRESS_LOOPBACK.
1175
1176 · ip4.src is the broadcast address of any IP network
1177 known to the router.
1178
1179 · A priority-95 flow allows IP multicast traffic if
1180 options:mcast_relay=’true’, otherwise drops it.
1181
1182 · ICMP echo reply. These flows reply to ICMP echo requests
1183 received for the router’s IP address. Let A be an IP
1184 address owned by a router port. Then, for each A that is
1185 an IPv4 address, a priority-90 flow matches on ip4.dst ==
1186 A and icmp4.type == 8 && icmp4.code == 0 (ICMP echo
1187 request). For each A that is an IPv6 address, a prior‐
1188 ity-90 flow matches on ip6.dst == A and icmp6.type == 128
1189 && icmp6.code == 0 (ICMPv6 echo request). The port of the
1190 router that receives the echo request does not matter.
1191 Also, the ip.ttl of the echo request packet is not
1192 checked, so it complies with RFC 1812, section 4.2.2.9.
1193 Flows for ICMPv4 echo requests use the following actions:
1194
1195 ip4.dst <-> ip4.src;
1196 ip.ttl = 255;
1197 icmp4.type = 0;
1198 flags.loopback = 1;
1199 next;
1200
1201
1202 Flows for ICMPv6 echo requests use the following actions:
1203
1204 ip6.dst <-> ip6.src;
1205 ip.ttl = 255;
1206 icmp6.type = 129;
1207 flags.loopback = 1;
1208 next;
1209
1210
1211 · Reply to ARP requests.
1212
1213 These flows reply to ARP requests for the router’s own IP
1214 address. The ARP requests are handled only if the
1215 requestor’s IP belongs to the same subnets of the logical
1216 router port. For each router port P that owns IP address
1217 A, which belongs to subnet S with prefix length L, and
1218 Ethernet address E, a priority-90 flow matches inport ==
1219 P && arp.spa == S/L && arp.op == 1 && arp.tpa == A (ARP
1220 request) with the following actions:
1221
1222 eth.dst = eth.src;
1223 eth.src = E;
1224 arp.op = 2; /* ARP reply. */
1225 arp.tha = arp.sha;
1226 arp.sha = E;
1227 arp.tpa = arp.spa;
1228 arp.spa = A;
1229 outport = P;
1230 flags.loopback = 1;
1231 output;
1232
1233
1234 For the gateway port on a distributed logical router
1235 (where one of the logical router ports specifies a redi‐
1236 rect-chassis), the above flows are only programmed on the
1237 gateway port instance on the redirect-chassis. This
1238 behavior avoids generation of multiple ARP responses from
1239 different chassis, and allows upstream MAC learning to
1240 point to the redirect-chassis.
1241
1242 For the logical router port with the option reside-on-re‐
1243 direct-chassis set (which is centralized), the above
1244 flows are only programmed on the gateway port instance on
1245 the redirect-chassis (if the logical router has a dis‐
1246 tributed gateway port). This behavior avoids generation
1247 of multiple ARP responses from different chassis, and
1248 allows upstream MAC learning to point to the redi‐
1249 rect-chassis.
1250
1251 · These flows reply to ARP requests for the virtual IP
1252 addresses configured in the router for DNAT or load bal‐
1253 ancing. For a configured DNAT IP address or a load bal‐
1254 ancer IPv4 VIP A, for each router port P with Ethernet
1255 address E, a priority-90 flow matches inport == P &&
1256 arp.op == 1 && arp.tpa == A (ARP request) with the fol‐
1257 lowing actions:
1258
1259 eth.dst = eth.src;
1260 eth.src = E;
1261 arp.op = 2; /* ARP reply. */
1262 arp.tha = arp.sha;
1263 arp.sha = E;
1264 arp.tpa = arp.spa;
1265 arp.spa = A;
1266 outport = P;
1267 flags.loopback = 1;
1268 output;
1269
1270
1271 For the gateway port on a distributed logical router with
1272 NAT (where one of the logical router ports specifies a
1273 redirect-chassis):
1274
1275 · If the corresponding NAT rule cannot be handled in
1276 a distributed manner, then this flow is only pro‐
1277 grammed on the gateway port instance on the redi‐
1278 rect-chassis. This behavior avoids generation of
1279 multiple ARP responses from different chassis, and
1280 allows upstream MAC learning to point to the redi‐
1281 rect-chassis.
1282
1283 · If the corresponding NAT rule can be handled in a
1284 distributed manner, then this flow is only pro‐
1285 grammed on the gateway port instance where the
1286 logical_port specified in the NAT rule resides.
1287
1288 Some of the actions are different for this case,
1289 using the external_mac specified in the NAT rule
1290 rather than the gateway port’s Ethernet address E:
1291
1292 eth.src = external_mac;
1293 arp.sha = external_mac;
1294
1295
1296 This behavior avoids generation of multiple ARP
1297 responses from different chassis, and allows
1298 upstream MAC learning to point to the correct
1299 chassis.
1300
1301 · Reply to IPv6 Neighbor Solicitations. These flows reply
1302 to Neighbor Solicitation requests for the router’s own
1303 IPv6 address and load balancing IPv6 VIPs and populate
1304 the logical router’s mac binding table.
1305
1306 For each router port P that owns IPv6 address A,
1307 solicited node address S, and Ethernet address E, a pri‐
1308 ority-90 flow matches inport == P && nd_ns && ip6.dst ==
1309 {A, E} && nd.target == A with the following actions:
1310
1311 nd_na_router {
1312 eth.src = E;
1313 ip6.src = A;
1314 nd.target = A;
1315 nd.tll = E;
1316 outport = inport;
1317 flags.loopback = 1;
1318 output;
1319 };
1320
1321
1322 For each router port P that has load balancing VIP A,
1323 solicited node address S, and Ethernet address E, a pri‐
1324 ority-90 flow matches inport == P && nd_ns && ip6.dst ==
1325 {A, E} && nd.target == A with the following actions:
1326
1327 nd_na {
1328 eth.src = E;
1329 ip6.src = A;
1330 nd.target = A;
1331 nd.tll = E;
1332 outport = inport;
1333 flags.loopback = 1;
1334 output;
1335 };
1336
1337
1338 For the gateway port on a distributed logical router
1339 (where one of the logical router ports specifies a redi‐
1340 rect-chassis), the above flows replying to IPv6 Neighbor
1341 Solicitations are only programmed on the gateway port
1342 instance on the redirect-chassis. This behavior avoids
1343 generation of multiple replies from different chassis,
1344 and allows upstream MAC learning to point to the redi‐
1345 rect-chassis.
1346
1347 · Priority-85 flows which drops the ARP and IPv6 Neighbor
1348 Discovery packets.
1349
1350 · UDP port unreachable. Priority-80 flows generate ICMP
1351 port unreachable messages in reply to UDP datagrams
1352 directed to the router’s IP address, except in the spe‐
1353 cial case of gateways, which accept traffic directed to a
1354 router IP for load balancing and NAT purposes.
1355
1356 These flows should not match IP fragments with nonzero
1357 offset.
1358
1359 · TCP reset. Priority-80 flows generate TCP reset messages
1360 in reply to TCP datagrams directed to the router’s IP
1361 address, except in the special case of gateways, which
1362 accept traffic directed to a router IP for load balancing
1363 and NAT purposes.
1364
1365 These flows should not match IP fragments with nonzero
1366 offset.
1367
1368 · Protocol or address unreachable. Priority-70 flows gener‐
1369 ate ICMP protocol or address unreachable messages for
1370 IPv4 and IPv6 respectively in reply to packets directed
1371 to the router’s IP address on IP protocols other than
1372 UDP, TCP, and ICMP, except in the special case of gate‐
1373 ways, which accept traffic directed to a router IP for
1374 load balancing purposes.
1375
1376 These flows should not match IP fragments with nonzero
1377 offset.
1378
1379 · Drop other IP traffic to this router. These flows drop
1380 any other traffic destined to an IP address of this
1381 router that is not already handled by one of the flows
1382 above, which amounts to ICMP (other than echo requests)
1383 and fragments with nonzero offsets. For each IP address A
1384 owned by the router, a priority-60 flow matches ip4.dst
1385 == A and drops the traffic. An exception is made and the
1386 above flow is not added if the router port’s own IP
1387 address is used to SNAT packets passing through that
1388 router.
1389
1390 The flows above handle all of the traffic that might be directed to the
1391 router itself. The following flows (with lower priorities) handle the
1392 remaining traffic, potentially for forwarding:
1393
1394 · Drop Ethernet local broadcast. A priority-50 flow with
1395 match eth.bcast drops traffic destined to the local Eth‐
1396 ernet broadcast address. By definition this traffic
1397 should not be forwarded.
1398
1399 · ICMP time exceeded. For each router port P, whose IP
1400 address is A, a priority-40 flow with match inport == P
1401 && ip.ttl == {0, 1} && !ip.later_frag matches packets
1402 whose TTL has expired, with the following actions to send
1403 an ICMP time exceeded reply for IPv4 and IPv6 respec‐
1404 tively:
1405
1406 icmp4 {
1407 icmp4.type = 11; /* Time exceeded. */
1408 icmp4.code = 0; /* TTL exceeded in transit. */
1409 ip4.dst = ip4.src;
1410 ip4.src = A;
1411 ip.ttl = 255;
1412 next;
1413 };
1414 icmp6 {
1415 icmp6.type = 3; /* Time exceeded. */
1416 icmp6.code = 0; /* TTL exceeded in transit. */
1417 ip6.dst = ip6.src;
1418 ip6.src = A;
1419 ip.ttl = 255;
1420 next;
1421 };
1422
1423
1424 · TTL discard. A priority-30 flow with match ip.ttl == {0,
1425 1} and actions drop; drops other packets whose TTL has
1426 expired, that should not receive a ICMP error reply (i.e.
1427 fragments with nonzero offset).
1428
1429 · Next table. A priority-0 flows match all packets that
1430 aren’t already handled and uses actions next; to feed
1431 them to the next table.
1432
1433 Ingress Table 4: DEFRAG
1434
1435 This is to send packets to connection tracker for tracking and defrag‐
1436 mentation. It contains a priority-0 flow that simply moves traffic to
1437 the next table. If load balancing rules with virtual IP addresses (and
1438 ports) are configured in OVN_Northbound database for a Gateway router,
1439 a priority-100 flow is added for each configured virtual IP address
1440 VIP. For IPv4 VIPs the flow matches ip && ip4.dst == VIP. For IPv6
1441 VIPs, the flow matches ip && ip6.dst == VIP. The flow uses the action
1442 ct_next; to send IP packets to the connection tracker for packet de-
1443 fragmentation and tracking before sending it to the next table.
1444
1445 Ingress Table 3: UNSNAT
1446
1447 This is for already established connections’ reverse traffic. i.e.,
1448 SNAT has already been done in egress pipeline and now the packet has
1449 entered the ingress pipeline as part of a reply. It is unSNATted here.
1450
1451 Ingress Table 3: UNSNAT on Gateway Routers
1452
1453 · If the Gateway router has been configured to force SNAT
1454 any previously DNATted packets to B, a priority-110 flow
1455 matches ip && ip4.dst == B with an action ct_snat; .
1456
1457 If the Gateway router has been configured to force SNAT
1458 any previously load-balanced packets to B, a priority-100
1459 flow matches ip && ip4.dst == B with an action ct_snat; .
1460
1461 For each NAT configuration in the OVN Northbound data‐
1462 base, that asks to change the source IP address of a
1463 packet from A to B, a priority-90 flow matches ip &&
1464 ip4.dst == B with an action ct_snat; .
1465
1466 A priority-0 logical flow with match 1 has actions next;.
1467
1468 Ingress Table 5: UNSNAT on Distributed Routers
1469
1470 · For each configuration in the OVN Northbound database,
1471 that asks to change the source IP address of a packet
1472 from A to B, a priority-100 flow matches ip && ip4.dst ==
1473 B && inport == GW, where GW is the logical router gateway
1474 port, with an action ct_snat;.
1475
1476 If the NAT rule cannot be handled in a distributed man‐
1477 ner, then the priority-100 flow above is only programmed
1478 on the redirect-chassis.
1479
1480 For each configuration in the OVN Northbound database,
1481 that asks to change the source IP address of a packet
1482 from A to B, a priority-50 flow matches ip && ip4.dst ==
1483 B with an action REGBIT_NAT_REDIRECT = 1; next;. This
1484 flow is for east/west traffic to a NAT destination IPv4
1485 address. By setting the REGBIT_NAT_REDIRECT flag, in the
1486 ingress table Gateway Redirect this will trigger a redi‐
1487 rect to the instance of the gateway port on the redi‐
1488 rect-chassis.
1489
1490 A priority-0 logical flow with match 1 has actions next;.
1491
1492 Ingress Table 6: DNAT
1493
1494 Packets enter the pipeline with destination IP address that needs to be
1495 DNATted from a virtual IP address to a real IP address. Packets in the
1496 reverse direction needs to be unDNATed.
1497
1498 Ingress Table 6: Load balancing DNAT rules
1499
1500 Following load balancing DNAT flows are added for Gateway router or
1501 Router with gateway port. These flows are programmed only on the redi‐
1502 rect-chassis. These flows do not get programmed for load balancers with
1503 IPv6 VIPs.
1504
1505 · If controller_event has been enabled for all the config‐
1506 ured load balancing rules for a Gateway router or Router
1507 with gateway port in OVN_Northbound database that does
1508 not have configured backends, a priority-130 flow is
1509 added to trigger ovn-controller events whenever the chas‐
1510 sis receives a packet for that particular VIP. If
1511 event-elb meter has been previously created, it will be
1512 associated to the empty_lb logical flow
1513
1514 · For all the configured load balancing rules for a Gateway
1515 router or Router with gateway port in OVN_Northbound
1516 database that includes a L4 port PORT of protocol P and
1517 IPv4 address VIP, a priority-120 flow that matches on
1518 ct.new && ip && ip4.dst == VIP && P && P.dst == PORT
1519 with an action of ct_lb(args), where args contains comma
1520 separated IPv4 addresses (and optional port numbers) to
1521 load balance to. If the router is configured to force
1522 SNAT any load-balanced packets, the above action will be
1523 replaced by flags.force_snat_for_lb = 1; ct_lb(args);.
1524
1525 · For all the configured load balancing rules for a router
1526 in OVN_Northbound database that includes a L4 port PORT
1527 of protocol P and IPv4 address VIP, a priority-120 flow
1528 that matches on ct.est && ip && ip4.dst == VIP && P &&
1529 P.dst == PORT
1530 with an action of ct_dnat;. If the router is configured
1531 to force SNAT any load-balanced packets, the above action
1532 will be replaced by flags.force_snat_for_lb = 1;
1533 ct_dnat;.
1534
1535 · For all the configured load balancing rules for a router
1536 in OVN_Northbound database that includes just an IP
1537 address VIP to match on, a priority-110 flow that matches
1538 on ct.new && ip && ip4.dst == VIP with an action of
1539 ct_lb(args), where args contains comma separated IPv4
1540 addresses. If the router is configured to force SNAT any
1541 load-balanced packets, the above action will be replaced
1542 by flags.force_snat_for_lb = 1; ct_lb(args);.
1543
1544 · For all the configured load balancing rules for a router
1545 in OVN_Northbound database that includes just an IP
1546 address VIP to match on, a priority-110 flow that matches
1547 on ct.est && ip && ip4.dst == VIP with an action of
1548 ct_dnat;. If the router is configured to force SNAT any
1549 load-balanced packets, the above action will be replaced
1550 by flags.force_snat_for_lb = 1; ct_dnat;.
1551
1552 Ingress Table 6: DNAT on Gateway Routers
1553
1554 · For each configuration in the OVN Northbound database,
1555 that asks to change the destination IP address of a
1556 packet from A to B, a priority-100 flow matches ip &&
1557 ip4.dst == A with an action flags.loopback = 1;
1558 ct_dnat(B);. If the Gateway router is configured to force
1559 SNAT any DNATed packet, the above action will be replaced
1560 by flags.force_snat_for_dnat = 1; flags.loopback = 1;
1561 ct_dnat(B);.
1562
1563 · For all IP packets of a Gateway router, a priority-50
1564 flow with an action flags.loopback = 1; ct_dnat;.
1565
1566 · A priority-0 logical flow with match 1 has actions next;.
1567
1568 Ingress Table 6: DNAT on Distributed Routers
1569
1570 On distributed routers, the DNAT table only handles packets with desti‐
1571 nation IP address that needs to be DNATted from a virtual IP address to
1572 a real IP address. The unDNAT processing in the reverse direction is
1573 handled in a separate table in the egress pipeline.
1574
1575 · For each configuration in the OVN Northbound database,
1576 that asks to change the destination IP address of a
1577 packet from A to B, a priority-100 flow matches ip &&
1578 ip4.dst == B && inport == GW, where GW is the logical
1579 router gateway port, with an action ct_dnat(B);.
1580
1581 If the NAT rule cannot be handled in a distributed man‐
1582 ner, then the priority-100 flow above is only programmed
1583 on the redirect-chassis.
1584
1585 For each configuration in the OVN Northbound database,
1586 that asks to change the destination IP address of a
1587 packet from A to B, a priority-50 flow matches ip &&
1588 ip4.dst == B with an action REGBIT_NAT_REDIRECT = 1;
1589 next;. This flow is for east/west traffic to a NAT desti‐
1590 nation IPv4 address. By setting the REGBIT_NAT_REDIRECT
1591 flag, in the ingress table Gateway Redirect this will
1592 trigger a redirect to the instance of the gateway port on
1593 the redirect-chassis.
1594
1595 A priority-0 logical flow with match 1 has actions next;.
1596
1597 Ingress Table 7: IPv6 ND RA option processing
1598
1599 · A priority-50 logical flow is added for each logical
1600 router port configured with IPv6 ND RA options which
1601 matches IPv6 ND Router Solicitation packet and applies
1602 the action put_nd_ra_opts and advances the packet to the
1603 next table.
1604
1605 reg0[5] = put_nd_ra_opts(options);next;
1606
1607
1608 For a valid IPv6 ND RS packet, this transforms the packet
1609 into an IPv6 ND RA reply and sets the RA options to the
1610 packet and stores 1 into reg0[5]. For other kinds of
1611 packets, it just stores 0 into reg0[5]. Either way, it
1612 continues to the next table.
1613
1614 · A priority-0 logical flow with match 1 has actions next;.
1615
1616 Ingress Table 8: IPv6 ND RA responder
1617
1618 This table implements IPv6 ND RA responder for the IPv6 ND RA replies
1619 generated by the previous table.
1620
1621 · A priority-50 logical flow is added for each logical
1622 router port configured with IPv6 ND RA options which
1623 matches IPv6 ND RA packets and reg0[5] == 1 and responds
1624 back to the inport after applying these actions. If
1625 reg0[5] is set to 1, it means that the action
1626 put_nd_ra_opts was successful.
1627
1628 eth.dst = eth.src;
1629 eth.src = E;
1630 ip6.dst = ip6.src;
1631 ip6.src = I;
1632 outport = P;
1633 flags.loopback = 1;
1634 output;
1635
1636
1637 where E is the MAC address and I is the IPv6 link local
1638 address of the logical router port.
1639
1640 (This terminates packet processing in ingress pipeline;
1641 the packet does not go to the next ingress table.)
1642
1643 · A priority-0 logical flow with match 1 has actions next;.
1644
1645 Ingress Table 9: IP Routing
1646
1647 A packet that arrives at this table is an IP packet that should be
1648 routed to the address in ip4.dst or ip6.dst. This table implements IP
1649 routing, setting reg0 (or xxreg0 for IPv6) to the next-hop IP address
1650 (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
1651 and advances to the next table for ARP resolution. It also sets reg1
1652 (or xxreg1) to the IP address owned by the selected router port
1653 (ingress table ARP Request will generate an ARP request, if needed,
1654 with reg0 as the target protocol address and reg1 as the source proto‐
1655 col address).
1656
1657 This table contains the following logical flows:
1658
1659 · Priority-500 flows that match IP multicast traffic des‐
1660 tined to groups registered on any of the attached
1661 switches and sets outport to the associated multicast
1662 group that will eventually flood the traffic to all
1663 interested attached logical switches. The flows also
1664 decrement TTL.
1665
1666 · Priority-450 flow that matches unregistered IP multicast
1667 traffic and sets outport to the MC_STATIC multicast
1668 group, which ovn-northd populates with the logical ports
1669 that have options :mcast_flood=’true’.
1670
1671 · For distributed logical routers where one of the logical
1672 router ports specifies a redirect-chassis, a priority-400
1673 logical flow for each ip source/destination couple that
1674 matches the dnat_and_snat NAT rules configured. These
1675 flows will allow to properly forward traffic to the
1676 external connections if available and avoid sending it
1677 through the tunnel. Assuming the two following NAT rules
1678 have been configured:
1679
1680 external_ip{0,1} = EIP{0,1};
1681 external_mac{0,1} = MAC{0,1};
1682 logical_ip{0,1} = LIP{0,1};
1683
1684
1685 the following action will be applied:
1686
1687 eth.dst = MAC0;
1688 eth.src = MAC1;
1689 reg0 = ip4.dst;
1690 reg1 = EIP1;
1691 outport = redirect-chassis-port;
1692 REGBIT_DISTRIBUTED_NAT = 1; next;.
1693
1694
1695 Morover a priority-400 logical flow is configured for
1696 each dnat_and_snat NAT rule configured in order to not
1697 send traffic for local FIP through the overlay tunnels
1698 but manage it in the local hypervisor
1699
1700 · For distributed logical routers where one of the logical
1701 router ports specifies a redirect-chassis, a priority-300
1702 logical flow with match REGBIT_NAT_REDIRECT == 1 has
1703 actions ip.ttl--; next;. The outport will be set later in
1704 the Gateway Redirect table.
1705
1706 · IPv4 routing table. For each route to IPv4 network N with
1707 netmask M, on router port P with IP address A and Ether‐
1708 net address E, a logical flow with match ip4.dst == N/M,
1709 whose priority is the number of 1-bits in M, has the fol‐
1710 lowing actions:
1711
1712 ip.ttl--;
1713 reg0 = G;
1714 reg1 = A;
1715 eth.src = E;
1716 outport = P;
1717 flags.loopback = 1;
1718 next;
1719
1720
1721 (Ingress table 1 already verified that ip.ttl--; will not
1722 yield a TTL exceeded error.)
1723
1724 If the route has a gateway, G is the gateway IP address.
1725 Instead, if the route is from a configured static route,
1726 G is the next hop IP address. Else it is ip4.dst.
1727
1728 · IPv6 routing table. For each route to IPv6 network N with
1729 netmask M, on router port P with IP address A and Ether‐
1730 net address E, a logical flow with match in CIDR notation
1731 ip6.dst == N/M, whose priority is the integer value of M,
1732 has the following actions:
1733
1734 ip.ttl--;
1735 xxreg0 = G;
1736 xxreg1 = A;
1737 eth.src = E;
1738 outport = P;
1739 flags.loopback = 1;
1740 next;
1741
1742
1743 (Ingress table 1 already verified that ip.ttl--; will not
1744 yield a TTL exceeded error.)
1745
1746 If the route has a gateway, G is the gateway IP address.
1747 Instead, if the route is from a configured static route,
1748 G is the next hop IP address. Else it is ip6.dst.
1749
1750 If the address A is in the link-local scope, the route
1751 will be limited to sending on the ingress port.
1752
1753 Ingress Table 10: ARP/ND Resolution
1754
1755 Any packet that reaches this table is an IP packet whose next-hop IPv4
1756 address is in reg0 or IPv6 address is in xxreg0. (ip4.dst or ip6.dst
1757 contains the final destination.) This table resolves the IP address in
1758 reg0 (or xxreg0) into an output port in outport and an Ethernet address
1759 in eth.dst, using the following flows:
1760
1761 · A priority-500 flow that matches IP multicast traffic
1762 that was allowed in the routing pipeline. For this kind
1763 of traffic the outport was already set so the flow just
1764 advances to the next table.
1765
1766 · For distributed logical routers where one of the logical
1767 router ports specifies a redirect-chassis, a priority-400
1768 logical flow with match REGBIT_DISTRIBUTED_NAT == 1 has
1769 action next;
1770
1771 For distributed logical routers where one of the logical
1772 router ports specifies a redirect-chassis, a priority-200
1773 logical flow with match REGBIT_NAT_REDIRECT == 1 has
1774 actions eth.dst = E; next;, where E is the ethernet
1775 address of the router’s distributed gateway port.
1776
1777 · Static MAC bindings. MAC bindings can be known statically
1778 based on data in the OVN_Northbound database. For router
1779 ports connected to logical switches, MAC bindings can be
1780 known statically from the addresses column in the Logi‐
1781 cal_Switch_Port table. For router ports connected to
1782 other logical routers, MAC bindings can be known stati‐
1783 cally from the mac and networks column in the Logi‐
1784 cal_Router_Port table.
1785
1786 For each IPv4 address A whose host is known to have Eth‐
1787 ernet address E on router port P, a priority-100 flow
1788 with match outport === P && reg0 == A has actions eth.dst
1789 = E; next;.
1790
1791 For each virtual ip A configured on a logical port of
1792 type virtual and its virtual parent set in its corre‐
1793 sponding Port_Binding record and the virtual parent with
1794 the Ethernet address E and the virtual ip is reachable
1795 via the router port P, a priority-100 flow with match
1796 outport === P && reg0 == A has actions eth.dst = E;
1797 next;.
1798
1799 For each virtual ip A configured on a logical port of
1800 type virtual and its virtual parent not set in its corre‐
1801 sponding Port_Binding record and the virtual ip A is
1802 reachable via the router port P, a priority-100 flow with
1803 match outport === P && reg0 == A has actions eth.dst =
1804 00:00:00:00:00:00; next;. This flow is added so that the
1805 ARP is always resolved for the virtual ip A by generating
1806 ARP request and not consulting the MAC_Binding table as
1807 it can have incorrect value for the virtual ip A.
1808
1809 For each IPv6 address A whose host is known to have Eth‐
1810 ernet address E on router port P, a priority-100 flow
1811 with match outport === P && xxreg0 == A has actions
1812 eth.dst = E; next;.
1813
1814 For each logical router port with an IPv4 address A and a
1815 mac address of E that is reachable via a different logi‐
1816 cal router port P, a priority-100 flow with match outport
1817 === P && reg0 == A has actions eth.dst = E; next;.
1818
1819 For each logical router port with an IPv6 address A and a
1820 mac address of E that is reachable via a different logi‐
1821 cal router port P, a priority-100 flow with match outport
1822 === P && xxreg0 == A has actions eth.dst = E; next;.
1823
1824 · Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
1825 ings that have become known dynamically through ARP or
1826 neighbor discovery. (The ingress table ARP Request will
1827 issue an ARP or neighbor solicitation request for cases
1828 where the binding is not yet known.)
1829
1830 A priority-0 logical flow with match ip4 has actions
1831 get_arp(outport, reg0); next;.
1832
1833 A priority-0 logical flow with match ip6 has actions
1834 get_nd(outport, xxreg0); next;.
1835
1836 · For logical router port with redirect-chassis and redi‐
1837 rect-type being set as bridged, a priority-50 flow will
1838 match outport == "ROUTER_PORT" and !is_chassis_resident
1839 ("cr-ROUTER_PORT") has actions eth.dst = E; next;, where
1840 E is the ethernet address of the logical router port.
1841
1842 Ingress Table 11: Check packet length
1843
1844 For distributed logical routers with distributed gateway port config‐
1845 ured with options:gateway_mtu to a valid integer value, this table adds
1846 a priority-50 logical flow with the match ip4 && outport == GW_PORT
1847 where GW_PORT is the distributed gateway router port and applies the
1848 action check_pkt_larger and advances the packet to the next table.
1849
1850 REGBIT_PKT_LARGER = check_pkt_larger(L); next;
1851
1852
1853 where L is the packet length to check for. If the packet is larger than
1854 L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
1855 taken from options:gateway_mtu column of Logical_Router_Port row.
1856
1857 This table adds one priority-0 fallback flow that matches all packets
1858 and advances to the next table.
1859
1860 Ingress Table 12: Handle larger packets
1861
1862 For distributed logical routers with distributed gateway port config‐
1863 ured with options:gateway_mtu to a valid integer value, this table adds
1864 the following priority-50 logical flow for each logical router port
1865 with the match ip4 && inport == LRP && outport == GW_PORT && REG‐
1866 BIT_PKT_LARGER, where LRP is the logical router port and GW_PORT is the
1867 distributed gateway router port and applies the following action
1868
1869 icmp4 {
1870 icmp4.type = 3; /* Destination Unreachable. */
1871 icmp4.code = 4; /* Frag Needed and DF was Set. */
1872 icmp4.frag_mtu = M;
1873 eth.dst = E;
1874 ip4.dst = ip4.src;
1875 ip4.src = I;
1876 ip.ttl = 255;
1877 REGBIT_EGRESS_LOOPBACK = 1;
1878 next(pipeline=ingress, table=0);
1879 };
1880
1881
1882 · Where M is the (fragment MTU - 58) whose value is taken
1883 from options:gateway_mtu column of Logical_Router_Port
1884 row.
1885
1886 · E is the Ethernet address of the logical router port.
1887
1888 · I is the IPv4 address of the logical router port.
1889
1890 This table adds one priority-0 fallback flow that matches all packets
1891 and advances to the next table.
1892
1893 Ingress Table 13: Gateway Redirect
1894
1895 For distributed logical routers where one of the logical router ports
1896 specifies a redirect-chassis, this table redirects certain packets to
1897 the distributed gateway port instance on the redirect-chassis. This ta‐
1898 ble has the following flows:
1899
1900 · A priority-300 logical flow with match REGBIT_DISTRIB‐
1901 UTED_NAT == 1 has action next;
1902
1903 · A priority-200 logical flow with match REGBIT_NAT_REDI‐
1904 RECT == 1 has actions outport = CR; next;, where CR is
1905 the chassisredirect port representing the instance of the
1906 logical router distributed gateway port on the redi‐
1907 rect-chassis.
1908
1909 · A priority-150 logical flow with match outport == GW &&
1910 eth.dst == 00:00:00:00:00:00 has actions outport = CR;
1911 next;, where GW is the logical router distributed gateway
1912 port and CR is the chassisredirect port representing the
1913 instance of the logical router distributed gateway port
1914 on the redirect-chassis.
1915
1916 · For each NAT rule in the OVN Northbound database that can
1917 be handled in a distributed manner, a priority-100 logi‐
1918 cal flow with match ip4.src == B && outport == GW, where
1919 GW is the logical router distributed gateway port, with
1920 actions next;.
1921
1922 · A priority-50 logical flow with match outport == GW has
1923 actions outport = CR; next;, where GW is the logical
1924 router distributed gateway port and CR is the chas‐
1925 sisredirect port representing the instance of the logical
1926 router distributed gateway port on the redirect-chassis.
1927
1928 · A priority-0 logical flow with match 1 has actions next;.
1929
1930 Ingress Table 14: ARP Request
1931
1932 In the common case where the Ethernet destination has been resolved,
1933 this table outputs the packet. Otherwise, it composes and sends an ARP
1934 or IPv6 Neighbor Solicitation request. It holds the following flows:
1935
1936 · Unknown MAC address. A priority-100 flow for IPv4 packets
1937 with match eth.dst == 00:00:00:00:00:00 has the following
1938 actions:
1939
1940 arp {
1941 eth.dst = ff:ff:ff:ff:ff:ff;
1942 arp.spa = reg1;
1943 arp.tpa = reg0;
1944 arp.op = 1; /* ARP request. */
1945 output;
1946 };
1947
1948
1949 Unknown MAC address. For each IPv6 static route associ‐
1950 ated with the router with the nexthop IP: G, a prior‐
1951 ity-200 flow for IPv6 packets with match eth.dst ==
1952 00:00:00:00:00:00 && xxreg0 == G with the following
1953 actions is added:
1954
1955 nd_ns {
1956 eth.dst = E;
1957 ip6.dst = I
1958 nd.target = G;
1959 output;
1960 };
1961
1962
1963 Where E is the multicast mac derived from the Gateway IP,
1964 I is the solicited-node multicast address corresponding
1965 to the target address G.
1966
1967 Unknown MAC address. A priority-100 flow for IPv6 packets
1968 with match eth.dst == 00:00:00:00:00:00 has the following
1969 actions:
1970
1971 nd_ns {
1972 nd.target = xxreg0;
1973 output;
1974 };
1975
1976
1977 (Ingress table IP Routing initialized reg1 with the IP
1978 address owned by outport and (xx)reg0 with the next-hop
1979 IP address)
1980
1981 The IP packet that triggers the ARP/IPv6 NS request is
1982 dropped.
1983
1984 · Known MAC address. A priority-0 flow with match 1 has
1985 actions output;.
1986
1987 Egress Table 0: UNDNAT
1988
1989 This is for already established connections’ reverse traffic. i.e.,
1990 DNAT has already been done in ingress pipeline and now the packet has
1991 entered the egress pipeline as part of a reply. For NAT on a distrib‐
1992 uted router, it is unDNATted here. For Gateway routers, the unDNAT pro‐
1993 cessing is carried out in the ingress DNAT table.
1994
1995 · For all the configured load balancing rules for a router
1996 with gateway port in OVN_Northbound database that
1997 includes an IPv4 address VIP, for every backend IPv4
1998 address B defined for the VIP a priority-120 flow is pro‐
1999 grammed on redirect-chassis that matches ip && ip4.src ==
2000 B && outport == GW, where GW is the logical router gate‐
2001 way port with an action ct_dnat;. If the backend IPv4
2002 address B is also configured with L4 port PORT of proto‐
2003 col P, then the match also includes P.src == PORT. These
2004 flows are not added for load balancers with IPv6 VIPs.
2005
2006 If the router is configured to force SNAT any load-bal‐
2007 anced packets, above action will be replaced by
2008 flags.force_snat_for_lb = 1; ct_dnat;.
2009
2010 · For each configuration in the OVN Northbound database
2011 that asks to change the destination IP address of a
2012 packet from an IP address of A to B, a priority-100 flow
2013 matches ip && ip4.src == B && outport == GW, where GW is
2014 the logical router gateway port, with an action ct_dnat;.
2015
2016 If the NAT rule cannot be handled in a distributed man‐
2017 ner, then the priority-100 flow above is only programmed
2018 on the redirect-chassis.
2019
2020 If the NAT rule can be handled in a distributed manner,
2021 then there is an additional action eth.src = EA;, where
2022 EA is the ethernet address associated with the IP address
2023 A in the NAT rule. This allows upstream MAC learning to
2024 point to the correct chassis.
2025
2026 · A priority-0 logical flow with match 1 has actions next;.
2027
2028 Egress Table 1: SNAT
2029
2030 Packets that are configured to be SNATed get their source IP address
2031 changed based on the configuration in the OVN Northbound database.
2032
2033 Egress Table 1: SNAT on Gateway Routers
2034
2035 · If the Gateway router in the OVN Northbound database has
2036 been configured to force SNAT a packet (that has been
2037 previously DNATted) to B, a priority-100 flow matches
2038 flags.force_snat_for_dnat == 1 && ip with an action
2039 ct_snat(B);.
2040
2041 If the Gateway router in the OVN Northbound database has
2042 been configured to force SNAT a packet (that has been
2043 previously load-balanced) to B, a priority-100 flow
2044 matches flags.force_snat_for_lb == 1 && ip with an action
2045 ct_snat(B);.
2046
2047 For each configuration in the OVN Northbound database,
2048 that asks to change the source IP address of a packet
2049 from an IP address of A or to change the source IP
2050 address of a packet that belongs to network A to B, a
2051 flow matches ip && ip4.src == A with an action
2052 ct_snat(B);. The priority of the flow is calculated based
2053 on the mask of A, with matches having larger masks get‐
2054 ting higher priorities.
2055
2056 A priority-0 logical flow with match 1 has actions next;.
2057
2058 Egress Table 1: SNAT on Distributed Routers
2059
2060 · For each configuration in the OVN Northbound database,
2061 that asks to change the source IP address of a packet
2062 from an IP address of A or to change the source IP
2063 address of a packet that belongs to network A to B, a
2064 flow matches ip && ip4.src == A && outport == GW, where
2065 GW is the logical router gateway port, with an action
2066 ct_snat(B);. The priority of the flow is calculated based
2067 on the mask of A, with matches having larger masks get‐
2068 ting higher priorities.
2069
2070 If the NAT rule cannot be handled in a distributed man‐
2071 ner, then the flow above is only programmed on the redi‐
2072 rect-chassis increasing flow priority by 128 in order to
2073 be run first
2074
2075 If the NAT rule can be handled in a distributed manner,
2076 then there is an additional action eth.src = EA;, where
2077 EA is the ethernet address associated with the IP address
2078 A in the NAT rule. This allows upstream MAC learning to
2079 point to the correct chassis.
2080
2081 · A priority-0 logical flow with match 1 has actions next;.
2082
2083 Egress Table 2: Egress Loopback
2084
2085 For distributed logical routers where one of the logical router ports
2086 specifies a redirect-chassis.
2087
2088 Earlier in the ingress pipeline, some east-west traffic was redirected
2089 to the chassisredirect port, based on flows in the UNSNAT and DNAT
2090 ingress tables setting the REGBIT_NAT_REDIRECT flag, which then trig‐
2091 gered a match to a flow in the Gateway Redirect ingress table. The
2092 intention was not to actually send traffic out the distributed gateway
2093 port instance on the redirect-chassis. This traffic was sent to the
2094 distributed gateway port instance in order for DNAT and/or SNAT pro‐
2095 cessing to be applied.
2096
2097 While UNDNAT and SNAT processing have already occurred by this point,
2098 this traffic needs to be forced through egress loopback on this dis‐
2099 tributed gateway port instance, in order for UNSNAT and DNAT processing
2100 to be applied, and also for IP routing and ARP resolution after all of
2101 the NAT processing, so that the packet can be forwarded to the destina‐
2102 tion.
2103
2104 This table has the following flows:
2105
2106 · For each dnat_and_snat NAT rule couple in the OVN North‐
2107 bound database on a distributed router, a priority-200
2108 logical with match ip4.dst == external_ip0 && ip4.src ==
2109 external_ip1, has action next;
2110
2111 For each NAT rule in the OVN Northbound database on a
2112 distributed router, a priority-100 logical flow with
2113 match ip4.dst == E && outport == GW, where E is the
2114 external IP address specified in the NAT rule, and GW is
2115 the logical router distributed gateway port, with the
2116 following actions:
2117
2118 clone {
2119 ct_clear;
2120 inport = outport;
2121 outport = "";
2122 flags = 0;
2123 flags.loopback = 1;
2124 reg0 = 0;
2125 reg1 = 0;
2126 ...
2127 reg9 = 0;
2128 REGBIT_EGRESS_LOOPBACK = 1;
2129 next(pipeline=ingress, table=0);
2130 };
2131
2132
2133 flags.loopback is set since in_port is unchanged and the
2134 packet may return back to that port after NAT processing.
2135 REGBIT_EGRESS_LOOPBACK is set to indicate that egress
2136 loopback has occurred, in order to skip the source IP
2137 address check against the router address.
2138
2139 · A priority-0 logical flow with match 1 has actions next;.
2140
2141 Egress Table 3: Delivery
2142
2143 Packets that reach this table are ready for delivery. It contains:
2144
2145 · Priority-110 logical flows that match IP multicast pack‐
2146 ets on each enabled logical router port and modify the
2147 Ethernet source address of the packets to the Ethernet
2148 address of the port and then execute action output;.
2149
2150 · Priority-100 logical flows that match packets on each
2151 enabled logical router port, with action output;.
2152
2153
2154
2155Open vSwitch 2.12.0 ovn-northd ovn-northd(8)