1ovn-northd(8) OVN Manual ovn-northd(8)
2
3
4
6 ovn-northd and ovn-northd-ddlog - Open Virtual Network central control
7 daemon
8
10 ovn-northd [options]
11
13 ovn-northd is a centralized daemon responsible for translating the
14 high-level OVN configuration into logical configuration consumable by
15 daemons such as ovn-controller. It translates the logical network con‐
16 figuration in terms of conventional network concepts, taken from the
17 OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
18 the OVN Southbound Database (see ovn-sb(5)) below it.
19
20 ovn-northd is implemented in C. ovn-northd-ddlog is a compatible imple‐
21 mentation written in DDlog, a language for incremental database pro‐
22 cessing. This documentation applies to both implementations, with dif‐
23 ferences indicated where relevant.
24
26 --ovnnb-db=database
27 The OVSDB database containing the OVN Northbound Database. If
28 the OVN_NB_DB environment variable is set, its value is used as
29 the default. Otherwise, the default is unix:/ovnnb_db.sock.
30
31 --ovnsb-db=database
32 The OVSDB database containing the OVN Southbound Database. If
33 the OVN_SB_DB environment variable is set, its value is used as
34 the default. Otherwise, the default is unix:/ovnsb_db.sock.
35
36 --ddlog-record=file
37 This option is for ovn-north-ddlog only. It causes the daemon to
38 record the initial database state and later changes to file in
39 the text-based DDlog command format. The ovn_northd_cli program
40 can later replay these changes for debugging purposes. This op‐
41 tion has a performance impact. See debugging-ddlog.rst in the
42 OVN documentation for more details.
43
44 --dry-run
45 Causes ovn-northd to start paused. In the paused state,
46 ovn-northd does not apply any changes to the databases, although
47 it continues to monitor them. For more information, see the
48 pause command, under Runtime Management Commands below.
49
50 For ovn-northd-ddlog, one could use this option with
51 --ddlog-record to generate a replay log without restarting a
52 process or disturbing a running system.
53
54 n-threads N
55 In certain situations, it may be desirable to enable paral‐
56 lelization on a system to decrease latency (at the potential
57 cost of increasing CPU usage).
58
59 This option will cause ovn-northd to use N threads when building
60 logical flows, when N is within [2-256]. If N is 1, paralleliza‐
61 tion is disabled (default behavior). If N is less than 1, then N
62 is set to 1, parallelization is disabled and a warning is
63 logged. If N is more than 256, then N is set to 256, paral‐
64 lelization is enabled (with 256 threads) and a warning is
65 logged.
66
67 ovn-northd-ddlog does not support this option.
68
69 database in the above options must be an OVSDB active or passive con‐
70 nection method, as described in ovsdb(7).
71
72 Daemon Options
73 --pidfile[=pidfile]
74 Causes a file (by default, program.pid) to be created indicating
75 the PID of the running process. If the pidfile argument is not
76 specified, or if it does not begin with /, then it is created in
77 .
78
79 If --pidfile is not specified, no pidfile is created.
80
81 --overwrite-pidfile
82 By default, when --pidfile is specified and the specified pid‐
83 file already exists and is locked by a running process, the dae‐
84 mon refuses to start. Specify --overwrite-pidfile to cause it to
85 instead overwrite the pidfile.
86
87 When --pidfile is not specified, this option has no effect.
88
89 --detach
90 Runs this program as a background process. The process forks,
91 and in the child it starts a new session, closes the standard
92 file descriptors (which has the side effect of disabling logging
93 to the console), and changes its current directory to the root
94 (unless --no-chdir is specified). After the child completes its
95 initialization, the parent exits.
96
97 --monitor
98 Creates an additional process to monitor this program. If it
99 dies due to a signal that indicates a programming error (SIGA‐
100 BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
101 or SIGXFSZ) then the monitor process starts a new copy of it. If
102 the daemon dies or exits for another reason, the monitor process
103 exits.
104
105 This option is normally used with --detach, but it also func‐
106 tions without it.
107
108 --no-chdir
109 By default, when --detach is specified, the daemon changes its
110 current working directory to the root directory after it de‐
111 taches. Otherwise, invoking the daemon from a carelessly chosen
112 directory would prevent the administrator from unmounting the
113 file system that holds that directory.
114
115 Specifying --no-chdir suppresses this behavior, preventing the
116 daemon from changing its current working directory. This may be
117 useful for collecting core files, since it is common behavior to
118 write core dumps into the current working directory and the root
119 directory is not a good directory to use.
120
121 This option has no effect when --detach is not specified.
122
123 --no-self-confinement
124 By default this daemon will try to self-confine itself to work
125 with files under well-known directories determined at build
126 time. It is better to stick with this default behavior and not
127 to use this flag unless some other Access Control is used to
128 confine daemon. Note that in contrast to other access control
129 implementations that are typically enforced from kernel-space
130 (e.g. DAC or MAC), self-confinement is imposed from the user-
131 space daemon itself and hence should not be considered as a full
132 confinement strategy, but instead should be viewed as an addi‐
133 tional layer of security.
134
135 --user=user:group
136 Causes this program to run as a different user specified in
137 user:group, thus dropping most of the root privileges. Short
138 forms user and :group are also allowed, with current user or
139 group assumed, respectively. Only daemons started by the root
140 user accepts this argument.
141
142 On Linux, daemons will be granted CAP_IPC_LOCK and
143 CAP_NET_BIND_SERVICES before dropping root privileges. Daemons
144 that interact with a datapath, such as ovs-vswitchd, will be
145 granted three additional capabilities, namely CAP_NET_ADMIN,
146 CAP_NET_BROADCAST and CAP_NET_RAW. The capability change will
147 apply even if the new user is root.
148
149 On Windows, this option is not currently supported. For security
150 reasons, specifying this option will cause the daemon process
151 not to start.
152
153 Logging Options
154 -v[spec]
155 --verbose=[spec]
156 Sets logging levels. Without any spec, sets the log level for ev‐
157 ery module and destination to dbg. Otherwise, spec is a list of
158 words separated by spaces or commas or colons, up to one from each
159 category below:
160
161 • A valid module name, as displayed by the vlog/list command
162 on ovs-appctl(8), limits the log level change to the speci‐
163 fied module.
164
165 • syslog, console, or file, to limit the log level change to
166 only to the system log, to the console, or to a file, re‐
167 spectively. (If --detach is specified, the daemon closes
168 its standard file descriptors, so logging to the console
169 will have no effect.)
170
171 On Windows platform, syslog is accepted as a word and is
172 only useful along with the --syslog-target option (the word
173 has no effect otherwise).
174
175 • off, emer, err, warn, info, or dbg, to control the log
176 level. Messages of the given severity or higher will be
177 logged, and messages of lower severity will be filtered
178 out. off filters out all messages. See ovs-appctl(8) for a
179 definition of each log level.
180
181 Case is not significant within spec.
182
183 Regardless of the log levels set for file, logging to a file will
184 not take place unless --log-file is also specified (see below).
185
186 For compatibility with older versions of OVS, any is accepted as a
187 word but has no effect.
188
189 -v
190 --verbose
191 Sets the maximum logging verbosity level, equivalent to --ver‐
192 bose=dbg.
193
194 -vPATTERN:destination:pattern
195 --verbose=PATTERN:destination:pattern
196 Sets the log pattern for destination to pattern. Refer to ovs-ap‐
197 pctl(8) for a description of the valid syntax for pattern.
198
199 -vFACILITY:facility
200 --verbose=FACILITY:facility
201 Sets the RFC5424 facility of the log message. facility can be one
202 of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
203 ftp, ntp, audit, alert, clock2, local0, local1, local2, local3,
204 local4, local5, local6 or local7. If this option is not specified,
205 daemon is used as the default for the local system syslog and lo‐
206 cal0 is used while sending a message to the target provided via
207 the --syslog-target option.
208
209 --log-file[=file]
210 Enables logging to a file. If file is specified, then it is used
211 as the exact name for the log file. The default log file name used
212 if file is omitted is /var/log/ovn/program.log.
213
214 --syslog-target=host:port
215 Send syslog messages to UDP port on host, in addition to the sys‐
216 tem syslog. The host must be a numerical IP address, not a host‐
217 name.
218
219 --syslog-method=method
220 Specify method as how syslog messages should be sent to syslog
221 daemon. The following forms are supported:
222
223 • libc, to use the libc syslog() function. Downside of using
224 this options is that libc adds fixed prefix to every mes‐
225 sage before it is actually sent to the syslog daemon over
226 /dev/log UNIX domain socket.
227
228 • unix:file, to use a UNIX domain socket directly. It is pos‐
229 sible to specify arbitrary message format with this option.
230 However, rsyslogd 8.9 and older versions use hard coded
231 parser function anyway that limits UNIX domain socket use.
232 If you want to use arbitrary message format with older
233 rsyslogd versions, then use UDP socket to localhost IP ad‐
234 dress instead.
235
236 • udp:ip:port, to use a UDP socket. With this method it is
237 possible to use arbitrary message format also with older
238 rsyslogd. When sending syslog messages over UDP socket ex‐
239 tra precaution needs to be taken into account, for example,
240 syslog daemon needs to be configured to listen on the spec‐
241 ified UDP port, accidental iptables rules could be inter‐
242 fering with local syslog traffic and there are some secu‐
243 rity considerations that apply to UDP sockets, but do not
244 apply to UNIX domain sockets.
245
246 • null, to discard all messages logged to syslog.
247
248 The default is taken from the OVS_SYSLOG_METHOD environment vari‐
249 able; if it is unset, the default is libc.
250
251 PKI Options
252 PKI configuration is required in order to use SSL for the connections
253 to the Northbound and Southbound databases.
254
255 -p privkey.pem
256 --private-key=privkey.pem
257 Specifies a PEM file containing the private key used as
258 identity for outgoing SSL connections.
259
260 -c cert.pem
261 --certificate=cert.pem
262 Specifies a PEM file containing a certificate that certi‐
263 fies the private key specified on -p or --private-key to be
264 trustworthy. The certificate must be signed by the certifi‐
265 cate authority (CA) that the peer in SSL connections will
266 use to verify it.
267
268 -C cacert.pem
269 --ca-cert=cacert.pem
270 Specifies a PEM file containing the CA certificate for ver‐
271 ifying certificates presented to this program by SSL peers.
272 (This may be the same certificate that SSL peers use to
273 verify the certificate specified on -c or --certificate, or
274 it may be a different one, depending on the PKI design in
275 use.)
276
277 -C none
278 --ca-cert=none
279 Disables verification of certificates presented by SSL
280 peers. This introduces a security risk, because it means
281 that certificates cannot be verified to be those of known
282 trusted hosts.
283
284 Other Options
285 --unixctl=socket
286 Sets the name of the control socket on which program listens for
287 runtime management commands (see RUNTIME MANAGEMENT COMMANDS,
288 below). If socket does not begin with /, it is interpreted as
289 relative to . If --unixctl is not used at all, the default
290 socket is /program.pid.ctl, where pid is program’s process ID.
291
292 On Windows a local named pipe is used to listen for runtime man‐
293 agement commands. A file is created in the absolute path as
294 pointed by socket or if --unixctl is not used at all, a file is
295 created as program in the configured OVS_RUNDIR directory. The
296 file exists just to mimic the behavior of a Unix domain socket.
297
298 Specifying none for socket disables the control socket feature.
299
300
301
302 -h
303 --help
304 Prints a brief help message to the console.
305
306 -V
307 --version
308 Prints version information to the console.
309
311 ovs-appctl can send commands to a running ovn-northd process. The cur‐
312 rently supported commands are described below.
313
314 exit Causes ovn-northd to gracefully terminate.
315
316 pause Pauses ovn-northd. When it is paused, ovn-northd receives
317 changes from the Northbound and Southbound database
318 changes as usual, but it does not send any updates. A
319 paused ovn-northd also drops database locks, which allows
320 any other non-paused instance of ovn-northd to take over.
321
322 resume Resumes the ovn-northd operation to process Northbound
323 and Southbound database contents and generate logical
324 flows. This will also instruct ovn-northd to aspire for
325 the lock on SB DB.
326
327 is-paused
328 Returns "true" if ovn-northd is currently paused, "false"
329 otherwise.
330
331 status Prints this server’s status. Status will be "active" if
332 ovn-northd has acquired OVSDB lock on SB DB, "standby" if
333 it has not or "paused" if this instance is paused.
334
335 sb-cluster-state-reset
336 Reset southbound database cluster status when databases
337 are destroyed and rebuilt.
338
339 If all databases in a clustered southbound database are
340 removed from disk, then the stored index of all databases
341 will be reset to zero. This will cause ovn-northd to be
342 unable to read or write to the southbound database, be‐
343 cause it will always detect the data as stale. In such a
344 case, run this command so that ovn-northd will reset its
345 local index so that it can interact with the southbound
346 database again.
347
348 nb-cluster-state-reset
349 Reset northbound database cluster status when databases
350 are destroyed and rebuilt.
351
352 This performs the same task as sb-cluster-state-reset ex‐
353 cept for the northbound database client.
354
355 set-n-threads N
356 Set the number of threads used for building logical
357 flows. When N is within [2-256], parallelization is en‐
358 abled. When N is 1 parallelization is disabled. When N is
359 less than 1 or more than 256, an error is returned. If
360 ovn-northd fails to start parallelization (e.g. fails to
361 setup semaphores, parallelization is disabled and an er‐
362 ror is returned.
363
364 get-n-threads
365 Return the number of threads used for building logical
366 flows.
367
368 inc-engine/show-stats
369 Display ovn-northd engine counters. For each engine node
370 the following counters have been added:
371
372 • recompute
373
374 • compute
375
376 • abort
377
378 inc-engine/show-stats engine_node_name counter_name
379 Display the ovn-northd engine counter(s) for the speci‐
380 fied engine_node_name. counter_name is optional and can
381 be one of recompute, compute or abort.
382
383 inc-engine/clear-stats
384 Reset ovn-northd engine counters.
385
386 Only ovn-northd-ddlog supports the following commands:
387
388 enable-cpu-profiling
389 disable-cpu-profiling
390 Enables or disables profiling of CPU time used by the DDlog
391 engine. When CPU profiling is enabled, the profile command
392 (see below) will include DDlog CPU usage statistics in its
393 output. Enabling CPU profiling will slow ovn-northd-ddlog.
394 Disabling CPU profiling does not clear any previously
395 recorded statistics.
396
397 profile
398 Outputs a profile of the current and peak sizes of arrange‐
399 ments inside DDlog. This profiling data can be useful for
400 optimizing DDlog code. If CPU profiling was previously en‐
401 abled (even if it was later disabled), the output also in‐
402 cludes a CPU time profile. See Profiling inside the tuto‐
403 rial in the DDlog repository for an introduction to profil‐
404 ing DDlog.
405
407 You may run ovn-northd more than once in an OVN deployment. When con‐
408 nected to a standalone or clustered DB setup, OVN will automatically
409 ensure that only one of them is active at a time. If multiple instances
410 of ovn-northd are running and the active ovn-northd fails, one of the
411 hot standby instances of ovn-northd will automatically take over.
412
413 Active-Standby with multiple OVN DB servers
414 You may run multiple OVN DB servers in an OVN deployment with:
415
416 • OVN DB servers deployed in active/passive mode with one
417 active and multiple passive ovsdb-servers.
418
419 • ovn-northd also deployed on all these nodes, using unix
420 ctl sockets to connect to the local OVN DB servers.
421
422 In such deployments, the ovn-northds on the passive nodes will process
423 the DB changes and compute logical flows to be thrown out later, be‐
424 cause write transactions are not allowed by the passive ovsdb-servers.
425 It results in unnecessary CPU usage.
426
427 With the help of runtime management command pause, you can pause
428 ovn-northd on these nodes. When a passive node becomes master, you can
429 use the runtime management command resume to resume the ovn-northd to
430 process the DB changes.
431
433 One of the main purposes of ovn-northd is to populate the Logical_Flow
434 table in the OVN_Southbound database. This section describes how
435 ovn-northd does this for switch and router logical datapaths.
436
437 Logical Switch Datapaths
438 Ingress Table 0: Admission Control and Ingress Port Security check
439
440 Ingress table 0 contains these logical flows:
441
442 • Priority 100 flows to drop packets with VLAN tags or mul‐
443 ticast Ethernet source addresses.
444
445 • For each disabled logical port, a priority 100 flow is
446 added which matches on all packets and applies the action
447 REGBIT_PORT_SEC_DROP" = 1; next;" so that the packets are
448 dropped in the next stage.
449
450 • For each (enabled) vtep logical port, a priority 70 flow
451 is added which matches on all packets and applies the ac‐
452 tion next(pipeline=ingress, table=S_SWITCH_IN_L2_LKUP) =
453 1; to skip most stages of ingress pipeline and go di‐
454 rectly to ingress L2 lookup table to determine the output
455 port. Packets from VTEP (RAMP) switch should not be sub‐
456 jected to any ACL checks. Egress pipeline will do the ACL
457 checks.
458
459 • For each enabled logical port configured with qdisc queue
460 id in the options:qdisc_queue_id column of Logi‐
461 cal_Switch_Port, a priority 70 flow is added which
462 matches on all packets and applies the action
463 set_queue(id); REGBIT_PORT_SEC_DROP" =
464 check_in_port_sec(); next;".
465
466 • A priority 1 flow is added which matches on all packets
467 for all the logical ports and applies the action REG‐
468 BIT_PORT_SEC_DROP" = check_in_port_sec(); next; to evalu‐
469 ate the port security. The action check_in_port_sec ap‐
470 plies the port security rules defined in the port_secu‐
471 rity column of Logical_Switch_Port table.
472
473 Ingress Table 1: Ingress Port Security - Apply
474
475 This table drops the packets if the port security check failed in the
476 previous stage i.e the register bit REGBIT_PORT_SEC_DROP is set to 1.
477
478 Ingress table 1 contains these logical flows:
479
480 • A priority-50 fallback flow that drops the packet if the
481 register bit REGBIT_PORT_SEC_DROP is set to 1.
482
483 • One priority-0 fallback flow that matches all packets and
484 advances to the next table.
485
486 Ingress Table 2: Lookup MAC address learning table
487
488 This table looks up the MAC learning table of the logical switch data‐
489 path to check if the port-mac pair is present or not. MAC is learnt
490 only for logical switch VIF ports whose port security is disabled and
491 ’unknown’ address set.
492
493 • For each such logical port p whose port security is dis‐
494 abled and ’unknown’ address set following flow is added.
495
496 • Priority 100 flow with the match inport == p and
497 action reg0[11] = lookup_fdb(inport, eth.src);
498 next;
499
500 • One priority-0 fallback flow that matches all packets and
501 advances to the next table.
502
503 Ingress Table 3: Learn MAC of ’unknown’ ports.
504
505 This table learns the MAC addresses seen on the logical ports whose
506 port security is disabled and ’unknown’ address set if the lookup_fdb
507 action returned false in the previous table.
508
509 • For each such logical port p whose port security is dis‐
510 abled and ’unknown’ address set following flow is added.
511
512 • Priority 100 flow with the match inport == p &&
513 reg0[11] == 0 and action put_fdb(inport, eth.src);
514 next; which stores the port-mac in the mac learn‐
515 ing table of the logical switch datapath and ad‐
516 vances the packet to the next table.
517
518 • One priority-0 fallback flow that matches all packets and
519 advances to the next table.
520
521 Ingress Table 4: from-lport Pre-ACLs
522
523 This table prepares flows for possible stateful ACL processing in
524 ingress table ACLs. It contains a priority-0 flow that simply moves
525 traffic to the next table. If stateful ACLs are used in the logical
526 datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
527 1; next;) for table Pre-stateful to send IP packets to the connection
528 tracker before eventually advancing to ingress table ACLs. If special
529 ports such as route ports or localnet ports can’t use ct(), a prior‐
530 ity-110 flow is added to skip over stateful ACLs. Multicast, IPv6
531 Neighbor Discovery and MLD traffic also skips stateful ACLs. For "al‐
532 low-stateless" ACLs, a flow is added to bypass setting the hint for
533 connection tracker processing.
534
535 This table also has a priority-110 flow with the match eth.dst == E for
536 all logical switch datapaths to move traffic to the next table. Where E
537 is the service monitor mac defined in the options:svc_monitor_mac col‐
538 umn of NB_Global table.
539
540 Ingress Table 5: Pre-LB
541
542 This table prepares flows for possible stateful load balancing process‐
543 ing in ingress table LB and Stateful. It contains a priority-0 flow
544 that simply moves traffic to the next table. Moreover it contains two
545 priority-110 flows to move multicast, IPv6 Neighbor Discovery and MLD
546 traffic to the next table. If load balancing rules with virtual IP ad‐
547 dresses (and ports) are configured in OVN_Northbound database for a
548 logical switch datapath, a priority-100 flow is added with the match ip
549 to match on IP packets and sets the action reg0[2] = 1; next; to act as
550 a hint for table Pre-stateful to send IP packets to the connection
551 tracker for packet de-fragmentation (and to possibly do DNAT for al‐
552 ready established load balanced traffic) before eventually advancing to
553 ingress table Stateful. If controller_event has been enabled and load
554 balancing rules with empty backends have been added in OVN_Northbound,
555 a 130 flow is added to trigger ovn-controller events whenever the chas‐
556 sis receives a packet for that particular VIP. If event-elb meter has
557 been previously created, it will be associated to the empty_lb logical
558 flow
559
560 Prior to OVN 20.09 we were setting the reg0[0] = 1 only if the IP des‐
561 tination matches the load balancer VIP. However this had few issues
562 cases where a logical switch doesn’t have any ACLs with allow-related
563 action. To understand the issue lets a take a TCP load balancer -
564 10.0.0.10:80=10.0.0.3:80. If a logical port - p1 with IP - 10.0.0.5
565 opens a TCP connection with the VIP - 10.0.0.10, then the packet in the
566 ingress pipeline of ’p1’ is sent to the p1’s conntrack zone id and the
567 packet is load balanced to the backend - 10.0.0.3. For the reply packet
568 from the backend lport, it is not sent to the conntrack of backend
569 lport’s zone id. This is fine as long as the packet is valid. Suppose
570 the backend lport sends an invalid TCP packet (like incorrect sequence
571 number), the packet gets delivered to the lport ’p1’ without unDNATing
572 the packet to the VIP - 10.0.0.10. And this causes the connection to be
573 reset by the lport p1’s VIF.
574
575 We can’t fix this issue by adding a logical flow to drop ct.inv packets
576 in the egress pipeline since it will drop all other connections not
577 destined to the load balancers. To fix this issue, we send all the
578 packets to the conntrack in the ingress pipeline if a load balancer is
579 configured. We can now add a lflow to drop ct.inv packets.
580
581 This table also has priority-120 flows that punt all IGMP/MLD packets
582 to ovn-controller if the switch is an interconnect switch with multi‐
583 cast snooping enabled.
584
585 This table also has a priority-110 flow with the match eth.dst == E for
586 all logical switch datapaths to move traffic to the next table. Where E
587 is the service monitor mac defined in the options:svc_monitor_mac col‐
588 umn of NB_Global table.
589
590 This table also has a priority-110 flow with the match inport == I for
591 all logical switch datapaths to move traffic to the next table. Where I
592 is the peer of a logical router port. This flow is added to skip the
593 connection tracking of packets which enter from logical router datapath
594 to logical switch datapath.
595
596 Ingress Table 6: Pre-stateful
597
598 This table prepares flows for all possible stateful processing in next
599 tables. It contains a priority-0 flow that simply moves traffic to the
600 next table.
601
602 • Priority-120 flows that send the packets to connection
603 tracker using ct_lb_mark; as the action so that the al‐
604 ready established traffic destined to the load balancer
605 VIP gets DNATted. These flows match each VIPs IP and
606 port. For IPv4 traffic the flows also load the original
607 destination IP and transport port in registers reg1 and
608 reg2. For IPv6 traffic the flows also load the original
609 destination IP and transport port in registers xxreg1 and
610 reg2.
611
612 • A priority-110 flow sends the packets that don’t match
613 the above flows to connection tracker based on a hint
614 provided by the previous tables (with a match for reg0[2]
615 == 1) by using the ct_lb_mark; action.
616
617 • A priority-100 flow sends the packets to connection
618 tracker based on a hint provided by the previous tables
619 (with a match for reg0[0] == 1) by using the ct_next; ac‐
620 tion.
621
622 Ingress Table 7: from-lport ACL hints
623
624 This table consists of logical flows that set hints (reg0 bits) to be
625 used in the next stage, in the ACL processing table, if stateful ACLs
626 or load balancers are configured. Multiple hints can be set for the
627 same packet. The possible hints are:
628
629 • reg0[7]: the packet might match an allow-related ACL and
630 might have to commit the connection to conntrack.
631
632 • reg0[8]: the packet might match an allow-related ACL but
633 there will be no need to commit the connection to con‐
634 ntrack because it already exists.
635
636 • reg0[9]: the packet might match a drop/reject.
637
638 • reg0[10]: the packet might match a drop/reject ACL but
639 the connection was previously allowed so it might have to
640 be committed again with ct_label=1/1.
641
642 The table contains the following flows:
643
644 • A priority-65535 flow to advance to the next table if the
645 logical switch has no ACLs configured, otherwise a prior‐
646 ity-0 flow to advance to the next table.
647
648 • A priority-7 flow that matches on packets that initiate a
649 new session. This flow sets reg0[7] and reg0[9] and then
650 advances to the next table.
651
652 • A priority-6 flow that matches on packets that are in the
653 request direction of an already existing session that has
654 been marked as blocked. This flow sets reg0[7] and
655 reg0[9] and then advances to the next table.
656
657 • A priority-5 flow that matches untracked packets. This
658 flow sets reg0[8] and reg0[9] and then advances to the
659 next table.
660
661 • A priority-4 flow that matches on packets that are in the
662 request direction of an already existing session that has
663 not been marked as blocked. This flow sets reg0[8] and
664 reg0[10] and then advances to the next table.
665
666 • A priority-3 flow that matches on packets that are in not
667 part of established sessions. This flow sets reg0[9] and
668 then advances to the next table.
669
670 • A priority-2 flow that matches on packets that are part
671 of an established session that has been marked as
672 blocked. This flow sets reg0[9] and then advances to the
673 next table.
674
675 • A priority-1 flow that matches on packets that are part
676 of an established session that has not been marked as
677 blocked. This flow sets reg0[10] and then advances to the
678 next table.
679
680 Ingress table 8: from-lport ACLs before LB
681
682 Logical flows in this table closely reproduce those in the ACL table in
683 the OVN_Northbound database for the from-lport direction without the
684 option apply-after-lb set or set to false. The priority values from the
685 ACL table have a limited range and have 1000 added to them to leave
686 room for OVN default flows at both higher and lower priorities.
687
688 • allow ACLs translate into logical flows with the next;
689 action. If there are any stateful ACLs on this datapath,
690 then allow ACLs translate to ct_commit; next; (which acts
691 as a hint for the next tables to commit the connection to
692 conntrack). In case the ACL has a label then reg3 is
693 loaded with the label value and reg0[13] bit is set to 1
694 (which acts as a hint for the next tables to commit the
695 label to conntrack).
696
697 • allow-related ACLs translate into logical flows with the
698 ct_commit(ct_label=0/1); next; actions for new connec‐
699 tions and reg0[1] = 1; next; for existing connections. In
700 case the ACL has a label then reg3 is loaded with the la‐
701 bel value and reg0[13] bit is set to 1 (which acts as a
702 hint for the next tables to commit the label to con‐
703 ntrack).
704
705 • allow-stateless ACLs translate into logical flows with
706 the next; action.
707
708 • reject ACLs translate into logical flows with the tcp_re‐
709 set { output <-> inport; next(pipeline=egress,table=5);}
710 action for TCP connections,icmp4/icmp6 action for UDP
711 connections, and sctp_abort {output <-%gt; inport;
712 next(pipeline=egress,table=5);} action for SCTP associa‐
713 tions.
714
715 • Other ACLs translate to drop; for new or untracked con‐
716 nections and ct_commit(ct_label=1/1); for known connec‐
717 tions. Setting ct_label marks a connection as one that
718 was previously allowed, but should no longer be allowed
719 due to a policy change.
720
721 This table contains a priority-65535 flow to advance to the next table
722 if the logical switch has no ACLs configured, otherwise a priority-0
723 flow to advance to the next table so that ACLs allow packets by default
724 if options:default_acl_drop column of NB_Global is false or not set.
725 Otherwise the flow action is set to drop; to implement a default drop
726 behavior.
727
728 If the logical datapath has a stateful ACL or a load balancer with VIP
729 configured, the following flows will also be added:
730
731 • If options:default_acl_drop column of NB_Global is false
732 or not set, a priority-1 flow that sets the hint to com‐
733 mit IP traffic that is not part of established sessions
734 to the connection tracker (with action reg0[1] = 1;
735 next;). This is needed for the default allow policy be‐
736 cause, while the initiator’s direction may not have any
737 stateful rules, the server’s may and then its return
738 traffic would not be known and marked as invalid.
739
740 • If options:default_acl_drop column of NB_Global is true,
741 a priority-1 flow that drops IP traffic that is not part
742 of established sessions.
743
744 • A priority-1 flow that sets the hint to commit IP traffic
745 to the connection tracker (with action reg0[1] = 1;
746 next;). This is needed for the default allow policy be‐
747 cause, while the initiator’s direction may not have any
748 stateful rules, the server’s may and then its return
749 traffic would not be known and marked as invalid.
750
751 • A priority-65532 flow that allows any traffic in the re‐
752 ply direction for a connection that has been committed to
753 the connection tracker (i.e., established flows), as long
754 as the committed flow does not have ct_mark.blocked set.
755 We only handle traffic in the reply direction here be‐
756 cause we want all packets going in the request direction
757 to still go through the flows that implement the cur‐
758 rently defined policy based on ACLs. If a connection is
759 no longer allowed by policy, ct_mark.blocked will get set
760 and packets in the reply direction will no longer be al‐
761 lowed, either. This flow also clears the register bits
762 reg0[9] and reg0[10]. If ACL logging and logging of re‐
763 lated packets is enabled, then a companion priority-65533
764 flow will be installed that accomplishes the same thing
765 but also logs the traffic.
766
767 • A priority-65532 flow that allows any traffic that is
768 considered related to a committed flow in the connection
769 tracker (e.g., an ICMP Port Unreachable from a non-lis‐
770 tening UDP port), as long as the committed flow does not
771 have ct_mark.blocked set. This flow also applies NAT to
772 the related traffic so that ICMP headers and the inner
773 packet have correct addresses. If ACL logging and logging
774 of related packets is enabled, then a companion prior‐
775 ity-65533 flow will be installed that accomplishes the
776 same thing but also logs the traffic.
777
778 • A priority-65532 flow that drops all traffic marked by
779 the connection tracker as invalid.
780
781 • A priority-65532 flow that drops all traffic in the reply
782 direction with ct_mark.blocked set meaning that the con‐
783 nection should no longer be allowed due to a policy
784 change. Packets in the request direction are skipped here
785 to let a newly created ACL re-allow this connection.
786
787 • A priority-65532 flow that allows IPv6 Neighbor solicita‐
788 tion, Neighbor discover, Router solicitation, Router ad‐
789 vertisement and MLD packets.
790
791 If the logical datapath has any ACL or a load balancer with VIP config‐
792 ured, the following flow will also be added:
793
794 • A priority 34000 logical flow is added for each logical
795 switch datapath with the match eth.dst = E to allow the
796 service monitor reply packet destined to ovn-controller
797 with the action next, where E is the service monitor mac
798 defined in the options:svc_monitor_mac column of
799 NB_Global table.
800
801 Ingress Table 9: from-lport QoS Marking
802
803 Logical flows in this table closely reproduce those in the QoS table
804 with the action column set in the OVN_Northbound database for the
805 from-lport direction.
806
807 • For every qos_rules entry in a logical switch with DSCP
808 marking enabled, a flow will be added at the priority
809 mentioned in the QoS table.
810
811 • One priority-0 fallback flow that matches all packets and
812 advances to the next table.
813
814 Ingress Table 10: from-lport QoS Meter
815
816 Logical flows in this table closely reproduce those in the QoS table
817 with the bandwidth column set in the OVN_Northbound database for the
818 from-lport direction.
819
820 • For every qos_rules entry in a logical switch with meter‐
821 ing enabled, a flow will be added at the priority men‐
822 tioned in the QoS table.
823
824 • One priority-0 fallback flow that matches all packets and
825 advances to the next table.
826
827 Ingress Table 11: Load balancing affinity check
828
829 Load balancing affinity check table contains the following logical
830 flows:
831
832 • For all the configured load balancing rules for a switch
833 in OVN_Northbound database where a positive affinity
834 timeout is specified in options column, that includes a
835 L4 port PORT of protocol P and IP address VIP, a prior‐
836 ity-100 flow is added. For IPv4 VIPs, the flow matches
837 ct.new && ip && ip4.dst == VIP && P.dst == PORT. For IPv6
838 VIPs, the flow matches ct.new && ip && ip6.dst == VIP&& P
839 && P.dst == PORT. The flow’s action is reg9[6] =
840 chk_lb_aff(); next;.
841
842 • A priority 0 flow is added which matches on all packets
843 and applies the action next;.
844
845 Ingress Table 12: LB
846
847 • For all the configured load balancing rules for a switch
848 in OVN_Northbound database where a positive affinity
849 timeout is specified in options column, that includes a
850 L4 port PORT of protocol P and IP address VIP, a prior‐
851 ity-150 flow is added. For IPv4 VIPs, the flow matches
852 reg9[6] == 1 && ct.new && ip && ip4.dst == VIP && P.dst
853 == PORT . For IPv6 VIPs, the flow matches reg9[6] == 1 &&
854 ct.new && ip && ip6.dst == VIP && P && P.dst == PORT.
855 The flow’s action is ct_lb_mark(args), where args con‐
856 tains comma separated IP addresses (and optional port
857 numbers) to load balance to. The address family of the IP
858 addresses of args is the same as the address family of
859 VIP.
860
861 • For all the configured load balancing rules for a switch
862 in OVN_Northbound database that includes a L4 port PORT
863 of protocol P and IP address VIP, a priority-120 flow is
864 added. For IPv4 VIPs , the flow matches ct.new && ip &&
865 ip4.dst == VIP && P.dst == PORT. For IPv6 VIPs, the flow
866 matches ct.new && ip && ip6.dst == VIP && P && P.dst ==
867 PORT. The flow’s action is ct_lb_mark(args) , where args
868 contains comma separated IP addresses (and optional port
869 numbers) to load balance to. The address family of the IP
870 addresses of args is the same as the address family of
871 VIP. If health check is enabled, then args will only con‐
872 tain those endpoints whose service monitor status entry
873 in OVN_Southbound db is either online or empty. For IPv4
874 traffic the flow also loads the original destination IP
875 and transport port in registers reg1 and reg2. For IPv6
876 traffic the flow also loads the original destination IP
877 and transport port in registers xxreg1 and reg2. The
878 above flow is created even if the load balancer is at‐
879 tached to a logical router connected to the current logi‐
880 cal switch and the install_ls_lb_from_router variable in
881 options is set to true.
882
883 • For all the configured load balancing rules for a switch
884 in OVN_Northbound database that includes just an IP ad‐
885 dress VIP to match on, OVN adds a priority-110 flow. For
886 IPv4 VIPs, the flow matches ct.new && ip && ip4.dst ==
887 VIP. For IPv6 VIPs, the flow matches ct.new && ip &&
888 ip6.dst == VIP. The action on this flow is
889 ct_lb_mark(args), where args contains comma separated IP
890 addresses of the same address family as VIP. For IPv4
891 traffic the flow also loads the original destination IP
892 and transport port in registers reg1 and reg2. For IPv6
893 traffic the flow also loads the original destination IP
894 and transport port in registers xxreg1 and reg2. The
895 above flow is created even if the load balancer is at‐
896 tached to a logical router connected to the current logi‐
897 cal switch and the install_ls_lb_from_router variable in
898 options is set to true.
899
900 • If the load balancer is created with --reject option and
901 it has no active backends, a TCP reset segment (for tcp)
902 or an ICMP port unreachable packet (for all other kind of
903 traffic) will be sent whenever an incoming packet is re‐
904 ceived for this load-balancer. Please note using --reject
905 option will disable empty_lb SB controller event for this
906 load balancer.
907
908 Ingress Table 13: Load balancing affinity learn
909
910 Load balancing affinity learn table contains the following logical
911 flows:
912
913 • For all the configured load balancing rules for a switch
914 in OVN_Northbound database where a positive affinity
915 timeout T is specified in options column, that includes a
916 L4 port PORT of protocol P and IP address VIP, a prior‐
917 ity-100 flow is added. For IPv4 VIPs, the flow matches
918 reg9[6] == 0 && ct.new && ip && ip4.dst == VIP && P.dst
919 == PORT. For IPv6 VIPs, the flow matches ct.new && ip &&
920 ip6.dst == VIP && P && P.dst == PORT . The flow’s action
921 is commit_lb_aff(vip = VIP:PORT, backend = backend ip:
922 backend port, proto = P, timeout = T); .
923
924 • A priority 0 flow is added which matches on all packets
925 and applies the action next;.
926
927 Ingress table 14: from-lport ACLs after LB
928
929 Logical flows in this table closely reproduce those in the ACL table in
930 the OVN_Northbound database for the from-lport direction with the op‐
931 tion apply-after-lb set to true. The priority values from the ACL table
932 have a limited range and have 1000 added to them to leave room for OVN
933 default flows at both higher and lower priorities.
934
935 • allow apply-after-lb ACLs translate into logical flows
936 with the next; action. If there are any stateful ACLs
937 (including both before-lb and after-lb ACLs) on this
938 datapath, then allow ACLs translate to ct_commit; next;
939 (which acts as a hint for the next tables to commit the
940 connection to conntrack). In case the ACL has a label
941 then reg3 is loaded with the label value and reg0[13] bit
942 is set to 1 (which acts as a hint for the next tables to
943 commit the label to conntrack).
944
945 • allow-related apply-after-lb ACLs translate into logical
946 flows with the ct_commit(ct_label=0/1); next; actions for
947 new connections and reg0[1] = 1; next; for existing con‐
948 nections. In case the ACL has a label then reg3 is loaded
949 with the label value and reg0[13] bit is set to 1 (which
950 acts as a hint for the next tables to commit the label to
951 conntrack).
952
953 • allow-stateless apply-after-lb ACLs translate into logi‐
954 cal flows with the next; action.
955
956 • reject apply-after-lb ACLs translate into logical flows
957 with the tcp_reset { output <-> inport; next(pipe‐
958 line=egress,table=5);} action for TCP connec‐
959 tions,icmp4/icmp6 action for UDP connections, and
960 sctp_abort {output <-%gt; inport; next(pipe‐
961 line=egress,table=5);} action for SCTP associations.
962
963 • Other apply-after-lb ACLs translate to drop; for new or
964 untracked connections and ct_commit(ct_label=1/1); for
965 known connections. Setting ct_label marks a connection as
966 one that was previously allowed, but should no longer be
967 allowed due to a policy change.
968
969 • One priority-0 fallback flow that matches all packets and
970 advances to the next table.
971
972 Ingress Table 15: Stateful
973
974 • A priority 100 flow is added which commits the packet to
975 the conntrack and sets the most significant 32-bits of
976 ct_label with the reg3 value based on the hint provided
977 by previous tables (with a match for reg0[1] == 1 &&
978 reg0[13] == 1). This is used by the ACLs with label to
979 commit the label value to conntrack.
980
981 • For ACLs without label, a second priority-100 flow com‐
982 mits packets to connection tracker using ct_commit; next;
983 action based on a hint provided by the previous tables
984 (with a match for reg0[1] == 1 && reg0[13] == 0).
985
986 • A priority-0 flow that simply moves traffic to the next
987 table.
988
989 Ingress Table 16: Pre-Hairpin
990
991 • If the logical switch has load balancer(s) configured,
992 then a priority-100 flow is added with the match ip &&
993 ct.trk to check if the packet needs to be hairpinned (if
994 after load balancing the destination IP matches the
995 source IP) or not by executing the actions reg0[6] =
996 chk_lb_hairpin(); and reg0[12] = chk_lb_hairpin_reply();
997 and advances the packet to the next table.
998
999 • A priority-0 flow that simply moves traffic to the next
1000 table.
1001
1002 Ingress Table 17: Nat-Hairpin
1003
1004 • If the logical switch has load balancer(s) configured,
1005 then a priority-100 flow is added with the match ip &&
1006 ct.new && ct.trk && reg0[6] == 1 which hairpins the traf‐
1007 fic by NATting source IP to the load balancer VIP by exe‐
1008 cuting the action ct_snat_to_vip and advances the packet
1009 to the next table.
1010
1011 • If the logical switch has load balancer(s) configured,
1012 then a priority-100 flow is added with the match ip &&
1013 ct.est && ct.trk && reg0[6] == 1 which hairpins the traf‐
1014 fic by NATting source IP to the load balancer VIP by exe‐
1015 cuting the action ct_snat and advances the packet to the
1016 next table.
1017
1018 • If the logical switch has load balancer(s) configured,
1019 then a priority-90 flow is added with the match ip &&
1020 reg0[12] == 1 which matches on the replies of hairpinned
1021 traffic (i.e., destination IP is VIP, source IP is the
1022 backend IP and source L4 port is backend port for L4 load
1023 balancers) and executes ct_snat and advances the packet
1024 to the next table.
1025
1026 • A priority-0 flow that simply moves traffic to the next
1027 table.
1028
1029 Ingress Table 18: Hairpin
1030
1031 • For each distributed gateway router port RP attached to
1032 the logical switch, a priority-2000 flow is added with
1033 the match reg0[14] == 1 && is_chassis_resident(RP)
1034 and action next; to pass the traffic to the next table
1035 to respond to the ARP requests for the router port IPs.
1036
1037 reg0[14] register bit is set in the ingress L2 port secu‐
1038 rity check table for traffic received from HW VTEP (ramp)
1039 ports.
1040
1041 • A priority-1000 flow that matches on reg0[14] register
1042 bit for the traffic received from HW VTEP (ramp) ports.
1043 This traffic is passed to ingress table ls_in_l2_lkup.
1044
1045 • A priority-1 flow that hairpins traffic matched by non-
1046 default flows in the Pre-Hairpin table. Hairpinning is
1047 done at L2, Ethernet addresses are swapped and the pack‐
1048 ets are looped back on the input port.
1049
1050 • A priority-0 flow that simply moves traffic to the next
1051 table.
1052
1053 Ingress Table 19: ARP/ND responder
1054
1055 This table implements ARP/ND responder in a logical switch for known
1056 IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
1057 by locally responding to ARP requests without the need to send to other
1058 hypervisors. One common case is when the inport is a logical port asso‐
1059 ciated with a VIF and the broadcast is responded to on the local hyper‐
1060 visor rather than broadcast across the whole network and responded to
1061 by the destination VM. This behavior is proxy ARP.
1062
1063 ARP requests arrive from VMs from a logical switch inport of type de‐
1064 fault. For this case, the logical switch proxy ARP rules can be for
1065 other VMs or logical router ports. Logical switch proxy ARP rules may
1066 be programmed both for mac binding of IP addresses on other logical
1067 switch VIF ports (which are of the default logical switch port type,
1068 representing connectivity to VMs or containers), and for mac binding of
1069 IP addresses on logical switch router type ports, representing their
1070 logical router port peers. In order to support proxy ARP for logical
1071 router ports, an IP address must be configured on the logical switch
1072 router type port, with the same value as the peer logical router port.
1073 The configured MAC addresses must match as well. When a VM sends an ARP
1074 request for a distributed logical router port and if the peer router
1075 type port of the attached logical switch does not have an IP address
1076 configured, the ARP request will be broadcast on the logical switch.
1077 One of the copies of the ARP request will go through the logical switch
1078 router type port to the logical router datapath, where the logical
1079 router ARP responder will generate a reply. The MAC binding of a dis‐
1080 tributed logical router, once learned by an associated VM, is used for
1081 all that VM’s communication needing routing. Hence, the action of a VM
1082 re-arping for the mac binding of the logical router port should be
1083 rare.
1084
1085 Logical switch ARP responder proxy ARP rules can also be hit when re‐
1086 ceiving ARP requests externally on a L2 gateway port. In this case, the
1087 hypervisor acting as an L2 gateway, responds to the ARP request on be‐
1088 half of a destination VM.
1089
1090 Note that ARP requests received from localnet logical inports can ei‐
1091 ther go directly to VMs, in which case the VM responds or can hit an
1092 ARP responder for a logical router port if the packet is used to re‐
1093 solve a logical router port next hop address. In either case, logical
1094 switch ARP responder rules will not be hit. It contains these logical
1095 flows:
1096
1097 • Priority-100 flows to skip the ARP responder if inport is
1098 of type localnet advances directly to the next table. ARP
1099 requests sent to localnet ports can be received by multi‐
1100 ple hypervisors. Now, because the same mac binding rules
1101 are downloaded to all hypervisors, each of the multiple
1102 hypervisors will respond. This will confuse L2 learning
1103 on the source of the ARP requests. ARP requests received
1104 on an inport of type router are not expected to hit any
1105 logical switch ARP responder flows. However, no skip
1106 flows are installed for these packets, as there would be
1107 some additional flow cost for this and the value appears
1108 limited.
1109
1110 • If inport V is of type virtual adds a priority-100 logi‐
1111 cal flows for each P configured in the options:virtual-
1112 parents column with the match
1113
1114 inport == P && && ((arp.op == 1 && arp.spa == VIP && arp.tpa == VIP) || (arp.op == 2 && arp.spa == VIP))
1115 inport == P && && ((nd_ns && ip6.dst == {VIP, NS_MULTICAST_ADDR} && nd.target == VIP) || (nd_na && nd.target == VIP))
1116
1117
1118 and applies the action
1119
1120 bind_vport(V, inport);
1121
1122
1123 and advances the packet to the next table.
1124
1125 Where VIP is the virtual ip configured in the column op‐
1126 tions:virtual-ip and NS_MULTICAST_ADDR is solicited-node
1127 multicast address corresponding to the VIP.
1128
1129 • Priority-50 flows that match ARP requests to each known
1130 IP address A of every logical switch port, and respond
1131 with ARP replies directly with corresponding Ethernet ad‐
1132 dress E:
1133
1134 eth.dst = eth.src;
1135 eth.src = E;
1136 arp.op = 2; /* ARP reply. */
1137 arp.tha = arp.sha;
1138 arp.sha = E;
1139 arp.tpa = arp.spa;
1140 arp.spa = A;
1141 outport = inport;
1142 flags.loopback = 1;
1143 output;
1144
1145
1146 These flows are omitted for logical ports (other than
1147 router ports or localport ports) that are down (unless
1148 ignore_lsp_down is configured as true in options column
1149 of NB_Global table of the Northbound database), for logi‐
1150 cal ports of type virtual, for logical ports with ’un‐
1151 known’ address set and for logical ports of a logical
1152 switch configured with other_config:vlan-passthru=true.
1153
1154 The above ARP responder flows are added for the list of
1155 IPv4 addresses if defined in options:arp_proxy column of
1156 Logical_Switch_Port table for logical switch ports of
1157 type router.
1158
1159 • Priority-50 flows that match IPv6 ND neighbor solicita‐
1160 tions to each known IP address A (and A’s solicited node
1161 address) of every logical switch port except of type
1162 router, and respond with neighbor advertisements directly
1163 with corresponding Ethernet address E:
1164
1165 nd_na {
1166 eth.src = E;
1167 ip6.src = A;
1168 nd.target = A;
1169 nd.tll = E;
1170 outport = inport;
1171 flags.loopback = 1;
1172 output;
1173 };
1174
1175
1176 Priority-50 flows that match IPv6 ND neighbor solicita‐
1177 tions to each known IP address A (and A’s solicited node
1178 address) of logical switch port of type router, and re‐
1179 spond with neighbor advertisements directly with corre‐
1180 sponding Ethernet address E:
1181
1182 nd_na_router {
1183 eth.src = E;
1184 ip6.src = A;
1185 nd.target = A;
1186 nd.tll = E;
1187 outport = inport;
1188 flags.loopback = 1;
1189 output;
1190 };
1191
1192
1193 These flows are omitted for logical ports (other than
1194 router ports or localport ports) that are down (unless
1195 ignore_lsp_down is configured as true in options column
1196 of NB_Global table of the Northbound database), for logi‐
1197 cal ports of type virtual and for logical ports with ’un‐
1198 known’ address set.
1199
1200 • Priority-100 flows with match criteria like the ARP and
1201 ND flows above, except that they only match packets from
1202 the inport that owns the IP addresses in question, with
1203 action next;. These flows prevent OVN from replying to,
1204 for example, an ARP request emitted by a VM for its own
1205 IP address. A VM only makes this kind of request to at‐
1206 tempt to detect a duplicate IP address assignment, so
1207 sending a reply will prevent the VM from accepting the IP
1208 address that it owns.
1209
1210 In place of next;, it would be reasonable to use drop;
1211 for the flows’ actions. If everything is working as it is
1212 configured, then this would produce equivalent results,
1213 since no host should reply to the request. But ARPing for
1214 one’s own IP address is intended to detect situations
1215 where the network is not working as configured, so drop‐
1216 ping the request would frustrate that intent.
1217
1218 • For each SVC_MON_SRC_IP defined in the value of the
1219 ip_port_mappings:ENDPOINT_IP column of Load_Balancer ta‐
1220 ble, priority-110 logical flow is added with the match
1221 arp.tpa == SVC_MON_SRC_IP && && arp.op == 1 and applies
1222 the action
1223
1224 eth.dst = eth.src;
1225 eth.src = E;
1226 arp.op = 2; /* ARP reply. */
1227 arp.tha = arp.sha;
1228 arp.sha = E;
1229 arp.tpa = arp.spa;
1230 arp.spa = A;
1231 outport = inport;
1232 flags.loopback = 1;
1233 output;
1234
1235
1236 where E is the service monitor source mac defined in the
1237 options:svc_monitor_mac column in the NB_Global table.
1238 This mac is used as the source mac in the service monitor
1239 packets for the load balancer endpoint IP health checks.
1240
1241 SVC_MON_SRC_IP is used as the source ip in the service
1242 monitor IPv4 packets for the load balancer endpoint IP
1243 health checks.
1244
1245 These flows are required if an ARP request is sent for
1246 the IP SVC_MON_SRC_IP.
1247
1248 • For each VIP configured in the table Forwarding_Group a
1249 priority-50 logical flow is added with the match arp.tpa
1250 == vip && && arp.op == 1
1251 and applies the action
1252
1253 eth.dst = eth.src;
1254 eth.src = E;
1255 arp.op = 2; /* ARP reply. */
1256 arp.tha = arp.sha;
1257 arp.sha = E;
1258 arp.tpa = arp.spa;
1259 arp.spa = A;
1260 outport = inport;
1261 flags.loopback = 1;
1262 output;
1263
1264
1265 where E is the forwarding group’s mac defined in the
1266 vmac.
1267
1268 A is used as either the destination ip for load balancing
1269 traffic to child ports or as nexthop to hosts behind the
1270 child ports.
1271
1272 These flows are required to respond to an ARP request if
1273 an ARP request is sent for the IP vip.
1274
1275 • One priority-0 fallback flow that matches all packets and
1276 advances to the next table.
1277
1278 Ingress Table 20: DHCP option processing
1279
1280 This table adds the DHCPv4 options to a DHCPv4 packet from the logical
1281 ports configured with IPv4 address(es) and DHCPv4 options, and simi‐
1282 larly for DHCPv6 options. This table also adds flows for the logical
1283 ports of type external.
1284
1285 • A priority-100 logical flow is added for these logical
1286 ports which matches the IPv4 packet with udp.src = 68 and
1287 udp.dst = 67 and applies the action put_dhcp_opts and ad‐
1288 vances the packet to the next table.
1289
1290 reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
1291 next;
1292
1293
1294 For DHCPDISCOVER and DHCPREQUEST, this transforms the
1295 packet into a DHCP reply, adds the DHCP offer IP ip and
1296 options to the packet, and stores 1 into reg0[3]. For
1297 other kinds of packets, it just stores 0 into reg0[3].
1298 Either way, it continues to the next table.
1299
1300 • A priority-100 logical flow is added for these logical
1301 ports which matches the IPv6 packet with udp.src = 546
1302 and udp.dst = 547 and applies the action put_dhcpv6_opts
1303 and advances the packet to the next table.
1304
1305 reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
1306 next;
1307
1308
1309 For DHCPv6 Solicit/Request/Confirm packets, this trans‐
1310 forms the packet into a DHCPv6 Advertise/Reply, adds the
1311 DHCPv6 offer IP ip and options to the packet, and stores
1312 1 into reg0[3]. For other kinds of packets, it just
1313 stores 0 into reg0[3]. Either way, it continues to the
1314 next table.
1315
1316 • A priority-0 flow that matches all packets to advances to
1317 table 16.
1318
1319 Ingress Table 21: DHCP responses
1320
1321 This table implements DHCP responder for the DHCP replies generated by
1322 the previous table.
1323
1324 • A priority 100 logical flow is added for the logical
1325 ports configured with DHCPv4 options which matches IPv4
1326 packets with udp.src == 68 && udp.dst == 67 && reg0[3] ==
1327 1 and responds back to the inport after applying these
1328 actions. If reg0[3] is set to 1, it means that the action
1329 put_dhcp_opts was successful.
1330
1331 eth.dst = eth.src;
1332 eth.src = E;
1333 ip4.src = S;
1334 udp.src = 67;
1335 udp.dst = 68;
1336 outport = P;
1337 flags.loopback = 1;
1338 output;
1339
1340
1341 where E is the server MAC address and S is the server
1342 IPv4 address defined in the DHCPv4 options. Note that
1343 ip4.dst field is handled by put_dhcp_opts.
1344
1345 (This terminates ingress packet processing; the packet
1346 does not go to the next ingress table.)
1347
1348 • A priority 100 logical flow is added for the logical
1349 ports configured with DHCPv6 options which matches IPv6
1350 packets with udp.src == 546 && udp.dst == 547 && reg0[3]
1351 == 1 and responds back to the inport after applying these
1352 actions. If reg0[3] is set to 1, it means that the action
1353 put_dhcpv6_opts was successful.
1354
1355 eth.dst = eth.src;
1356 eth.src = E;
1357 ip6.dst = A;
1358 ip6.src = S;
1359 udp.src = 547;
1360 udp.dst = 546;
1361 outport = P;
1362 flags.loopback = 1;
1363 output;
1364
1365
1366 where E is the server MAC address and S is the server
1367 IPv6 LLA address generated from the server_id defined in
1368 the DHCPv6 options and A is the IPv6 address defined in
1369 the logical port’s addresses column.
1370
1371 (This terminates packet processing; the packet does not
1372 go on the next ingress table.)
1373
1374 • A priority-0 flow that matches all packets to advances to
1375 table 17.
1376
1377 Ingress Table 22 DNS Lookup
1378
1379 This table looks up and resolves the DNS names to the corresponding
1380 configured IP address(es).
1381
1382 • A priority-100 logical flow for each logical switch data‐
1383 path if it is configured with DNS records, which matches
1384 the IPv4 and IPv6 packets with udp.dst = 53 and applies
1385 the action dns_lookup and advances the packet to the next
1386 table.
1387
1388 reg0[4] = dns_lookup(); next;
1389
1390
1391 For valid DNS packets, this transforms the packet into a
1392 DNS reply if the DNS name can be resolved, and stores 1
1393 into reg0[4]. For failed DNS resolution or other kinds of
1394 packets, it just stores 0 into reg0[4]. Either way, it
1395 continues to the next table.
1396
1397 Ingress Table 23 DNS Responses
1398
1399 This table implements DNS responder for the DNS replies generated by
1400 the previous table.
1401
1402 • A priority-100 logical flow for each logical switch data‐
1403 path if it is configured with DNS records, which matches
1404 the IPv4 and IPv6 packets with udp.dst = 53 && reg0[4] ==
1405 1 and responds back to the inport after applying these
1406 actions. If reg0[4] is set to 1, it means that the action
1407 dns_lookup was successful.
1408
1409 eth.dst <-> eth.src;
1410 ip4.src <-> ip4.dst;
1411 udp.dst = udp.src;
1412 udp.src = 53;
1413 outport = P;
1414 flags.loopback = 1;
1415 output;
1416
1417
1418 (This terminates ingress packet processing; the packet
1419 does not go to the next ingress table.)
1420
1421 Ingress table 24 External ports
1422
1423 Traffic from the external logical ports enter the ingress datapath
1424 pipeline via the localnet port. This table adds the below logical flows
1425 to handle the traffic from these ports.
1426
1427 • A priority-100 flow is added for each external logical
1428 port which doesn’t reside on a chassis to drop the
1429 ARP/IPv6 NS request to the router IP(s) (of the logical
1430 switch) which matches on the inport of the external logi‐
1431 cal port and the valid eth.src address(es) of the exter‐
1432 nal logical port.
1433
1434 This flow guarantees that the ARP/NS request to the
1435 router IP address from the external ports is responded by
1436 only the chassis which has claimed these external ports.
1437 All the other chassis, drops these packets.
1438
1439 A priority-100 flow is added for each external logical
1440 port which doesn’t reside on a chassis to drop any packet
1441 destined to the router mac - with the match inport == ex‐
1442 ternal && eth.src == E && eth.dst == R && !is_chas‐
1443 sis_resident("external") where E is the external port mac
1444 and R is the router port mac.
1445
1446 • A priority-0 flow that matches all packets to advances to
1447 table 20.
1448
1449 Ingress Table 25 Destination Lookup
1450
1451 This table implements switching behavior. It contains these logical
1452 flows:
1453
1454 • A priority-110 flow with the match eth.src == E for all
1455 logical switch datapaths and applies the action han‐
1456 dle_svc_check(inport). Where E is the service monitor mac
1457 defined in the options:svc_monitor_mac column of
1458 NB_Global table.
1459
1460 • A priority-100 flow that punts all IGMP/MLD packets to
1461 ovn-controller if multicast snooping is enabled on the
1462 logical switch.
1463
1464 • Priority-90 flows that forward registered IP multicast
1465 traffic to their corresponding multicast group, which
1466 ovn-northd creates based on learnt IGMP_Group entries.
1467 The flows also forward packets to the MC_MROUTER_FLOOD
1468 multicast group, which ovn-nortdh populates with all the
1469 logical ports that are connected to logical routers with
1470 options:mcast_relay=’true’.
1471
1472 • A priority-85 flow that forwards all IP multicast traffic
1473 destined to 224.0.0.X to the MC_FLOOD_L2 multicast group,
1474 which ovn-northd populates with all non-router logical
1475 ports.
1476
1477 • A priority-85 flow that forwards all IP multicast traffic
1478 destined to reserved multicast IPv6 addresses (RFC 4291,
1479 2.7.1, e.g., Solicited-Node multicast) to the MC_FLOOD
1480 multicast group, which ovn-northd populates with all en‐
1481 abled logical ports.
1482
1483 • A priority-80 flow that forwards all unregistered IP mul‐
1484 ticast traffic to the MC_STATIC multicast group, which
1485 ovn-northd populates with all the logical ports that have
1486 options :mcast_flood=’true’. The flow also forwards un‐
1487 registered IP multicast traffic to the MC_MROUTER_FLOOD
1488 multicast group, which ovn-northd populates with all the
1489 logical ports connected to logical routers that have op‐
1490 tions :mcast_relay=’true’.
1491
1492 • A priority-80 flow that drops all unregistered IP multi‐
1493 cast traffic if other_config :mcast_snoop=’true’ and
1494 other_config :mcast_flood_unregistered=’false’ and the
1495 switch is not connected to a logical router that has op‐
1496 tions :mcast_relay=’true’ and the switch doesn’t have any
1497 logical port with options :mcast_flood=’true’.
1498
1499 • Priority-80 flows for each IP address/VIP/NAT address
1500 owned by a router port connected to the switch. These
1501 flows match ARP requests and ND packets for the specific
1502 IP addresses. Matched packets are forwarded only to the
1503 router that owns the IP address and to the MC_FLOOD_L2
1504 multicast group which contains all non-router logical
1505 ports.
1506
1507 • Priority-75 flows for each port connected to a logical
1508 router matching self originated ARP request/RARP re‐
1509 quest/ND packets. These packets are flooded to the
1510 MC_FLOOD_L2 which contains all non-router logical ports.
1511
1512 • A priority-70 flow that outputs all packets with an Eth‐
1513 ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
1514 ticast group.
1515
1516 • One priority-50 flow that matches each known Ethernet ad‐
1517 dress against eth.dst. Action of this flow outputs the
1518 packet to the single associated output port if it is en‐
1519 abled. drop; action is applied if LSP is disabled.
1520
1521 For the Ethernet address on a logical switch port of type
1522 router, when that logical switch port’s addresses column
1523 is set to router and the connected logical router port
1524 has a gateway chassis:
1525
1526 • The flow for the connected logical router port’s
1527 Ethernet address is only programmed on the gateway
1528 chassis.
1529
1530 • If the logical router has rules specified in nat
1531 with external_mac, then those addresses are also
1532 used to populate the switch’s destination lookup
1533 on the chassis where logical_port is resident.
1534
1535 For the Ethernet address on a logical switch port of type
1536 router, when that logical switch port’s addresses column
1537 is set to router and the connected logical router port
1538 specifies a reside-on-redirect-chassis and the logical
1539 router to which the connected logical router port belongs
1540 to has a distributed gateway LRP:
1541
1542 • The flow for the connected logical router port’s
1543 Ethernet address is only programmed on the gateway
1544 chassis.
1545
1546 For each forwarding group configured on the logical
1547 switch datapath, a priority-50 flow that matches on
1548 eth.dst == VIP
1549 with an action of fwd_group(childports=args ), where
1550 args contains comma separated logical switch child ports
1551 to load balance to. If liveness is enabled, then action
1552 also includes liveness=true.
1553
1554 • One priority-0 fallback flow that matches all packets
1555 with the action outport = get_fdb(eth.dst); next;. The
1556 action get_fdb gets the port for the eth.dst in the MAC
1557 learning table of the logical switch datapath. If there
1558 is no entry for eth.dst in the MAC learning table, then
1559 it stores none in the outport.
1560
1561 Ingress Table 26 Destination unknown
1562
1563 This table handles the packets whose destination was not found or and
1564 looked up in the MAC learning table of the logical switch datapath. It
1565 contains the following flows.
1566
1567 • Priority 50 flow with the match outport == P is added for
1568 each disabled Logical Switch Port P. This flow has action
1569 drop;.
1570
1571 • If the logical switch has logical ports with ’unknown’
1572 addresses set, then the below logical flow is added
1573
1574 • Priority 50 flow with the match outport == "none"
1575 then outputs them to the MC_UNKNOWN multicast
1576 group, which ovn-northd populates with all enabled
1577 logical ports that accept unknown destination
1578 packets. As a small optimization, if no logical
1579 ports accept unknown destination packets,
1580 ovn-northd omits this multicast group and logical
1581 flow.
1582
1583 If the logical switch has no logical ports with ’unknown’
1584 address set, then the below logical flow is added
1585
1586 • Priority 50 flow with the match outport == none
1587 and drops the packets.
1588
1589 • One priority-0 fallback flow that outputs the packet to
1590 the egress stage with the outport learnt from get_fdb ac‐
1591 tion.
1592
1593 Egress Table 0: Pre-LB
1594
1595 This table is similar to ingress table Pre-LB. It contains a priority-0
1596 flow that simply moves traffic to the next table. Moreover it contains
1597 two priority-110 flows to move multicast, IPv6 Neighbor Discovery and
1598 MLD traffic to the next table. If any load balancing rules exist for
1599 the datapath, a priority-100 flow is added with a match of ip and ac‐
1600 tion of reg0[2] = 1; next; to act as a hint for table Pre-stateful to
1601 send IP packets to the connection tracker for packet de-fragmentation
1602 and possibly DNAT the destination VIP to one of the selected backend
1603 for already committed load balanced traffic.
1604
1605 This table also has a priority-110 flow with the match eth.src == E for
1606 all logical switch datapaths to move traffic to the next table. Where E
1607 is the service monitor mac defined in the options:svc_monitor_mac col‐
1608 umn of NB_Global table.
1609
1610 Egress Table 1: to-lport Pre-ACLs
1611
1612 This is similar to ingress table Pre-ACLs except for to-lport traffic.
1613
1614 This table also has a priority-110 flow with the match eth.src == E for
1615 all logical switch datapaths to move traffic to the next table. Where E
1616 is the service monitor mac defined in the options:svc_monitor_mac col‐
1617 umn of NB_Global table.
1618
1619 This table also has a priority-110 flow with the match outport == I for
1620 all logical switch datapaths to move traffic to the next table. Where I
1621 is the peer of a logical router port. This flow is added to skip the
1622 connection tracking of packets which will be entering logical router
1623 datapath from logical switch datapath for routing.
1624
1625 Egress Table 2: Pre-stateful
1626
1627 This is similar to ingress table Pre-stateful. This table adds the be‐
1628 low 3 logical flows.
1629
1630 • A Priority-120 flow that send the packets to connection
1631 tracker using ct_lb_mark; as the action so that the al‐
1632 ready established traffic gets unDNATted from the backend
1633 IP to the load balancer VIP based on a hint provided by
1634 the previous tables with a match for reg0[2] == 1. If the
1635 packet was not DNATted earlier, then ct_lb_mark functions
1636 like ct_next.
1637
1638 • A priority-100 flow sends the packets to connection
1639 tracker based on a hint provided by the previous tables
1640 (with a match for reg0[0] == 1) by using the ct_next; ac‐
1641 tion.
1642
1643 • A priority-0 flow that matches all packets to advance to
1644 the next table.
1645
1646 Egress Table 3: from-lport ACL hints
1647
1648 This is similar to ingress table ACL hints.
1649
1650 Egress Table 4: to-lport ACLs
1651
1652 This is similar to ingress table ACLs except for to-lport ACLs.
1653
1654 In addition, the following flows are added.
1655
1656 • A priority 34000 logical flow is added for each logical
1657 port which has DHCPv4 options defined to allow the DHCPv4
1658 reply packet and which has DHCPv6 options defined to al‐
1659 low the DHCPv6 reply packet from the Ingress Table 18:
1660 DHCP responses.
1661
1662 • A priority 34000 logical flow is added for each logical
1663 switch datapath configured with DNS records with the
1664 match udp.dst = 53 to allow the DNS reply packet from the
1665 Ingress Table 20: DNS responses.
1666
1667 • A priority 34000 logical flow is added for each logical
1668 switch datapath with the match eth.src = E to allow the
1669 service monitor request packet generated by ovn-con‐
1670 troller with the action next, where E is the service mon‐
1671 itor mac defined in the options:svc_monitor_mac column of
1672 NB_Global table.
1673
1674 Egress Table 5: to-lport QoS Marking
1675
1676 This is similar to ingress table QoS marking except they apply to
1677 to-lport QoS rules.
1678
1679 Egress Table 6: to-lport QoS Meter
1680
1681 This is similar to ingress table QoS meter except they apply to
1682 to-lport QoS rules.
1683
1684 Egress Table 7: Stateful
1685
1686 This is similar to ingress table Stateful except that there are no
1687 rules added for load balancing new connections.
1688
1689 Egress Table 8: Egress Port Security - check
1690
1691 This is similar to the port security logic in table Ingress Port Secu‐
1692 rity check except that action check_out_port_sec is used to check the
1693 port security rules. This table adds the below logical flows.
1694
1695 • A priority 100 flow which matches on the multicast traf‐
1696 fic and applies the action REGBIT_PORT_SEC_DROP" = 0;
1697 next;" to skip the out port security checks.
1698
1699 • A priority 0 logical flow is added which matches on all
1700 the packets and applies the action REGBIT_PORT_SEC_DROP"
1701 = check_out_port_sec(); next;". The action
1702 check_out_port_sec applies the port security rules based
1703 on the addresses defined in the port_security column of
1704 Logical_Switch_Port table before delivering the packet to
1705 the outport.
1706
1707 Egress Table 9: Egress Port Security - Apply
1708
1709 This is similar to the ingress port security logic in ingress table A
1710 Ingress Port Security - Apply. This table drops the packets if the port
1711 security check failed in the previous stage i.e the register bit REG‐
1712 BIT_PORT_SEC_DROP is set to 1.
1713
1714 The following flows are added.
1715
1716 • For each localnet port configured with egress qos in the
1717 options:qdisc_queue_id column of Logical_Switch_Port, a
1718 priority 100 flow is added which matches on the localnet
1719 outport and applies the action set_queue(id); output;".
1720
1721 Please remember to mark the corresponding physical inter‐
1722 face with ovn-egress-iface set to true in external_ids.
1723
1724 • A priority-50 flow that drops the packet if the register
1725 bit REGBIT_PORT_SEC_DROP is set to 1.
1726
1727 • A priority-0 flow that outputs the packet to the outport.
1728
1729 Logical Router Datapaths
1730 Logical router datapaths will only exist for Logical_Router rows in the
1731 OVN_Northbound database that do not have enabled set to false
1732
1733 Ingress Table 0: L2 Admission Control
1734
1735 This table drops packets that the router shouldn’t see at all based on
1736 their Ethernet headers. It contains the following flows:
1737
1738 • Priority-100 flows to drop packets with VLAN tags or mul‐
1739 ticast Ethernet source addresses.
1740
1741 • For each enabled router port P with Ethernet address E, a
1742 priority-50 flow that matches inport == P && (eth.mcast
1743 || eth.dst == E), stores the router port ethernet address
1744 and advances to next table, with action xreg0[0..47]=E;
1745 next;.
1746
1747 For the gateway port on a distributed logical router
1748 (where one of the logical router ports specifies a gate‐
1749 way chassis), the above flow matching eth.dst == E is
1750 only programmed on the gateway port instance on the gate‐
1751 way chassis. If LRP’s logical switch has attached LSP of
1752 vtep type, the is_chassis_resident() part is not added to
1753 lflow to allow traffic originated from logical switch to
1754 reach LR services (LBs, NAT).
1755
1756 For a distributed logical router or for gateway router
1757 where the port is configured with options:gateway_mtu the
1758 action of the above flow is modified adding
1759 check_pkt_larger in order to mark the packet setting REG‐
1760 BIT_PKT_LARGER if the size is greater than the MTU. If
1761 the port is also configured with options:gateway_mtu_by‐
1762 pass then another flow is added, with priority-55, to by‐
1763 pass the check_pkt_larger flow. This is useful for traf‐
1764 fic that normally doesn’t need to be fragmented and for
1765 which check_pkt_larger, which might not be offloadable,
1766 is not really needed. One such example is TCP traffic.
1767
1768 • For each dnat_and_snat NAT rule on a distributed router
1769 that specifies an external Ethernet address E, a prior‐
1770 ity-50 flow that matches inport == GW && eth.dst == E,
1771 where GW is the logical router distributed gateway port
1772 corresponding to the NAT rule (specified or inferred),
1773 with action xreg0[0..47]=E; next;.
1774
1775 This flow is only programmed on the gateway port instance
1776 on the chassis where the logical_port specified in the
1777 NAT rule resides.
1778
1779 • A priority-0 logical flow that matches all packets not
1780 already handled (match 1) and drops them (action drop;).
1781
1782 Other packets are implicitly dropped.
1783
1784 Ingress Table 1: Neighbor lookup
1785
1786 For ARP and IPv6 Neighbor Discovery packets, this table looks into the
1787 MAC_Binding records to determine if OVN needs to learn the mac bind‐
1788 ings. Following flows are added:
1789
1790 • For each router port P that owns IP address A, which be‐
1791 longs to subnet S with prefix length L, if the option al‐
1792 ways_learn_from_arp_request is true for this router, a
1793 priority-100 flow is added which matches inport == P &&
1794 arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1795 lowing actions:
1796
1797 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1798 next;
1799
1800
1801 If the option always_learn_from_arp_request is false, the
1802 following two flows are added.
1803
1804 A priority-110 flow is added which matches inport == P &&
1805 arp.spa == S/L && arp.tpa == A && arp.op == 1 (ARP re‐
1806 quest) with the following actions:
1807
1808 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1809 reg9[3] = 1;
1810 next;
1811
1812
1813 A priority-100 flow is added which matches inport == P &&
1814 arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1815 lowing actions:
1816
1817 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1818 reg9[3] = lookup_arp_ip(inport, arp.spa);
1819 next;
1820
1821
1822 If the logical router port P is a distributed gateway
1823 router port, additional match is_chassis_resident(cr-P)
1824 is added for all these flows.
1825
1826 • A priority-100 flow which matches on ARP reply packets
1827 and applies the actions if the option al‐
1828 ways_learn_from_arp_request is true:
1829
1830 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1831 next;
1832
1833
1834 If the option always_learn_from_arp_request is false, the
1835 above actions will be:
1836
1837 reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1838 reg9[3] = 1;
1839 next;
1840
1841
1842 • A priority-100 flow which matches on IPv6 Neighbor Dis‐
1843 covery advertisement packet and applies the actions if
1844 the option always_learn_from_arp_request is true:
1845
1846 reg9[2] = lookup_nd(inport, nd.target, nd.tll);
1847 next;
1848
1849
1850 If the option always_learn_from_arp_request is false, the
1851 above actions will be:
1852
1853 reg9[2] = lookup_nd(inport, nd.target, nd.tll);
1854 reg9[3] = 1;
1855 next;
1856
1857
1858 • A priority-100 flow which matches on IPv6 Neighbor Dis‐
1859 covery solicitation packet and applies the actions if the
1860 option always_learn_from_arp_request is true:
1861
1862 reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
1863 next;
1864
1865
1866 If the option always_learn_from_arp_request is false, the
1867 above actions will be:
1868
1869 reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
1870 reg9[3] = lookup_nd_ip(inport, ip6.src);
1871 next;
1872
1873
1874 • A priority-0 fallback flow that matches all packets and
1875 applies the action reg9[2] = 1; next; advancing the
1876 packet to the next table.
1877
1878 Ingress Table 2: Neighbor learning
1879
1880 This table adds flows to learn the mac bindings from the ARP and IPv6
1881 Neighbor Solicitation/Advertisement packets if it is needed according
1882 to the lookup results from the previous stage.
1883
1884 reg9[2] will be 1 if the lookup_arp/lookup_nd in the previous table was
1885 successful or skipped, meaning no need to learn mac binding from the
1886 packet.
1887
1888 reg9[3] will be 1 if the lookup_arp_ip/lookup_nd_ip in the previous ta‐
1889 ble was successful or skipped, meaning it is ok to learn mac binding
1890 from the packet (if reg9[2] is 0).
1891
1892 • A priority-100 flow with the match reg9[2] == 1 ||
1893 reg9[3] == 0 and advances the packet to the next table as
1894 there is no need to learn the neighbor.
1895
1896 • A priority-95 flow with the match nd_ns && (ip6.src == 0
1897 || nd.sll == 0) and applies the action next;
1898
1899 • A priority-90 flow with the match arp and applies the ac‐
1900 tion put_arp(inport, arp.spa, arp.sha); next;
1901
1902 • A priority-95 flow with the match nd_na && nd.tll == 0
1903 and applies the action put_nd(inport, nd.target,
1904 eth.src); next;
1905
1906 • A priority-90 flow with the match nd_na and applies the
1907 action put_nd(inport, nd.target, nd.tll); next;
1908
1909 • A priority-90 flow with the match nd_ns and applies the
1910 action put_nd(inport, ip6.src, nd.sll); next;
1911
1912 • A priority-0 logical flow that matches all packets not
1913 already handled (match 1) and drops them (action drop;).
1914
1915 Ingress Table 3: IP Input
1916
1917 This table is the core of the logical router datapath functionality. It
1918 contains the following flows to implement very basic IP host function‐
1919 ality.
1920
1921 • For each dnat_and_snat NAT rule on a distributed logical
1922 routers or gateway routers with gateway port configured
1923 with options:gateway_mtu to a valid integer value M, a
1924 priority-160 flow with the match inport == LRP && REG‐
1925 BIT_PKT_LARGER && REGBIT_EGRESS_LOOPBACK == 0, where LRP
1926 is the logical router port and applies the following ac‐
1927 tion for ipv4 and ipv6 respectively:
1928
1929 icmp4_error {
1930 icmp4.type = 3; /* Destination Unreachable. */
1931 icmp4.code = 4; /* Frag Needed and DF was Set. */
1932 icmp4.frag_mtu = M;
1933 eth.dst = eth.src;
1934 eth.src = E;
1935 ip4.dst = ip4.src;
1936 ip4.src = I;
1937 ip.ttl = 255;
1938 REGBIT_EGRESS_LOOPBACK = 1;
1939 REGBIT_PKT_LARGER 0;
1940 outport = LRP;
1941 flags.loopback = 1;
1942 output;
1943 };
1944 icmp6_error {
1945 icmp6.type = 2;
1946 icmp6.code = 0;
1947 icmp6.frag_mtu = M;
1948 eth.dst = eth.src;
1949 eth.src = E;
1950 ip6.dst = ip6.src;
1951 ip6.src = I;
1952 ip.ttl = 255;
1953 REGBIT_EGRESS_LOOPBACK = 1;
1954 REGBIT_PKT_LARGER 0;
1955 outport = LRP;
1956 flags.loopback = 1;
1957 output;
1958 };
1959
1960
1961 where E and I are the NAT rule external mac and IP re‐
1962 spectively.
1963
1964 • For distributed logical routers or gateway routers with
1965 gateway port configured with options:gateway_mtu to a
1966 valid integer value, a priority-150 flow with the match
1967 inport == LRP && REGBIT_PKT_LARGER && REGBIT_EGRESS_LOOP‐
1968 BACK == 0, where LRP is the logical router port and ap‐
1969 plies the following action for ipv4 and ipv6 respec‐
1970 tively:
1971
1972 icmp4_error {
1973 icmp4.type = 3; /* Destination Unreachable. */
1974 icmp4.code = 4; /* Frag Needed and DF was Set. */
1975 icmp4.frag_mtu = M;
1976 eth.dst = E;
1977 ip4.dst = ip4.src;
1978 ip4.src = I;
1979 ip.ttl = 255;
1980 REGBIT_EGRESS_LOOPBACK = 1;
1981 REGBIT_PKT_LARGER 0;
1982 next(pipeline=ingress, table=0);
1983 };
1984 icmp6_error {
1985 icmp6.type = 2;
1986 icmp6.code = 0;
1987 icmp6.frag_mtu = M;
1988 eth.dst = E;
1989 ip6.dst = ip6.src;
1990 ip6.src = I;
1991 ip.ttl = 255;
1992 REGBIT_EGRESS_LOOPBACK = 1;
1993 REGBIT_PKT_LARGER 0;
1994 next(pipeline=ingress, table=0);
1995 };
1996
1997
1998 • For each NAT entry of a distributed logical router (with
1999 distributed gateway router port(s)) of type snat, a pri‐
2000 ority-120 flow with the match inport == P && ip4.src == A
2001 advances the packet to the next pipeline, where P is the
2002 distributed logical router port corresponding to the NAT
2003 entry (specified or inferred) and A is the external_ip
2004 set in the NAT entry. If A is an IPv6 address, then
2005 ip6.src is used for the match.
2006
2007 The above flow is required to handle the routing of the
2008 East/west NAT traffic.
2009
2010 • For each BFD port the two following priority-110 flows
2011 are added to manage BFD traffic:
2012
2013 • if ip4.src or ip6.src is any IP address owned by
2014 the router port and udp.dst == 3784 , the packet
2015 is advanced to the next pipeline stage.
2016
2017 • if ip4.dst or ip6.dst is any IP address owned by
2018 the router port and udp.dst == 3784 , the han‐
2019 dle_bfd_msg action is executed.
2020
2021 • L3 admission control: Priority-120 flows allows IGMP and
2022 MLD packets if the router has logical ports that have op‐
2023 tions :mcast_flood=’true’.
2024
2025 • L3 admission control: A priority-100 flow drops packets
2026 that match any of the following:
2027
2028 • ip4.src[28..31] == 0xe (multicast source)
2029
2030 • ip4.src == 255.255.255.255 (broadcast source)
2031
2032 • ip4.src == 127.0.0.0/8 || ip4.dst == 127.0.0.0/8
2033 (localhost source or destination)
2034
2035 • ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
2036 network source or destination)
2037
2038 • ip4.src or ip6.src is any IP address owned by the
2039 router, unless the packet was recirculated due to
2040 egress loopback as indicated by REG‐
2041 BIT_EGRESS_LOOPBACK.
2042
2043 • ip4.src is the broadcast address of any IP network
2044 known to the router.
2045
2046 • A priority-100 flow parses DHCPv6 replies from IPv6 pre‐
2047 fix delegation routers (udp.src == 547 && udp.dst ==
2048 546). The handle_dhcpv6_reply is used to send IPv6 prefix
2049 delegation messages to the delegation router.
2050
2051 • ICMP echo reply. These flows reply to ICMP echo requests
2052 received for the router’s IP address. Let A be an IP ad‐
2053 dress owned by a router port. Then, for each A that is an
2054 IPv4 address, a priority-90 flow matches on ip4.dst == A
2055 and icmp4.type == 8 && icmp4.code == 0 (ICMP echo re‐
2056 quest). For each A that is an IPv6 address, a priority-90
2057 flow matches on ip6.dst == A and icmp6.type == 128 &&
2058 icmp6.code == 0 (ICMPv6 echo request). The port of the
2059 router that receives the echo request does not matter.
2060 Also, the ip.ttl of the echo request packet is not
2061 checked, so it complies with RFC 1812, section 4.2.2.9.
2062 Flows for ICMPv4 echo requests use the following actions:
2063
2064 ip4.dst <-> ip4.src;
2065 ip.ttl = 255;
2066 icmp4.type = 0;
2067 flags.loopback = 1;
2068 next;
2069
2070
2071 Flows for ICMPv6 echo requests use the following actions:
2072
2073 ip6.dst <-> ip6.src;
2074 ip.ttl = 255;
2075 icmp6.type = 129;
2076 flags.loopback = 1;
2077 next;
2078
2079
2080 • Reply to ARP requests.
2081
2082 These flows reply to ARP requests for the router’s own IP
2083 address. The ARP requests are handled only if the re‐
2084 questor’s IP belongs to the same subnets of the logical
2085 router port. For each router port P that owns IP address
2086 A, which belongs to subnet S with prefix length L, and
2087 Ethernet address E, a priority-90 flow matches inport ==
2088 P && arp.spa == S/L && arp.op == 1 && arp.tpa == A (ARP
2089 request) with the following actions:
2090
2091 eth.dst = eth.src;
2092 eth.src = xreg0[0..47];
2093 arp.op = 2; /* ARP reply. */
2094 arp.tha = arp.sha;
2095 arp.sha = xreg0[0..47];
2096 arp.tpa = arp.spa;
2097 arp.spa = A;
2098 outport = inport;
2099 flags.loopback = 1;
2100 output;
2101
2102
2103 For the gateway port on a distributed logical router
2104 (where one of the logical router ports specifies a gate‐
2105 way chassis), the above flows are only programmed on the
2106 gateway port instance on the gateway chassis. This behav‐
2107 ior avoids generation of multiple ARP responses from dif‐
2108 ferent chassis, and allows upstream MAC learning to point
2109 to the gateway chassis.
2110
2111 For the logical router port with the option reside-on-re‐
2112 direct-chassis set (which is centralized), the above
2113 flows are only programmed on the gateway port instance on
2114 the gateway chassis (if the logical router has a distrib‐
2115 uted gateway port). This behavior avoids generation of
2116 multiple ARP responses from different chassis, and allows
2117 upstream MAC learning to point to the gateway chassis.
2118
2119 • Reply to IPv6 Neighbor Solicitations. These flows reply
2120 to Neighbor Solicitation requests for the router’s own
2121 IPv6 address and populate the logical router’s mac bind‐
2122 ing table.
2123
2124 For each router port P that owns IPv6 address A, so‐
2125 licited node address S, and Ethernet address E, a prior‐
2126 ity-90 flow matches inport == P && nd_ns && ip6.dst ==
2127 {A, E} && nd.target == A with the following actions:
2128
2129 nd_na_router {
2130 eth.src = xreg0[0..47];
2131 ip6.src = A;
2132 nd.target = A;
2133 nd.tll = xreg0[0..47];
2134 outport = inport;
2135 flags.loopback = 1;
2136 output;
2137 };
2138
2139
2140 For the gateway port on a distributed logical router
2141 (where one of the logical router ports specifies a gate‐
2142 way chassis), the above flows replying to IPv6 Neighbor
2143 Solicitations are only programmed on the gateway port in‐
2144 stance on the gateway chassis. This behavior avoids gen‐
2145 eration of multiple replies from different chassis, and
2146 allows upstream MAC learning to point to the gateway
2147 chassis.
2148
2149 • These flows reply to ARP requests or IPv6 neighbor solic‐
2150 itation for the virtual IP addresses configured in the
2151 router for NAT (both DNAT and SNAT) or load balancing.
2152
2153 IPv4: For a configured NAT (both DNAT and SNAT) IP ad‐
2154 dress or a load balancer IPv4 VIP A, for each router port
2155 P with Ethernet address E, a priority-90 flow matches
2156 arp.op == 1 && arp.tpa == A (ARP request) with the fol‐
2157 lowing actions:
2158
2159 eth.dst = eth.src;
2160 eth.src = xreg0[0..47];
2161 arp.op = 2; /* ARP reply. */
2162 arp.tha = arp.sha;
2163 arp.sha = xreg0[0..47];
2164 arp.tpa <-> arp.spa;
2165 outport = inport;
2166 flags.loopback = 1;
2167 output;
2168
2169
2170 IPv4: For a configured load balancer IPv4 VIP, a similar
2171 flow is added with the additional match inport == P if
2172 the VIP is reachable from any logical router port of the
2173 logical router.
2174
2175 If the router port P is a distributed gateway router
2176 port, then the is_chassis_resident(P) is also added in
2177 the match condition for the load balancer IPv4 VIP A.
2178
2179 IPv6: For a configured NAT (both DNAT and SNAT) IP ad‐
2180 dress or a load balancer IPv6 VIP A (if the VIP is reach‐
2181 able from any logical router port of the logical router),
2182 solicited node address S, for each router port P with
2183 Ethernet address E, a priority-90 flow matches inport ==
2184 P && nd_ns && ip6.dst == {A, S} && nd.target == A with
2185 the following actions:
2186
2187 eth.dst = eth.src;
2188 nd_na {
2189 eth.src = xreg0[0..47];
2190 nd.tll = xreg0[0..47];
2191 ip6.src = A;
2192 nd.target = A;
2193 outport = inport;
2194 flags.loopback = 1;
2195 output;
2196 }
2197
2198
2199 If the router port P is a distributed gateway router
2200 port, then the is_chassis_resident(P) is also added in
2201 the match condition for the load balancer IPv6 VIP A.
2202
2203 For the gateway port on a distributed logical router with
2204 NAT (where one of the logical router ports specifies a
2205 gateway chassis):
2206
2207 • If the corresponding NAT rule cannot be handled in
2208 a distributed manner, then a priority-92 flow is
2209 programmed on the gateway port instance on the
2210 gateway chassis. A priority-91 drop flow is pro‐
2211 grammed on the other chassis when ARP requests/NS
2212 packets are received on the gateway port. This be‐
2213 havior avoids generation of multiple ARP responses
2214 from different chassis, and allows upstream MAC
2215 learning to point to the gateway chassis.
2216
2217 • If the corresponding NAT rule can be handled in a
2218 distributed manner, then this flow is only pro‐
2219 grammed on the gateway port instance where the
2220 logical_port specified in the NAT rule resides.
2221
2222 Some of the actions are different for this case,
2223 using the external_mac specified in the NAT rule
2224 rather than the gateway port’s Ethernet address E:
2225
2226 eth.src = external_mac;
2227 arp.sha = external_mac;
2228
2229
2230 or in the case of IPv6 neighbor solicition:
2231
2232 eth.src = external_mac;
2233 nd.tll = external_mac;
2234
2235
2236 This behavior avoids generation of multiple ARP
2237 responses from different chassis, and allows up‐
2238 stream MAC learning to point to the correct chas‐
2239 sis.
2240
2241 • Priority-85 flows which drops the ARP and IPv6 Neighbor
2242 Discovery packets.
2243
2244 • A priority-84 flow explicitly allows IPv6 multicast traf‐
2245 fic that is supposed to reach the router pipeline (i.e.,
2246 router solicitation and router advertisement packets).
2247
2248 • A priority-83 flow explicitly drops IPv6 multicast traf‐
2249 fic that is destined to reserved multicast groups.
2250
2251 • A priority-82 flow allows IP multicast traffic if op‐
2252 tions:mcast_relay=’true’, otherwise drops it.
2253
2254 • UDP port unreachable. Priority-80 flows generate ICMP
2255 port unreachable messages in reply to UDP datagrams di‐
2256 rected to the router’s IP address, except in the special
2257 case of gateways, which accept traffic directed to a
2258 router IP for load balancing and NAT purposes.
2259
2260 These flows should not match IP fragments with nonzero
2261 offset.
2262
2263 • TCP reset. Priority-80 flows generate TCP reset messages
2264 in reply to TCP datagrams directed to the router’s IP ad‐
2265 dress, except in the special case of gateways, which ac‐
2266 cept traffic directed to a router IP for load balancing
2267 and NAT purposes.
2268
2269 These flows should not match IP fragments with nonzero
2270 offset.
2271
2272 • Protocol or address unreachable. Priority-70 flows gener‐
2273 ate ICMP protocol or address unreachable messages for
2274 IPv4 and IPv6 respectively in reply to packets directed
2275 to the router’s IP address on IP protocols other than
2276 UDP, TCP, and ICMP, except in the special case of gate‐
2277 ways, which accept traffic directed to a router IP for
2278 load balancing purposes.
2279
2280 These flows should not match IP fragments with nonzero
2281 offset.
2282
2283 • Drop other IP traffic to this router. These flows drop
2284 any other traffic destined to an IP address of this
2285 router that is not already handled by one of the flows
2286 above, which amounts to ICMP (other than echo requests)
2287 and fragments with nonzero offsets. For each IP address A
2288 owned by the router, a priority-60 flow matches ip4.dst
2289 == A or ip6.dst == A and drops the traffic. An exception
2290 is made and the above flow is not added if the router
2291 port’s own IP address is used to SNAT packets passing
2292 through that router or if it is used as a load balancer
2293 VIP.
2294
2295 The flows above handle all of the traffic that might be directed to the
2296 router itself. The following flows (with lower priorities) handle the
2297 remaining traffic, potentially for forwarding:
2298
2299 • Drop Ethernet local broadcast. A priority-50 flow with
2300 match eth.bcast drops traffic destined to the local Eth‐
2301 ernet broadcast address. By definition this traffic
2302 should not be forwarded.
2303
2304 • ICMP time exceeded. For each router port P, whose IP ad‐
2305 dress is A, a priority-100 flow with match inport == P &&
2306 ip.ttl == {0, 1} && !ip.later_frag matches packets whose
2307 TTL has expired, with the following actions to send an
2308 ICMP time exceeded reply for IPv4 and IPv6 respectively:
2309
2310 icmp4 {
2311 icmp4.type = 11; /* Time exceeded. */
2312 icmp4.code = 0; /* TTL exceeded in transit. */
2313 ip4.dst = ip4.src;
2314 ip4.src = A;
2315 ip.ttl = 254;
2316 next;
2317 };
2318 icmp6 {
2319 icmp6.type = 3; /* Time exceeded. */
2320 icmp6.code = 0; /* TTL exceeded in transit. */
2321 ip6.dst = ip6.src;
2322 ip6.src = A;
2323 ip.ttl = 254;
2324 next;
2325 };
2326
2327
2328 • TTL discard. A priority-30 flow with match ip.ttl == {0,
2329 1} and actions drop; drops other packets whose TTL has
2330 expired, that should not receive a ICMP error reply (i.e.
2331 fragments with nonzero offset).
2332
2333 • Next table. A priority-0 flows match all packets that
2334 aren’t already handled and uses actions next; to feed
2335 them to the next table.
2336
2337 Ingress Table 4: UNSNAT
2338
2339 This is for already established connections’ reverse traffic. i.e.,
2340 SNAT has already been done in egress pipeline and now the packet has
2341 entered the ingress pipeline as part of a reply. It is unSNATted here.
2342
2343 Ingress Table 4: UNSNAT on Gateway and Distributed Routers
2344
2345 • If the Router (Gateway or Distributed) is configured with
2346 load balancers, then below lflows are added:
2347
2348 For each IPv4 address A defined as load balancer VIP with
2349 the protocol P (and the protocol port T if defined) is
2350 also present as an external_ip in the NAT table, a prior‐
2351 ity-120 logical flow is added with the match ip4 &&
2352 ip4.dst == A && P with the action next; to advance the
2353 packet to the next table. If the load balancer has proto‐
2354 col port B defined, then the match also has P.dst == B.
2355
2356 The above flows are also added for IPv6 load balancers.
2357
2358 Ingress Table 4: UNSNAT on Gateway Routers
2359
2360 • If the Gateway router has been configured to force SNAT
2361 any previously DNATted packets to B, a priority-110 flow
2362 matches ip && ip4.dst == B or ip && ip6.dst == B with an
2363 action ct_snat; .
2364
2365 If the Gateway router is configured with
2366 lb_force_snat_ip=router_ip then for every logical router
2367 port P attached to the Gateway router with the router ip
2368 B, a priority-110 flow is added with the match inport ==
2369 P && ip4.dst == B or inport == P && ip6.dst == B with an
2370 action ct_snat; .
2371
2372 If the Gateway router has been configured to force SNAT
2373 any previously load-balanced packets to B, a priority-100
2374 flow matches ip && ip4.dst == B or ip && ip6.dst == B
2375 with an action ct_snat; .
2376
2377 For each NAT configuration in the OVN Northbound data‐
2378 base, that asks to change the source IP address of a
2379 packet from A to B, a priority-90 flow matches ip &&
2380 ip4.dst == B or ip && ip6.dst == B with an action
2381 ct_snat; . If the NAT rule is of type dnat_and_snat and
2382 has stateless=true in the options, then the action would
2383 be next;.
2384
2385 A priority-0 logical flow with match 1 has actions next;.
2386
2387 Ingress Table 4: UNSNAT on Distributed Routers
2388
2389 • For each configuration in the OVN Northbound database,
2390 that asks to change the source IP address of a packet
2391 from A to B, two priority-100 flows are added.
2392
2393 If the NAT rule cannot be handled in a distributed man‐
2394 ner, then the below priority-100 flows are only pro‐
2395 grammed on the gateway chassis.
2396
2397 • The first flow matches ip && ip4.dst == B && in‐
2398 port == GW && flags.loopback == 0 or ip && ip6.dst
2399 == B && inport == GW && flags.loopback == 0 where
2400 GW is the distributed gateway port corresponding
2401 to the NAT rule (specified or inferred), with an
2402 action ct_snat_in_czone; to unSNAT in the common
2403 zone. If the NAT rule is of type dnat_and_snat and
2404 has stateless=true in the options, then the action
2405 would be next;.
2406
2407 If the NAT entry is of type snat, then there is an
2408 additional match is_chassis_resident(cr-GW)
2409 where cr-GW is the chassis resident port of GW.
2410
2411 • The second flow matches ip && ip4.dst == B && in‐
2412 port == GW && flags.loopback == 1 &&
2413 flags.use_snat_zone == 1 or ip && ip6.dst == B &&
2414 inport == GW && flags.loopback == 0 &&
2415 flags.use_snat_zone == 1 where GW is the distrib‐
2416 uted gateway port corresponding to the NAT rule
2417 (specified or inferred), with an action ct_snat;
2418 to unSNAT in the snat zone. If the NAT rule is of
2419 type dnat_and_snat and has stateless=true in the
2420 options, then the action would be ip4/6.dst=(B).
2421
2422 If the NAT entry is of type snat, then there is an
2423 additional match is_chassis_resident(cr-GW)
2424 where cr-GW is the chassis resident port of GW.
2425
2426 A priority-0 logical flow with match 1 has actions next;.
2427
2428 Ingress Table 5: DEFRAG
2429
2430 This is to send packets to connection tracker for tracking and defrag‐
2431 mentation. It contains a priority-0 flow that simply moves traffic to
2432 the next table.
2433
2434 If load balancing rules with only virtual IP addresses are configured
2435 in OVN_Northbound database for a Gateway router, a priority-100 flow is
2436 added for each configured virtual IP address VIP. For IPv4 VIPs the
2437 flow matches ip && ip4.dst == VIP. For IPv6 VIPs, the flow matches ip
2438 && ip6.dst == VIP. The flow applies the action reg0 = VIP; ct_dnat; (or
2439 xxreg0 for IPv6) to send IP packets to the connection tracker for
2440 packet de-fragmentation and to dnat the destination IP for the commit‐
2441 ted connection before sending it to the next table.
2442
2443 If load balancing rules with virtual IP addresses and ports are config‐
2444 ured in OVN_Northbound database for a Gateway router, a priority-110
2445 flow is added for each configured virtual IP address VIP, protocol
2446 PROTO and port PORT. For IPv4 VIPs the flow matches ip && ip4.dst ==
2447 VIP && PROTO && PROTO.dst == PORT. For IPv6 VIPs, the flow matches ip
2448 && ip6.dst == VIP && PROTO && PROTO.dst == PORT. The flow applies the
2449 action reg0 = VIP; reg9[16..31] = PROTO.dst; ct_dnat; (or xxreg0 for
2450 IPv6) to send IP packets to the connection tracker for packet de-frag‐
2451 mentation and to dnat the destination IP for the committed connection
2452 before sending it to the next table.
2453
2454 If ECMP routes with symmetric reply are configured in the OVN_North‐
2455 bound database for a gateway router, a priority-100 flow is added for
2456 each router port on which symmetric replies are configured. The match‐
2457 ing logic for these ports essentially reverses the configured logic of
2458 the ECMP route. So for instance, a route with a destination routing
2459 policy will instead match if the source IP address matches the static
2460 route’s prefix. The flow uses the actions chk_ecmp_nh_mac(); ct_next or
2461 chk_ecmp_nh(); ct_next to send IP packets to table 76 or to table 77 in
2462 order to check if source info are already stored by OVN and then to the
2463 connection tracker for packet de-fragmentation and tracking before
2464 sending it to the next table.
2465
2466 If load balancing rules are configured in OVN_Northbound database for a
2467 Gateway router, a priority 50 flow that matches icmp || icmp6 with an
2468 action of ct_dnat;, this allows potentially related ICMP traffic to
2469 pass through CT.
2470
2471 Ingress Table 6: Load balancing affinity check
2472
2473 Load balancing affinity check table contains the following logical
2474 flows:
2475
2476 • For all the configured load balancing rules for a logical
2477 router where a positive affinity timeout is specified in
2478 options column, that includes a L4 port PORT of protocol
2479 P and IPv4 or IPv6 address VIP, a priority-100 flow that
2480 matches on ct.new && ip && reg0 == VIP && P &&
2481 reg9[16..31] == PORT (xxreg0 == VIP
2482 in the IPv6 case) with an action of reg9[6] =
2483 chk_lb_aff(); next;
2484
2485 • A priority 0 flow is added which matches on all packets
2486 and applies the action next;.
2487
2488 Ingress Table 7: DNAT
2489
2490 Packets enter the pipeline with destination IP address that needs to be
2491 DNATted from a virtual IP address to a real IP address. Packets in the
2492 reverse direction needs to be unDNATed.
2493
2494 Ingress Table 7: Load balancing DNAT rules
2495
2496 Following load balancing DNAT flows are added for Gateway router or
2497 Router with gateway port. These flows are programmed only on the gate‐
2498 way chassis. These flows do not get programmed for load balancers with
2499 IPv6 VIPs.
2500
2501 • For all the configured load balancing rules for a logical
2502 router where a positive affinity timeout is specified in
2503 options column, that includes a L4 port PORT of protocol
2504 P and IPv4 or IPv6 address VIP, a priority-150 flow that
2505 matches on reg9[6] == 1 && ct.new && ip && reg0 == VIP &&
2506 P && reg9[16..31] == PORT (xxreg0 == VIP in the IPv6
2507 case) with an action of ct_lb_mark(args) , where args
2508 contains comma separated IP addresses (and optional port
2509 numbers) to load balance to. The address family of the IP
2510 addresses of args is the same as the address family of
2511 VIP.
2512
2513 • If controller_event has been enabled for all the config‐
2514 ured load balancing rules for a Gateway router or Router
2515 with gateway port in OVN_Northbound database that does
2516 not have configured backends, a priority-130 flow is
2517 added to trigger ovn-controller events whenever the chas‐
2518 sis receives a packet for that particular VIP. If
2519 event-elb meter has been previously created, it will be
2520 associated to the empty_lb logical flow
2521
2522 • For all the configured load balancing rules for a Gateway
2523 router or Router with gateway port in OVN_Northbound
2524 database that includes a L4 port PORT of protocol P and
2525 IPv4 or IPv6 address VIP, a priority-120 flow that
2526 matches on ct.new && !ct.rel && ip && reg0 == VIP && P &&
2527 reg9[16..31] ==
2528 PORT (xxreg0 == VIP
2529 in the IPv6 case) with an action of ct_lb_mark(args),
2530 where args contains comma separated IPv4 or IPv6 ad‐
2531 dresses (and optional port numbers) to load balance to.
2532 If the router is configured to force SNAT any load-bal‐
2533 anced packets, the above action will be replaced by
2534 flags.force_snat_for_lb = 1; ct_lb_mark(args);. If the
2535 load balancing rule is configured with skip_snat set to
2536 true, the above action will be replaced by
2537 flags.skip_snat_for_lb = 1; ct_lb_mark(args);. If health
2538 check is enabled, then args will only contain those end‐
2539 points whose service monitor status entry in OVN_South‐
2540 bound db is either online or empty.
2541
2542 The previous table lr_in_defrag sets the register reg0
2543 (or xxreg0 for IPv6) and does ct_dnat. Hence for estab‐
2544 lished traffic, this table just advances the packet to
2545 the next stage.
2546
2547 • For all the configured load balancing rules for a router
2548 in OVN_Northbound database that includes a L4 port PORT
2549 of protocol P and IPv4 or IPv6 address VIP, a prior‐
2550 ity-120 flow that matches on ct.est && !ct.rel && ip4 &&
2551 reg0 == VIP && P && reg9[16..31] ==
2552 PORT (ip6 and xxreg0 == VIP in the IPv6 case) with an
2553 action of next;. If the router is configured to force
2554 SNAT any load-balanced packets, the above action will be
2555 replaced by flags.force_snat_for_lb = 1; next;. If the
2556 load balancing rule is configured with skip_snat set to
2557 true, the above action will be replaced by
2558 flags.skip_snat_for_lb = 1; next;.
2559
2560 The previous table lr_in_defrag sets the register reg0
2561 (or xxreg0 for IPv6) and does ct_dnat. Hence for estab‐
2562 lished traffic, this table just advances the packet to
2563 the next stage.
2564
2565 • For all the configured load balancing rules for a router
2566 in OVN_Northbound database that includes just an IP ad‐
2567 dress VIP to match on, a priority-110 flow that matches
2568 on ct.new && !ct.rel && ip4 && reg0 == VIP (ip6 and
2569 xxreg0 == VIP in the IPv6 case) with an action of
2570 ct_lb_mark(args), where args contains comma separated
2571 IPv4 or IPv6 addresses. If the router is configured to
2572 force SNAT any load-balanced packets, the above action
2573 will be replaced by flags.force_snat_for_lb = 1;
2574 ct_lb_mark(args);. If the load balancing rule is config‐
2575 ured with skip_snat set to true, the above action will be
2576 replaced by flags.skip_snat_for_lb = 1;
2577 ct_lb_mark(args);.
2578
2579 The previous table lr_in_defrag sets the register reg0
2580 (or xxreg0 for IPv6) and does ct_dnat. Hence for estab‐
2581 lished traffic, this table just advances the packet to
2582 the next stage.
2583
2584 • For all the configured load balancing rules for a router
2585 in OVN_Northbound database that includes just an IP ad‐
2586 dress VIP to match on, a priority-110 flow that matches
2587 on ct.est && !ct.rel && ip4 && reg0 == VIP (or ip6 and
2588 xxreg0 == VIP) with an action of next;. If the router is
2589 configured to force SNAT any load-balanced packets, the
2590 above action will be replaced by flags.force_snat_for_lb
2591 = 1; next;. If the load balancing rule is configured with
2592 skip_snat set to true, the above action will be replaced
2593 by flags.skip_snat_for_lb = 1; next;.
2594
2595 The previous table lr_in_defrag sets the register reg0
2596 (or xxreg0 for IPv6) and does ct_dnat. Hence for estab‐
2597 lished traffic, this table just advances the packet to
2598 the next stage.
2599
2600 • If the load balancer is created with --reject option and
2601 it has no active backends, a TCP reset segment (for tcp)
2602 or an ICMP port unreachable packet (for all other kind of
2603 traffic) will be sent whenever an incoming packet is re‐
2604 ceived for this load-balancer. Please note using --reject
2605 option will disable empty_lb SB controller event for this
2606 load balancer.
2607
2608 • For the related traffic, a priority 50 flow that matches
2609 ct.rel && !ct.est && !ct.new with an action of ct_com‐
2610 mit_nat;, if the router has load balancer assigned to it.
2611 Along with two priority 70 flows that match skip_snat and
2612 force_snat flags.
2613
2614 Ingress Table 7: DNAT on Gateway Routers
2615
2616 • For each configuration in the OVN Northbound database,
2617 that asks to change the destination IP address of a
2618 packet from A to B, a priority-100 flow matches ip &&
2619 ip4.dst == A or ip && ip6.dst == A with an action
2620 flags.loopback = 1; ct_dnat(B);. If the Gateway router is
2621 configured to force SNAT any DNATed packet, the above ac‐
2622 tion will be replaced by flags.force_snat_for_dnat = 1;
2623 flags.loopback = 1; ct_dnat(B);. If the NAT rule is of
2624 type dnat_and_snat and has stateless=true in the options,
2625 then the action would be ip4/6.dst= (B).
2626
2627 If the NAT rule has allowed_ext_ips configured, then
2628 there is an additional match ip4.src == allowed_ext_ips .
2629 Similarly, for IPV6, match would be ip6.src == al‐
2630 lowed_ext_ips.
2631
2632 If the NAT rule has exempted_ext_ips set, then there is
2633 an additional flow configured at priority 101. The flow
2634 matches if source ip is an exempted_ext_ip and the action
2635 is next; . This flow is used to bypass the ct_dnat action
2636 for a packet originating from exempted_ext_ips.
2637
2638 • A priority-0 logical flow with match 1 has actions next;.
2639
2640 Ingress Table 7: DNAT on Distributed Routers
2641
2642 On distributed routers, the DNAT table only handles packets with desti‐
2643 nation IP address that needs to be DNATted from a virtual IP address to
2644 a real IP address. The unDNAT processing in the reverse direction is
2645 handled in a separate table in the egress pipeline.
2646
2647 • For each configuration in the OVN Northbound database,
2648 that asks to change the destination IP address of a
2649 packet from A to B, a priority-100 flow matches ip &&
2650 ip4.dst == B && inport == GW, where GW is the logical
2651 router gateway port corresponding to the NAT rule (speci‐
2652 fied or inferred), with an action ct_dnat(B);. The match
2653 will include ip6.dst == B in the IPv6 case. If the NAT
2654 rule is of type dnat_and_snat and has stateless=true in
2655 the options, then the action would be ip4/6.dst=(B).
2656
2657 If the NAT rule cannot be handled in a distributed man‐
2658 ner, then the priority-100 flow above is only programmed
2659 on the gateway chassis.
2660
2661 If the NAT rule has allowed_ext_ips configured, then
2662 there is an additional match ip4.src == allowed_ext_ips .
2663 Similarly, for IPV6, match would be ip6.src == al‐
2664 lowed_ext_ips.
2665
2666 If the NAT rule has exempted_ext_ips set, then there is
2667 an additional flow configured at priority 101. The flow
2668 matches if source ip is an exempted_ext_ip and the action
2669 is next; . This flow is used to bypass the ct_dnat action
2670 for a packet originating from exempted_ext_ips.
2671
2672 A priority-0 logical flow with match 1 has actions next;.
2673
2674 Ingress Table 8: Load balancing affinity learn
2675
2676 Load balancing affinity learn table contains the following logical
2677 flows:
2678
2679 • For all the configured load balancing rules for a logical
2680 router where a positive affinity timeout T is specified
2681 in options
2682 column, that includes a L4 port PORT of protocol P and
2683 IPv4 or IPv6 address VIP, a priority-100 flow that
2684 matches on reg9[6] == 0 && ct.new && ip && reg0 == VIP &&
2685 P && reg9[16..31] == PORT (xxreg0 == VIP in the IPv6
2686 case) with an action of commit_lb_aff(vip = VIP:PORT,
2687 backend = backend ip: backend port, proto = P, timeout =
2688 T);.
2689
2690 • A priority 0 flow is added which matches on all packets
2691 and applies the action next;.
2692
2693 Ingress Table 9: ECMP symmetric reply processing
2694
2695 • If ECMP routes with symmetric reply are configured in the
2696 OVN_Northbound database for a gateway router, a prior‐
2697 ity-100 flow is added for each router port on which sym‐
2698 metric replies are configured. The matching logic for
2699 these ports essentially reverses the configured logic of
2700 the ECMP route. So for instance, a route with a destina‐
2701 tion routing policy will instead match if the source IP
2702 address matches the static route’s prefix. The flow uses
2703 the action ct_commit { ct_label.ecmp_reply_eth =
2704 eth.src;" " ct_mark.ecmp_reply_port = K;}; com‐
2705 mit_ecmp_nh(); next;
2706 to commit the connection and storing eth.src and the
2707 ECMP reply port binding tunnel key K in the ct_label and
2708 the traffic pattern to table 76 or 77.
2709
2710 Ingress Table 10: IPv6 ND RA option processing
2711
2712 • A priority-50 logical flow is added for each logical
2713 router port configured with IPv6 ND RA options which
2714 matches IPv6 ND Router Solicitation packet and applies
2715 the action put_nd_ra_opts and advances the packet to the
2716 next table.
2717
2718 reg0[5] = put_nd_ra_opts(options);next;
2719
2720
2721 For a valid IPv6 ND RS packet, this transforms the packet
2722 into an IPv6 ND RA reply and sets the RA options to the
2723 packet and stores 1 into reg0[5]. For other kinds of
2724 packets, it just stores 0 into reg0[5]. Either way, it
2725 continues to the next table.
2726
2727 • A priority-0 logical flow with match 1 has actions next;.
2728
2729 Ingress Table 11: IPv6 ND RA responder
2730
2731 This table implements IPv6 ND RA responder for the IPv6 ND RA replies
2732 generated by the previous table.
2733
2734 • A priority-50 logical flow is added for each logical
2735 router port configured with IPv6 ND RA options which
2736 matches IPv6 ND RA packets and reg0[5] == 1 and responds
2737 back to the inport after applying these actions. If
2738 reg0[5] is set to 1, it means that the action
2739 put_nd_ra_opts was successful.
2740
2741 eth.dst = eth.src;
2742 eth.src = E;
2743 ip6.dst = ip6.src;
2744 ip6.src = I;
2745 outport = P;
2746 flags.loopback = 1;
2747 output;
2748
2749
2750 where E is the MAC address and I is the IPv6 link local
2751 address of the logical router port.
2752
2753 (This terminates packet processing in ingress pipeline;
2754 the packet does not go to the next ingress table.)
2755
2756 • A priority-0 logical flow with match 1 has actions next;.
2757
2758 Ingress Table 12: IP Routing Pre
2759
2760 If a packet arrived at this table from Logical Router Port P which has
2761 options:route_table value set, a logical flow with match inport == "P"
2762 with priority 100 and action setting unique-generated per-datapath
2763 32-bit value (non-zero) in OVS register 7. This register’s value is
2764 checked in next table. If packet didn’t match any configured inport
2765 (<main> route table), register 7 value is set to 0.
2766
2767 This table contains the following logical flows:
2768
2769 • Priority-100 flow with match inport == "LRP_NAME" value
2770 and action, which set route table identifier in reg7.
2771
2772 A priority-0 logical flow with match 1 has actions reg7 =
2773 0; next;.
2774
2775 Ingress Table 13: IP Routing
2776
2777 A packet that arrives at this table is an IP packet that should be
2778 routed to the address in ip4.dst or ip6.dst. This table implements IP
2779 routing, setting reg0 (or xxreg0 for IPv6) to the next-hop IP address
2780 (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
2781 and advances to the next table for ARP resolution. It also sets reg1
2782 (or xxreg1) to the IP address owned by the selected router port
2783 (ingress table ARP Request will generate an ARP request, if needed,
2784 with reg0 as the target protocol address and reg1 as the source proto‐
2785 col address).
2786
2787 For ECMP routes, i.e. multiple static routes with same policy and pre‐
2788 fix but different nexthops, the above actions are deferred to next ta‐
2789 ble. This table, instead, is responsible for determine the ECMP group
2790 id and select a member id within the group based on 5-tuple hashing. It
2791 stores group id in reg8[0..15] and member id in reg8[16..31]. This step
2792 is skipped with a priority-10300 rule if the traffic going out the ECMP
2793 route is reply traffic, and the ECMP route was configured to use sym‐
2794 metric replies. Instead, the stored values in conntrack is used to
2795 choose the destination. The ct_label.ecmp_reply_eth tells the destina‐
2796 tion MAC address to which the packet should be sent. The
2797 ct_mark.ecmp_reply_port tells the logical router port on which the
2798 packet should be sent. These values saved to the conntrack fields when
2799 the initial ingress traffic is received over the ECMP route and commit‐
2800 ted to conntrack. If REGBIT_KNOWN_ECMP_NH is set, the priority-10300
2801 flows in this stage set the outport, while the eth.dst is set by flows
2802 at the ARP/ND Resolution stage.
2803
2804 This table contains the following logical flows:
2805
2806 • Priority-10550 flow that drops IPv6 Router Solicita‐
2807 tion/Advertisement packets that were not processed in
2808 previous tables.
2809
2810 • Priority-10550 flows that drop IGMP and MLD packets with
2811 source MAC address owned by the router. These are used to
2812 prevent looping statically forwarded IGMP and MLD packets
2813 for which TTL is not decremented (it is always 1).
2814
2815 • Priority-10500 flows that match IP multicast traffic des‐
2816 tined to groups registered on any of the attached
2817 switches and sets outport to the associated multicast
2818 group that will eventually flood the traffic to all in‐
2819 terested attached logical switches. The flows also decre‐
2820 ment TTL.
2821
2822 • Priority-10460 flows that match IGMP and MLD control
2823 packets, set outport to the MC_STATIC multicast group,
2824 which ovn-northd populates with the logical ports that
2825 have options :mcast_flood=’true’. If no router ports are
2826 configured to flood multicast traffic the packets are
2827 dropped.
2828
2829 • Priority-10450 flow that matches unregistered IP multi‐
2830 cast traffic decrements TTL and sets outport to the
2831 MC_STATIC multicast group, which ovn-northd populates
2832 with the logical ports that have options
2833 :mcast_flood=’true’. If no router ports are configured to
2834 flood multicast traffic the packets are dropped.
2835
2836 • IPv4 routing table. For each route to IPv4 network N with
2837 netmask M, on router port P with IP address A and Ether‐
2838 net address E, a logical flow with match ip4.dst == N/M,
2839 whose priority is the number of 1-bits in M, has the fol‐
2840 lowing actions:
2841
2842 ip.ttl--;
2843 reg8[0..15] = 0;
2844 reg0 = G;
2845 reg1 = A;
2846 eth.src = E;
2847 outport = P;
2848 flags.loopback = 1;
2849 next;
2850
2851
2852 (Ingress table 1 already verified that ip.ttl--; will not
2853 yield a TTL exceeded error.)
2854
2855 If the route has a gateway, G is the gateway IP address.
2856 Instead, if the route is from a configured static route,
2857 G is the next hop IP address. Else it is ip4.dst.
2858
2859 • IPv6 routing table. For each route to IPv6 network N with
2860 netmask M, on router port P with IP address A and Ether‐
2861 net address E, a logical flow with match in CIDR notation
2862 ip6.dst == N/M, whose priority is the integer value of M,
2863 has the following actions:
2864
2865 ip.ttl--;
2866 reg8[0..15] = 0;
2867 xxreg0 = G;
2868 xxreg1 = A;
2869 eth.src = E;
2870 outport = inport;
2871 flags.loopback = 1;
2872 next;
2873
2874
2875 (Ingress table 1 already verified that ip.ttl--; will not
2876 yield a TTL exceeded error.)
2877
2878 If the route has a gateway, G is the gateway IP address.
2879 Instead, if the route is from a configured static route,
2880 G is the next hop IP address. Else it is ip6.dst.
2881
2882 If the address A is in the link-local scope, the route
2883 will be limited to sending on the ingress port.
2884
2885 For each static route the reg7 == id && is prefixed in
2886 logical flow match portion. For routes with route_table
2887 value set a unique non-zero id is used. For routes within
2888 <main> route table (no route table set), this id value is
2889 0.
2890
2891 For each connected route (route to the LRP’s subnet CIDR)
2892 the logical flow match portion has no reg7 == id && pre‐
2893 fix to have route to LRP’s subnets in all routing tables.
2894
2895 • For ECMP routes, they are grouped by policy and prefix.
2896 An unique id (non-zero) is assigned to each group, and
2897 each member is also assigned an unique id (non-zero)
2898 within each group.
2899
2900 For each IPv4/IPv6 ECMP group with group id GID and mem‐
2901 ber ids MID1, MID2, ..., a logical flow with match in
2902 CIDR notation ip4.dst == N/M, or ip6.dst == N/M, whose
2903 priority is the integer value of M, has the following ac‐
2904 tions:
2905
2906 ip.ttl--;
2907 flags.loopback = 1;
2908 reg8[0..15] = GID;
2909 select(reg8[16..31], MID1, MID2, ...);
2910
2911
2912 • A priority-0 logical flow that matches all packets not
2913 already handled (match 1) and drops them (action drop;).
2914
2915 Ingress Table 14: IP_ROUTING_ECMP
2916
2917 This table implements the second part of IP routing for ECMP routes
2918 following the previous table. If a packet matched a ECMP group in the
2919 previous table, this table matches the group id and member id stored
2920 from the previous table, setting reg0 (or xxreg0 for IPv6) to the next-
2921 hop IP address (leaving ip4.dst or ip6.dst, the packet’s final destina‐
2922 tion, unchanged) and advances to the next table for ARP resolution. It
2923 also sets reg1 (or xxreg1) to the IP address owned by the selected
2924 router port (ingress table ARP Request will generate an ARP request, if
2925 needed, with reg0 as the target protocol address and reg1 as the source
2926 protocol address).
2927
2928 This processing is skipped for reply traffic being sent out of an ECMP
2929 route if the route was configured to use symmetric replies.
2930
2931 This table contains the following logical flows:
2932
2933 • A priority-150 flow that matches reg8[0..15] == 0 with
2934 action next; directly bypasses packets of non-ECMP
2935 routes.
2936
2937 • For each member with ID MID in each ECMP group with ID
2938 GID, a priority-100 flow with match reg8[0..15] == GID &&
2939 reg8[16..31] == MID has following actions:
2940
2941 [xx]reg0 = G;
2942 [xx]reg1 = A;
2943 eth.src = E;
2944 outport = P;
2945
2946
2947 • A priority-0 logical flow that matches all packets not
2948 already handled (match 1) and drops them (action drop;).
2949
2950 Ingress Table 15: Router policies
2951
2952 This table adds flows for the logical router policies configured on the
2953 logical router. Please see the OVN_Northbound database Logi‐
2954 cal_Router_Policy table documentation in ovn-nb for supported actions.
2955
2956 • For each router policy configured on the logical router,
2957 a logical flow is added with specified priority, match
2958 and actions.
2959
2960 • If the policy action is reroute with 2 or more nexthops
2961 defined, then the logical flow is added with the follow‐
2962 ing actions:
2963
2964 reg8[0..15] = GID;
2965 reg8[16..31] = select(1,..n);
2966
2967
2968 where GID is the ECMP group id generated by ovn-northd
2969 for this policy and n is the number of nexthops. select
2970 action selects one of the nexthop member id, stores it in
2971 the register reg8[16..31] and advances the packet to the
2972 next stage.
2973
2974 • If the policy action is reroute with just one nexhop,
2975 then the logical flow is added with the following ac‐
2976 tions:
2977
2978 [xx]reg0 = H;
2979 eth.src = E;
2980 outport = P;
2981 reg8[0..15] = 0;
2982 flags.loopback = 1;
2983 next;
2984
2985
2986 where H is the nexthop defined in the router policy, E
2987 is the ethernet address of the logical router port from
2988 which the nexthop is reachable and P is the logical
2989 router port from which the nexthop is reachable.
2990
2991 • If a router policy has the option pkt_mark=m set and if
2992 the action is not drop, then the action also includes
2993 pkt.mark = m to mark the packet with the marker m.
2994
2995 Ingress Table 16: ECMP handling for router policies
2996
2997 This table handles the ECMP for the router policies configured with
2998 multiple nexthops.
2999
3000 • A priority-150 flow is added to advance the packet to the
3001 next stage if the ECMP group id register reg8[0..15] is
3002 0.
3003
3004 • For each ECMP reroute router policy with multiple nex‐
3005 thops, a priority-100 flow is added for each nexthop H
3006 with the match reg8[0..15] == GID && reg8[16..31] == M
3007 where GID is the router policy group id generated by
3008 ovn-northd and M is the member id of the nexthop H gener‐
3009 ated by ovn-northd. The following actions are added to
3010 the flow:
3011
3012 [xx]reg0 = H;
3013 eth.src = E;
3014 outport = P
3015 "flags.loopback = 1; "
3016 "next;"
3017
3018
3019 where H is the nexthop defined in the router policy, E
3020 is the ethernet address of the logical router port from
3021 which the nexthop is reachable and P is the logical
3022 router port from which the nexthop is reachable.
3023
3024 • A priority-0 logical flow that matches all packets not
3025 already handled (match 1) and drops them (action drop;).
3026
3027 Ingress Table 17: ARP/ND Resolution
3028
3029 Any packet that reaches this table is an IP packet whose next-hop IPv4
3030 address is in reg0 or IPv6 address is in xxreg0. (ip4.dst or ip6.dst
3031 contains the final destination.) This table resolves the IP address in
3032 reg0 (or xxreg0) into an output port in outport and an Ethernet address
3033 in eth.dst, using the following flows:
3034
3035 • A priority-500 flow that matches IP multicast traffic
3036 that was allowed in the routing pipeline. For this kind
3037 of traffic the outport was already set so the flow just
3038 advances to the next table.
3039
3040 • Priority-200 flows that match ECMP reply traffic for the
3041 routes configured to use symmetric replies, with actions
3042 push(xxreg1); xxreg1 = ct_label; eth.dst =
3043 xxreg1[32..79]; pop(xxreg1); next;. xxreg1 is used here
3044 to avoid masked access to ct_label, to make the flow HW-
3045 offloading friendly.
3046
3047 • Static MAC bindings. MAC bindings can be known statically
3048 based on data in the OVN_Northbound database. For router
3049 ports connected to logical switches, MAC bindings can be
3050 known statically from the addresses column in the Logi‐
3051 cal_Switch_Port table. For router ports connected to
3052 other logical routers, MAC bindings can be known stati‐
3053 cally from the mac and networks column in the Logi‐
3054 cal_Router_Port table. (Note: the flow is NOT installed
3055 for the IP addresses that belong to a neighbor logical
3056 router port if the current router has the options:dy‐
3057 namic_neigh_routers set to true)
3058
3059 For each IPv4 address A whose host is known to have Eth‐
3060 ernet address E on router port P, a priority-100 flow
3061 with match outport === P && reg0 == A has actions eth.dst
3062 = E; next;.
3063
3064 For each virtual ip A configured on a logical port of
3065 type virtual and its virtual parent set in its corre‐
3066 sponding Port_Binding record and the virtual parent with
3067 the Ethernet address E and the virtual ip is reachable
3068 via the router port P, a priority-100 flow with match
3069 outport === P && xxreg0/reg0 == A has actions eth.dst =
3070 E; next;.
3071
3072 For each virtual ip A configured on a logical port of
3073 type virtual and its virtual parent not set in its corre‐
3074 sponding Port_Binding record and the virtual ip A is
3075 reachable via the router port P, a priority-100 flow with
3076 match outport === P && xxreg0/reg0 == A has actions
3077 eth.dst = 00:00:00:00:00:00; next;. This flow is added so
3078 that the ARP is always resolved for the virtual ip A by
3079 generating ARP request and not consulting the MAC_Binding
3080 table as it can have incorrect value for the virtual ip
3081 A.
3082
3083 For each IPv6 address A whose host is known to have Eth‐
3084 ernet address E on router port P, a priority-100 flow
3085 with match outport === P && xxreg0 == A has actions
3086 eth.dst = E; next;.
3087
3088 For each logical router port with an IPv4 address A and a
3089 mac address of E that is reachable via a different logi‐
3090 cal router port P, a priority-100 flow with match outport
3091 === P && reg0 == A has actions eth.dst = E; next;.
3092
3093 For each logical router port with an IPv6 address A and a
3094 mac address of E that is reachable via a different logi‐
3095 cal router port P, a priority-100 flow with match outport
3096 === P && xxreg0 == A has actions eth.dst = E; next;.
3097
3098 • Static MAC bindings from NAT entries. MAC bindings can
3099 also be known for the entries in the NAT table. Below
3100 flows are programmed for distributed logical routers i.e
3101 with a distributed router port.
3102
3103 For each row in the NAT table with IPv4 address A in the
3104 external_ip column of NAT table, a priority-100 flow with
3105 the match outport === P && reg0 == A has actions eth.dst
3106 = E; next;, where P is the distributed logical router
3107 port, E is the Ethernet address if set in the exter‐
3108 nal_mac column of NAT table for of type dnat_and_snat,
3109 otherwise the Ethernet address of the distributed logical
3110 router port. Note that if the external_ip is not within a
3111 subnet on the owning logical router, then OVN will only
3112 create ARP resolution flows if the options:add_route is
3113 set to true. Otherwise, no ARP resolution flows will be
3114 added.
3115
3116 For IPv6 NAT entries, same flows are added, but using the
3117 register xxreg0 for the match.
3118
3119 • If the router datapath runs a port with redirect-type set
3120 to bridged, for each distributed NAT rule with IP A in
3121 the logical_ip column and logical port P in the logi‐
3122 cal_port column of NAT table, a priority-90 flow with the
3123 match outport == Q && ip.src === A && is_chassis_resi‐
3124 dent(P), where Q is the distributed logical router port
3125 and action get_arp(outport, reg0); next; for IPv4 and
3126 get_nd(outport, xxreg0); next; for IPv6.
3127
3128 • Traffic with IP destination an address owned by the
3129 router should be dropped. Such traffic is normally
3130 dropped in ingress table IP Input except for IPs that are
3131 also shared with SNAT rules. However, if there was no un‐
3132 SNAT operation that happened successfully until this
3133 point in the pipeline and the destination IP of the
3134 packet is still a router owned IP, the packets can be
3135 safely dropped.
3136
3137 A priority-2 logical flow with match ip4.dst = {..}
3138 matches on traffic destined to router owned IPv4 ad‐
3139 dresses which are also SNAT IPs. This flow has action
3140 drop;.
3141
3142 A priority-2 logical flow with match ip6.dst = {..}
3143 matches on traffic destined to router owned IPv6 ad‐
3144 dresses which are also SNAT IPs. This flow has action
3145 drop;.
3146
3147 A priority-0 logical that flow matches all packets not
3148 already handled (match 1) and drops them (action drop;).
3149
3150 • Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
3151 ings that have become known dynamically through ARP or
3152 neighbor discovery. (The ingress table ARP Request will
3153 issue an ARP or neighbor solicitation request for cases
3154 where the binding is not yet known.)
3155
3156 A priority-0 logical flow with match ip4 has actions
3157 get_arp(outport, reg0); next;.
3158
3159 A priority-0 logical flow with match ip6 has actions
3160 get_nd(outport, xxreg0); next;.
3161
3162 • For a distributed gateway LRP with redirect-type set to
3163 bridged, a priority-50 flow will match outport ==
3164 "ROUTER_PORT" and !is_chassis_resident ("cr-ROUTER_PORT")
3165 has actions eth.dst = E; next;, where E is the ethernet
3166 address of the logical router port.
3167
3168 Ingress Table 18: Check packet length
3169
3170 For distributed logical routers or gateway routers with gateway port
3171 configured with options:gateway_mtu to a valid integer value, this ta‐
3172 ble adds a priority-50 logical flow with the match outport == GW_PORT
3173 where GW_PORT is the gateway router port and applies the action
3174 check_pkt_larger and advances the packet to the next table.
3175
3176 REGBIT_PKT_LARGER = check_pkt_larger(L); next;
3177
3178
3179 where L is the packet length to check for. If the packet is larger than
3180 L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
3181 taken from options:gateway_mtu column of Logical_Router_Port row.
3182
3183 If the port is also configured with options:gateway_mtu_bypass then an‐
3184 other flow is added, with priority-55, to bypass the check_pkt_larger
3185 flow.
3186
3187 This table adds one priority-0 fallback flow that matches all packets
3188 and advances to the next table.
3189
3190 Ingress Table 19: Handle larger packets
3191
3192 For distributed logical routers or gateway routers with gateway port
3193 configured with options:gateway_mtu to a valid integer value, this ta‐
3194 ble adds the following priority-150 logical flow for each logical
3195 router port with the match inport == LRP && outport == GW_PORT && REG‐
3196 BIT_PKT_LARGER && !REGBIT_EGRESS_LOOPBACK, where LRP is the logical
3197 router port and GW_PORT is the gateway port and applies the following
3198 action for ipv4 and ipv6 respectively:
3199
3200 icmp4 {
3201 icmp4.type = 3; /* Destination Unreachable. */
3202 icmp4.code = 4; /* Frag Needed and DF was Set. */
3203 icmp4.frag_mtu = M;
3204 eth.dst = E;
3205 ip4.dst = ip4.src;
3206 ip4.src = I;
3207 ip.ttl = 255;
3208 REGBIT_EGRESS_LOOPBACK = 1;
3209 REGBIT_PKT_LARGER = 0;
3210 next(pipeline=ingress, table=0);
3211 };
3212 icmp6 {
3213 icmp6.type = 2;
3214 icmp6.code = 0;
3215 icmp6.frag_mtu = M;
3216 eth.dst = E;
3217 ip6.dst = ip6.src;
3218 ip6.src = I;
3219 ip.ttl = 255;
3220 REGBIT_EGRESS_LOOPBACK = 1;
3221 REGBIT_PKT_LARGER = 0;
3222 next(pipeline=ingress, table=0);
3223 };
3224
3225
3226 • Where M is the (fragment MTU - 58) whose value is taken
3227 from options:gateway_mtu column of Logical_Router_Port
3228 row.
3229
3230 • E is the Ethernet address of the logical router port.
3231
3232 • I is the IPv4/IPv6 address of the logical router port.
3233
3234 This table adds one priority-0 fallback flow that matches all packets
3235 and advances to the next table.
3236
3237 Ingress Table 20: Gateway Redirect
3238
3239 For distributed logical routers where one or more of the logical router
3240 ports specifies a gateway chassis, this table redirects certain packets
3241 to the distributed gateway port instances on the gateway chassises.
3242 This table has the following flows:
3243
3244 • For each NAT rule in the OVN Northbound database that can
3245 be handled in a distributed manner, a priority-100 logi‐
3246 cal flow with match ip4.src == B && outport == GW &&
3247 is_chassis_resident(P), where GW is the distributed gate‐
3248 way port specified in the NAT rule and P is the NAT logi‐
3249 cal port. IP traffic matching the above rule will be man‐
3250 aged locally setting reg1 to C and eth.src to D, where C
3251 is NAT external ip and D is NAT external mac.
3252
3253 • For each dnat_and_snat NAT rule with stateless=true and
3254 allowed_ext_ips configured, a priority-75 flow is pro‐
3255 grammed with match ip4.dst == B and action outport = CR;
3256 next; where B is the NAT rule external IP and CR is the
3257 chassisredirect port representing the instance of the
3258 logical router distributed gateway port on the gateway
3259 chassis. Moreover a priority-70 flow is programmed with
3260 same match and action drop;. For each dnat_and_snat NAT
3261 rule with stateless=true and exempted_ext_ips configured,
3262 a priority-75 flow is programmed with match ip4.dst == B
3263 and action drop; where B is the NAT rule external IP. A
3264 similar flow is added for IPv6 traffic.
3265
3266 • For each NAT rule in the OVN Northbound database that can
3267 be handled in a distributed manner, a priority-80 logical
3268 flow with drop action if the NAT logical port is a vir‐
3269 tual port not claimed by any chassis yet.
3270
3271 • A priority-50 logical flow with match outport == GW has
3272 actions outport = CR; next;, where GW is the logical
3273 router distributed gateway port and CR is the chas‐
3274 sisredirect port representing the instance of the logical
3275 router distributed gateway port on the gateway chassis.
3276
3277 • A priority-0 logical flow with match 1 has actions next;.
3278
3279 Ingress Table 21: ARP Request
3280
3281 In the common case where the Ethernet destination has been resolved,
3282 this table outputs the packet. Otherwise, it composes and sends an ARP
3283 or IPv6 Neighbor Solicitation request. It holds the following flows:
3284
3285 • Unknown MAC address. A priority-100 flow for IPv4 packets
3286 with match eth.dst == 00:00:00:00:00:00 has the following
3287 actions:
3288
3289 arp {
3290 eth.dst = ff:ff:ff:ff:ff:ff;
3291 arp.spa = reg1;
3292 arp.tpa = reg0;
3293 arp.op = 1; /* ARP request. */
3294 output;
3295 };
3296
3297
3298 Unknown MAC address. For each IPv6 static route associ‐
3299 ated with the router with the nexthop IP: G, a prior‐
3300 ity-200 flow for IPv6 packets with match eth.dst ==
3301 00:00:00:00:00:00 && xxreg0 == G with the following ac‐
3302 tions is added:
3303
3304 nd_ns {
3305 eth.dst = E;
3306 ip6.dst = I
3307 nd.target = G;
3308 output;
3309 };
3310
3311
3312 Where E is the multicast mac derived from the Gateway IP,
3313 I is the solicited-node multicast address corresponding
3314 to the target address G.
3315
3316 Unknown MAC address. A priority-100 flow for IPv6 packets
3317 with match eth.dst == 00:00:00:00:00:00 has the following
3318 actions:
3319
3320 nd_ns {
3321 nd.target = xxreg0;
3322 output;
3323 };
3324
3325
3326 (Ingress table IP Routing initialized reg1 with the IP
3327 address owned by outport and (xx)reg0 with the next-hop
3328 IP address)
3329
3330 The IP packet that triggers the ARP/IPv6 NS request is
3331 dropped.
3332
3333 • Known MAC address. A priority-0 flow with match 1 has ac‐
3334 tions output;.
3335
3336 Egress Table 0: Check DNAT local
3337
3338 This table checks if the packet needs to be DNATed in the router
3339 ingress table lr_in_dnat after it is SNATed and looped back to the
3340 ingress pipeline. This check is done only for routers configured with
3341 distributed gateway ports and NAT entries. This check is done so that
3342 SNAT and DNAT is done in different zones instead of a common zone.
3343
3344 • For each NAT rule in the OVN Northbound database on a
3345 distributed router, a priority-50 logical flow with match
3346 ip4.dst == E && is_chassis_resident(P), where E is the
3347 external IP address specified in the NAT rule, GW is the
3348 logical router distributed gateway port. For
3349 dnat_and_snat NAT rule, P is the logical port specified
3350 in the NAT rule. If logical_port column of NAT table is
3351 NOT set, then P is the chassisredirect port of GW with
3352 the actions: REGBIT_DST_NAT_IP_LOCAL = 1; next;
3353
3354 • A priority-0 logical flow with match 1 has actions REG‐
3355 BIT_DST_NAT_IP_LOCAL = 0; next;.
3356
3357 This table also installs a priority-50 logical flow for each logical
3358 router that has NATs configured on it. The flow has match ip && ct_la‐
3359 bel.natted == 1 and action REGBIT_DST_NAT_IP_LOCAL = 1; next;. This is
3360 intended to ensure that traffic that was DNATted locally will use a
3361 separate conntrack zone for SNAT if SNAT is required later in the
3362 egress pipeline. Note that this flow checks the value of ct_label.nat‐
3363 ted, which is set in the ingress pipeline. This means that ovn-northd
3364 assumes that this value is carried over from the ingress pipeline to
3365 the egress pipeline and is not altered or cleared. If conntrack label
3366 values are ever changed to be cleared between the ingress and egress
3367 pipelines, then the match conditions of this flow will be updated ac‐
3368 cordingly.
3369
3370 Egress Table 1: UNDNAT
3371
3372 This is for already established connections’ reverse traffic. i.e.,
3373 DNAT has already been done in ingress pipeline and now the packet has
3374 entered the egress pipeline as part of a reply. This traffic is unD‐
3375 NATed here.
3376
3377 • A priority-0 logical flow with match 1 has actions next;.
3378
3379 Egress Table 1: UNDNAT on Gateway Routers
3380
3381 • For all IP packets, a priority-50 flow with an action
3382 flags.loopback = 1; ct_dnat;.
3383
3384 Egress Table 1: UNDNAT on Distributed Routers
3385
3386 • For all the configured load balancing rules for a router
3387 with gateway port in OVN_Northbound database that in‐
3388 cludes an IPv4 address VIP, for every backend IPv4 ad‐
3389 dress B defined for the VIP a priority-120 flow is pro‐
3390 grammed on gateway chassis that matches ip && ip4.src ==
3391 B && outport == GW, where GW is the logical router gate‐
3392 way port with an action ct_dnat_in_czone;. If the backend
3393 IPv4 address B is also configured with L4 port PORT of
3394 protocol P, then the match also includes P.src == PORT.
3395 These flows are not added for load balancers with IPv6
3396 VIPs.
3397
3398 If the router is configured to force SNAT any load-bal‐
3399 anced packets, above action will be replaced by
3400 flags.force_snat_for_lb = 1; ct_dnat;.
3401
3402 • For each configuration in the OVN Northbound database
3403 that asks to change the destination IP address of a
3404 packet from an IP address of A to B, a priority-100 flow
3405 matches ip && ip4.src == B && outport == GW, where GW is
3406 the logical router gateway port, with an action
3407 ct_dnat_in_czone;. If the NAT rule is of type
3408 dnat_and_snat and has stateless=true in the options, then
3409 the action would be next;.
3410
3411 If the NAT rule cannot be handled in a distributed man‐
3412 ner, then the priority-100 flow above is only programmed
3413 on the gateway chassis with the action ct_dnat_in_czone.
3414
3415 If the NAT rule can be handled in a distributed manner,
3416 then there is an additional action eth.src = EA;, where
3417 EA is the ethernet address associated with the IP address
3418 A in the NAT rule. This allows upstream MAC learning to
3419 point to the correct chassis.
3420
3421 Egress Table 2: Post UNDNAT
3422
3423 • A priority-50 logical flow is added that commits any un‐
3424 tracked flows from the previous table lr_out_undnat for
3425 Gateway routers. This flow matches on ct.new && ip with
3426 action ct_commit { } ; next; .
3427
3428 • A priority-0 logical flow with match 1 has actions next;.
3429
3430 Egress Table 3: SNAT
3431
3432 Packets that are configured to be SNATed get their source IP address
3433 changed based on the configuration in the OVN Northbound database.
3434
3435 • A priority-120 flow to advance the IPv6 Neighbor solici‐
3436 tation packet to next table to skip SNAT. In the case
3437 where ovn-controller injects an IPv6 Neighbor Solicita‐
3438 tion packet (for nd_ns action) we don’t want the packet
3439 to go through conntrack.
3440
3441 Egress Table 3: SNAT on Gateway Routers
3442
3443 • If the Gateway router in the OVN Northbound database has
3444 been configured to force SNAT a packet (that has been
3445 previously DNATted) to B, a priority-100 flow matches
3446 flags.force_snat_for_dnat == 1 && ip with an action
3447 ct_snat(B);.
3448
3449 • If a load balancer configured to skip snat has been ap‐
3450 plied to the Gateway router pipeline, a priority-120 flow
3451 matches flags.skip_snat_for_lb == 1 && ip with an action
3452 next;.
3453
3454 • If the Gateway router in the OVN Northbound database has
3455 been configured to force SNAT a packet (that has been
3456 previously load-balanced) using router IP (i.e op‐
3457 tions:lb_force_snat_ip=router_ip), then for each logical
3458 router port P attached to the Gateway router, a prior‐
3459 ity-110 flow matches flags.force_snat_for_lb == 1 && out‐
3460 port == P
3461 with an action ct_snat(R); where R is the IP configured
3462 on the router port. If R is an IPv4 address then the
3463 match will also include ip4 and if it is an IPv6 address,
3464 then the match will also include ip6.
3465
3466 If the logical router port P is configured with multiple
3467 IPv4 and multiple IPv6 addresses, only the first IPv4 and
3468 first IPv6 address is considered.
3469
3470 • If the Gateway router in the OVN Northbound database has
3471 been configured to force SNAT a packet (that has been
3472 previously load-balanced) to B, a priority-100 flow
3473 matches flags.force_snat_for_lb == 1 && ip with an action
3474 ct_snat(B);.
3475
3476 • For each configuration in the OVN Northbound database,
3477 that asks to change the source IP address of a packet
3478 from an IP address of A or to change the source IP ad‐
3479 dress of a packet that belongs to network A to B, a flow
3480 matches ip && ip4.src == A && (!ct.trk || !ct.rpl) with
3481 an action ct_snat(B);. The priority of the flow is calcu‐
3482 lated based on the mask of A, with matches having larger
3483 masks getting higher priorities. If the NAT rule is of
3484 type dnat_and_snat and has stateless=true in the options,
3485 then the action would be ip4/6.src= (B).
3486
3487 • If the NAT rule has allowed_ext_ips configured, then
3488 there is an additional match ip4.dst == allowed_ext_ips .
3489 Similarly, for IPV6, match would be ip6.dst == al‐
3490 lowed_ext_ips.
3491
3492 • If the NAT rule has exempted_ext_ips set, then there is
3493 an additional flow configured at the priority + 1 of cor‐
3494 responding NAT rule. The flow matches if destination ip
3495 is an exempted_ext_ip and the action is next; . This flow
3496 is used to bypass the ct_snat action for a packet which
3497 is destinted to exempted_ext_ips.
3498
3499 • A priority-0 logical flow with match 1 has actions next;.
3500
3501 Egress Table 3: SNAT on Distributed Routers
3502
3503 • For each configuration in the OVN Northbound database,
3504 that asks to change the source IP address of a packet
3505 from an IP address of A or to change the source IP ad‐
3506 dress of a packet that belongs to network A to B, two
3507 flows are added. The priority P of these flows are calcu‐
3508 lated based on the mask of A, with matches having larger
3509 masks getting higher priorities.
3510
3511 If the NAT rule cannot be handled in a distributed man‐
3512 ner, then the below flows are only programmed on the
3513 gateway chassis increasing flow priority by 128 in order
3514 to be run first.
3515
3516 • The first flow is added with the calculated prior‐
3517 ity P and match ip && ip4.src == A && outport ==
3518 GW, where GW is the logical router gateway port,
3519 with an action ct_snat_in_czone(B); to SNATed in
3520 the common zone. If the NAT rule is of type
3521 dnat_and_snat and has stateless=true in the op‐
3522 tions, then the action would be ip4/6.src=(B).
3523
3524 • The second flow is added with the calculated pri‐
3525 ority P + 1 and match ip && ip4.src == A && out‐
3526 port == GW && REGBIT_DST_NAT_IP_LOCAL == 0, where
3527 GW is the logical router gateway port, with an ac‐
3528 tion ct_snat(B); to SNAT in the snat zone. If the
3529 NAT rule is of type dnat_and_snat and has state‐
3530 less=true in the options, then the action would be
3531 ip4/6.src=(B).
3532
3533 If the NAT rule can be handled in a distributed manner,
3534 then there is an additional action (for both the flows)
3535 eth.src = EA;, where EA is the ethernet address associ‐
3536 ated with the IP address A in the NAT rule. This allows
3537 upstream MAC learning to point to the correct chassis.
3538
3539 If the NAT rule has allowed_ext_ips configured, then
3540 there is an additional match ip4.dst == allowed_ext_ips .
3541 Similarly, for IPV6, match would be ip6.dst == al‐
3542 lowed_ext_ips.
3543
3544 If the NAT rule has exempted_ext_ips set, then there is
3545 an additional flow configured at the priority P + 2 of
3546 corresponding NAT rule. The flow matches if destination
3547 ip is an exempted_ext_ip and the action is next; . This
3548 flow is used to bypass the ct_snat action for a flow
3549 which is destinted to exempted_ext_ips.
3550
3551 • A priority-0 logical flow with match 1 has actions next;.
3552
3553 Egress Table 4: Egress Loopback
3554
3555 For distributed logical routers where one of the logical router ports
3556 specifies a gateway chassis.
3557
3558 While UNDNAT and SNAT processing have already occurred by this point,
3559 this traffic needs to be forced through egress loopback on this dis‐
3560 tributed gateway port instance, in order for UNSNAT and DNAT processing
3561 to be applied, and also for IP routing and ARP resolution after all of
3562 the NAT processing, so that the packet can be forwarded to the destina‐
3563 tion.
3564
3565 This table has the following flows:
3566
3567 • For each NAT rule in the OVN Northbound database on a
3568 distributed router, a priority-100 logical flow with
3569 match ip4.dst == E && outport == GW && is_chassis_resi‐
3570 dent(P), where E is the external IP address specified in
3571 the NAT rule, GW is the distributed gateway port corre‐
3572 sponding to the NAT rule (specified or inferred). For
3573 dnat_and_snat NAT rule, P is the logical port specified
3574 in the NAT rule. If logical_port column of NAT table is
3575 NOT set, then P is the chassisredirect port of GW with
3576 the following actions:
3577
3578 clone {
3579 ct_clear;
3580 inport = outport;
3581 outport = "";
3582 flags = 0;
3583 flags.loopback = 1;
3584 flags.use_snat_zone = REGBIT_DST_NAT_IP_LOCAL;
3585 reg0 = 0;
3586 reg1 = 0;
3587 ...
3588 reg9 = 0;
3589 REGBIT_EGRESS_LOOPBACK = 1;
3590 next(pipeline=ingress, table=0);
3591 };
3592
3593
3594 flags.loopback is set since in_port is unchanged and the
3595 packet may return back to that port after NAT processing.
3596 REGBIT_EGRESS_LOOPBACK is set to indicate that egress
3597 loopback has occurred, in order to skip the source IP ad‐
3598 dress check against the router address.
3599
3600 • A priority-0 logical flow with match 1 has actions next;.
3601
3602 Egress Table 5: Delivery
3603
3604 Packets that reach this table are ready for delivery. It contains:
3605
3606 • Priority-110 logical flows that match IP multicast pack‐
3607 ets on each enabled logical router port and modify the
3608 Ethernet source address of the packets to the Ethernet
3609 address of the port and then execute action output;.
3610
3611 • Priority-100 logical flows that match packets on each en‐
3612 abled logical router port, with action output;.
3613
3614 • A priority-0 logical flow that matches all packets not
3615 already handled (match 1) and drops them (action drop;).
3616
3618 As described in the previous section, there are several places where
3619 ovn-northd might decided to drop a packet by explicitly creating a Log‐
3620 ical_Flow with the drop; action.
3621
3622 When debug drop-sampling has been cofigured in the OVN Northbound data‐
3623 base, the ovn-northd will replace all the drop; actions with a sam‐
3624 ple(priority=65535, collector_set=id, obs_domain=obs_id,
3625 obs_point=@cookie) action, where:
3626
3627 • id is the value the debug_drop_collector_set option con‐
3628 figured in the OVN Northbound.
3629
3630 • obs_id has it’s 8 most significant bits equal to the
3631 value of the debug_drop_domain_id option in the OVN
3632 Northbound and it’s 24 least significant bits equal to
3633 the datapath’s tunnel key.
3634
3635
3636
3637OVN 22.12.0 ovn-northd ovn-northd(8)