1ovn-northd(8)                     OVN Manual                     ovn-northd(8)
2
3
4

NAME

6       ovn-northd  and ovn-northd-ddlog - Open Virtual Network central control
7       daemon
8

SYNOPSIS

10       ovn-northd [options]
11

DESCRIPTION

13       ovn-northd is a centralized  daemon  responsible  for  translating  the
14       high-level  OVN  configuration into logical configuration consumable by
15       daemons such as ovn-controller. It translates the logical network  con‐
16       figuration  in  terms  of conventional network concepts, taken from the
17       OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
18       the OVN Southbound Database (see ovn-sb(5)) below it.
19
20       ovn-northd is implemented in C. ovn-northd-ddlog is a compatible imple‐
21       mentation written in DDlog, a language for  incremental  database  pro‐
22       cessing.  This documentation applies to both implementations, with dif‐
23       ferences indicated where relevant.
24

OPTIONS

26       --ovnnb-db=database
27              The OVSDB database containing the OVN  Northbound  Database.  If
28              the  OVN_NB_DB environment variable is set, its value is used as
29              the default. Otherwise, the default is unix:/ovnnb_db.sock.
30
31       --ovnsb-db=database
32              The OVSDB database containing the OVN  Southbound  Database.  If
33              the  OVN_SB_DB environment variable is set, its value is used as
34              the default. Otherwise, the default is unix:/ovnsb_db.sock.
35
36       --ddlog-record=file
37              This option is for ovn-north-ddlog only. It causes the daemon to
38              record  the  initial database state and later changes to file in
39              the text-based DDlog command format. The ovn_northd_cli  program
40              can  later replay these changes for debugging purposes. This op‐
41              tion has a performance impact. See  debugging-ddlog.rst  in  the
42              OVN documentation for more details.
43
44       --dry-run
45              Causes   ovn-northd  to  start  paused.  In  the  paused  state,
46              ovn-northd does not apply any changes to the databases, although
47              it  continues  to  monitor  them.  For more information, see the
48              pause command, under Runtime Management Commands below.
49
50              For  ovn-northd-ddlog,  one   could   use   this   option   with
51              --ddlog-record  to  generate  a  replay log without restarting a
52              process or disturbing a running system.
53
54       n-threads N
55              In certain situations, it may  be  desirable  to  enable  paral‐
56              lelization  on  a  system  to decrease latency (at the potential
57              cost of increasing CPU usage).
58
59              This option will cause ovn-northd to use N threads when building
60              logical flows, when N is within [2-256]. If N is 1, paralleliza‐
61              tion is disabled (default behavior). If N is less than 1, then N
62              is  set  to  1,  parallelization  is  disabled  and a warning is
63              logged. If N is more than 256, then N  is  set  to  256,  paral‐
64              lelization  is  enabled  (with  256  threads)  and  a warning is
65              logged.
66
67              ovn-northd-ddlog does not support this option.
68
69       database in the above options must be an OVSDB active or  passive  con‐
70       nection method, as described in ovsdb(7).
71
72   Daemon Options
73       --pidfile[=pidfile]
74              Causes a file (by default, program.pid) to be created indicating
75              the PID of the running process. If the pidfile argument  is  not
76              specified, or if it does not begin with /, then it is created in
77              .
78
79              If --pidfile is not specified, no pidfile is created.
80
81       --overwrite-pidfile
82              By default, when --pidfile is specified and the  specified  pid‐
83              file already exists and is locked by a running process, the dae‐
84              mon refuses to start. Specify --overwrite-pidfile to cause it to
85              instead overwrite the pidfile.
86
87              When --pidfile is not specified, this option has no effect.
88
89       --detach
90              Runs  this  program  as a background process. The process forks,
91              and in the child it starts a new session,  closes  the  standard
92              file descriptors (which has the side effect of disabling logging
93              to the console), and changes its current directory to  the  root
94              (unless  --no-chdir is specified). After the child completes its
95              initialization, the parent exits.
96
97       --monitor
98              Creates an additional process to monitor  this  program.  If  it
99              dies  due  to a signal that indicates a programming error (SIGA‐
100              BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
101              or SIGXFSZ) then the monitor process starts a new copy of it. If
102              the daemon dies or exits for another reason, the monitor process
103              exits.
104
105              This  option  is  normally used with --detach, but it also func‐
106              tions without it.
107
108       --no-chdir
109              By default, when --detach is specified, the daemon  changes  its
110              current  working  directory  to  the root directory after it de‐
111              taches. Otherwise, invoking the daemon from a carelessly  chosen
112              directory  would  prevent  the administrator from unmounting the
113              file system that holds that directory.
114
115              Specifying --no-chdir suppresses this behavior,  preventing  the
116              daemon  from changing its current working directory. This may be
117              useful for collecting core files, since it is common behavior to
118              write core dumps into the current working directory and the root
119              directory is not a good directory to use.
120
121              This option has no effect when --detach is not specified.
122
123       --no-self-confinement
124              By default this daemon will try to self-confine itself  to  work
125              with  files  under  well-known  directories  determined at build
126              time. It is better to stick with this default behavior  and  not
127              to  use  this  flag  unless some other Access Control is used to
128              confine daemon. Note that in contrast to  other  access  control
129              implementations  that  are  typically enforced from kernel-space
130              (e.g. DAC or MAC), self-confinement is imposed  from  the  user-
131              space daemon itself and hence should not be considered as a full
132              confinement strategy, but instead should be viewed as  an  addi‐
133              tional layer of security.
134
135       --user=user:group
136              Causes  this  program  to  run  as a different user specified in
137              user:group, thus dropping most of  the  root  privileges.  Short
138              forms  user  and  :group  are also allowed, with current user or
139              group assumed, respectively. Only daemons started  by  the  root
140              user accepts this argument.
141
142              On   Linux,   daemons   will   be   granted   CAP_IPC_LOCK   and
143              CAP_NET_BIND_SERVICES before dropping root  privileges.  Daemons
144              that  interact  with  a  datapath, such as ovs-vswitchd, will be
145              granted three  additional  capabilities,  namely  CAP_NET_ADMIN,
146              CAP_NET_BROADCAST  and  CAP_NET_RAW.  The capability change will
147              apply even if the new user is root.
148
149              On Windows, this option is not currently supported. For security
150              reasons,  specifying  this  option will cause the daemon process
151              not to start.
152
153   Logging Options
154       -v[spec]
155       --verbose=[spec]
156            Sets logging levels. Without any spec, sets the log level for  ev‐
157            ery  module  and  destination to dbg. Otherwise, spec is a list of
158            words separated by spaces or commas or colons, up to one from each
159            category below:
160
161            •      A  valid module name, as displayed by the vlog/list command
162                   on ovs-appctl(8), limits the log level change to the speci‐
163                   fied module.
164
165syslog,  console, or file, to limit the log level change to
166                   only to the system log, to the console, or to a  file,  re‐
167                   spectively.  (If  --detach  is specified, the daemon closes
168                   its standard file descriptors, so logging  to  the  console
169                   will have no effect.)
170
171                   On  Windows  platform,  syslog is accepted as a word and is
172                   only useful along with the --syslog-target option (the word
173                   has no effect otherwise).
174
175off,  emer,  err,  warn,  info,  or dbg, to control the log
176                   level. Messages of the given severity  or  higher  will  be
177                   logged,  and  messages  of  lower severity will be filtered
178                   out. off filters out all messages. See ovs-appctl(8) for  a
179                   definition of each log level.
180
181            Case is not significant within spec.
182
183            Regardless  of the log levels set for file, logging to a file will
184            not take place unless --log-file is also specified (see below).
185
186            For compatibility with older versions of OVS, any is accepted as a
187            word but has no effect.
188
189       -v
190       --verbose
191            Sets  the  maximum  logging  verbosity level, equivalent to --ver‐
192            bose=dbg.
193
194       -vPATTERN:destination:pattern
195       --verbose=PATTERN:destination:pattern
196            Sets the log pattern for destination to pattern. Refer to  ovs-ap‐
197            pctl(8) for a description of the valid syntax for pattern.
198
199       -vFACILITY:facility
200       --verbose=FACILITY:facility
201            Sets  the RFC5424 facility of the log message. facility can be one
202            of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
203            ftp,  ntp,  audit,  alert, clock2, local0, local1, local2, local3,
204            local4, local5, local6 or local7. If this option is not specified,
205            daemon  is used as the default for the local system syslog and lo‐
206            cal0 is used while sending a message to the  target  provided  via
207            the --syslog-target option.
208
209       --log-file[=file]
210            Enables  logging  to a file. If file is specified, then it is used
211            as the exact name for the log file. The default log file name used
212            if file is omitted is /var/log/ovn/program.log.
213
214       --syslog-target=host:port
215            Send  syslog messages to UDP port on host, in addition to the sys‐
216            tem syslog. The host must be a numerical IP address, not  a  host‐
217            name.
218
219       --syslog-method=method
220            Specify  method  as  how  syslog messages should be sent to syslog
221            daemon. The following forms are supported:
222
223libc, to use the libc syslog() function. Downside of  using
224                   this  options  is that libc adds fixed prefix to every mes‐
225                   sage before it is actually sent to the syslog  daemon  over
226                   /dev/log UNIX domain socket.
227
228unix:file, to use a UNIX domain socket directly. It is pos‐
229                   sible to specify arbitrary message format with this option.
230                   However,  rsyslogd  8.9  and  older versions use hard coded
231                   parser function anyway that limits UNIX domain socket  use.
232                   If  you  want  to  use  arbitrary message format with older
233                   rsyslogd versions, then use UDP socket to localhost IP  ad‐
234                   dress instead.
235
236udp:ip:port,  to  use  a UDP socket. With this method it is
237                   possible to use arbitrary message format  also  with  older
238                   rsyslogd.  When sending syslog messages over UDP socket ex‐
239                   tra precaution needs to be taken into account, for example,
240                   syslog daemon needs to be configured to listen on the spec‐
241                   ified UDP port, accidental iptables rules could  be  inter‐
242                   fering  with  local syslog traffic and there are some secu‐
243                   rity considerations that apply to UDP sockets, but  do  not
244                   apply to UNIX domain sockets.
245
246null, to discard all messages logged to syslog.
247
248            The  default is taken from the OVS_SYSLOG_METHOD environment vari‐
249            able; if it is unset, the default is libc.
250
251   PKI Options
252       PKI configuration is required in order to use SSL for  the  connections
253       to the Northbound and Southbound databases.
254
255              -p privkey.pem
256              --private-key=privkey.pem
257                   Specifies  a  PEM  file  containing the private key used as
258                   identity for outgoing SSL connections.
259
260              -c cert.pem
261              --certificate=cert.pem
262                   Specifies a PEM file containing a certificate  that  certi‐
263                   fies the private key specified on -p or --private-key to be
264                   trustworthy. The certificate must be signed by the certifi‐
265                   cate  authority  (CA) that the peer in SSL connections will
266                   use to verify it.
267
268              -C cacert.pem
269              --ca-cert=cacert.pem
270                   Specifies a PEM file containing the CA certificate for ver‐
271                   ifying certificates presented to this program by SSL peers.
272                   (This may be the same certificate that  SSL  peers  use  to
273                   verify the certificate specified on -c or --certificate, or
274                   it may be a different one, depending on the PKI  design  in
275                   use.)
276
277              -C none
278              --ca-cert=none
279                   Disables  verification  of  certificates  presented  by SSL
280                   peers. This introduces a security risk,  because  it  means
281                   that  certificates  cannot be verified to be those of known
282                   trusted hosts.
283
284   Other Options
285       --unixctl=socket
286              Sets the name of the control socket on which program listens for
287              runtime  management  commands  (see RUNTIME MANAGEMENT COMMANDS,
288              below). If socket does not begin with /, it  is  interpreted  as
289              relative  to  .  If  --unixctl  is  not used at all, the default
290              socket is /program.pid.ctl, where pid is program’s process ID.
291
292              On Windows a local named pipe is used to listen for runtime man‐
293              agement  commands.  A  file  is  created in the absolute path as
294              pointed by socket or if --unixctl is not used at all, a file  is
295              created  as  program in the configured OVS_RUNDIR directory. The
296              file exists just to mimic the behavior of a Unix domain socket.
297
298              Specifying none for socket disables the control socket feature.
299
300
301
302       -h
303       --help
304            Prints a brief help message to the console.
305
306       -V
307       --version
308            Prints version information to the console.
309

RUNTIME MANAGEMENT COMMANDS

311       ovs-appctl can send commands to a running ovn-northd process. The  cur‐
312       rently supported commands are described below.
313
314              exit   Causes ovn-northd to gracefully terminate.
315
316              pause  Pauses ovn-northd. When it is paused, ovn-northd receives
317                     changes  from  the  Northbound  and  Southbound  database
318                     changes  as  usual,  but  it does not send any updates. A
319                     paused ovn-northd also drops database locks, which allows
320                     any other non-paused instance of ovn-northd to take over.
321
322              resume Resumes  the  ovn-northd  operation to process Northbound
323                     and Southbound database  contents  and  generate  logical
324                     flows.  This  will also instruct ovn-northd to aspire for
325                     the lock on SB DB.
326
327              is-paused
328                     Returns "true" if ovn-northd is currently paused, "false"
329                     otherwise.
330
331              status Prints  this  server’s status. Status will be "active" if
332                     ovn-northd has acquired OVSDB lock on SB DB, "standby" if
333                     it has not or "paused" if this instance is paused.
334
335              sb-cluster-state-reset
336                     Reset  southbound  database cluster status when databases
337                     are destroyed and rebuilt.
338
339                     If all databases in a clustered southbound  database  are
340                     removed from disk, then the stored index of all databases
341                     will be reset to zero. This will cause ovn-northd  to  be
342                     unable  to  read or write to the southbound database, be‐
343                     cause it will always detect the data as stale. In such  a
344                     case,  run this command so that ovn-northd will reset its
345                     local index so that it can interact with  the  southbound
346                     database again.
347
348              nb-cluster-state-reset
349                     Reset  northbound  database cluster status when databases
350                     are destroyed and rebuilt.
351
352                     This performs the same task as sb-cluster-state-reset ex‐
353                     cept for the northbound database client.
354
355              set-n-threads N
356                     Set  the  number  of  threads  used  for building logical
357                     flows. When N is within [2-256], parallelization  is  en‐
358                     abled. When N is 1 parallelization is disabled. When N is
359                     less than 1 or more than 256, an error  is  returned.  If
360                     ovn-northd  fails to start parallelization (e.g. fails to
361                     setup semaphores, parallelization is disabled and an  er‐
362                     ror is returned.
363
364              get-n-threads
365                     Return  the  number  of threads used for building logical
366                     flows.
367
368       Only ovn-northd-ddlog supports the following commands:
369
370              enable-cpu-profiling
371              disable-cpu-profiling
372                   Enables or disables profiling of CPU time used by the DDlog
373                   engine.  When CPU profiling is enabled, the profile command
374                   (see below) will include DDlog CPU usage statistics in  its
375                   output.  Enabling CPU profiling will slow ovn-northd-ddlog.
376                   Disabling CPU  profiling  does  not  clear  any  previously
377                   recorded statistics.
378
379              profile
380                   Outputs a profile of the current and peak sizes of arrange‐
381                   ments inside DDlog. This profiling data can be  useful  for
382                   optimizing  DDlog code. If CPU profiling was previously en‐
383                   abled (even if it was later disabled), the output also  in‐
384                   cludes  a  CPU time profile. See Profiling inside the tuto‐
385                   rial in the DDlog repository for an introduction to profil‐
386                   ing DDlog.
387

ACTIVE-STANDBY FOR HIGH AVAILABILITY

389       You  may  run ovn-northd more than once in an OVN deployment. When con‐
390       nected to a standalone or clustered DB setup,  OVN  will  automatically
391       ensure that only one of them is active at a time. If multiple instances
392       of ovn-northd are running and the active ovn-northd fails, one  of  the
393       hot standby instances of ovn-northd will automatically take over.
394
395   Active-Standby with multiple OVN DB servers
396       You may run multiple OVN DB servers in an OVN deployment with:
397
398              •      OVN  DB  servers deployed in active/passive mode with one
399                     active and multiple passive ovsdb-servers.
400
401ovn-northd also deployed on all these nodes,  using  unix
402                     ctl sockets to connect to the local OVN DB servers.
403
404       In  such deployments, the ovn-northds on the passive nodes will process
405       the DB changes and compute logical flows to be thrown  out  later,  be‐
406       cause  write transactions are not allowed by the passive ovsdb-servers.
407       It results in unnecessary CPU usage.
408
409       With the help of  runtime  management  command  pause,  you  can  pause
410       ovn-northd  on these nodes. When a passive node becomes master, you can
411       use the runtime management command resume to resume the  ovn-northd  to
412       process the DB changes.
413

LOGICAL FLOW TABLE STRUCTURE

415       One  of the main purposes of ovn-northd is to populate the Logical_Flow
416       table in  the  OVN_Southbound  database.  This  section  describes  how
417       ovn-northd does this for switch and router logical datapaths.
418
419   Logical Switch Datapaths
420     Ingress Table 0: Admission Control and Ingress Port Security check
421
422       Ingress table 0 contains these logical flows:
423
424              •      Priority 100 flows to drop packets with VLAN tags or mul‐
425                     ticast Ethernet source addresses.
426
427              •      For each disabled logical port, a priority  100  flow  is
428                     added which matches on all packets and applies the action
429                     REGBIT_PORT_SEC_DROP" = 1; next;" so that the packets are
430                     dropped in the next stage.
431
432              •      For  each (enabled) vtep logical port, a priority 70 flow
433                     is added which matches on all packets and applies the ac‐
434                     tion  next(pipeline=ingress, table=S_SWITCH_IN_L2_LKUP) =
435                     1; to skip most stages of ingress  pipeline  and  go  di‐
436                     rectly to ingress L2 lookup table to determine the output
437                     port. Packets from VTEP (RAMP) switch should not be  sub‐
438                     jected to any ACL checks. Egress pipeline will do the ACL
439                     checks.
440
441              •      For each enabled logical port configured with qdisc queue
442                     id   in   the   options:qdisc_queue_id  column  of  Logi‐
443                     cal_Switch_Port,  a  priority  70  flow  is  added  which
444                     matches   on   all   packets   and   applies  the  action
445                     set_queue(id);          REGBIT_PORT_SEC_DROP"           =
446                     check_in_port_sec(); next;".
447
448              •      A  priority  1 flow is added which matches on all packets
449                     for all the logical ports and  applies  the  action  REG‐
450                     BIT_PORT_SEC_DROP" = check_in_port_sec(); next; to evalu‐
451                     ate the port security. The action  check_in_port_sec  ap‐
452                     plies  the  port security rules defined in the port_secu‐
453                     rity column of Logical_Switch_Port table.
454
455     Ingress Table 1: Ingress Port Security - Apply
456
457       This table drops the packets if the port security check failed  in  the
458       previous stage i.e the register bit REGBIT_PORT_SEC_DROP is set to 1.
459
460       Ingress table 1 contains these logical flows:
461
462              •      A  priority-50 fallback flow that drops the packet if the
463                     register bit REGBIT_PORT_SEC_DROP is set to 1.
464
465              •      One priority-0 fallback flow that matches all packets and
466                     advances to the next table.
467
468     Ingress Table 2: Lookup MAC address learning table
469
470       This  table looks up the MAC learning table of the logical switch data‐
471       path to check if the port-mac pair is present or  not.  MAC  is  learnt
472       only  for  logical switch VIF ports whose port security is disabled and
473       ’unknown’ address set.
474
475              •      For each such logical port p whose port security is  dis‐
476                     abled and ’unknown’ address set following flow is added.
477
478                     •      Priority  100  flow with the match inport == p and
479                            action  reg0[11]  =  lookup_fdb(inport,  eth.src);
480                            next;
481
482              •      One priority-0 fallback flow that matches all packets and
483                     advances to the next table.
484
485     Ingress Table 3: Learn MAC of ’unknown’ ports.
486
487       This table learns the MAC addresses seen on  the  logical  ports  whose
488       port  security  is disabled and ’unknown’ address set if the lookup_fdb
489       action returned false in the previous table.
490
491              •      For each such logical port p whose port security is  dis‐
492                     abled and ’unknown’ address set following flow is added.
493
494                     •      Priority  100  flow  with the match inport == p &&
495                            reg0[11] == 0 and action put_fdb(inport, eth.src);
496                            next;  which stores the port-mac in the mac learn‐
497                            ing table of the logical switch datapath  and  ad‐
498                            vances the packet to the next table.
499
500              •      One priority-0 fallback flow that matches all packets and
501                     advances to the next table.
502
503     Ingress Table 4: from-lport Pre-ACLs
504
505       This table prepares flows  for  possible  stateful  ACL  processing  in
506       ingress  table  ACLs.  It  contains a priority-0 flow that simply moves
507       traffic to the next table. If stateful ACLs are  used  in  the  logical
508       datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
509       1; next;) for table Pre-stateful to send IP packets to  the  connection
510       tracker  before  eventually advancing to ingress table ACLs. If special
511       ports such as route ports or localnet ports can’t use  ct(),  a  prior‐
512       ity-110  flow  is  added  to  skip  over stateful ACLs. Multicast, IPv6
513       Neighbor Discovery and MLD traffic also skips stateful ACLs.  For  "al‐
514       low-stateless"  ACLs,  a  flow  is added to bypass setting the hint for
515       connection tracker processing.
516
517       This table also has a priority-110 flow with the match eth.dst == E for
518       all logical switch datapaths to move traffic to the next table. Where E
519       is the service monitor mac defined in the options:svc_monitor_mac colum
520       of NB_Global table.
521
522     Ingress Table 5: Pre-LB
523
524       This table prepares flows for possible stateful load balancing process‐
525       ing in ingress table LB and Stateful. It  contains  a  priority-0  flow
526       that  simply  moves traffic to the next table. Moreover it contains two
527       priority-110 flows to move multicast, IPv6 Neighbor Discovery  and  MLD
528       traffic  to the next table. If load balancing rules with virtual IP ad‐
529       dresses (and ports) are configured in  OVN_Northbound  database  for  a
530       logical switch datapath, a priority-100 flow is added with the match ip
531       to match on IP packets and sets the action reg0[2] = 1; next; to act as
532       a  hint  for  table  Pre-stateful  to send IP packets to the connection
533       tracker for packet de-fragmentation (and to possibly do  DNAT  for  al‐
534       ready established load balanced traffic) before eventually advancing to
535       ingress table Stateful. If controller_event has been enabled  and  load
536       balancing  rules with empty backends have been added in OVN_Northbound,
537       a 130 flow is added to trigger ovn-controller events whenever the chas‐
538       sis  receives  a packet for that particular VIP. If event-elb meter has
539       been previously created, it will be associated to the empty_lb  logical
540       flow
541
542       Prior  to OVN 20.09 we were setting the reg0[0] = 1 only if the IP des‐
543       tination matches the load balancer VIP. However  this  had  few  issues
544       cases  where  a logical switch doesn’t have any ACLs with allow-related
545       action. To understand the issue lets a  take  a  TCP  load  balancer  -
546       10.0.0.10:80=10.0.0.3:80.  If  a  logical  port - p1 with IP - 10.0.0.5
547       opens a TCP connection with the VIP - 10.0.0.10, then the packet in the
548       ingress  pipeline of ’p1’ is sent to the p1’s conntrack zone id and the
549       packet is load balanced to the backend - 10.0.0.3. For the reply packet
550       from  the  backend  lport,  it  is not sent to the conntrack of backend
551       lport’s zone id. This is fine as long as the packet is  valid.  Suppose
552       the  backend lport sends an invalid TCP packet (like incorrect sequence
553       number), the packet gets delivered to the lport ’p1’ without  unDNATing
554       the packet to the VIP - 10.0.0.10. And this causes the connection to be
555       reset by the lport p1’s VIF.
556
557       We can’t fix this issue by adding a logical flow to drop ct.inv packets
558       in  the  egress  pipeline  since it will drop all other connections not
559       destined to the load balancers. To fix this  issue,  we  send  all  the
560       packets  to the conntrack in the ingress pipeline if a load balancer is
561       configured. We can now add a lflow to drop ct.inv packets.
562
563       This table also has priority-120 flows that punt all  IGMP/MLD  packets
564       to  ovn-controller  if the switch is an interconnect switch with multi‐
565       cast snooping enabled.
566
567       This table also has a priority-110 flow with the match eth.dst == E for
568       all logical switch datapaths to move traffic to the next table. Where E
569       is the service monitor mac defined in the options:svc_monitor_mac colum
570       of NB_Global table.
571
572       This  table also has a priority-110 flow with the match inport == I for
573       all logical switch datapaths to move traffic to the next table. Where I
574       is  the  peer  of a logical router port. This flow is added to skip the
575       connection tracking of packets which enter from logical router datapath
576       to logical switch datapath.
577
578     Ingress Table 6: Pre-stateful
579
580       This  table prepares flows for all possible stateful processing in next
581       tables. It contains a priority-0 flow that simply moves traffic to  the
582       next table.
583
584              •      Priority-120  flows  that  send the packets to connection
585                     tracker using ct_lb_mark; as the action so that  the  al‐
586                     ready  established  traffic destined to the load balancer
587                     VIP gets DNATted based on a hint provided by the previous
588                     tables  (with  a  match for reg0[2] == 1 and on supported
589                     load balancer protocols and address families).  For  IPv4
590                     traffic  the  flows also load the original destination IP
591                     and transport port in registers reg1 and reg2.  For  IPv6
592                     traffic  the  flows also load the original destination IP
593                     and transport port in registers xxreg1 and reg2.
594
595              •      A priority-110  flow  sends  the  packets  to  connection
596                     tracker  based  on a hint provided by the previous tables
597                     (with a match for reg0[2] == 1) by using the  ct_lb_mark;
598                     action. This flow is added to handle the traffic for load
599                     balancer VIPs whose protocol is not defined  (mainly  for
600                     ICMP traffic).
601
602              •      A  priority-100  flow  sends  the  packets  to connection
603                     tracker based on a hint provided by the  previous  tables
604                     (with a match for reg0[0] == 1) by using the ct_next; ac‐
605                     tion.
606
607     Ingress Table 7: from-lport ACL hints
608
609       This table consists of logical flows that set hints (reg0 bits)  to  be
610       used  in  the next stage, in the ACL processing table, if stateful ACLs
611       or load balancers are configured. Multiple hints can  be  set  for  the
612       same packet. The possible hints are:
613
614reg0[7]:  the packet might match an allow-related ACL and
615                     might have to commit the connection to conntrack.
616
617reg0[8]: the packet might match an allow-related ACL  but
618                     there  will  be  no need to commit the connection to con‐
619                     ntrack because it already exists.
620
621reg0[9]: the packet might match a drop/reject.
622
623reg0[10]: the packet might match a  drop/reject  ACL  but
624                     the connection was previously allowed so it might have to
625                     be committed again with ct_label=1/1.
626
627       The table contains the following flows:
628
629              •      A priority-65535 flow to advance to the next table if the
630                     logical switch has no ACLs configured, otherwise a prior‐
631                     ity-0 flow to advance to the next table.
632
633              •      A priority-7 flow that matches on packets that initiate a
634                     new  session. This flow sets reg0[7] and reg0[9] and then
635                     advances to the next table.
636
637              •      A priority-6 flow that matches on packets that are in the
638                     request direction of an already existing session that has
639                     been marked  as  blocked.  This  flow  sets  reg0[7]  and
640                     reg0[9] and then advances to the next table.
641
642              •      A  priority-5  flow  that matches untracked packets. This
643                     flow sets reg0[8] and reg0[9] and then  advances  to  the
644                     next table.
645
646              •      A priority-4 flow that matches on packets that are in the
647                     request direction of an already existing session that has
648                     not  been  marked  as blocked. This flow sets reg0[8] and
649                     reg0[10] and then advances to the next table.
650
651              •      A priority-3 flow that matches on packets that are in not
652                     part  of established sessions. This flow sets reg0[9] and
653                     then advances to the next table.
654
655              •      A priority-2 flow that matches on packets that  are  part
656                     of  an  established  session  that  has  been  marked  as
657                     blocked. This flow sets reg0[9] and then advances to  the
658                     next table.
659
660              •      A  priority-1  flow that matches on packets that are part
661                     of an established session that has  not  been  marked  as
662                     blocked. This flow sets reg0[10] and then advances to the
663                     next table.
664
665     Ingress table 8: from-lport ACLs before LB
666
667       Logical flows in this table closely reproduce those in the ACL table in
668       the  OVN_Northbound  database  for the from-lport direction without the
669       option apply-after-lb set or set to false. The priority values from the
670       ACL  table  have  a  limited range and have 1000 added to them to leave
671       room for OVN default flows at both higher and lower priorities.
672
673allow ACLs translate into logical flows  with  the  next;
674                     action.  If there are any stateful ACLs on this datapath,
675                     then allow ACLs translate to ct_commit; next; (which acts
676                     as a hint for the next tables to commit the connection to
677                     conntrack). In case the ACL has  a  label  then  reg3  is
678                     loaded  with the label value and reg0[13] bit is set to 1
679                     (which acts as a hint for the next tables to  commit  the
680                     label to conntrack).
681
682allow-related  ACLs translate into logical flows with the
683                     ct_commit(ct_label=0/1); next; actions  for  new  connec‐
684                     tions and reg0[1] = 1; next; for existing connections. In
685                     case the ACL has a label then reg3 is loaded with the la‐
686                     bel  value  and reg0[13] bit is set to 1 (which acts as a
687                     hint for the next tables to  commit  the  label  to  con‐
688                     ntrack).
689
690allow-stateless  ACLs  translate  into logical flows with
691                     the next; action.
692
693reject ACLs translate into logical flows with the tcp_re‐
694                     set  { output <-> inport; next(pipeline=egress,table=5);}
695                     action for TCP  connections,icmp4/icmp6  action  for  UDP
696                     connections,   and   sctp_abort  {output  <-%gt;  inport;
697                     next(pipeline=egress,table=5);} action for SCTP  associa‐
698                     tions.
699
700              •      Other  ACLs  translate to drop; for new or untracked con‐
701                     nections and ct_commit(ct_label=1/1); for  known  connec‐
702                     tions.  Setting  ct_label  marks a connection as one that
703                     was previously allowed, but should no longer  be  allowed
704                     due to a policy change.
705
706       This  table contains a priority-65535 flow to advance to the next table
707       if the logical switch has no ACLs configured,  otherwise  a  priority-0
708       flow to advance to the next table so that ACLs allow packets by default
709       if options:default_acl_drop colum of NB_Global is  false  or  not  set.
710       Otherwise  the  flow action is set to drop; to implement a default drop
711       behavior.
712
713       If the logical datapath has a stateful ACL or a load balancer with  VIP
714       configured, the following flows will also be added:
715
716              •      If  options:default_acl_drop  colum of NB_Global is false
717                     or not set, a priority-1 flow that sets the hint to  com‐
718                     mit  IP  traffic that is not part of established sessions
719                     to the connection  tracker  (with  action  reg0[1]  =  1;
720                     next;).  This  is needed for the default allow policy be‐
721                     cause, while the initiator’s direction may not  have  any
722                     stateful  rules,  the  server’s  may  and then its return
723                     traffic would not be known and marked as invalid.
724
725              •      If options:default_acl_drop colum of NB_Global is true, a
726                     priority-1 flow that drops IP traffic that is not part of
727                     established sessions.
728
729              •      A priority-1 flow that sets the hint to commit IP traffic
730                     to  the  connection  tracker  (with  action  reg0[1] = 1;
731                     next;). This is needed for the default allow  policy  be‐
732                     cause,  while  the initiator’s direction may not have any
733                     stateful rules, the server’s  may  and  then  its  return
734                     traffic would not be known and marked as invalid.
735
736              •      A  priority-65532 flow that allows any traffic in the re‐
737                     ply direction for a connection that has been committed to
738                     the connection tracker (i.e., established flows), as long
739                     as the committed flow does not have ct_mark.blocked  set.
740                     We  only  handle  traffic in the reply direction here be‐
741                     cause we want all packets going in the request  direction
742                     to  still  go  through  the flows that implement the cur‐
743                     rently defined policy based on ACLs. If a  connection  is
744                     no longer allowed by policy, ct_mark.blocked will get set
745                     and packets in the reply direction will no longer be  al‐
746                     lowed,  either.  This  flow also clears the register bits
747                     reg0[9] and reg0[10]. If ACL logging and logging  of  re‐
748                     lated packets is enabled, then a companion priority-65533
749                     flow will be installed that accomplishes the  same  thing
750                     but also logs the traffic.
751
752              •      A  priority-65532  flow  that  allows any traffic that is
753                     considered related to a committed flow in the  connection
754                     tracker  (e.g.,  an ICMP Port Unreachable from a non-lis‐
755                     tening UDP port), as long as the committed flow does  not
756                     have  ct_mark.blocked  set. If ACL logging and logging of
757                     related packets  is  enabled,  then  a  companion  prior‐
758                     ity-65533  flow  will  be installed that accomplishes the
759                     same thing but also logs the traffic.
760
761              •      A priority-65532 flow that drops all  traffic  marked  by
762                     the connection tracker as invalid.
763
764              •      A priority-65532 flow that drops all traffic in the reply
765                     direction with ct_mark.blocked set meaning that the  con‐
766                     nection  should  no  longer  be  allowed  due to a policy
767                     change. Packets in the request direction are skipped here
768                     to let a newly created ACL re-allow this connection.
769
770              •      A priority-65532 flow that allows IPv6 Neighbor solicita‐
771                     tion, Neighbor discover, Router solicitation, Router  ad‐
772                     vertisement and MLD packets.
773
774       If the logical datapath has any ACL or a load balancer with VIP config‐
775       ured, the following flow will also be added:
776
777              •      A priority 34000 logical flow is added for  each  logical
778                     switch  datapath  with the match eth.dst = E to allow the
779                     service monitor reply packet destined  to  ovn-controller
780                     with  the action next, where E is the service monitor mac
781                     defined in the options:svc_monitor_mac colum of NB_Global
782                     table.
783
784     Ingress Table 9: from-lport QoS Marking
785
786       Logical  flows  in  this table closely reproduce those in the QoS table
787       with the action column set  in  the  OVN_Northbound  database  for  the
788       from-lport direction.
789
790              •      For  every  qos_rules entry in a logical switch with DSCP
791                     marking enabled, a flow will be  added  at  the  priority
792                     mentioned in the QoS table.
793
794              •      One priority-0 fallback flow that matches all packets and
795                     advances to the next table.
796
797     Ingress Table 10: from-lport QoS Meter
798
799       Logical flows in this table closely reproduce those in  the  QoS  table
800       with  the  bandwidth  column set in the OVN_Northbound database for the
801       from-lport direction.
802
803              •      For every qos_rules entry in a logical switch with meter‐
804                     ing  enabled,  a  flow will be added at the priority men‐
805                     tioned in the QoS table.
806
807              •      One priority-0 fallback flow that matches all packets and
808                     advances to the next table.
809
810     Ingress Table 11: LB
811
812              •      For  all the configured load balancing rules for a switch
813                     in OVN_Northbound database that includes a L4  port  PORT
814                     of  protocol P and IP address VIP, a priority-120 flow is
815                     added. For IPv4 VIPs , the flow matches ct.new &&  ip  &&
816                     ip4.dst  == VIP && P && P.dst == PORT. For IPv6 VIPs, the
817                     flow matches ct.new && ip && ip6.dst == VIP && P && P.dst
818                     ==  PORT.  The  flow’s action is ct_lb_mark(args) , where
819                     args contains comma separated IP addresses (and  optional
820                     port  numbers)  to load balance to. The address family of
821                     the IP addresses of args is the same as the address  fam‐
822                     ily  of  VIP.  If health check is enabled, then args will
823                     only contain those endpoints whose service monitor status
824                     entry in OVN_Southbound db is either online or empty. For
825                     IPv4 traffic the flow also loads the original destination
826                     IP  and  transport  port  in registers reg1 and reg2. For
827                     IPv6 traffic the flow also loads the original destination
828                     IP and transport port in registers xxreg1 and reg2.
829
830              •      For  all the configured load balancing rules for a switch
831                     in OVN_Northbound database that includes just an  IP  ad‐
832                     dress  VIP to match on, OVN adds a priority-110 flow. For
833                     IPv4 VIPs, the flow matches ct.new && ip  &&  ip4.dst  ==
834                     VIP.  For  IPv6  VIPs,  the  flow matches ct.new && ip &&
835                     ip6.dst  ==   VIP.   The   action   on   this   flow   is
836                     ct_lb_mark(args),  where args contains comma separated IP
837                     addresses of the same address family  as  VIP.  For  IPv4
838                     traffic  the  flow also loads the original destination IP
839                     and transport port in registers reg1 and reg2.  For  IPv6
840                     traffic  the  flow also loads the original destination IP
841                     and transport port in registers xxreg1 and reg2.
842
843              •      If the load balancer is created with --reject option  and
844                     it  has no active backends, a TCP reset segment (for tcp)
845                     or an ICMP port unreachable packet (for all other kind of
846                     traffic)  will be sent whenever an incoming packet is re‐
847                     ceived for this load-balancer. Please note using --reject
848                     option will disable empty_lb SB controller event for this
849                     load balancer.
850
851     Ingress table 12: from-lport ACLs after LB
852
853       Logical flows in this table closely reproduce those in the ACL table in
854       the  OVN_Northbound  database for the from-lport direction with the op‐
855       tion apply-after-lb set to true. The priority values from the ACL table
856       have  a limited range and have 1000 added to them to leave room for OVN
857       default flows at both higher and lower priorities.
858
859allow apply-after-lb ACLs translate  into  logical  flows
860                     with  the  next;  action.  If there are any stateful ACLs
861                     (including both before-lb  and  after-lb  ACLs)  on  this
862                     datapath,  then  allow ACLs translate to ct_commit; next;
863                     (which acts as a hint for the next tables to  commit  the
864                     connection  to  conntrack).  In  case the ACL has a label
865                     then reg3 is loaded with the label value and reg0[13] bit
866                     is  set to 1 (which acts as a hint for the next tables to
867                     commit the label to conntrack).
868
869allow-related apply-after-lb ACLs translate into  logical
870                     flows with the ct_commit(ct_label=0/1); next; actions for
871                     new connections and reg0[1] = 1; next; for existing  con‐
872                     nections. In case the ACL has a label then reg3 is loaded
873                     with the label value and reg0[13] bit is set to 1  (which
874                     acts as a hint for the next tables to commit the label to
875                     conntrack).
876
877allow-stateless apply-after-lb ACLs translate into  logi‐
878                     cal flows with the next; action.
879
880reject  apply-after-lb  ACLs translate into logical flows
881                     with  the  tcp_reset  {  output  <->  inport;  next(pipe‐
882                     line=egress,table=5);}    action    for    TCP    connec‐
883                     tions,icmp4/icmp6  action  for   UDP   connections,   and
884                     sctp_abort     {output    <-%gt;    inport;    next(pipe‐
885                     line=egress,table=5);} action for SCTP associations.
886
887              •      Other apply-after-lb ACLs translate to drop; for  new  or
888                     untracked  connections  and  ct_commit(ct_label=1/1); for
889                     known connections. Setting ct_label marks a connection as
890                     one  that was previously allowed, but should no longer be
891                     allowed due to a policy change.
892
893              •      One priority-0 fallback flow that matches all packets and
894                     advances to the next table.
895
896     Ingress Table 13: Stateful
897
898              •      A  priority 100 flow is added which commits the packet to
899                     the conntrack and sets the most  significant  32-bits  of
900                     ct_label  with  the reg3 value based on the hint provided
901                     by previous tables (with a match  for  reg0[1]  ==  1  &&
902                     reg0[13]  ==  1).  This is used by the ACLs with label to
903                     commit the label value to conntrack.
904
905              •      For ACLs without label, a second priority-100  flow  com‐
906                     mits packets to connection tracker using ct_commit; next;
907                     action based on a hint provided by  the  previous  tables
908                     (with a match for reg0[1] == 1 && reg0[13] == 0).
909
910              •      A  priority-0  flow that simply moves traffic to the next
911                     table.
912
913     Ingress Table 14: Pre-Hairpin
914
915              •      If the logical switch has  load  balancer(s)  configured,
916                     then  a  priority-100  flow is added with the match ip &&
917                     ct.trk to check if the packet needs to be hairpinned  (if
918                     after  load  balancing  the  destination  IP  matches the
919                     source IP) or not by  executing  the  actions  reg0[6]  =
920                     chk_lb_hairpin();  and reg0[12] = chk_lb_hairpin_reply();
921                     and advances the packet to the next table.
922
923              •      A priority-0 flow that simply moves traffic to  the  next
924                     table.
925
926     Ingress Table 15: Nat-Hairpin
927
928              •      If  the  logical  switch has load balancer(s) configured,
929                     then a priority-100 flow is added with the  match  ip  &&
930                     ct.new && ct.trk && reg0[6] == 1 which hairpins the traf‐
931                     fic by NATting source IP to the load balancer VIP by exe‐
932                     cuting  the action ct_snat_to_vip and advances the packet
933                     to the next table.
934
935              •      If the logical switch has  load  balancer(s)  configured,
936                     then  a  priority-100  flow is added with the match ip &&
937                     ct.est && ct.trk && reg0[6] == 1 which hairpins the traf‐
938                     fic by NATting source IP to the load balancer VIP by exe‐
939                     cuting the action ct_snat and advances the packet to  the
940                     next table.
941
942              •      If  the  logical  switch has load balancer(s) configured,
943                     then a priority-90 flow is added with  the  match  ip  &&
944                     reg0[12]  == 1 which matches on the replies of hairpinned
945                     traffic (i.e., destination IP is VIP, source  IP  is  the
946                     backend IP and source L4 port is backend port for L4 load
947                     balancers) and executes ct_snat and advances  the  packet
948                     to the next table.
949
950              •      A  priority-0  flow that simply moves traffic to the next
951                     table.
952
953     Ingress Table 16: Hairpin
954
955              •      For each distributed gateway router port RP  attached  to
956                     the  logical  switch,  a priority-2000 flow is added with
957                     the match reg0[14] == 1 && is_chassis_resident(RP)
958                      and action next; to pass the traffic to the  next  table
959                     to respond to the ARP requests for the router port IPs.
960
961                     reg0[14] register bit is set in the ingress L2 port secu‐
962                     rity check table for traffic received from HW VTEP (ramp)
963                     ports.
964
965              •      A  priority-1000  flow  that matches on reg0[14] register
966                     bit for the traffic received from HW VTEP  (ramp)  ports.
967                     This traffic is passed to ingress table ls_in_l2_lkup.
968
969              •      A  priority-1  flow that hairpins traffic matched by non-
970                     default flows in the Pre-Hairpin  table.  Hairpinning  is
971                     done  at L2, Ethernet addresses are swapped and the pack‐
972                     ets are looped back on the input port.
973
974              •      A priority-0 flow that simply moves traffic to  the  next
975                     table.
976
977     Ingress Table 17: ARP/ND responder
978
979       This  table  implements  ARP/ND responder in a logical switch for known
980       IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
981       by locally responding to ARP requests without the need to send to other
982       hypervisors. One common case is when the inport is a logical port asso‐
983       ciated with a VIF and the broadcast is responded to on the local hyper‐
984       visor rather than broadcast across the whole network and  responded  to
985       by the destination VM. This behavior is proxy ARP.
986
987       ARP  requests  arrive from VMs from a logical switch inport of type de‐
988       fault. For this case, the logical switch proxy ARP  rules  can  be  for
989       other  VMs  or logical router ports. Logical switch proxy ARP rules may
990       be programmed both for mac binding of IP  addresses  on  other  logical
991       switch  VIF  ports  (which are of the default logical switch port type,
992       representing connectivity to VMs or containers), and for mac binding of
993       IP  addresses  on  logical switch router type ports, representing their
994       logical router port peers. In order to support proxy  ARP  for  logical
995       router  ports,  an  IP address must be configured on the logical switch
996       router type port, with the same value as the peer logical router  port.
997       The configured MAC addresses must match as well. When a VM sends an ARP
998       request for a distributed logical router port and if  the  peer  router
999       type  port  of  the attached logical switch does not have an IP address
1000       configured, the ARP request will be broadcast on  the  logical  switch.
1001       One of the copies of the ARP request will go through the logical switch
1002       router type port to the logical  router  datapath,  where  the  logical
1003       router  ARP  responder will generate a reply. The MAC binding of a dis‐
1004       tributed logical router, once learned by an associated VM, is used  for
1005       all  that VM’s communication needing routing. Hence, the action of a VM
1006       re-arping for the mac binding of the  logical  router  port  should  be
1007       rare.
1008
1009       Logical  switch  ARP responder proxy ARP rules can also be hit when re‐
1010       ceiving ARP requests externally on a L2 gateway port. In this case, the
1011       hypervisor  acting as an L2 gateway, responds to the ARP request on be‐
1012       half of a destination VM.
1013
1014       Note that ARP requests received from localnet logical inports  can  ei‐
1015       ther  go  directly  to VMs, in which case the VM responds or can hit an
1016       ARP responder for a logical router port if the packet is  used  to  re‐
1017       solve  a  logical router port next hop address. In either case, logical
1018       switch ARP responder rules will not be hit. It contains  these  logical
1019       flows:
1020
1021              •      Priority-100 flows to skip the ARP responder if inport is
1022                     of type localnet advances directly to the next table. ARP
1023                     requests sent to localnet ports can be received by multi‐
1024                     ple hypervisors. Now, because the same mac binding  rules
1025                     are  downloaded  to all hypervisors, each of the multiple
1026                     hypervisors will respond. This will confuse  L2  learning
1027                     on  the source of the ARP requests. ARP requests received
1028                     on an inport of type router are not expected to  hit  any
1029                     logical  switch  ARP  responder  flows.  However, no skip
1030                     flows are installed for these packets, as there would  be
1031                     some  additional flow cost for this and the value appears
1032                     limited.
1033
1034              •      If inport V is of type virtual adds a priority-100  logi‐
1035                     cal  flows  for each P configured in the options:virtual-
1036                     parents column with the match
1037
1038                     inport == P && && ((arp.op == 1 && arp.spa == VIP && arp.tpa == VIP) || (arp.op == 2 && arp.spa == VIP))
1039                     inport == P && && ((nd_ns && ip6.dst == {VIP, NS_MULTICAST_ADDR} && nd.target == VIP) || (nd_na && nd.target == VIP))
1040
1041
1042                     and applies the action
1043
1044                     bind_vport(V, inport);
1045
1046
1047                     and advances the packet to the next table.
1048
1049                     Where VIP is the virtual ip configured in the column  op‐
1050                     tions:virtual-ip  and NS_MULTICAST_ADDR is solicited-node
1051                     multicast address corresponding to the VIP.
1052
1053              •      Priority-50 flows that match ARP requests to  each  known
1054                     IP  address  A  of every logical switch port, and respond
1055                     with ARP replies directly with corresponding Ethernet ad‐
1056                     dress E:
1057
1058                     eth.dst = eth.src;
1059                     eth.src = E;
1060                     arp.op = 2; /* ARP reply. */
1061                     arp.tha = arp.sha;
1062                     arp.sha = E;
1063                     arp.tpa = arp.spa;
1064                     arp.spa = A;
1065                     outport = inport;
1066                     flags.loopback = 1;
1067                     output;
1068
1069
1070                     These  flows  are  omitted  for logical ports (other than
1071                     router ports or localport ports) that  are  down  (unless
1072                     ignore_lsp_down  is  configured as true in options column
1073                     of NB_Global table of the Northbound database), for logi‐
1074                     cal  ports  of  type virtual, for logical ports with ’un‐
1075                     known’ address set and for logical  ports  of  a  logical
1076                     switch configured with other_config:vlan-passthru=true.
1077
1078                     The  above  ARP responder flows are added for the list of
1079                     IPv4 addresses if defined in options:arp_proxy column  of
1080                     Logical_Switch_Port  table  for  logical  switch ports of
1081                     type router.
1082
1083              •      Priority-50 flows that match IPv6 ND  neighbor  solicita‐
1084                     tions  to each known IP address A (and A’s solicited node
1085                     address) of every logical  switch  port  except  of  type
1086                     router, and respond with neighbor advertisements directly
1087                     with corresponding Ethernet address E:
1088
1089                     nd_na {
1090                         eth.src = E;
1091                         ip6.src = A;
1092                         nd.target = A;
1093                         nd.tll = E;
1094                         outport = inport;
1095                         flags.loopback = 1;
1096                         output;
1097                     };
1098
1099
1100                     Priority-50 flows that match IPv6 ND  neighbor  solicita‐
1101                     tions  to each known IP address A (and A’s solicited node
1102                     address) of logical switch port of type router,  and  re‐
1103                     spond  with  neighbor advertisements directly with corre‐
1104                     sponding Ethernet address E:
1105
1106                     nd_na_router {
1107                         eth.src = E;
1108                         ip6.src = A;
1109                         nd.target = A;
1110                         nd.tll = E;
1111                         outport = inport;
1112                         flags.loopback = 1;
1113                         output;
1114                     };
1115
1116
1117                     These flows are omitted for  logical  ports  (other  than
1118                     router  ports  or  localport ports) that are down (unless
1119                     ignore_lsp_down is configured as true in  options  column
1120                     of NB_Global table of the Northbound database), for logi‐
1121                     cal ports of type virtual and for logical ports with ’un‐
1122                     known’ address set.
1123
1124              •      Priority-100  flows  with match criteria like the ARP and
1125                     ND flows above, except that they only match packets  from
1126                     the  inport  that owns the IP addresses in question, with
1127                     action next;. These flows prevent OVN from  replying  to,
1128                     for  example,  an ARP request emitted by a VM for its own
1129                     IP address. A VM only makes this kind of request  to  at‐
1130                     tempt  to  detect  a  duplicate IP address assignment, so
1131                     sending a reply will prevent the VM from accepting the IP
1132                     address that it owns.
1133
1134                     In  place  of  next;, it would be reasonable to use drop;
1135                     for the flows’ actions. If everything is working as it is
1136                     configured,  then  this would produce equivalent results,
1137                     since no host should reply to the request. But ARPing for
1138                     one’s  own  IP  address  is intended to detect situations
1139                     where the network is not working as configured, so  drop‐
1140                     ping the request would frustrate that intent.
1141
1142              •      For  each  SVC_MON_SRC_IP  defined  in  the  value of the
1143                     ip_port_mappings:ENDPOINT_IP column of Load_Balancer  ta‐
1144                     ble,  priority-110  logical  flow is added with the match
1145                     arp.tpa == SVC_MON_SRC_IP && && arp.op == 1  and  applies
1146                     the action
1147
1148                     eth.dst = eth.src;
1149                     eth.src = E;
1150                     arp.op = 2; /* ARP reply. */
1151                     arp.tha = arp.sha;
1152                     arp.sha = E;
1153                     arp.tpa = arp.spa;
1154                     arp.spa = A;
1155                     outport = inport;
1156                     flags.loopback = 1;
1157                     output;
1158
1159
1160                     where  E is the service monitor source mac defined in the
1161                     options:svc_monitor_mac column in  the  NB_Global  table.
1162                     This mac is used as the source mac in the service monitor
1163                     packets for the load balancer endpoint IP health checks.
1164
1165                     SVC_MON_SRC_IP is used as the source ip  in  the  service
1166                     monitor  IPv4  packets  for the load balancer endpoint IP
1167                     health checks.
1168
1169                     These flows are required if an ARP request  is  sent  for
1170                     the IP SVC_MON_SRC_IP.
1171
1172              •      For  each  VIP configured in the table Forwarding_Group a
1173                     priority-50 logical flow is added with the match  arp.tpa
1174                     == vip && && arp.op == 1
1175                      and applies the action
1176
1177                     eth.dst = eth.src;
1178                     eth.src = E;
1179                     arp.op = 2; /* ARP reply. */
1180                     arp.tha = arp.sha;
1181                     arp.sha = E;
1182                     arp.tpa = arp.spa;
1183                     arp.spa = A;
1184                     outport = inport;
1185                     flags.loopback = 1;
1186                     output;
1187
1188
1189                     where  E  is  the  forwarding  group’s mac defined in the
1190                     vmac.
1191
1192                     A is used as either the destination ip for load balancing
1193                     traffic  to child ports or as nexthop to hosts behind the
1194                     child ports.
1195
1196                     These flows are required to respond to an ARP request  if
1197                     an ARP request is sent for the IP vip.
1198
1199              •      One priority-0 fallback flow that matches all packets and
1200                     advances to the next table.
1201
1202     Ingress Table 18: DHCP option processing
1203
1204       This table adds the DHCPv4 options to a DHCPv4 packet from the  logical
1205       ports  configured  with  IPv4 address(es) and DHCPv4 options, and simi‐
1206       larly for DHCPv6 options. This table also adds flows  for  the  logical
1207       ports of type external.
1208
1209              •      A  priority-100  logical  flow is added for these logical
1210                     ports which matches the IPv4 packet with udp.src = 68 and
1211                     udp.dst = 67 and applies the action put_dhcp_opts and ad‐
1212                     vances the packet to the next table.
1213
1214                     reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
1215                     next;
1216
1217
1218                     For DHCPDISCOVER and  DHCPREQUEST,  this  transforms  the
1219                     packet  into  a DHCP reply, adds the DHCP offer IP ip and
1220                     options to the packet, and stores  1  into  reg0[3].  For
1221                     other  kinds  of  packets, it just stores 0 into reg0[3].
1222                     Either way, it continues to the next table.
1223
1224              •      A priority-100 logical flow is added  for  these  logical
1225                     ports  which  matches  the IPv6 packet with udp.src = 546
1226                     and udp.dst = 547 and applies the action  put_dhcpv6_opts
1227                     and advances the packet to the next table.
1228
1229                     reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
1230                     next;
1231
1232
1233                     For  DHCPv6  Solicit/Request/Confirm packets, this trans‐
1234                     forms the packet into a DHCPv6 Advertise/Reply, adds  the
1235                     DHCPv6  offer IP ip and options to the packet, and stores
1236                     1 into reg0[3]. For  other  kinds  of  packets,  it  just
1237                     stores  0  into  reg0[3]. Either way, it continues to the
1238                     next table.
1239
1240              •      A priority-0 flow that matches all packets to advances to
1241                     table 16.
1242
1243     Ingress Table 19: DHCP responses
1244
1245       This  table implements DHCP responder for the DHCP replies generated by
1246       the previous table.
1247
1248              •      A priority 100 logical flow  is  added  for  the  logical
1249                     ports  configured  with DHCPv4 options which matches IPv4
1250                     packets with udp.src == 68 && udp.dst == 67 && reg0[3] ==
1251                     1  and  responds  back to the inport after applying these
1252                     actions. If reg0[3] is set to 1, it means that the action
1253                     put_dhcp_opts was successful.
1254
1255                     eth.dst = eth.src;
1256                     eth.src = E;
1257                     ip4.src = S;
1258                     udp.src = 67;
1259                     udp.dst = 68;
1260                     outport = P;
1261                     flags.loopback = 1;
1262                     output;
1263
1264
1265                     where  E  is  the  server MAC address and S is the server
1266                     IPv4 address defined in the  DHCPv4  options.  Note  that
1267                     ip4.dst field is handled by put_dhcp_opts.
1268
1269                     (This  terminates  ingress  packet processing; the packet
1270                     does not go to the next ingress table.)
1271
1272              •      A priority 100 logical flow  is  added  for  the  logical
1273                     ports  configured  with DHCPv6 options which matches IPv6
1274                     packets with udp.src == 546 && udp.dst == 547 &&  reg0[3]
1275                     == 1 and responds back to the inport after applying these
1276                     actions. If reg0[3] is set to 1, it means that the action
1277                     put_dhcpv6_opts was successful.
1278
1279                     eth.dst = eth.src;
1280                     eth.src = E;
1281                     ip6.dst = A;
1282                     ip6.src = S;
1283                     udp.src = 547;
1284                     udp.dst = 546;
1285                     outport = P;
1286                     flags.loopback = 1;
1287                     output;
1288
1289
1290                     where  E  is  the  server MAC address and S is the server
1291                     IPv6 LLA address generated from the server_id defined  in
1292                     the  DHCPv6  options and A is the IPv6 address defined in
1293                     the logical port’s addresses column.
1294
1295                     (This terminates packet processing; the packet  does  not
1296                     go on the next ingress table.)
1297
1298              •      A priority-0 flow that matches all packets to advances to
1299                     table 17.
1300
1301     Ingress Table 20 DNS Lookup
1302
1303       This table looks up and resolves the DNS  names  to  the  corresponding
1304       configured IP address(es).
1305
1306              •      A priority-100 logical flow for each logical switch data‐
1307                     path if it is configured with DNS records, which  matches
1308                     the  IPv4  and IPv6 packets with udp.dst = 53 and applies
1309                     the action dns_lookup and advances the packet to the next
1310                     table.
1311
1312                     reg0[4] = dns_lookup(); next;
1313
1314
1315                     For  valid DNS packets, this transforms the packet into a
1316                     DNS reply if the DNS name can be resolved, and  stores  1
1317                     into reg0[4]. For failed DNS resolution or other kinds of
1318                     packets, it just stores 0 into reg0[4].  Either  way,  it
1319                     continues to the next table.
1320
1321     Ingress Table 21 DNS Responses
1322
1323       This  table  implements  DNS responder for the DNS replies generated by
1324       the previous table.
1325
1326              •      A priority-100 logical flow for each logical switch data‐
1327                     path  if it is configured with DNS records, which matches
1328                     the IPv4 and IPv6 packets with udp.dst = 53 && reg0[4] ==
1329                     1  and  responds  back to the inport after applying these
1330                     actions. If reg0[4] is set to 1, it means that the action
1331                     dns_lookup was successful.
1332
1333                     eth.dst <-> eth.src;
1334                     ip4.src <-> ip4.dst;
1335                     udp.dst = udp.src;
1336                     udp.src = 53;
1337                     outport = P;
1338                     flags.loopback = 1;
1339                     output;
1340
1341
1342                     (This  terminates  ingress  packet processing; the packet
1343                     does not go to the next ingress table.)
1344
1345     Ingress table 22 External ports
1346
1347       Traffic from the external logical  ports  enter  the  ingress  datapath
1348       pipeline via the localnet port. This table adds the below logical flows
1349       to handle the traffic from these ports.
1350
1351              •      A priority-100 flow is added for  each  external  logical
1352                     port  which  doesn’t  reside  on  a  chassis  to drop the
1353                     ARP/IPv6 NS request to the router IP(s) (of  the  logical
1354                     switch) which matches on the inport of the external logi‐
1355                     cal port and the valid eth.src address(es) of the  exter‐
1356                     nal logical port.
1357
1358                     This  flow  guarantees  that  the  ARP/NS  request to the
1359                     router IP address from the external ports is responded by
1360                     only  the chassis which has claimed these external ports.
1361                     All the other chassis, drops these packets.
1362
1363                     A priority-100 flow is added for  each  external  logical
1364                     port which doesn’t reside on a chassis to drop any packet
1365                     destined to the router mac - with the match inport == ex‐
1366                     ternal  &&  eth.src  ==  E  &&  eth.dst == R && !is_chas‐
1367                     sis_resident("external") where E is the external port mac
1368                     and R is the router port mac.
1369
1370              •      A priority-0 flow that matches all packets to advances to
1371                     table 20.
1372
1373     Ingress Table 23 Destination Lookup
1374
1375       This table implements switching behavior.  It  contains  these  logical
1376       flows:
1377
1378              •      A  priority-110  flow with the match eth.src == E for all
1379                     logical switch datapaths  and  applies  the  action  han‐
1380                     dle_svc_check(inport). Where E is the service monitor mac
1381                     defined in the options:svc_monitor_mac colum of NB_Global
1382                     table.
1383
1384              •      A  priority-100  flow  that punts all IGMP/MLD packets to
1385                     ovn-controller if multicast snooping is  enabled  on  the
1386                     logical switch. The flow also forwards the IGMP/MLD pack‐
1387                     ets  to  the  MC_MROUTER_STATIC  multicast  group,  which
1388                     ovn-northd populates with all the logical ports that have
1389                     options :mcast_flood_reports=’true’.
1390
1391              •      Priority-90 flows that forward  registered  IP  multicast
1392                     traffic  to  their  corresponding  multicast group, which
1393                     ovn-northd creates based on  learnt  IGMP_Group  entries.
1394                     The  flows  also  forward packets to the MC_MROUTER_FLOOD
1395                     multicast group, which ovn-nortdh populates with all  the
1396                     logical  ports that are connected to logical routers with
1397                     options:mcast_relay=’true’.
1398
1399              •      A priority-85 flow that forwards all IP multicast traffic
1400                     destined  to  224.0.0.X  to the MC_FLOOD multicast group,
1401                     which  ovn-northd  populates  with  all  enabled  logical
1402                     ports.
1403
1404              •      A priority-85 flow that forwards all IP multicast traffic
1405                     destined to reserved multicast IPv6 addresses (RFC  4291,
1406                     2.7.1,  e.g.,  Solicited-Node  multicast) to the MC_FLOOD
1407                     multicast group, which ovn-northd populates with all  en‐
1408                     abled logical ports.
1409
1410              •      A priority-80 flow that forwards all unregistered IP mul‐
1411                     ticast traffic to the MC_STATIC  multicast  group,  which
1412                     ovn-northd populates with all the logical ports that have
1413                     options :mcast_flood=’true’. The flow also  forwards  un‐
1414                     registered  IP  multicast traffic to the MC_MROUTER_FLOOD
1415                     multicast group, which ovn-northd populates with all  the
1416                     logical  ports connected to logical routers that have op‐
1417                     tions :mcast_relay=’true’.
1418
1419              •      A priority-80 flow that drops all unregistered IP  multi‐
1420                     cast  traffic  if  other_config  :mcast_snoop=’true’  and
1421                     other_config  :mcast_flood_unregistered=’false’  and  the
1422                     switch  is not connected to a logical router that has op‐
1423                     tions :mcast_relay=’true’ and the switch doesn’t have any
1424                     logical port with options :mcast_flood=’true’.
1425
1426              •      Priority-80  flows  for  each  IP address/VIP/NAT address
1427                     owned by a router port connected  to  the  switch.  These
1428                     flows  match ARP requests and ND packets for the specific
1429                     IP addresses. Matched packets are forwarded only  to  the
1430                     router  that  owns  the IP address and to the MC_FLOOD_L2
1431                     multicast group which  contains  all  non-router  logical
1432                     ports.
1433
1434              •      Priority-75  flows  for  each port connected to a logical
1435                     router matching self originated ARP  request/ND  packets.
1436                     These  packets  are flooded to the MC_FLOOD_L2 which con‐
1437                     tains all non-router logical ports.
1438
1439              •      A priority-70 flow that outputs all packets with an  Eth‐
1440                     ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
1441                     ticast group.
1442
1443              •      One priority-50 flow that matches each known Ethernet ad‐
1444                     dress  against eth.dst and outputs the packet to the sin‐
1445                     gle associated output port.
1446
1447                     For the Ethernet address on a logical switch port of type
1448                     router,  when that logical switch port’s addresses column
1449                     is set to router and the connected  logical  router  port
1450                     has a gateway chassis:
1451
1452                     •      The  flow  for the connected logical router port’s
1453                            Ethernet address is only programmed on the gateway
1454                            chassis.
1455
1456                     •      If  the  logical router has rules specified in nat
1457                            with external_mac, then those addresses  are  also
1458                            used  to  populate the switch’s destination lookup
1459                            on the chassis where logical_port is resident.
1460
1461                     For the Ethernet address on a logical switch port of type
1462                     router,  when that logical switch port’s addresses column
1463                     is set to router and the connected  logical  router  port
1464                     specifies  a  reside-on-redirect-chassis  and the logical
1465                     router to which the connected logical router port belongs
1466                     to has a distributed gateway LRP:
1467
1468                     •      The  flow  for the connected logical router port’s
1469                            Ethernet address is only programmed on the gateway
1470                            chassis.
1471
1472                     For  each  forwarding  group  configured  on  the logical
1473                     switch datapath,  a  priority-50  flow  that  matches  on
1474                     eth.dst == VIP
1475                      with  an  action  of  fwd_group(childports=args ), where
1476                     args contains comma separated logical switch child  ports
1477                     to  load  balance to. If liveness is enabled, then action
1478                     also includes  liveness=true.
1479
1480              •      One priority-0 fallback flow  that  matches  all  packets
1481                     with  the  action  outport = get_fdb(eth.dst); next;. The
1482                     action get_fdb gets the port for the eth.dst in  the  MAC
1483                     learning  table  of the logical switch datapath. If there
1484                     is no entry for eth.dst in the MAC learning  table,  then
1485                     it stores none in the outport.
1486
1487     Ingress Table 24 Destination unknown
1488
1489       This  table  handles the packets whose destination was not found or and
1490       looked up in the MAC learning table of the logical switch datapath.  It
1491       contains the following flows.
1492
1493              •      If  the  logical  switch has logical ports with ’unknown’
1494                     addresses set, then the below logical flow is added
1495
1496                     •      Priority 50 flow with the match  outport  ==  none
1497                            then  outputs  them  to  the  MC_UNKNOWN multicast
1498                            group, which ovn-northd populates with all enabled
1499                            logical  ports  that  accept  unknown  destination
1500                            packets. As a small optimization,  if  no  logical
1501                            ports    accept   unknown   destination   packets,
1502                            ovn-northd omits this multicast group and  logical
1503                            flow.
1504
1505                     If the logical switch has no logical ports with ’unknown’
1506                     address set, then the below logical flow is added
1507
1508                     •      Priority 50 flow with the match  outport  ==  none
1509                            and drops the packets.
1510
1511              •      One  priority-0  fallback flow that outputs the packet to
1512                     the egress stage with the outport learnt from get_fdb ac‐
1513                     tion.
1514
1515     Egress Table 0: Pre-LB
1516
1517       This table is similar to ingress table Pre-LB. It contains a priority-0
1518       flow that simply moves traffic to the next table. Moreover it  contains
1519       two  priority-110  flows to move multicast, IPv6 Neighbor Discovery and
1520       MLD traffic to the next table. If any load balancing  rules  exist  for
1521       the  datapath,  a priority-100 flow is added with a match of ip and ac‐
1522       tion of reg0[2] = 1; next; to act as a hint for table  Pre-stateful  to
1523       send  IP  packets to the connection tracker for packet de-fragmentation
1524       and possibly DNAT the destination VIP to one of  the  selected  backend
1525       for already commited load balanced traffic.
1526
1527       This table also has a priority-110 flow with the match eth.src == E for
1528       all logical switch datapaths to move traffic to the next table. Where E
1529       is the service monitor mac defined in the options:svc_monitor_mac colum
1530       of NB_Global table.
1531
1532     Egress Table 1: to-lport Pre-ACLs
1533
1534       This is similar to ingress table Pre-ACLs except for to-lport traffic.
1535
1536       This table also has a priority-110 flow with the match eth.src == E for
1537       all logical switch datapaths to move traffic to the next table. Where E
1538       is the service monitor mac defined in the options:svc_monitor_mac colum
1539       of NB_Global table.
1540
1541       This table also has a priority-110 flow with the match outport == I for
1542       all logical switch datapaths to move traffic to the next table. Where I
1543       is  the  peer  of a logical router port. This flow is added to skip the
1544       connection tracking of packets which will be  entering  logical  router
1545       datapath from logical switch datapath for routing.
1546
1547     Egress Table 2: Pre-stateful
1548
1549       This  is similar to ingress table Pre-stateful. This table adds the be‐
1550       low 3 logical flows.
1551
1552              •      A Priority-120 flow that send the packets  to  connection
1553                     tracker  using  ct_lb_mark; as the action so that the al‐
1554                     ready established traffic gets unDNATted from the backend
1555                     IP  to  the load balancer VIP based on a hint provided by
1556                     the previous tables with a match for reg0[2] == 1. If the
1557                     packet was not DNATted earlier, then ct_lb_mark functions
1558                     like ct_next.
1559
1560              •      A priority-100  flow  sends  the  packets  to  connection
1561                     tracker  based  on a hint provided by the previous tables
1562                     (with a match for reg0[0] == 1) by using the ct_next; ac‐
1563                     tion.
1564
1565              •      A  priority-0 flow that matches all packets to advance to
1566                     the next table.
1567
1568     Egress Table 3: from-lport ACL hints
1569
1570       This is similar to ingress table ACL hints.
1571
1572     Egress Table 4: to-lport ACLs
1573
1574       This is similar to ingress table ACLs except for to-lport ACLs.
1575
1576       In addition, the following flows are added.
1577
1578              •      A priority 34000 logical flow is added for  each  logical
1579                     port which has DHCPv4 options defined to allow the DHCPv4
1580                     reply packet and which has DHCPv6 options defined to  al‐
1581                     low  the  DHCPv6  reply packet from the Ingress Table 18:
1582                     DHCP responses.
1583
1584              •      A priority 34000 logical flow is added for  each  logical
1585                     switch  datapath  configured  with  DNS  records with the
1586                     match udp.dst = 53 to allow the DNS reply packet from the
1587                     Ingress Table 20: DNS responses.
1588
1589              •      A  priority  34000 logical flow is added for each logical
1590                     switch datapath with the match eth.src = E to  allow  the
1591                     service  monitor  request  packet  generated  by ovn-con‐
1592                     troller with the action next, where E is the service mon‐
1593                     itor  mac defined in the options:svc_monitor_mac colum of
1594                     NB_Global table.
1595
1596     Egress Table 5: to-lport QoS Marking
1597
1598       This is similar to ingress table  QoS  marking  except  they  apply  to
1599       to-lport QoS rules.
1600
1601     Egress Table 6: to-lport QoS Meter
1602
1603       This  is  similar  to  ingress  table  QoS  meter  except they apply to
1604       to-lport QoS rules.
1605
1606     Egress Table 7: Stateful
1607
1608       This is similar to ingress table Stateful  except  that  there  are  no
1609       rules added for load balancing new connections.
1610
1611     Egress Table 8: Egress Port Security - check
1612
1613       This  is similar to the port security logic in table Ingress Port Secu‐
1614       rity check except that action check_out_port_sec is used to  check  the
1615       port security rules. This table adds the below logical flows.
1616
1617              •      A  priority 100 flow which matches on the multicast traf‐
1618                     fic and applies the  action  REGBIT_PORT_SEC_DROP"  =  0;
1619                     next;" to skip the out port security checks.
1620
1621              •      For  each  disabled  logical port, a priority 150 flow is
1622                     added which matches on all packets and applies the action
1623                     REGBIT_PORT_SEC_DROP" = 1; next;" so that the packets are
1624                     dropped in the next stage.
1625
1626              •      A priority 0 logical flow is added which matches  on  all
1627                     the  packets and applies the action REGBIT_PORT_SEC_DROP"
1628                     =    check_out_port_sec();     next;".     The     action
1629                     check_out_port_sec  applies the port security rules based
1630                     on the addresses defined in the port_security  column  of
1631                     Logical_Switch_Port table before delivering the packet to
1632                     the outport.
1633
1634     Egress Table 9: Egress Port Security - Apply
1635
1636       This is similar to the ingress port security logic in ingress  table  A
1637       Ingress Port Security - Apply. This table drops the packets if the port
1638       security check failed in the previous stage i.e the register  bit  REG‐
1639       BIT_PORT_SEC_DROP is set to 1.
1640
1641       The following flows are added.
1642
1643              •      For  each localnet port configured with egress qos in the
1644                     options:qdisc_queue_id column of  Logical_Switch_Port,  a
1645                     priority  100 flow is added which matches on the localnet
1646                     outport and applies the action set_queue(id); output;".
1647
1648                     Please remember to mark the corresponding physical inter‐
1649                     face with ovn-egress-iface set to true in external_ids.
1650
1651              •      A  priority-50 flow that drops the packet if the register
1652                     bit REGBIT_PORT_SEC_DROP is set to 1.
1653
1654              •      A priority-0 flow that outputs the packet to the outport.
1655
1656   Logical Router Datapaths
1657       Logical router datapaths will only exist for Logical_Router rows in the
1658       OVN_Northbound database that do not have enabled set to false
1659
1660     Ingress Table 0: L2 Admission Control
1661
1662       This  table drops packets that the router shouldn’t see at all based on
1663       their Ethernet headers. It contains the following flows:
1664
1665              •      Priority-100 flows to drop packets with VLAN tags or mul‐
1666                     ticast Ethernet source addresses.
1667
1668              •      For each enabled router port P with Ethernet address E, a
1669                     priority-50 flow that matches inport == P  &&  (eth.mcast
1670                     || eth.dst == E), stores the router port ethernet address
1671                     and advances to next table, with  action  xreg0[0..47]=E;
1672                     next;.
1673
1674                     For  the  gateway  port  on  a distributed logical router
1675                     (where one of the logical router ports specifies a  gate‐
1676                     way  chassis),  the  above  flow matching eth.dst == E is
1677                     only programmed on the gateway port instance on the gate‐
1678                     way chassis.
1679
1680                     For  a  distributed  logical router or for gateway router
1681                     where the port is configured with options:gateway_mtu the
1682                     action   of   the   above   flow   is   modified   adding
1683                     check_pkt_larger in order to mark the packet setting REG‐
1684                     BIT_PKT_LARGER  if  the  size is greater than the MTU. If
1685                     the port is also configured with  options:gateway_mtu_by‐
1686                     pass then another flow is added, with priority-55, to by‐
1687                     pass the check_pkt_larger flow. This is useful for  traf‐
1688                     fic  that  normally doesn’t need to be fragmented and for
1689                     which check_pkt_larger, which might not  be  offloadable,
1690                     is not really needed. One such example is TCP traffic.
1691
1692              •      For  each  dnat_and_snat NAT rule on a distributed router
1693                     that specifies an external Ethernet address E,  a  prior‐
1694                     ity-50  flow  that  matches inport == GW && eth.dst == E,
1695                     where GW is the logical router gateway port, with  action
1696                     xreg0[0..47]=E; next;.
1697
1698                     This flow is only programmed on the gateway port instance
1699                     on the chassis where the logical_port  specified  in  the
1700                     NAT rule resides.
1701
1702       Other packets are implicitly dropped.
1703
1704     Ingress Table 1: Neighbor lookup
1705
1706       For  ARP and IPv6 Neighbor Discovery packets, this table looks into the
1707       MAC_Binding records to determine if OVN needs to learn  the  mac  bind‐
1708       ings. Following flows are added:
1709
1710              •      For  each router port P that owns IP address A, which be‐
1711                     longs to subnet S with prefix length L, if the option al‐
1712                     ways_learn_from_arp_request  is  true  for this router, a
1713                     priority-100 flow is added which matches inport ==  P  &&
1714                     arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1715                     lowing actions:
1716
1717                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1718                     next;
1719
1720
1721                     If the option always_learn_from_arp_request is false, the
1722                     following two flows are added.
1723
1724                     A priority-110 flow is added which matches inport == P &&
1725                     arp.spa == S/L && arp.tpa == A && arp.op ==  1  (ARP  re‐
1726                     quest) with the following actions:
1727
1728                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1729                     reg9[3] = 1;
1730                     next;
1731
1732
1733                     A priority-100 flow is added which matches inport == P &&
1734                     arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1735                     lowing actions:
1736
1737                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1738                     reg9[3] = lookup_arp_ip(inport, arp.spa);
1739                     next;
1740
1741
1742                     If  the  logical  router  port P is a distributed gateway
1743                     router port, additional  match  is_chassis_resident(cr-P)
1744                     is added for all these flows.
1745
1746              •      A  priority-100  flow  which matches on ARP reply packets
1747                     and   applies   the   actions   if   the    option    al‐
1748                     ways_learn_from_arp_request is true:
1749
1750                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1751                     next;
1752
1753
1754                     If the option always_learn_from_arp_request is false, the
1755                     above actions will be:
1756
1757                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1758                     reg9[3] = 1;
1759                     next;
1760
1761
1762              •      A priority-100 flow which matches on IPv6  Neighbor  Dis‐
1763                     covery  advertisement  packet  and applies the actions if
1764                     the option always_learn_from_arp_request is true:
1765
1766                     reg9[2] = lookup_nd(inport, nd.target, nd.tll);
1767                     next;
1768
1769
1770                     If the option always_learn_from_arp_request is false, the
1771                     above actions will be:
1772
1773                     reg9[2] = lookup_nd(inport, nd.target, nd.tll);
1774                     reg9[3] = 1;
1775                     next;
1776
1777
1778              •      A  priority-100  flow which matches on IPv6 Neighbor Dis‐
1779                     covery solicitation packet and applies the actions if the
1780                     option always_learn_from_arp_request is true:
1781
1782                     reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
1783                     next;
1784
1785
1786                     If the option always_learn_from_arp_request is false, the
1787                     above actions will be:
1788
1789                     reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
1790                     reg9[3] = lookup_nd_ip(inport, ip6.src);
1791                     next;
1792
1793
1794              •      A priority-0 fallback flow that matches all  packets  and
1795                     applies  the  action  reg9[2]  =  1;  next; advancing the
1796                     packet to the next table.
1797
1798     Ingress Table 2: Neighbor learning
1799
1800       This table adds flows to learn the mac bindings from the ARP  and  IPv6
1801       Neighbor  Solicitation/Advertisement  packets if it is needed according
1802       to the lookup results from the previous stage.
1803
1804       reg9[2] will be 1 if the lookup_arp/lookup_nd in the previous table was
1805       successful  or  skipped,  meaning no need to learn mac binding from the
1806       packet.
1807
1808       reg9[3] will be 1 if the lookup_arp_ip/lookup_nd_ip in the previous ta‐
1809       ble  was  successful  or skipped, meaning it is ok to learn mac binding
1810       from the packet (if reg9[2] is 0).
1811
1812              •      A priority-100 flow  with  the  match  reg9[2]  ==  1  ||
1813                     reg9[3] == 0 and advances the packet to the next table as
1814                     there is no need to learn the neighbor.
1815
1816              •      A priority-90 flow with the match arp and applies the ac‐
1817                     tion put_arp(inport, arp.spa, arp.sha); next;
1818
1819              •      A  priority-95  flow with the match nd_na  && nd.tll == 0
1820                     and  applies   the   action   put_nd(inport,   nd.target,
1821                     eth.src); next;
1822
1823              •      A  priority-90  flow with the match nd_na and applies the
1824                     action put_nd(inport, nd.target, nd.tll); next;
1825
1826              •      A priority-90 flow with the match nd_ns and  applies  the
1827                     action put_nd(inport, ip6.src, nd.sll); next;
1828
1829     Ingress Table 3: IP Input
1830
1831       This table is the core of the logical router datapath functionality. It
1832       contains the following flows to implement very basic IP host  function‐
1833       ality.
1834
1835              •      For  each dnat_and_snat NAT rule on a distributed logical
1836                     routers or gateway routers with gateway  port  configured
1837                     with  options:gateway_mtu  to  a valid integer value M, a
1838                     priority-160 flow with the match inport ==  LRP  &&  REG‐
1839                     BIT_PKT_LARGER  && REGBIT_EGRESS_LOOPBACK == 0, where LRP
1840                     is the logical router port and applies the following  ac‐
1841                     tion for ipv4 and ipv6 respectively:
1842
1843                     icmp4_error {
1844                         icmp4.type = 3; /* Destination Unreachable. */
1845                         icmp4.code = 4;  /* Frag Needed and DF was Set. */
1846                         icmp4.frag_mtu = M;
1847                         eth.dst = eth.src;
1848                         eth.src = E;
1849                         ip4.dst = ip4.src;
1850                         ip4.src = I;
1851                         ip.ttl = 255;
1852                         REGBIT_EGRESS_LOOPBACK = 1;
1853                         REGBIT_PKT_LARGER 0;
1854                         outport = LRP;
1855                         flags.loopback = 1;
1856                         output;
1857                     };
1858                     icmp6_error {
1859                         icmp6.type = 2;
1860                         icmp6.code = 0;
1861                         icmp6.frag_mtu = M;
1862                         eth.dst = eth.src;
1863                         eth.src = E;
1864                         ip6.dst = ip6.src;
1865                         ip6.src = I;
1866                         ip.ttl = 255;
1867                         REGBIT_EGRESS_LOOPBACK = 1;
1868                         REGBIT_PKT_LARGER 0;
1869                         outport = LRP;
1870                         flags.loopback = 1;
1871                         output;
1872                     };
1873
1874
1875                     where  E  and  I are the NAT rule external mac and IP re‐
1876                     spectively.
1877
1878              •      For distributed logical routers or gateway  routers  with
1879                     gateway  port  configured  with  options:gateway_mtu to a
1880                     valid integer value, a priority-150 flow with  the  match
1881                     inport == LRP && REGBIT_PKT_LARGER && REGBIT_EGRESS_LOOP‐
1882                     BACK == 0, where LRP is the logical router port  and  ap‐
1883                     plies  the  following  action  for  ipv4 and ipv6 respec‐
1884                     tively:
1885
1886                     icmp4_error {
1887                         icmp4.type = 3; /* Destination Unreachable. */
1888                         icmp4.code = 4;  /* Frag Needed and DF was Set. */
1889                         icmp4.frag_mtu = M;
1890                         eth.dst = E;
1891                         ip4.dst = ip4.src;
1892                         ip4.src = I;
1893                         ip.ttl = 255;
1894                         REGBIT_EGRESS_LOOPBACK = 1;
1895                         REGBIT_PKT_LARGER 0;
1896                         next(pipeline=ingress, table=0);
1897                     };
1898                     icmp6_error {
1899                         icmp6.type = 2;
1900                         icmp6.code = 0;
1901                         icmp6.frag_mtu = M;
1902                         eth.dst = E;
1903                         ip6.dst = ip6.src;
1904                         ip6.src = I;
1905                         ip.ttl = 255;
1906                         REGBIT_EGRESS_LOOPBACK = 1;
1907                         REGBIT_PKT_LARGER 0;
1908                         next(pipeline=ingress, table=0);
1909                     };
1910
1911
1912              •      For each NAT entry of a distributed logical router  (with
1913                     distributed  gateway  router port) of type snat, a prior‐
1914                     ity-120 flow with the match inport == P && ip4.src  ==  A
1915                     advances  the packet to the next pipeline, where P is the
1916                     distributed logical router port and A is the  external_ip
1917                     set  in  the  NAT  entry.  If  A is an IPv6 address, then
1918                     ip6.src is used for the match.
1919
1920                     The above flow is required to handle the routing  of  the
1921                     East/west NAT traffic.
1922
1923              •      For  each  BFD  port the two following priority-110 flows
1924                     are added to manage BFD traffic:
1925
1926                     •      if ip4.src or ip6.src is any IP address  owned  by
1927                            the  router  port and udp.dst == 3784 , the packet
1928                            is advanced to the next pipeline stage.
1929
1930                     •      if ip4.dst or ip6.dst is any IP address  owned  by
1931                            the  router  port  and  udp.dst == 3784 , the han‐
1932                            dle_bfd_msg action is executed.
1933
1934              •      L3 admission control: Priority-120 flows allows IGMP  and
1935                     MLD packets if the router has logical ports that have op‐
1936                     tions :mcast_flood=’true’.
1937
1938              •      L3 admission control: A priority-100 flow  drops  packets
1939                     that match any of the following:
1940
1941ip4.src[28..31] == 0xe (multicast source)
1942
1943ip4.src == 255.255.255.255 (broadcast source)
1944
1945ip4.src  ==  127.0.0.0/8 || ip4.dst == 127.0.0.0/8
1946                            (localhost source or destination)
1947
1948ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
1949                            network source or destination)
1950
1951ip4.src  or ip6.src is any IP address owned by the
1952                            router, unless the packet was recirculated due  to
1953                            egress    loopback    as    indicated    by   REG‐
1954                            BIT_EGRESS_LOOPBACK.
1955
1956ip4.src is the broadcast address of any IP network
1957                            known to the router.
1958
1959              •      A  priority-100 flow parses DHCPv6 replies from IPv6 pre‐
1960                     fix delegation routers (udp.src  ==  547  &&  udp.dst  ==
1961                     546). The handle_dhcpv6_reply is used to send IPv6 prefix
1962                     delegation messages to the delegation router.
1963
1964              •      ICMP echo reply. These flows reply to ICMP echo  requests
1965                     received  for the router’s IP address. Let A be an IP ad‐
1966                     dress owned by a router port. Then, for each A that is an
1967                     IPv4  address, a priority-90 flow matches on ip4.dst == A
1968                     and icmp4.type == 8 && icmp4.code ==  0  (ICMP  echo  re‐
1969                     quest). For each A that is an IPv6 address, a priority-90
1970                     flow matches on ip6.dst == A and  icmp6.type  ==  128  &&
1971                     icmp6.code  ==  0  (ICMPv6 echo request). The port of the
1972                     router that receives the echo request  does  not  matter.
1973                     Also,  the  ip.ttl  of  the  echo  request  packet is not
1974                     checked, so it complies with RFC 1812,  section  4.2.2.9.
1975                     Flows for ICMPv4 echo requests use the following actions:
1976
1977                     ip4.dst <-> ip4.src;
1978                     ip.ttl = 255;
1979                     icmp4.type = 0;
1980                     flags.loopback = 1;
1981                     next;
1982
1983
1984                     Flows for ICMPv6 echo requests use the following actions:
1985
1986                     ip6.dst <-> ip6.src;
1987                     ip.ttl = 255;
1988                     icmp6.type = 129;
1989                     flags.loopback = 1;
1990                     next;
1991
1992
1993              •      Reply to ARP requests.
1994
1995                     These flows reply to ARP requests for the router’s own IP
1996                     address. The ARP requests are handled  only  if  the  re‐
1997                     questor’s  IP  belongs to the same subnets of the logical
1998                     router port. For each router port P that owns IP  address
1999                     A,  which  belongs  to subnet S with prefix length L, and
2000                     Ethernet address E, a priority-90 flow matches inport  ==
2001                     P  &&  arp.spa == S/L && arp.op == 1 && arp.tpa == A (ARP
2002                     request) with the following actions:
2003
2004                     eth.dst = eth.src;
2005                     eth.src = xreg0[0..47];
2006                     arp.op = 2; /* ARP reply. */
2007                     arp.tha = arp.sha;
2008                     arp.sha = xreg0[0..47];
2009                     arp.tpa = arp.spa;
2010                     arp.spa = A;
2011                     outport = inport;
2012                     flags.loopback = 1;
2013                     output;
2014
2015
2016                     For the gateway port  on  a  distributed  logical  router
2017                     (where  one of the logical router ports specifies a gate‐
2018                     way chassis), the above flows are only programmed on  the
2019                     gateway port instance on the gateway chassis. This behav‐
2020                     ior avoids generation of multiple ARP responses from dif‐
2021                     ferent chassis, and allows upstream MAC learning to point
2022                     to the gateway chassis.
2023
2024                     For the logical router port with the option reside-on-re‐
2025                     direct-chassis  set  (which  is  centralized),  the above
2026                     flows are only programmed on the gateway port instance on
2027                     the gateway chassis (if the logical router has a distrib‐
2028                     uted gateway port). This behavior  avoids  generation  of
2029                     multiple ARP responses from different chassis, and allows
2030                     upstream MAC learning to point to the gateway chassis.
2031
2032              •      Reply to IPv6 Neighbor Solicitations. These  flows  reply
2033                     to  Neighbor  Solicitation  requests for the router’s own
2034                     IPv6 address and populate the logical router’s mac  bind‐
2035                     ing table.
2036
2037                     For  each  router  port  P  that owns IPv6 address A, so‐
2038                     licited node address S, and Ethernet address E, a  prior‐
2039                     ity-90  flow  matches  inport == P && nd_ns && ip6.dst ==
2040                     {A, E} && nd.target == A with the following actions:
2041
2042                     nd_na_router {
2043                         eth.src = xreg0[0..47];
2044                         ip6.src = A;
2045                         nd.target = A;
2046                         nd.tll = xreg0[0..47];
2047                         outport = inport;
2048                         flags.loopback = 1;
2049                         output;
2050                     };
2051
2052
2053                     For the gateway port  on  a  distributed  logical  router
2054                     (where  one of the logical router ports specifies a gate‐
2055                     way chassis), the above flows replying to  IPv6  Neighbor
2056                     Solicitations are only programmed on the gateway port in‐
2057                     stance on the gateway chassis. This behavior avoids  gen‐
2058                     eration  of  multiple replies from different chassis, and
2059                     allows upstream MAC learning  to  point  to  the  gateway
2060                     chassis.
2061
2062              •      These flows reply to ARP requests or IPv6 neighbor solic‐
2063                     itation for the virtual IP addresses  configured  in  the
2064                     router for NAT (both DNAT and SNAT) or load balancing.
2065
2066                     IPv4:  For  a  configured NAT (both DNAT and SNAT) IP ad‐
2067                     dress or a load balancer IPv4 VIP A, for each router port
2068                     P  with  Ethernet  address  E, a priority-90 flow matches
2069                     arp.op == 1 && arp.tpa == A (ARP request) with  the  fol‐
2070                     lowing actions:
2071
2072                     eth.dst = eth.src;
2073                     eth.src = xreg0[0..47];
2074                     arp.op = 2; /* ARP reply. */
2075                     arp.tha = arp.sha;
2076                     arp.sha = xreg0[0..47];
2077                     arp.tpa <-> arp.spa;
2078                     outport = inport;
2079                     flags.loopback = 1;
2080                     output;
2081
2082
2083                     IPv4:  For a configured load balancer IPv4 VIP, a similar
2084                     flow is added with the additional match inport  ==  P  if
2085                     the  VIP is reachable from any logical router port of the
2086                     logical router.
2087
2088                     If the router port P  is  a  distributed  gateway  router
2089                     port,  then  the  is_chassis_resident(P) is also added in
2090                     the match condition for the load balancer IPv4 VIP A.
2091
2092                     IPv6: For a configured NAT (both DNAT and  SNAT)  IP  ad‐
2093                     dress or a load balancer IPv6 VIP A (if the VIP is reach‐
2094                     able from any logical router port of the logical router),
2095                     solicited  node  address  S,  for each router port P with
2096                     Ethernet address E, a priority-90 flow matches inport  ==
2097                     P  &&  nd_ns  && ip6.dst == {A, S} && nd.target == A with
2098                     the following actions:
2099
2100                     eth.dst = eth.src;
2101                     nd_na {
2102                         eth.src = xreg0[0..47];
2103                         nd.tll = xreg0[0..47];
2104                         ip6.src = A;
2105                         nd.target = A;
2106                         outport = inport;
2107                         flags.loopback = 1;
2108                         output;
2109                     }
2110
2111
2112                     If the router port P  is  a  distributed  gateway  router
2113                     port,  then  the  is_chassis_resident(P) is also added in
2114                     the match condition for the load balancer IPv6 VIP A.
2115
2116                     For the gateway port on a distributed logical router with
2117                     NAT  (where  one  of the logical router ports specifies a
2118                     gateway chassis):
2119
2120                     •      If the corresponding NAT rule cannot be handled in
2121                            a  distributed  manner, then a priority-92 flow is
2122                            programmed on the gateway  port  instance  on  the
2123                            gateway  chassis.  A priority-91 drop flow is pro‐
2124                            grammed on the other chassis when ARP  requests/NS
2125                            packets are received on the gateway port. This be‐
2126                            havior avoids generation of multiple ARP responses
2127                            from  different  chassis,  and allows upstream MAC
2128                            learning to point to the gateway chassis.
2129
2130                     •      If the corresponding NAT rule can be handled in  a
2131                            distributed  manner,  then  this flow is only pro‐
2132                            grammed on the gateway  port  instance  where  the
2133                            logical_port specified in the NAT rule resides.
2134
2135                            Some  of  the actions are different for this case,
2136                            using the external_mac specified in the  NAT  rule
2137                            rather than the gateway port’s Ethernet address E:
2138
2139                            eth.src = external_mac;
2140                            arp.sha = external_mac;
2141
2142
2143                            or in the case of IPv6 neighbor solicition:
2144
2145                            eth.src = external_mac;
2146                            nd.tll = external_mac;
2147
2148
2149                            This  behavior  avoids  generation of multiple ARP
2150                            responses from different chassis, and  allows  up‐
2151                            stream  MAC learning to point to the correct chas‐
2152                            sis.
2153
2154              •      Priority-85 flows which drops the ARP and  IPv6  Neighbor
2155                     Discovery packets.
2156
2157              •      A priority-84 flow explicitly allows IPv6 multicast traf‐
2158                     fic that is supposed to reach the router pipeline  (i.e.,
2159                     router solicitation and router advertisement packets).
2160
2161              •      A  priority-83 flow explicitly drops IPv6 multicast traf‐
2162                     fic that is destined to reserved multicast groups.
2163
2164              •      A priority-82 flow allows IP  multicast  traffic  if  op‐
2165                     tions:mcast_relay=’true’, otherwise drops it.
2166
2167              •      UDP  port  unreachable.  Priority-80  flows generate ICMP
2168                     port unreachable messages in reply to UDP  datagrams  di‐
2169                     rected  to the router’s IP address, except in the special
2170                     case of gateways, which  accept  traffic  directed  to  a
2171                     router IP for load balancing and NAT purposes.
2172
2173                     These  flows  should  not match IP fragments with nonzero
2174                     offset.
2175
2176              •      TCP reset. Priority-80 flows generate TCP reset  messages
2177                     in reply to TCP datagrams directed to the router’s IP ad‐
2178                     dress, except in the special case of gateways, which  ac‐
2179                     cept  traffic  directed to a router IP for load balancing
2180                     and NAT purposes.
2181
2182                     These flows should not match IP  fragments  with  nonzero
2183                     offset.
2184
2185              •      Protocol or address unreachable. Priority-70 flows gener‐
2186                     ate ICMP protocol or  address  unreachable  messages  for
2187                     IPv4  and  IPv6 respectively in reply to packets directed
2188                     to the router’s IP address on  IP  protocols  other  than
2189                     UDP,  TCP,  and ICMP, except in the special case of gate‐
2190                     ways, which accept traffic directed to a  router  IP  for
2191                     load balancing purposes.
2192
2193                     These  flows  should  not match IP fragments with nonzero
2194                     offset.
2195
2196              •      Drop other IP traffic to this router.  These  flows  drop
2197                     any  other  traffic  destined  to  an  IP address of this
2198                     router that is not already handled by one  of  the  flows
2199                     above,  which  amounts to ICMP (other than echo requests)
2200                     and fragments with nonzero offsets. For each IP address A
2201                     owned  by  the router, a priority-60 flow matches ip4.dst
2202                     == A or ip6.dst == A and drops the traffic. An  exception
2203                     is  made  and  the  above flow is not added if the router
2204                     port’s own IP address is used  to  SNAT  packets  passing
2205                     through that router.
2206
2207       The flows above handle all of the traffic that might be directed to the
2208       router itself. The following flows (with lower priorities)  handle  the
2209       remaining traffic, potentially for forwarding:
2210
2211              •      Drop  Ethernet  local  broadcast. A priority-50 flow with
2212                     match eth.bcast drops traffic destined to the local  Eth‐
2213                     ernet  broadcast  address.  By  definition  this  traffic
2214                     should not be forwarded.
2215
2216              •      ICMP time exceeded. For each router port P, whose IP  ad‐
2217                     dress is A, a priority-100 flow with match inport == P &&
2218                     ip.ttl == {0, 1} && !ip.later_frag matches packets  whose
2219                     TTL  has  expired,  with the following actions to send an
2220                     ICMP time exceeded reply for IPv4 and IPv6 respectively:
2221
2222                     icmp4 {
2223                         icmp4.type = 11; /* Time exceeded. */
2224                         icmp4.code = 0;  /* TTL exceeded in transit. */
2225                         ip4.dst = ip4.src;
2226                         ip4.src = A;
2227                         ip.ttl = 254;
2228                         next;
2229                     };
2230                     icmp6 {
2231                         icmp6.type = 3; /* Time exceeded. */
2232                         icmp6.code = 0;  /* TTL exceeded in transit. */
2233                         ip6.dst = ip6.src;
2234                         ip6.src = A;
2235                         ip.ttl = 254;
2236                         next;
2237                     };
2238
2239
2240              •      TTL discard. A priority-30 flow with match ip.ttl ==  {0,
2241                     1}  and  actions  drop; drops other packets whose TTL has
2242                     expired, that should not receive a ICMP error reply (i.e.
2243                     fragments with nonzero offset).
2244
2245              •      Next  table.  A  priority-0  flows match all packets that
2246                     aren’t already handled and uses  actions  next;  to  feed
2247                     them to the next table.
2248
2249     Ingress Table 4: UNSNAT
2250
2251       This  is  for  already  established connections’ reverse traffic. i.e.,
2252       SNAT has already been done in egress pipeline and now  the  packet  has
2253       entered the ingress pipeline as part of a reply. It is unSNATted here.
2254
2255       Ingress Table 4: UNSNAT on Gateway and Distributed Routers
2256
2257              •      If the Router (Gateway or Distributed) is configured with
2258                     load balancers, then below lflows are added:
2259
2260                     For each IPv4 address A defined as load balancer VIP with
2261                     the  protocol  P  (and the protocol port T if defined) is
2262                     also present as an external_ip in the NAT table, a prior‐
2263                     ity-120  logical  flow  is  added  with  the match ip4 &&
2264                     ip4.dst == A && P with the action next;  to  advance  the
2265                     packet to the next table. If the load balancer has proto‐
2266                     col port B defined, then the match also has P.dst == B.
2267
2268                     The above flows are also added for IPv6 load balancers.
2269
2270       Ingress Table 4: UNSNAT on Gateway Routers
2271
2272              •      If the Gateway router has been configured to  force  SNAT
2273                     any  previously DNATted packets to B, a priority-110 flow
2274                     matches ip && ip4.dst == B or ip && ip6.dst == B with  an
2275                     action ct_snat; .
2276
2277                     If    the    Gateway    router    is    configured   with
2278                     lb_force_snat_ip=router_ip then for every logical  router
2279                     port  P attached to the Gateway router with the router ip
2280                     B, a priority-110 flow is added with the match inport  ==
2281                     P  && ip4.dst == B or inport == P && ip6.dst == B with an
2282                     action ct_snat; .
2283
2284                     If the Gateway router has been configured to  force  SNAT
2285                     any previously load-balanced packets to B, a priority-100
2286                     flow matches ip && ip4.dst == B or ip  &&  ip6.dst  ==  B
2287                     with an action ct_snat; .
2288
2289                     For  each  NAT  configuration in the OVN Northbound data‐
2290                     base, that asks to change the  source  IP  address  of  a
2291                     packet  from  A  to  B,  a priority-90 flow matches ip &&
2292                     ip4.dst == B or  ip  &&  ip6.dst  ==  B  with  an  action
2293                     ct_snat;  .  If the NAT rule is of type dnat_and_snat and
2294                     has stateless=true in the options, then the action  would
2295                     be ip4/6.dst= (B).
2296
2297                     A priority-0 logical flow with match 1 has actions next;.
2298
2299       Ingress Table 4: UNSNAT on Distributed Routers
2300
2301              •      For  each  configuration  in the OVN Northbound database,
2302                     that asks to change the source IP  address  of  a  packet
2303                     from A to B, two priority-100 flows are added.
2304
2305                     If  the  NAT rule cannot be handled in a distributed man‐
2306                     ner, then the below  priority-100  flows  are  only  pro‐
2307                     grammed on the gateway chassis.
2308
2309                     •      The  first  flow matches ip && ip4.dst == B && in‐
2310                            port == GW && flags.loopback == 0 or ip && ip6.dst
2311                            ==  B && inport == GW && flags.loopback == 0 where
2312                            GW is the distributed gateway  port  specified  in
2313                            the  NAT rule, with an action ct_snat_in_czone; to
2314                            unSNAT in the common zone. If the NAT rule  is  of
2315                            type  dnat_and_snat  and has stateless=true in the
2316                            options, then the action would be ip4/6.dst=(B).
2317
2318                            If the NAT entry is of type snat, then there is an
2319                            additional match is_chassis_resident(cr-GW)
2320                             where cr-GW is the chassis resident port of GW.
2321
2322                     •      The  second flow matches ip && ip4.dst == B && in‐
2323                            port   ==   GW   &&   flags.loopback   ==   1   &&
2324                            flags.use_snat_zone  == 1 or ip && ip6.dst == B &&
2325                            inport  ==  GW   &&   flags.loopback   ==   0   &&
2326                            flags.use_snat_zone  == 1 where GW is the distrib‐
2327                            uted gateway port specified in the NAT rule,  with
2328                            an  action ct_snat; to unSNAT in the snat zone. If
2329                            the NAT rule is  of  type  dnat_and_snat  and  has
2330                            stateless=true  in  the  options,  then the action
2331                            would be ip4/6.dst=(B).
2332
2333                            If the NAT entry is of type snat, then there is an
2334                            additional match is_chassis_resident(cr-GW)
2335                             where cr-GW is the chassis resident port of GW.
2336
2337                     A priority-0 logical flow with match 1 has actions next;.
2338
2339     Ingress Table 5: DEFRAG
2340
2341       This  is to send packets to connection tracker for tracking and defrag‐
2342       mentation. It contains a priority-0 flow that simply moves  traffic  to
2343       the next table.
2344
2345       If  load  balancing rules with only virtual IP addresses are configured
2346       in OVN_Northbound database for a Gateway router, a priority-100 flow is
2347       added  for  each  configured  virtual IP address VIP. For IPv4 VIPs the
2348       flow matches ip && ip4.dst == VIP. For IPv6 VIPs, the flow  matches  ip
2349       && ip6.dst == VIP. The flow applies the action reg0 = VIP; ct_dnat; (or
2350       xxreg0 for IPv6) to send IP  packets  to  the  connection  tracker  for
2351       packet  de-fragmentation and to dnat the destination IP for the commit‐
2352       ted connection before sending it to the next table.
2353
2354       If load balancing rules with virtual IP addresses and ports are config‐
2355       ured  in  OVN_Northbound  database for a Gateway router, a priority-110
2356       flow is added for each configured  virtual  IP  address  VIP,  protocol
2357       PROTO  and  port  PORT. For IPv4 VIPs the flow matches ip && ip4.dst ==
2358       VIP && PROTO && PROTO.dst == PORT. For IPv6 VIPs, the flow  matches  ip
2359       &&  ip6.dst  == VIP && PROTO && PROTO.dst == PORT. The flow applies the
2360       action reg0 = VIP; reg9[16..31] = PROTO.dst; ct_dnat;  (or  xxreg0  for
2361       IPv6)  to send IP packets to the connection tracker for packet de-frag‐
2362       mentation and to dnat the destination IP for the  committed  connection
2363       before sending it to the next table.
2364
2365       If  ECMP  routes  with symmetric reply are configured in the OVN_North‐
2366       bound database for a gateway router, a priority-100 flow is  added  for
2367       each  router port on which symmetric replies are configured. The match‐
2368       ing logic for these ports essentially reverses the configured logic  of
2369       the  ECMP  route.  So  for instance, a route with a destination routing
2370       policy will instead match if the source IP address matches  the  static
2371       route’s  prefix. The flow uses the action ct_next to send IP packets to
2372       the connection tracker for packet de-fragmentation and tracking  before
2373       sending it to the next table.
2374
2375     Ingress Table 6: DNAT
2376
2377       Packets enter the pipeline with destination IP address that needs to be
2378       DNATted from a virtual IP address to a real IP address. Packets in  the
2379       reverse direction needs to be unDNATed.
2380
2381       Ingress Table 6: Load balancing DNAT rules
2382
2383       Following  load  balancing  DNAT  flows are added for Gateway router or
2384       Router with gateway port. These flows are programmed only on the  gate‐
2385       way  chassis. These flows do not get programmed for load balancers with
2386       IPv6 VIPs.
2387
2388              •      If controller_event has been enabled for all the  config‐
2389                     ured  load balancing rules for a Gateway router or Router
2390                     with gateway port in OVN_Northbound  database  that  does
2391                     not  have  configured  backends,  a  priority-130 flow is
2392                     added to trigger ovn-controller events whenever the chas‐
2393                     sis  receives  a  packet  for  that  particular  VIP.  If
2394                     event-elb meter has been previously created, it  will  be
2395                     associated to the empty_lb logical flow
2396
2397              •      For all the configured load balancing rules for a Gateway
2398                     router or Router  with  gateway  port  in  OVN_Northbound
2399                     database  that  includes a L4 port PORT of protocol P and
2400                     IPv4 or  IPv6  address  VIP,  a  priority-120  flow  that
2401                     matches  on  ct.new  &&  ip  &&  reg0  ==  VIP  &&  P  &&
2402                     reg9[16..31] ==  PORT (xxreg0 == VIP in  the  IPv6  case)
2403                     with  an  action of ct_lb_mark(args), where args contains
2404                     comma separated IPv4 or IPv6 addresses (and optional port
2405                     numbers)  to load balance to. If the router is configured
2406                     to force SNAT any load-balanced packets, the above action
2407                     will   be   replaced   by  flags.force_snat_for_lb  =  1;
2408                     ct_lb_mark(args);. If the load balancing rule is  config‐
2409                     ured with skip_snat set to true, the above action will be
2410                     replaced     by     flags.skip_snat_for_lb      =      1;
2411                     ct_lb_mark(args);.  If health check is enabled, then args
2412                     will only contain those endpoints whose  service  monitor
2413                     status  entry  in  OVN_Southbound  db is either online or
2414                     empty.
2415
2416                     The previous table lr_in_defrag sets  the  register  reg0
2417                     (or  xxreg0  for IPv6) and does ct_dnat. Hence for estab‐
2418                     lished traffic, this table just advances  the  packet  to
2419                     the next stage.
2420
2421              •      For  all the configured load balancing rules for a router
2422                     in OVN_Northbound database that includes a L4  port  PORT
2423                     of  protocol  P  and  IPv4  or IPv6 address VIP, a prior‐
2424                     ity-120 flow that matches on ct.est && ip4 && reg0 == VIP
2425                     &&  P  && reg9[16..31] ==  PORT (ip6 and xxreg0 == VIP in
2426                     the IPv6 case) with an action of next;. If the router  is
2427                     configured  to  force SNAT any load-balanced packets, the
2428                     above action will be replaced by  flags.force_snat_for_lb
2429                     = 1; next;. If the load balancing rule is configured with
2430                     skip_snat set to true, the above action will be  replaced
2431                     by flags.skip_snat_for_lb = 1; next;.
2432
2433                     The  previous  table  lr_in_defrag sets the register reg0
2434                     (or xxreg0 for IPv6) and does ct_dnat. Hence  for  estab‐
2435                     lished  traffic,  this  table just advances the packet to
2436                     the next stage.
2437
2438              •      For all the configured load balancing rules for a  router
2439                     in  OVN_Northbound  database that includes just an IP ad‐
2440                     dress VIP to match on, a priority-110 flow  that  matches
2441                     on ct.new && ip4 && reg0 == VIP (ip6 and xxreg0 == VIP in
2442                     the IPv6 case) with an action of ct_lb_mark(args),  where
2443                     args  contains comma separated IPv4 or IPv6 addresses. If
2444                     the router is configured to force SNAT any  load-balanced
2445                     packets,   the   above   action   will   be  replaced  by
2446                     flags.force_snat_for_lb = 1;  ct_lb_mark(args);.  If  the
2447                     load  balancing  rule is configured with skip_snat set to
2448                     true,   the   above   action   will   be   replaced    by
2449                     flags.skip_snat_for_lb = 1; ct_lb_mark(args);.
2450
2451                     The  previous  table  lr_in_defrag sets the register reg0
2452                     (or xxreg0 for IPv6) and does ct_dnat. Hence  for  estab‐
2453                     lished  traffic,  this  table just advances the packet to
2454                     the next stage.
2455
2456              •      For all the configured load balancing rules for a  router
2457                     in  OVN_Northbound  database that includes just an IP ad‐
2458                     dress VIP to match on, a priority-110 flow  that  matches
2459                     on  ct.est  &&  ip4  && reg0 == VIP (or ip6 and xxreg0 ==
2460                     VIP) with an action of next;. If the router is configured
2461                     to force SNAT any load-balanced packets, the above action
2462                     will be replaced by flags.force_snat_for_lb =  1;  next;.
2463                     If  the  load balancing rule is configured with skip_snat
2464                     set to  true,  the  above  action  will  be  replaced  by
2465                     flags.skip_snat_for_lb = 1; next;.
2466
2467                     The  previous  table  lr_in_defrag sets the register reg0
2468                     (or xxreg0 for IPv6) and does ct_dnat. Hence  for  estab‐
2469                     lished  traffic,  this  table just advances the packet to
2470                     the next stage.
2471
2472              •      If the load balancer is created with --reject option  and
2473                     it  has no active backends, a TCP reset segment (for tcp)
2474                     or an ICMP port unreachable packet (for all other kind of
2475                     traffic)  will be sent whenever an incoming packet is re‐
2476                     ceived for this load-balancer. Please note using --reject
2477                     option will disable empty_lb SB controller event for this
2478                     load balancer.
2479
2480       Ingress Table 6: DNAT on Gateway Routers
2481
2482              •      For each configuration in the  OVN  Northbound  database,
2483                     that  asks  to  change  the  destination  IP address of a
2484                     packet from A to B, a priority-100  flow  matches  ip  &&
2485                     ip4.dst  ==  A  or  ip  &&  ip6.dst  ==  A with an action
2486                     flags.loopback = 1; ct_dnat(B);. If the Gateway router is
2487                     configured to force SNAT any DNATed packet, the above ac‐
2488                     tion will be replaced by flags.force_snat_for_dnat  =  1;
2489                     flags.loopback  =  1;  ct_dnat(B);. If the NAT rule is of
2490                     type dnat_and_snat and has stateless=true in the options,
2491                     then the action would be ip4/6.dst= (B).
2492
2493                     If  the  NAT  rule  has  allowed_ext_ips configured, then
2494                     there is an additional match ip4.src == allowed_ext_ips .
2495                     Similarly,  for  IPV6,  match  would  be  ip6.src  == al‐
2496                     lowed_ext_ips.
2497
2498                     If the NAT rule has exempted_ext_ips set, then  there  is
2499                     an  additional  flow configured at priority 101. The flow
2500                     matches if source ip is an exempted_ext_ip and the action
2501                     is next; . This flow is used to bypass the ct_dnat action
2502                     for a packet originating from exempted_ext_ips.
2503
2504              •      A priority-0 logical flow with match 1 has actions next;.
2505
2506       Ingress Table 6: DNAT on Distributed Routers
2507
2508       On distributed routers, the DNAT table only handles packets with desti‐
2509       nation IP address that needs to be DNATted from a virtual IP address to
2510       a real IP address. The unDNAT processing in the  reverse  direction  is
2511       handled in a separate table in the egress pipeline.
2512
2513              •      For  each  configuration  in the OVN Northbound database,
2514                     that asks to change  the  destination  IP  address  of  a
2515                     packet  from  A  to  B, a priority-100 flow matches ip &&
2516                     ip4.dst == B && inport == GW, where  GW  is  the  logical
2517                     router  gateway port configured for the NAT rule, with an
2518                     action ct_dnat(B);. The match will include ip6.dst  ==  B
2519                     in   the   IPv6   case.  If  the  NAT  rule  is  of  type
2520                     dnat_and_snat and has stateless=true in the options, then
2521                     the action would be ip4/6.dst=(B).
2522
2523                     If  the  NAT rule cannot be handled in a distributed man‐
2524                     ner, then the priority-100 flow above is only  programmed
2525                     on the gateway chassis.
2526
2527                     If  the  NAT  rule  has  allowed_ext_ips configured, then
2528                     there is an additional match ip4.src == allowed_ext_ips .
2529                     Similarly,  for  IPV6,  match  would  be  ip6.src  == al‐
2530                     lowed_ext_ips.
2531
2532                     If the NAT rule has exempted_ext_ips set, then  there  is
2533                     an  additional  flow configured at priority 101. The flow
2534                     matches if source ip is an exempted_ext_ip and the action
2535                     is next; . This flow is used to bypass the ct_dnat action
2536                     for a packet originating from exempted_ext_ips.
2537
2538                     A priority-0 logical flow with match 1 has actions next;.
2539
2540     Ingress Table 7: ECMP symmetric reply processing
2541
2542              •      If ECMP routes with symmetric reply are configured in the
2543                     OVN_Northbound  database  for  a gateway router, a prior‐
2544                     ity-100 flow is added for each router port on which  sym‐
2545                     metric  replies  are  configured.  The matching logic for
2546                     these ports essentially reverses the configured logic  of
2547                     the  ECMP route. So for instance, a route with a destina‐
2548                     tion routing policy will instead match if the  source  IP
2549                     address  matches the static route’s prefix. The flow uses
2550                     the  action   ct_commit   {   ct_label.ecmp_reply_eth   =
2551                     eth.src;" " ct_mark.ecmp_reply_port = K;}; next;  to com‐
2552                     mit the connection and storing eth.src and the ECMP reply
2553                     port binding tunnel key K in the ct_label.
2554
2555     Ingress Table 8: IPv6 ND RA option processing
2556
2557              •      A  priority-50  logical  flow  is  added for each logical
2558                     router port configured with  IPv6  ND  RA  options  which
2559                     matches  IPv6  ND  Router Solicitation packet and applies
2560                     the action put_nd_ra_opts and advances the packet to  the
2561                     next table.
2562
2563                     reg0[5] = put_nd_ra_opts(options);next;
2564
2565
2566                     For a valid IPv6 ND RS packet, this transforms the packet
2567                     into an IPv6 ND RA reply and sets the RA options  to  the
2568                     packet  and  stores  1  into  reg0[5]. For other kinds of
2569                     packets, it just stores 0 into reg0[5].  Either  way,  it
2570                     continues to the next table.
2571
2572              •      A priority-0 logical flow with match 1 has actions next;.
2573
2574     Ingress Table 9: IPv6 ND RA responder
2575
2576       This  table  implements IPv6 ND RA responder for the IPv6 ND RA replies
2577       generated by the previous table.
2578
2579              •      A priority-50 logical flow  is  added  for  each  logical
2580                     router  port  configured  with  IPv6  ND RA options which
2581                     matches IPv6 ND RA packets and reg0[5] == 1 and  responds
2582                     back  to  the  inport  after  applying  these actions. If
2583                     reg0[5]  is  set  to  1,  it  means   that   the   action
2584                     put_nd_ra_opts was successful.
2585
2586                     eth.dst = eth.src;
2587                     eth.src = E;
2588                     ip6.dst = ip6.src;
2589                     ip6.src = I;
2590                     outport = P;
2591                     flags.loopback = 1;
2592                     output;
2593
2594
2595                     where  E  is the MAC address and I is the IPv6 link local
2596                     address of the logical router port.
2597
2598                     (This terminates packet processing in  ingress  pipeline;
2599                     the packet does not go to the next ingress table.)
2600
2601              •      A priority-0 logical flow with match 1 has actions next;.
2602
2603     Ingress Table 10: IP Routing Pre
2604
2605       If  a packet arrived at this table from Logical Router Port P which has
2606       options:route_table value set, a logical flow with match inport ==  "P"
2607       with  priority  100  and  action  setting unique-generated per-datapath
2608       32-bit value (non-zero) in OVS register 7.  This  register’s  value  is
2609       checked  in  next  table.  If packet didn’t match any configured inport
2610       (<main> route table), register 7 value is set to 0.
2611
2612       This table contains the following logical flows:
2613
2614              •      Priority-100 flow with match inport ==  "LRP_NAME"  value
2615                     and action, which set route table identifier in reg7.
2616
2617                     A priority-0 logical flow with match 1 has actions reg7 =
2618                     0; next;.
2619
2620     Ingress Table 11: IP Routing
2621
2622       A packet that arrives at this table is an  IP  packet  that  should  be
2623       routed  to  the address in ip4.dst or ip6.dst. This table implements IP
2624       routing, setting reg0 (or xxreg0 for IPv6) to the next-hop  IP  address
2625       (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
2626       and advances to the next table for ARP resolution. It  also  sets  reg1
2627       (or  xxreg1)  to  the  IP  address  owned  by  the selected router port
2628       (ingress table ARP Request will generate an  ARP  request,  if  needed,
2629       with  reg0 as the target protocol address and reg1 as the source proto‐
2630       col address).
2631
2632       For ECMP routes, i.e. multiple static routes with same policy and  pre‐
2633       fix  but different nexthops, the above actions are deferred to next ta‐
2634       ble. This table, instead, is responsible for determine the  ECMP  group
2635       id and select a member id within the group based on 5-tuple hashing. It
2636       stores group id in reg8[0..15] and member id in reg8[16..31]. This step
2637       is skipped with a priority-10300 rule if the traffic going out the ECMP
2638       route is reply traffic, and the ECMP route was configured to  use  sym‐
2639       metric  replies.  Instead,  the  stored  values in conntrack is used to
2640       choose the destination. The ct_label.ecmp_reply_eth tells the  destina‐
2641       tion   MAC   address   to   which   the  packet  should  be  sent.  The
2642       ct_mark.ecmp_reply_port tells the logical  router  port  on  which  the
2643       packet  should be sent. These values saved to the conntrack fields when
2644       the initial ingress traffic is received over the ECMP route and commit‐
2645       ted  to  conntrack. The priority-10300 flows in this stage set the out‐
2646       port, while the eth.dst is set by flows at the ARP/ND Resolution stage.
2647
2648       This table contains the following logical flows:
2649
2650              •      Priority-10550 flow  that  drops  IPv6  Router  Solicita‐
2651                     tion/Advertisement  packets  that  were  not processed in
2652                     previous tables.
2653
2654              •      Priority-10550 flows that drop IGMP and MLD packets  with
2655                     source MAC address owned by the router. These are used to
2656                     prevent looping statically forwarded IGMP and MLD packets
2657                     for which TTL is not decremented (it is always 1).
2658
2659              •      Priority-10500 flows that match IP multicast traffic des‐
2660                     tined  to  groups  registered  on  any  of  the  attached
2661                     switches  and  sets  outport  to the associated multicast
2662                     group that will eventually flood the traffic to  all  in‐
2663                     terested attached logical switches. The flows also decre‐
2664                     ment TTL.
2665
2666              •      Priority-10460 flows that  match  IGMP  and  MLD  control
2667                     packets,  set  outport  to the MC_STATIC multicast group,
2668                     which ovn-northd populates with the  logical  ports  that
2669                     have  options :mcast_flood=’true’. If no router ports are
2670                     configured to flood multicast  traffic  the  packets  are
2671                     dropped.
2672
2673              •      Priority-10450  flow  that matches unregistered IP multi‐
2674                     cast traffic decrements  TTL  and  sets  outport  to  the
2675                     MC_STATIC  multicast  group,  which  ovn-northd populates
2676                     with   the    logical    ports    that    have    options
2677                     :mcast_flood=’true’. If no router ports are configured to
2678                     flood multicast traffic the packets are dropped.
2679
2680              •      IPv4 routing table. For each route to IPv4 network N with
2681                     netmask  M, on router port P with IP address A and Ether‐
2682                     net address E, a logical flow with match ip4.dst ==  N/M,
2683                     whose priority is the number of 1-bits in M, has the fol‐
2684                     lowing actions:
2685
2686                     ip.ttl--;
2687                     reg8[0..15] = 0;
2688                     reg0 = G;
2689                     reg1 = A;
2690                     eth.src = E;
2691                     outport = P;
2692                     flags.loopback = 1;
2693                     next;
2694
2695
2696                     (Ingress table 1 already verified that ip.ttl--; will not
2697                     yield a TTL exceeded error.)
2698
2699                     If  the route has a gateway, G is the gateway IP address.
2700                     Instead, if the route is from a configured static  route,
2701                     G is the next hop IP address. Else it is ip4.dst.
2702
2703              •      IPv6 routing table. For each route to IPv6 network N with
2704                     netmask M, on router port P with IP address A and  Ether‐
2705                     net address E, a logical flow with match in CIDR notation
2706                     ip6.dst == N/M, whose priority is the integer value of M,
2707                     has the following actions:
2708
2709                     ip.ttl--;
2710                     reg8[0..15] = 0;
2711                     xxreg0 = G;
2712                     xxreg1 = A;
2713                     eth.src = E;
2714                     outport = inport;
2715                     flags.loopback = 1;
2716                     next;
2717
2718
2719                     (Ingress table 1 already verified that ip.ttl--; will not
2720                     yield a TTL exceeded error.)
2721
2722                     If the route has a gateway, G is the gateway IP  address.
2723                     Instead,  if the route is from a configured static route,
2724                     G is the next hop IP address. Else it is ip6.dst.
2725
2726                     If the address A is in the link-local  scope,  the  route
2727                     will be limited to sending on the ingress port.
2728
2729                     For  each  static  route the reg7 == id && is prefixed in
2730                     logical flow match portion. For routes  with  route_table
2731                     value set a unique non-zero id is used. For routes within
2732                     <main> route table (no route table set), this id value is
2733                     0.
2734
2735                     For each connected route (route to the LRP’s subnet CIDR)
2736                     the logical flow match portion has no reg7 == id &&  pre‐
2737                     fix to have route to LRP’s subnets in all routing tables.
2738
2739              •      For  ECMP  routes, they are grouped by policy and prefix.
2740                     An unique id (non-zero) is assigned to  each  group,  and
2741                     each  member  is  also  assigned  an unique id (non-zero)
2742                     within each group.
2743
2744                     For each IPv4/IPv6 ECMP group with group id GID and  mem‐
2745                     ber  ids  MID1,  MID2,  ..., a logical flow with match in
2746                     CIDR notation ip4.dst == N/M, or ip6.dst  ==  N/M,  whose
2747                     priority is the integer value of M, has the following ac‐
2748                     tions:
2749
2750                     ip.ttl--;
2751                     flags.loopback = 1;
2752                     reg8[0..15] = GID;
2753                     select(reg8[16..31], MID1, MID2, ...);
2754
2755
2756     Ingress Table 12: IP_ROUTING_ECMP
2757
2758       This table implements the second part of IP  routing  for  ECMP  routes
2759       following  the  previous table. If a packet matched a ECMP group in the
2760       previous table, this table matches the group id and  member  id  stored
2761       from the previous table, setting reg0 (or xxreg0 for IPv6) to the next-
2762       hop IP address (leaving ip4.dst or ip6.dst, the packet’s final destina‐
2763       tion,  unchanged) and advances to the next table for ARP resolution. It
2764       also sets reg1 (or xxreg1) to the IP  address  owned  by  the  selected
2765       router port (ingress table ARP Request will generate an ARP request, if
2766       needed, with reg0 as the target protocol address and reg1 as the source
2767       protocol address).
2768
2769       This  processing is skipped for reply traffic being sent out of an ECMP
2770       route if the route was configured to use symmetric replies.
2771
2772       This table contains the following logical flows:
2773
2774              •      A priority-150 flow that matches reg8[0..15]  ==  0  with
2775                     action   next;  directly  bypasses  packets  of  non-ECMP
2776                     routes.
2777
2778              •      For each member with ID MID in each ECMP  group  with  ID
2779                     GID, a priority-100 flow with match reg8[0..15] == GID &&
2780                     reg8[16..31] == MID has following actions:
2781
2782                     [xx]reg0 = G;
2783                     [xx]reg1 = A;
2784                     eth.src = E;
2785                     outport = P;
2786
2787
2788     Ingress Table 13: Router policies
2789
2790       This table adds flows for the logical router policies configured on the
2791       logical   router.   Please   see   the  OVN_Northbound  database  Logi‐
2792       cal_Router_Policy table documentation in ovn-nb for supported actions.
2793
2794              •      For each router policy configured on the logical  router,
2795                     a  logical  flow  is added with specified priority, match
2796                     and actions.
2797
2798              •      If the policy action is reroute with 2 or  more  nexthops
2799                     defined,  then the logical flow is added with the follow‐
2800                     ing actions:
2801
2802                     reg8[0..15] = GID;
2803                     reg8[16..31] = select(1,..n);
2804
2805
2806                     where GID is the ECMP group id  generated  by  ovn-northd
2807                     for  this  policy and n is the number of nexthops. select
2808                     action selects one of the nexthop member id, stores it in
2809                     the  register reg8[16..31] and advances the packet to the
2810                     next stage.
2811
2812              •      If the policy action is reroute  with  just  one  nexhop,
2813                     then  the  logical  flow  is added with the following ac‐
2814                     tions:
2815
2816                     [xx]reg0 = H;
2817                     eth.src = E;
2818                     outport = P;
2819                     reg8[0..15] = 0;
2820                     flags.loopback = 1;
2821                     next;
2822
2823
2824                     where H is the nexthop  defined in the router  policy,  E
2825                     is  the  ethernet address of the logical router port from
2826                     which the nexthop is  reachable  and  P  is  the  logical
2827                     router port from which the nexthop is reachable.
2828
2829              •      If  a  router policy has the option pkt_mark=m set and if
2830                     the action is not drop, then  the  action  also  includes
2831                     pkt.mark = m to mark the packet with the marker m.
2832
2833     Ingress Table 14: ECMP handling for router policies
2834
2835       This  table  handles  the  ECMP for the router policies configured with
2836       multiple nexthops.
2837
2838              •      A priority-150 flow is added to advance the packet to the
2839                     next  stage  if the ECMP group id register reg8[0..15] is
2840                     0.
2841
2842              •      For each ECMP reroute router policy  with  multiple  nex‐
2843                     thops,  a  priority-100  flow is added for each nexthop H
2844                     with the match reg8[0..15] == GID &&  reg8[16..31]  ==  M
2845                     where  GID  is  the  router  policy group id generated by
2846                     ovn-northd and M is the member id of the nexthop H gener‐
2847                     ated  by  ovn-northd.  The following actions are added to
2848                     the flow:
2849
2850                     [xx]reg0 = H;
2851                     eth.src = E;
2852                     outport = P
2853                     "flags.loopback = 1; "
2854                     "next;"
2855
2856
2857                     where H is the nexthop  defined in the router  policy,  E
2858                     is  the  ethernet address of the logical router port from
2859                     which the nexthop is  reachable  and  P  is  the  logical
2860                     router port from which the nexthop is reachable.
2861
2862     Ingress Table 15: ARP/ND Resolution
2863
2864       Any  packet that reaches this table is an IP packet whose next-hop IPv4
2865       address is in reg0 or IPv6 address is in xxreg0.  (ip4.dst  or  ip6.dst
2866       contains  the final destination.) This table resolves the IP address in
2867       reg0 (or xxreg0) into an output port in outport and an Ethernet address
2868       in eth.dst, using the following flows:
2869
2870              •      A  priority-500  flow  that  matches IP multicast traffic
2871                     that was allowed in the routing pipeline. For  this  kind
2872                     of  traffic  the outport was already set so the flow just
2873                     advances to the next table.
2874
2875              •      Priority-200 flows that match ECMP reply traffic for  the
2876                     routes  configured to use symmetric replies, with actions
2877                     push(xxreg1);    xxreg1    =    ct_label;    eth.dst    =
2878                     xxreg1[32..79];  pop(xxreg1);  next;. xxreg1 is used here
2879                     to avoid masked access to ct_label, to make the flow  HW-
2880                     offloading friendly.
2881
2882              •      Static MAC bindings. MAC bindings can be known statically
2883                     based on data in the OVN_Northbound database. For  router
2884                     ports  connected to logical switches, MAC bindings can be
2885                     known statically from the addresses column in  the  Logi‐
2886                     cal_Switch_Port  table.  For  router  ports  connected to
2887                     other logical routers, MAC bindings can be  known  stati‐
2888                     cally  from  the  mac  and  networks  column in the Logi‐
2889                     cal_Router_Port table. (Note: the flow is  NOT  installed
2890                     for  the  IP  addresses that belong to a neighbor logical
2891                     router port if the current  router  has  the  options:dy‐
2892                     namic_neigh_routers set to true)
2893
2894                     For  each IPv4 address A whose host is known to have Eth‐
2895                     ernet address E on router port  P,  a  priority-100  flow
2896                     with match outport === P && reg0 == A has actions eth.dst
2897                     = E; next;.
2898
2899                     For each virtual ip A configured on  a  logical  port  of
2900                     type  virtual  and  its  virtual parent set in its corre‐
2901                     sponding Port_Binding record and the virtual parent  with
2902                     the  Ethernet  address  E and the virtual ip is reachable
2903                     via the router port P, a  priority-100  flow  with  match
2904                     outport  ===  P && xxreg0/reg0 == A has actions eth.dst =
2905                     E; next;.
2906
2907                     For each virtual ip A configured on  a  logical  port  of
2908                     type virtual and its virtual parent not set in its corre‐
2909                     sponding Port_Binding record and  the  virtual  ip  A  is
2910                     reachable via the router port P, a priority-100 flow with
2911                     match outport === P  &&  xxreg0/reg0  ==  A  has  actions
2912                     eth.dst = 00:00:00:00:00:00; next;. This flow is added so
2913                     that the ARP is always resolved for the virtual ip  A  by
2914                     generating ARP request and not consulting the MAC_Binding
2915                     table as it can have incorrect value for the  virtual  ip
2916                     A.
2917
2918                     For  each IPv6 address A whose host is known to have Eth‐
2919                     ernet address E on router port  P,  a  priority-100  flow
2920                     with  match  outport  ===  P  &&  xxreg0 == A has actions
2921                     eth.dst = E; next;.
2922
2923                     For each logical router port with an IPv4 address A and a
2924                     mac  address of E that is reachable via a different logi‐
2925                     cal router port P, a priority-100 flow with match outport
2926                     === P && reg0 == A has actions eth.dst = E; next;.
2927
2928                     For each logical router port with an IPv6 address A and a
2929                     mac address of E that is reachable via a different  logi‐
2930                     cal router port P, a priority-100 flow with match outport
2931                     === P && xxreg0 == A has actions eth.dst = E; next;.
2932
2933              •      Static MAC bindings from NAT entries.  MAC  bindings  can
2934                     also  be  known  for  the entries in the NAT table. Below
2935                     flows are programmed for distributed logical routers  i.e
2936                     with a distributed router port.
2937
2938                     For  each row in the NAT table with IPv4 address A in the
2939                     external_ip column of NAT table, a priority-100 flow with
2940                     the  match outport === P && reg0 == A has actions eth.dst
2941                     = E; next;, where P is  the  distributed  logical  router
2942                     port,  E  is  the  Ethernet  address if set in the exter‐
2943                     nal_mac column of NAT table for  of  type  dnat_and_snat,
2944                     otherwise the Ethernet address of the distributed logical
2945                     router port. Note that if the external_ip is not within a
2946                     subnet  on  the owning logical router, then OVN will only
2947                     create ARP resolution flows if the  options:add_route  is
2948                     set  to  true. Otherwise, no ARP resolution flows will be
2949                     added.
2950
2951                     For IPv6 NAT entries, same flows are added, but using the
2952                     register xxreg0 for the match.
2953
2954              •      Traffic  with  IP  destination  an  address  owned by the
2955                     router  should  be  dropped.  Such  traffic  is  normally
2956                     dropped in ingress table IP Input except for IPs that are
2957                     also shared with SNAT rules. However, if there was no un‐
2958                     SNAT  operation  that  happened  successfully  until this
2959                     point in the pipeline  and  the  destination  IP  of  the
2960                     packet  is  still  a  router owned IP, the packets can be
2961                     safely dropped.
2962
2963                     A priority-1 logical  flow  with  match  ip4.dst  =  {..}
2964                     matches  on  traffic  destined  to  router owned IPv4 ad‐
2965                     dresses which are also SNAT IPs.  This  flow  has  action
2966                     drop;.
2967
2968                     A  priority-1  logical  flow  with  match  ip6.dst = {..}
2969                     matches on traffic destined  to  router  owned  IPv6  ad‐
2970                     dresses  which  are  also  SNAT IPs. This flow has action
2971                     drop;.
2972
2973              •      Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
2974                     ings  that  have  become known dynamically through ARP or
2975                     neighbor discovery. (The ingress table ARP  Request  will
2976                     issue  an  ARP or neighbor solicitation request for cases
2977                     where the binding is not yet known.)
2978
2979                     A priority-0 logical flow  with  match  ip4  has  actions
2980                     get_arp(outport, reg0); next;.
2981
2982                     A  priority-0  logical  flow  with  match ip6 has actions
2983                     get_nd(outport, xxreg0); next;.
2984
2985              •      For a distributed gateway LRP with redirect-type  set  to
2986                     bridged,   a  priority-50  flow  will  match  outport  ==
2987                     "ROUTER_PORT" and !is_chassis_resident ("cr-ROUTER_PORT")
2988                     has  actions  eth.dst = E; next;, where E is the ethernet
2989                     address of the logical router port.
2990
2991     Ingress Table 16: Check packet length
2992
2993       For distributed logical routers or gateway routers  with  gateway  port
2994       configured  with options:gateway_mtu to a valid integer value, this ta‐
2995       ble adds a priority-50 logical flow with the match outport  ==  GW_PORT
2996       where  GW_PORT  is  the  gateway  router  port  and  applies the action
2997       check_pkt_larger and advances the packet to the next table.
2998
2999       REGBIT_PKT_LARGER = check_pkt_larger(L); next;
3000
3001
3002       where L is the packet length to check for. If the packet is larger than
3003       L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
3004       taken from options:gateway_mtu column of Logical_Router_Port row.
3005
3006       If the port is also configured with options:gateway_mtu_bypass then an‐
3007       other  flow  is added, with priority-55, to bypass the check_pkt_larger
3008       flow.
3009
3010       This table adds one priority-0 fallback flow that matches  all  packets
3011       and advances to the next table.
3012
3013     Ingress Table 17: Handle larger packets
3014
3015       For  distributed  logical  routers or gateway routers with gateway port
3016       configured with options:gateway_mtu to a valid integer value, this  ta‐
3017       ble  adds  the  following  priority-150  logical  flow for each logical
3018       router port with the match inport == LRP && outport == GW_PORT &&  REG‐
3019       BIT_PKT_LARGER  &&  !REGBIT_EGRESS_LOOPBACK,  where  LRP is the logical
3020       router port and GW_PORT is the gateway port and applies  the  following
3021       action for ipv4 and ipv6 respectively:
3022
3023       icmp4 {
3024           icmp4.type = 3; /* Destination Unreachable. */
3025           icmp4.code = 4;  /* Frag Needed and DF was Set. */
3026           icmp4.frag_mtu = M;
3027           eth.dst = E;
3028           ip4.dst = ip4.src;
3029           ip4.src = I;
3030           ip.ttl = 255;
3031           REGBIT_EGRESS_LOOPBACK = 1;
3032           REGBIT_PKT_LARGER = 0;
3033           next(pipeline=ingress, table=0);
3034       };
3035       icmp6 {
3036           icmp6.type = 2;
3037           icmp6.code = 0;
3038           icmp6.frag_mtu = M;
3039           eth.dst = E;
3040           ip6.dst = ip6.src;
3041           ip6.src = I;
3042           ip.ttl = 255;
3043           REGBIT_EGRESS_LOOPBACK = 1;
3044           REGBIT_PKT_LARGER = 0;
3045           next(pipeline=ingress, table=0);
3046       };
3047
3048
3049              •      Where  M  is the (fragment MTU - 58) whose value is taken
3050                     from options:gateway_mtu  column  of  Logical_Router_Port
3051                     row.
3052
3053E is the Ethernet address of the logical router port.
3054
3055I is the IPv4/IPv6 address of the logical router port.
3056
3057       This  table  adds one priority-0 fallback flow that matches all packets
3058       and advances to the next table.
3059
3060     Ingress Table 18: Gateway Redirect
3061
3062       For distributed logical routers where one or more of the logical router
3063       ports specifies a gateway chassis, this table redirects certain packets
3064       to the distributed gateway port instances  on  the  gateway  chassises.
3065       This table has the following flows:
3066
3067              •      For each NAT rule in the OVN Northbound database that can
3068                     be handled in a distributed manner, a priority-100  logi‐
3069                     cal  flow  with  match  ip4.src  == B && outport == GW &&
3070                     is_chassis_resident(P), where GW is the distributed gate‐
3071                     way port specified in the NAT rule and P is the NAT logi‐
3072                     cal port. IP traffic matching the above rule will be man‐
3073                     aged  locally setting reg1 to C and eth.src to D, where C
3074                     is NAT external ip and D is NAT external mac.
3075
3076              •      For each NAT rule in the OVN Northbound database that can
3077                     be handled in a distributed manner, a priority-80 logical
3078                     flow with drop action if the NAT logical port is  a  vir‐
3079                     tual port not claimed by any chassis yet.
3080
3081              •      A  priority-50  logical flow with match outport == GW has
3082                     actions outport = CR; next;,  where  GW  is  the  logical
3083                     router  distributed  gateway  port  and  CR  is the chas‐
3084                     sisredirect port representing the instance of the logical
3085                     router distributed gateway port on the gateway chassis.
3086
3087              •      A priority-0 logical flow with match 1 has actions next;.
3088
3089     Ingress Table 19: ARP Request
3090
3091       In  the  common  case where the Ethernet destination has been resolved,
3092       this table outputs the packet. Otherwise, it composes and sends an  ARP
3093       or IPv6 Neighbor Solicitation request. It holds the following flows:
3094
3095              •      Unknown MAC address. A priority-100 flow for IPv4 packets
3096                     with match eth.dst == 00:00:00:00:00:00 has the following
3097                     actions:
3098
3099                     arp {
3100                         eth.dst = ff:ff:ff:ff:ff:ff;
3101                         arp.spa = reg1;
3102                         arp.tpa = reg0;
3103                         arp.op = 1;  /* ARP request. */
3104                         output;
3105                     };
3106
3107
3108                     Unknown  MAC  address. For each IPv6 static route associ‐
3109                     ated with the router with the nexthop  IP:  G,  a  prior‐
3110                     ity-200  flow  for  IPv6  packets  with  match eth.dst ==
3111                     00:00:00:00:00:00 && xxreg0 == G with the  following  ac‐
3112                     tions is added:
3113
3114                     nd_ns {
3115                         eth.dst = E;
3116                         ip6.dst = I
3117                         nd.target = G;
3118                         output;
3119                     };
3120
3121
3122                     Where E is the multicast mac derived from the Gateway IP,
3123                     I is the solicited-node multicast  address  corresponding
3124                     to the target address G.
3125
3126                     Unknown MAC address. A priority-100 flow for IPv6 packets
3127                     with match eth.dst == 00:00:00:00:00:00 has the following
3128                     actions:
3129
3130                     nd_ns {
3131                         nd.target = xxreg0;
3132                         output;
3133                     };
3134
3135
3136                     (Ingress  table  IP  Routing initialized reg1 with the IP
3137                     address owned by outport and (xx)reg0 with  the  next-hop
3138                     IP address)
3139
3140                     The  IP  packet  that triggers the ARP/IPv6 NS request is
3141                     dropped.
3142
3143              •      Known MAC address. A priority-0 flow with match 1 has ac‐
3144                     tions output;.
3145
3146     Egress Table 0: Check DNAT local
3147
3148       This  table  checks  if  the  packet  needs  to be DNATed in the router
3149       ingress table lr_in_dnat after it is SNATed  and  looped  back  to  the
3150       ingress  pipeline.  This check is done only for routers configured with
3151       distributed gateway ports and NAT entries. This check is done  so  that
3152       SNAT and DNAT is done in different zones instead of a common zone.
3153
3154              •      For  each  NAT  rule  in the OVN Northbound database on a
3155                     distributed router, a priority-50 logical flow with match
3156                     ip4.dst  ==  E  && is_chassis_resident(P), where E is the
3157                     external IP address specified in the NAT rule, GW is  the
3158                     logical    router    distributed    gateway   port.   For
3159                     dnat_and_snat NAT rule, P is the logical  port  specified
3160                     in  the  NAT rule. If logical_port column of NAT table is
3161                     NOT set, then P is the chassisredirect port  of  GW  with
3162                     the actions: REGBIT_DST_NAT_IP_LOCAL = 1; next;
3163
3164              •      A  priority-0  logical flow with match 1 has actions REG‐
3165                     BIT_DST_NAT_IP_LOCAL = 0; next;.
3166
3167     Egress Table 1: UNDNAT
3168
3169       This is for already established  connections’  reverse  traffic.  i.e.,
3170       DNAT  has  already been done in ingress pipeline and now the packet has
3171       entered the egress pipeline as part of a reply. This  traffic  is  unD‐
3172       NATed here.
3173
3174              •      A priority-0 logical flow with match 1 has actions next;.
3175
3176     Egress Table 1: UNDNAT on Gateway Routers
3177
3178              •      For  all  IP  packets,  a priority-50 flow with an action
3179                     flags.loopback = 1; ct_dnat;.
3180
3181     Egress Table 1: UNDNAT on Distributed Routers
3182
3183              •      For all the configured load balancing rules for a  router
3184                     with  gateway  port  in  OVN_Northbound database that in‐
3185                     cludes an IPv4 address VIP, for every  backend  IPv4  ad‐
3186                     dress  B  defined for the VIP a priority-120 flow is pro‐
3187                     grammed on gateway chassis that matches ip && ip4.src  ==
3188                     B  && outport == GW, where GW is the logical router gate‐
3189                     way port with an action ct_dnat_in_czone;. If the backend
3190                     IPv4  address  B  is also configured with L4 port PORT of
3191                     protocol P, then the match also includes P.src  ==  PORT.
3192                     These  flows  are  not added for load balancers with IPv6
3193                     VIPs.
3194
3195                     If the router is configured to force SNAT  any  load-bal‐
3196                     anced   packets,   above   action  will  be  replaced  by
3197                     flags.force_snat_for_lb = 1; ct_dnat;.
3198
3199              •      For each configuration in  the  OVN  Northbound  database
3200                     that  asks  to  change  the  destination  IP address of a
3201                     packet from an IP address of A to B, a priority-100  flow
3202                     matches  ip && ip4.src == B && outport == GW, where GW is
3203                     the  logical  router  gateway  port,   with   an   action
3204                     ct_dnat_in_czone;.   If   the   NAT   rule   is  of  type
3205                     dnat_and_snat and has stateless=true in the options, then
3206                     the action would be ip4/6.src= (B).
3207
3208                     If  the  NAT rule cannot be handled in a distributed man‐
3209                     ner, then the priority-100 flow above is only  programmed
3210                     on the gateway chassis with the action ct_dnat_in_czone.
3211
3212                     If  the  NAT rule can be handled in a distributed manner,
3213                     then there is an additional action eth.src =  EA;,  where
3214                     EA is the ethernet address associated with the IP address
3215                     A in the NAT rule. This allows upstream MAC  learning  to
3216                     point to the correct chassis.
3217
3218     Egress Table 2: Post UNDNAT
3219
3220              •      A  priority-50 logical flow is added that commits any un‐
3221                     tracked flows from the previous table  lr_out_undnat  for
3222                     Gateway  routers.  This flow matches on ct.new && ip with
3223                     action ct_commit { } ; next; .
3224
3225              •      A priority-0 logical flow with match 1 has actions next;.
3226
3227     Egress Table 3: SNAT
3228
3229       Packets that are configured to be SNATed get their  source  IP  address
3230       changed based on the configuration in the OVN Northbound database.
3231
3232              •      A  priority-120 flow to advance the IPv6 Neighbor solici‐
3233                     tation packet to next table to skip  SNAT.  In  the  case
3234                     where  ovn-controller  injects an IPv6 Neighbor Solicita‐
3235                     tion packet (for nd_ns action) we don’t want  the  packet
3236                     to go throught conntrack.
3237
3238       Egress Table 3: SNAT on Gateway Routers
3239
3240              •      If  the Gateway router in the OVN Northbound database has
3241                     been configured to force SNAT a  packet  (that  has  been
3242                     previously  DNATted)  to  B,  a priority-100 flow matches
3243                     flags.force_snat_for_dnat ==  1  &&  ip  with  an  action
3244                     ct_snat(B);.
3245
3246              •      If  a  load balancer configured to skip snat has been ap‐
3247                     plied to the Gateway router pipeline, a priority-120 flow
3248                     matches  flags.skip_snat_for_lb == 1 && ip with an action
3249                     next;.
3250
3251              •      If the Gateway router in the OVN Northbound database  has
3252                     been  configured  to  force  SNAT a packet (that has been
3253                     previously  load-balanced)  using  router  IP  (i.e   op‐
3254                     tions:lb_force_snat_ip=router_ip),  then for each logical
3255                     router port P attached to the Gateway  router,  a  prior‐
3256                     ity-110 flow matches flags.force_snat_for_lb == 1 && out‐
3257                     port == P
3258                      with an action ct_snat(R); where R is the IP  configured
3259                     on  the  router  port.  If  R is an IPv4 address then the
3260                     match will also include ip4 and if it is an IPv6 address,
3261                     then the match will also include ip6.
3262
3263                     If  the logical router port P is configured with multiple
3264                     IPv4 and multiple IPv6 addresses, only the first IPv4 and
3265                     first IPv6 address is considered.
3266
3267              •      If  the Gateway router in the OVN Northbound database has
3268                     been configured to force SNAT a  packet  (that  has  been
3269                     previously  load-balanced)  to  B,  a  priority-100  flow
3270                     matches flags.force_snat_for_lb == 1 && ip with an action
3271                     ct_snat(B);.
3272
3273              •      For  each  configuration  in the OVN Northbound database,
3274                     that asks to change the source IP  address  of  a  packet
3275                     from  an  IP  address of A or to change the source IP ad‐
3276                     dress of a packet that belongs to network A to B, a  flow
3277                     matches  ip  && ip4.src == A && (!ct.trk || !ct.rpl) with
3278                     an action ct_snat(B);. The priority of the flow is calcu‐
3279                     lated  based on the mask of A, with matches having larger
3280                     masks getting higher priorities. If the NAT  rule  is  of
3281                     type dnat_and_snat and has stateless=true in the options,
3282                     then the action would be ip4/6.src= (B).
3283
3284              •      If the NAT  rule  has  allowed_ext_ips  configured,  then
3285                     there is an additional match ip4.dst == allowed_ext_ips .
3286                     Similarly, for  IPV6,  match  would  be  ip6.dst  ==  al‐
3287                     lowed_ext_ips.
3288
3289              •      If  the  NAT rule has exempted_ext_ips set, then there is
3290                     an additional flow configured at the priority + 1 of cor‐
3291                     responding  NAT  rule. The flow matches if destination ip
3292                     is an exempted_ext_ip and the action is next; . This flow
3293                     is  used  to bypass the ct_snat action for a packet which
3294                     is destinted to exempted_ext_ips.
3295
3296              •      A priority-0 logical flow with match 1 has actions next;.
3297
3298       Egress Table 3: SNAT on Distributed Routers
3299
3300              •      For each configuration in the  OVN  Northbound  database,
3301                     that  asks  to  change  the source IP address of a packet
3302                     from an IP address of A or to change the  source  IP  ad‐
3303                     dress  of  a  packet  that belongs to network A to B, two
3304                     flows are added. The priority P of these flows are calcu‐
3305                     lated  based on the mask of A, with matches having larger
3306                     masks getting higher priorities.
3307
3308                     If the NAT rule cannot be handled in a  distributed  man‐
3309                     ner,  then  the  below  flows  are only programmed on the
3310                     gateway chassis increasing flow priority by 128 in  order
3311                     to be run first.
3312
3313                     •      The first flow is added with the calculated prior‐
3314                            ity P and match ip && ip4.src == A &&  outport  ==
3315                            GW,  where  GW is the logical router gateway port,
3316                            with an action ct_snat_in_czone(B); to  SNATed  in
3317                            the  common  zone.  If  the  NAT  rule  is of type
3318                            dnat_and_snat and has stateless=true  in  the  op‐
3319                            tions, then the action would be ip4/6.src=(B).
3320
3321                     •      The  second flow is added with the calculated pri‐
3322                            ority P + 1  and match ip && ip4.src == A &&  out‐
3323                            port  == GW && REGBIT_DST_NAT_IP_LOCAL == 0, where
3324                            GW is the logical router gateway port, with an ac‐
3325                            tion  ct_snat(B); to SNAT in the snat zone. If the
3326                            NAT rule is of type dnat_and_snat and  has  state‐
3327                            less=true in the options, then the action would be
3328                            ip4/6.src=(B).
3329
3330                     If the NAT rule can be handled in a  distributed  manner,
3331                     then  there  is an additional action (for both the flows)
3332                     eth.src = EA;, where EA is the ethernet  address  associ‐
3333                     ated  with  the IP address A in the NAT rule. This allows
3334                     upstream MAC learning to point to the correct chassis.
3335
3336                     If the NAT  rule  has  allowed_ext_ips  configured,  then
3337                     there is an additional match ip4.dst == allowed_ext_ips .
3338                     Similarly, for  IPV6,  match  would  be  ip6.dst  ==  al‐
3339                     lowed_ext_ips.
3340
3341                     If  the  NAT rule has exempted_ext_ips set, then there is
3342                     an additional flow configured at the priority P +  2   of
3343                     corresponding  NAT  rule. The flow matches if destination
3344                     ip is an exempted_ext_ip and the action is next;  .  This
3345                     flow  is  used  to  bypass  the ct_snat action for a flow
3346                     which is destinted to exempted_ext_ips.
3347
3348              •      A priority-0 logical flow with match 1 has actions next;.
3349
3350     Egress Table 4: Egress Loopback
3351
3352       For distributed logical routers where one of the logical  router  ports
3353       specifies a gateway chassis.
3354
3355       While  UNDNAT  and SNAT processing have already occurred by this point,
3356       this traffic needs to be forced through egress loopback  on  this  dis‐
3357       tributed gateway port instance, in order for UNSNAT and DNAT processing
3358       to be applied, and also for IP routing and ARP resolution after all  of
3359       the NAT processing, so that the packet can be forwarded to the destina‐
3360       tion.
3361
3362       This table has the following flows:
3363
3364              •      For each NAT rule in the OVN  Northbound  database  on  a
3365                     distributed  router,  a  priority-100  logical  flow with
3366                     match ip4.dst == E && outport == GW  &&  is_chassis_resi‐
3367                     dent(P),  where E is the external IP address specified in
3368                     the NAT rule, GW is the distributed gateway  port  speci‐
3369                     fied  in  the  NAT rule. For dnat_and_snat NAT rule, P is
3370                     the logical port specified in  the  NAT  rule.  If  logi‐
3371                     cal_port  column  of  NAT table is NOT set, then P is the
3372                     chassisredirect port of GW with the following actions:
3373
3374                     clone {
3375                         ct_clear;
3376                         inport = outport;
3377                         outport = "";
3378                         flags = 0;
3379                         flags.loopback = 1;
3380                         flags.use_snat_zone = REGBIT_DST_NAT_IP_LOCAL;
3381                         reg0 = 0;
3382                         reg1 = 0;
3383                         ...
3384                         reg9 = 0;
3385                         REGBIT_EGRESS_LOOPBACK = 1;
3386                         next(pipeline=ingress, table=0);
3387                     };
3388
3389
3390                     flags.loopback is set since in_port is unchanged and  the
3391                     packet may return back to that port after NAT processing.
3392                     REGBIT_EGRESS_LOOPBACK is set  to  indicate  that  egress
3393                     loopback has occurred, in order to skip the source IP ad‐
3394                     dress check against the router address.
3395
3396              •      A priority-0 logical flow with match 1 has actions next;.
3397
3398     Egress Table 5: Delivery
3399
3400       Packets that reach this table are ready for delivery. It contains:
3401
3402              •      Priority-110 logical flows that match IP multicast  pack‐
3403                     ets  on  each  enabled logical router port and modify the
3404                     Ethernet source address of the packets  to  the  Ethernet
3405                     address of the port and then execute action output;.
3406
3407              •      Priority-100 logical flows that match packets on each en‐
3408                     abled logical router port, with action output;.
3409
3410
3411
3412OVN 22.06.1                       ovn-northd                     ovn-northd(8)
Impressum