1ovn-northd(8)                     OVN Manual                     ovn-northd(8)
2
3
4

NAME

6       ovn-northd  and ovn-northd-ddlog - Open Virtual Network central control
7       daemon
8

SYNOPSIS

10       ovn-northd [options]
11

DESCRIPTION

13       ovn-northd is a centralized  daemon  responsible  for  translating  the
14       high-level  OVN  configuration into logical configuration consumable by
15       daemons such as ovn-controller. It translates the logical network  con‐
16       figuration  in  terms  of conventional network concepts, taken from the
17       OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
18       the OVN Southbound Database (see ovn-sb(5)) below it.
19
20       ovn-northd is implemented in C. ovn-northd-ddlog is a compatible imple‐
21       mentation written in DDlog, a language for  incremental  database  pro‐
22       cessing.  This documentation applies to both implementations, with dif‐
23       ferences indicated where relevant.
24

OPTIONS

26       --ovnnb-db=database
27              The OVSDB database containing the OVN  Northbound  Database.  If
28              the  OVN_NB_DB environment variable is set, its value is used as
29              the default. Otherwise, the default is unix:/ovnnb_db.sock.
30
31       --ovnsb-db=database
32              The OVSDB database containing the OVN  Southbound  Database.  If
33              the  OVN_SB_DB environment variable is set, its value is used as
34              the default. Otherwise, the default is unix:/ovnsb_db.sock.
35
36       --ddlog-record=file
37              This option is for ovn-north-ddlog only. It causes the daemon to
38              record  the  initial database state and later changes to file in
39              the text-based DDlog command format. The ovn_northd_cli  program
40              can  later replay these changes for debugging purposes. This op‐
41              tion has a performance impact. See  debugging-ddlog.rst  in  the
42              OVN documentation for more details.
43
44       --dry-run
45              Causes   ovn-northd  to  start  paused.  In  the  paused  state,
46              ovn-northd does not apply any changes to the databases, although
47              it  continues  to  monitor  them.  For more information, see the
48              pause command, under Runtime Management Commands below.
49
50              For  ovn-northd-ddlog,  one   could   use   this   option   with
51              --ddlog-record  to  generate  a  replay log without restarting a
52              process or disturbing a running system.
53
54       n-threads N
55              In certain situations, it may  be  desirable  to  enable  paral‐
56              lelization  on  a  system  to decrease latency (at the potential
57              cost of increasing CPU usage).
58
59              This option will cause ovn-northd to use N threads when building
60              logical flows, when N is within [2-256]. If N is 1, paralleliza‐
61              tion is disabled (default behavior). If N is less than 1, then N
62              is  set  to  1,  parallelization  is  disabled  and a warning is
63              logged. If N is more than 256, then N  is  set  to  256,  paral‐
64              lelization  is  enabled  (with  256  threads)  and  a warning is
65              logged.
66
67              ovn-northd-ddlog does not support this option.
68
69       database in the above options must be an OVSDB active or  passive  con‐
70       nection method, as described in ovsdb(7).
71
72   Daemon Options
73       --pidfile[=pidfile]
74              Causes a file (by default, program.pid) to be created indicating
75              the PID of the running process. If the pidfile argument  is  not
76              specified, or if it does not begin with /, then it is created in
77              .
78
79              If --pidfile is not specified, no pidfile is created.
80
81       --overwrite-pidfile
82              By default, when --pidfile is specified and the  specified  pid‐
83              file already exists and is locked by a running process, the dae‐
84              mon refuses to start. Specify --overwrite-pidfile to cause it to
85              instead overwrite the pidfile.
86
87              When --pidfile is not specified, this option has no effect.
88
89       --detach
90              Runs  this  program  as a background process. The process forks,
91              and in the child it starts a new session,  closes  the  standard
92              file descriptors (which has the side effect of disabling logging
93              to the console), and changes its current directory to  the  root
94              (unless  --no-chdir is specified). After the child completes its
95              initialization, the parent exits.
96
97       --monitor
98              Creates an additional process to monitor  this  program.  If  it
99              dies  due  to a signal that indicates a programming error (SIGA‐
100              BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
101              or SIGXFSZ) then the monitor process starts a new copy of it. If
102              the daemon dies or exits for another reason, the monitor process
103              exits.
104
105              This  option  is  normally used with --detach, but it also func‐
106              tions without it.
107
108       --no-chdir
109              By default, when --detach is specified, the daemon  changes  its
110              current  working  directory  to  the root directory after it de‐
111              taches. Otherwise, invoking the daemon from a carelessly  chosen
112              directory  would  prevent  the administrator from unmounting the
113              file system that holds that directory.
114
115              Specifying --no-chdir suppresses this behavior,  preventing  the
116              daemon  from changing its current working directory. This may be
117              useful for collecting core files, since it is common behavior to
118              write core dumps into the current working directory and the root
119              directory is not a good directory to use.
120
121              This option has no effect when --detach is not specified.
122
123       --no-self-confinement
124              By default this daemon will try to self-confine itself  to  work
125              with  files  under  well-known  directories  determined at build
126              time. It is better to stick with this default behavior  and  not
127              to  use  this  flag  unless some other Access Control is used to
128              confine daemon. Note that in contrast to  other  access  control
129              implementations  that  are  typically enforced from kernel-space
130              (e.g. DAC or MAC), self-confinement is imposed  from  the  user-
131              space daemon itself and hence should not be considered as a full
132              confinement strategy, but instead should be viewed as  an  addi‐
133              tional layer of security.
134
135       --user=user:group
136              Causes  this  program  to  run  as a different user specified in
137              user:group, thus dropping most of  the  root  privileges.  Short
138              forms  user  and  :group  are also allowed, with current user or
139              group assumed, respectively. Only daemons started  by  the  root
140              user accepts this argument.
141
142              On   Linux,   daemons   will   be   granted   CAP_IPC_LOCK   and
143              CAP_NET_BIND_SERVICES before dropping root  privileges.  Daemons
144              that  interact  with  a  datapath, such as ovs-vswitchd, will be
145              granted three  additional  capabilities,  namely  CAP_NET_ADMIN,
146              CAP_NET_BROADCAST  and  CAP_NET_RAW.  The capability change will
147              apply even if the new user is root.
148
149              On Windows, this option is not currently supported. For security
150              reasons,  specifying  this  option will cause the daemon process
151              not to start.
152
153   Logging Options
154       -v[spec]
155       --verbose=[spec]
156            Sets logging levels. Without any spec, sets the log level for  ev‐
157            ery  module  and  destination to dbg. Otherwise, spec is a list of
158            words separated by spaces or commas or colons, up to one from each
159            category below:
160
161            •      A  valid module name, as displayed by the vlog/list command
162                   on ovs-appctl(8), limits the log level change to the speci‐
163                   fied module.
164
165syslog,  console, or file, to limit the log level change to
166                   only to the system log, to the console, or to a  file,  re‐
167                   spectively.  (If  --detach  is specified, the daemon closes
168                   its standard file descriptors, so logging  to  the  console
169                   will have no effect.)
170
171                   On  Windows  platform,  syslog is accepted as a word and is
172                   only useful along with the --syslog-target option (the word
173                   has no effect otherwise).
174
175off,  emer,  err,  warn,  info,  or dbg, to control the log
176                   level. Messages of the given severity  or  higher  will  be
177                   logged,  and  messages  of  lower severity will be filtered
178                   out. off filters out all messages. See ovs-appctl(8) for  a
179                   definition of each log level.
180
181            Case is not significant within spec.
182
183            Regardless  of the log levels set for file, logging to a file will
184            not take place unless --log-file is also specified (see below).
185
186            For compatibility with older versions of OVS, any is accepted as a
187            word but has no effect.
188
189       -v
190       --verbose
191            Sets  the  maximum  logging  verbosity level, equivalent to --ver‐
192            bose=dbg.
193
194       -vPATTERN:destination:pattern
195       --verbose=PATTERN:destination:pattern
196            Sets the log pattern for destination to pattern. Refer to  ovs-ap‐
197            pctl(8) for a description of the valid syntax for pattern.
198
199       -vFACILITY:facility
200       --verbose=FACILITY:facility
201            Sets  the RFC5424 facility of the log message. facility can be one
202            of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
203            ftp,  ntp,  audit,  alert, clock2, local0, local1, local2, local3,
204            local4, local5, local6 or local7. If this option is not specified,
205            daemon  is used as the default for the local system syslog and lo‐
206            cal0 is used while sending a message to the  target  provided  via
207            the --syslog-target option.
208
209       --log-file[=file]
210            Enables  logging  to a file. If file is specified, then it is used
211            as the exact name for the log file. The default log file name used
212            if file is omitted is /var/log/ovn/program.log.
213
214       --syslog-target=host:port
215            Send  syslog messages to UDP port on host, in addition to the sys‐
216            tem syslog. The host must be a numerical IP address, not  a  host‐
217            name.
218
219       --syslog-method=method
220            Specify  method  as  how  syslog messages should be sent to syslog
221            daemon. The following forms are supported:
222
223libc, to use the libc syslog() function. Downside of  using
224                   this  options  is that libc adds fixed prefix to every mes‐
225                   sage before it is actually sent to the syslog  daemon  over
226                   /dev/log UNIX domain socket.
227
228unix:file, to use a UNIX domain socket directly. It is pos‐
229                   sible to specify arbitrary message format with this option.
230                   However,  rsyslogd  8.9  and  older versions use hard coded
231                   parser function anyway that limits UNIX domain socket  use.
232                   If  you  want  to  use  arbitrary message format with older
233                   rsyslogd versions, then use UDP socket to localhost IP  ad‐
234                   dress instead.
235
236udp:ip:port,  to  use  a UDP socket. With this method it is
237                   possible to use arbitrary message format  also  with  older
238                   rsyslogd.  When sending syslog messages over UDP socket ex‐
239                   tra precaution needs to be taken into account, for example,
240                   syslog daemon needs to be configured to listen on the spec‐
241                   ified UDP port, accidental iptables rules could  be  inter‐
242                   fering  with  local syslog traffic and there are some secu‐
243                   rity considerations that apply to UDP sockets, but  do  not
244                   apply to UNIX domain sockets.
245
246null, to discard all messages logged to syslog.
247
248            The  default is taken from the OVS_SYSLOG_METHOD environment vari‐
249            able; if it is unset, the default is libc.
250
251   PKI Options
252       PKI configuration is required in order to use SSL for  the  connections
253       to the Northbound and Southbound databases.
254
255              -p privkey.pem
256              --private-key=privkey.pem
257                   Specifies  a  PEM  file  containing the private key used as
258                   identity for outgoing SSL connections.
259
260              -c cert.pem
261              --certificate=cert.pem
262                   Specifies a PEM file containing a certificate  that  certi‐
263                   fies the private key specified on -p or --private-key to be
264                   trustworthy. The certificate must be signed by the certifi‐
265                   cate  authority  (CA) that the peer in SSL connections will
266                   use to verify it.
267
268              -C cacert.pem
269              --ca-cert=cacert.pem
270                   Specifies a PEM file containing the CA certificate for ver‐
271                   ifying certificates presented to this program by SSL peers.
272                   (This may be the same certificate that  SSL  peers  use  to
273                   verify the certificate specified on -c or --certificate, or
274                   it may be a different one, depending on the PKI  design  in
275                   use.)
276
277              -C none
278              --ca-cert=none
279                   Disables  verification  of  certificates  presented  by SSL
280                   peers. This introduces a security risk,  because  it  means
281                   that  certificates  cannot be verified to be those of known
282                   trusted hosts.
283
284   Other Options
285       --unixctl=socket
286              Sets the name of the control socket on which program listens for
287              runtime  management  commands  (see RUNTIME MANAGEMENT COMMANDS,
288              below). If socket does not begin with /, it  is  interpreted  as
289              relative  to  .  If  --unixctl  is  not used at all, the default
290              socket is /program.pid.ctl, where pid is program’s process ID.
291
292              On Windows a local named pipe is used to listen for runtime man‐
293              agement  commands.  A  file  is  created in the absolute path as
294              pointed by socket or if --unixctl is not used at all, a file  is
295              created  as  program in the configured OVS_RUNDIR directory. The
296              file exists just to mimic the behavior of a Unix domain socket.
297
298              Specifying none for socket disables the control socket feature.
299
300
301
302       -h
303       --help
304            Prints a brief help message to the console.
305
306       -V
307       --version
308            Prints version information to the console.
309

RUNTIME MANAGEMENT COMMANDS

311       ovs-appctl can send commands to a running ovn-northd process. The  cur‐
312       rently supported commands are described below.
313
314              exit   Causes ovn-northd to gracefully terminate.
315
316              pause  Pauses ovn-northd. When it is paused, ovn-northd receives
317                     changes  from  the  Northbound  and  Southbound  database
318                     changes  as  usual,  but  it does not send any updates. A
319                     paused ovn-northd also drops database locks, which allows
320                     any other non-paused instance of ovn-northd to take over.
321
322              resume Resumes  the  ovn-northd  operation to process Northbound
323                     and Southbound database  contents  and  generate  logical
324                     flows.  This  will also instruct ovn-northd to aspire for
325                     the lock on SB DB.
326
327              is-paused
328                     Returns "true" if ovn-northd is currently paused, "false"
329                     otherwise.
330
331              status Prints  this  server’s status. Status will be "active" if
332                     ovn-northd has acquired OVSDB lock on SB DB, "standby" if
333                     it has not or "paused" if this instance is paused.
334
335              sb-cluster-state-reset
336                     Reset  southbound  database cluster status when databases
337                     are destroyed and rebuilt.
338
339                     If all databases in a clustered southbound  database  are
340                     removed from disk, then the stored index of all databases
341                     will be reset to zero. This will cause ovn-northd  to  be
342                     unable  to  read or write to the southbound database, be‐
343                     cause it will always detect the data as stale. In such  a
344                     case,  run this command so that ovn-northd will reset its
345                     local index so that it can interact with  the  southbound
346                     database again.
347
348              nb-cluster-state-reset
349                     Reset  northbound  database cluster status when databases
350                     are destroyed and rebuilt.
351
352                     This performs the same task as sb-cluster-state-reset ex‐
353                     cept for the northbound database client.
354
355              set-n-threads N
356                     Set  the  number  of  threads  used  for building logical
357                     flows. When N is within [2-256], parallelization  is  en‐
358                     abled. When N is 1 parallelization is disabled. When N is
359                     less than 1 or more than 256, an error  is  returned.  If
360                     ovn-northd  fails to start parallelization (e.g. fails to
361                     setup semaphores, parallelization is disabled and an  er‐
362                     ror is returned.
363
364              get-n-threads
365                     Return  the  number  of threads used for building logical
366                     flows.
367
368              inc-engine/show-stats
369                     Display ovn-northd engine counters. For each engine  node
370                     the following counters have been added:
371
372recompute
373
374compute
375
376abort
377
378              inc-engine/show-stats engine_node_name counter_name
379                     Display  the  ovn-northd engine counter(s) for the speci‐
380                     fied engine_node_name. counter_name is optional  and  can
381                     be one of recompute, compute or abort.
382
383              inc-engine/clear-stats
384                     Reset ovn-northd engine counters.
385
386       Only ovn-northd-ddlog supports the following commands:
387
388              enable-cpu-profiling
389              disable-cpu-profiling
390                   Enables or disables profiling of CPU time used by the DDlog
391                   engine. When CPU profiling is enabled, the profile  command
392                   (see  below) will include DDlog CPU usage statistics in its
393                   output. Enabling CPU profiling will slow  ovn-northd-ddlog.
394                   Disabling  CPU  profiling  does  not  clear  any previously
395                   recorded statistics.
396
397              profile
398                   Outputs a profile of the current and peak sizes of arrange‐
399                   ments  inside  DDlog. This profiling data can be useful for
400                   optimizing DDlog code. If CPU profiling was previously  en‐
401                   abled  (even if it was later disabled), the output also in‐
402                   cludes a CPU time profile. See Profiling inside  the  tuto‐
403                   rial in the DDlog repository for an introduction to profil‐
404                   ing DDlog.
405

ACTIVE-STANDBY FOR HIGH AVAILABILITY

407       You may run ovn-northd more than once in an OVN deployment.  When  con‐
408       nected  to  a  standalone or clustered DB setup, OVN will automatically
409       ensure that only one of them is active at a time. If multiple instances
410       of  ovn-northd  are running and the active ovn-northd fails, one of the
411       hot standby instances of ovn-northd will automatically take over.
412
413   Active-Standby with multiple OVN DB servers
414       You may run multiple OVN DB servers in an OVN deployment with:
415
416              •      OVN DB servers deployed in active/passive mode  with  one
417                     active and multiple passive ovsdb-servers.
418
419ovn-northd  also  deployed on all these nodes, using unix
420                     ctl sockets to connect to the local OVN DB servers.
421
422       In such deployments, the ovn-northds on the passive nodes will  process
423       the  DB  changes  and compute logical flows to be thrown out later, be‐
424       cause write transactions are not allowed by the passive  ovsdb-servers.
425       It results in unnecessary CPU usage.
426
427       With  the  help  of  runtime  management  command  pause, you can pause
428       ovn-northd on these nodes. When a passive node becomes master, you  can
429       use  the  runtime management command resume to resume the ovn-northd to
430       process the DB changes.
431

LOGICAL FLOW TABLE STRUCTURE

433       One of the main purposes of ovn-northd is to populate the  Logical_Flow
434       table  in  the  OVN_Southbound  database.  This  section  describes how
435       ovn-northd does this for switch and router logical datapaths.
436
437   Logical Switch Datapaths
438     Ingress Table 0: Admission Control and Ingress Port Security check
439
440       Ingress table 0 contains these logical flows:
441
442              •      Priority 100 flows to drop packets with VLAN tags or mul‐
443                     ticast Ethernet source addresses.
444
445              •      For  each  disabled  logical port, a priority 100 flow is
446                     added which matches on all packets and applies the action
447                     REGBIT_PORT_SEC_DROP" = 1; next;" so that the packets are
448                     dropped in the next stage.
449
450              •      For each (enabled) vtep logical port, a priority 70  flow
451                     is added which matches on all packets and applies the ac‐
452                     tion next(pipeline=ingress, table=S_SWITCH_IN_L2_LKUP)  =
453                     1;  to  skip  most  stages of ingress pipeline and go di‐
454                     rectly to ingress L2 lookup table to determine the output
455                     port.  Packets from VTEP (RAMP) switch should not be sub‐
456                     jected to any ACL checks. Egress pipeline will do the ACL
457                     checks.
458
459              •      For each enabled logical port configured with qdisc queue
460                     id  in  the  options:qdisc_queue_id   column   of   Logi‐
461                     cal_Switch_Port,  a  priority  70  flow  is  added  which
462                     matches  on  all   packets   and   applies   the   action
463                     set_queue(id);           REGBIT_PORT_SEC_DROP"          =
464                     check_in_port_sec(); next;".
465
466              •      A priority 1 flow is added which matches on  all  packets
467                     for  all  the  logical  ports and applies the action REG‐
468                     BIT_PORT_SEC_DROP" = check_in_port_sec(); next; to evalu‐
469                     ate  the  port security. The action check_in_port_sec ap‐
470                     plies the port security rules defined in  the  port_secu‐
471                     rity column of Logical_Switch_Port table.
472
473     Ingress Table 1: Ingress Port Security - Apply
474
475       This  table  drops the packets if the port security check failed in the
476       previous stage i.e the register bit REGBIT_PORT_SEC_DROP is set to 1.
477
478       Ingress table 1 contains these logical flows:
479
480              •      A priority-50 fallback flow that drops the packet if  the
481                     register bit REGBIT_PORT_SEC_DROP is set to 1.
482
483              •      One priority-0 fallback flow that matches all packets and
484                     advances to the next table.
485
486     Ingress Table 2: Lookup MAC address learning table
487
488       This table looks up the MAC learning table of the logical switch  data‐
489       path to check if the port-mac pair is present or not. MAC is learnt for
490       logical switch VIF ports whose port security is disabled and  ’unknown’
491       address  setn  as  well  as  for  localnet  ports  with  option  local‐
492       net_learn_fdb. A localnet port entry does not overwrite a VIF port  en‐
493       try.
494
495              •      For  each  such VIF logical port p whose port security is
496                     disabled and ’unknown’  address  set  following  flow  is
497                     added.
498
499                     •      Priority  100  flow with the match inport == p and
500                            action  reg0[11]  =  lookup_fdb(inport,  eth.src);
501                            next;
502
503              •      For  each  such localnet logical port p following flow is
504                     added.
505
506                     •      Priority 100 flow with the match inport ==  p  and
507                            action    flags.localnet    =    1;   reg0[11]   =
508                            lookup_fdb(inport, eth.src); next;
509
510              •      One priority-0 fallback flow that matches all packets and
511                     advances to the next table.
512
513     Ingress Table 3: Learn MAC of ’unknown’ ports.
514
515       This table learns the MAC addresses seen on the VIF logical ports whose
516       port security is disabled and ’unknown’ address set as well as  on  lo‐
517       calnet  ports  with localnet_learn_fdb option set if the lookup_fdb ac‐
518       tion returned false in the previous table.  For  localnet  ports  (with
519       flags.localnet = 1), lookup_fdb returns true if (port, mac) is found or
520       if a mac is found for a port of type vif.
521
522              •      For each such VIF logical port p whose port  security  is
523                     disabled and ’unknown’ address set and localnet port fol‐
524                     lowing flow is added.
525
526                     •      Priority 100 flow with the match inport  ==  p  &&
527                            reg0[11] == 0 and action put_fdb(inport, eth.src);
528                            next; which stores the port-mac in the mac  learn‐
529                            ing  table  of the logical switch datapath and ad‐
530                            vances the packet to the next table.
531
532              •      One priority-0 fallback flow that matches all packets and
533                     advances to the next table.
534
535     Ingress Table 4: from-lport Pre-ACLs
536
537       This  table  prepares  flows  for  possible  stateful ACL processing in
538       ingress table ACLs. It contains a priority-0  flow  that  simply  moves
539       traffic  to  the  next  table. If stateful ACLs are used in the logical
540       datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
541       1;  next;)  for table Pre-stateful to send IP packets to the connection
542       tracker before eventually advancing to ingress table ACLs.  If  special
543       ports  such  as  route ports or localnet ports can’t use ct(), a prior‐
544       ity-110 flow is added to  skip  over  stateful  ACLs.  Multicast,  IPv6
545       Neighbor  Discovery  and MLD traffic also skips stateful ACLs. For "al‐
546       low-stateless" ACLs, a flow is added to bypass  setting  the  hint  for
547       connection tracker processing when there are stateful ACLs or LB rules;
548       REGBIT_ACL_STATELESS is set for traffic matching stateless ACL flows.
549
550       This table also has a priority-110 flow with the match eth.dst == E for
551       all logical switch datapaths to move traffic to the next table. Where E
552       is the service monitor mac defined in the options:svc_monitor_mac  col‐
553       umn of NB_Global table.
554
555     Ingress Table 5: Pre-LB
556
557       This table prepares flows for possible stateful load balancing process‐
558       ing in ingress table LB and Stateful. It  contains  a  priority-0  flow
559       that  simply  moves traffic to the next table. Moreover it contains two
560       priority-110 flows to move multicast, IPv6 Neighbor Discovery  and  MLD
561       traffic  to  the next table. It also contains two priority-110 flows to
562       move stateless traffic, i.e traffic for which  REGBIT_ACL_STATELESS  is
563       set,  to  the  next  table. If load balancing rules with virtual IP ad‐
564       dresses (and ports) are configured in  OVN_Northbound  database  for  a
565       logical switch datapath, a priority-100 flow is added with the match ip
566       to match on IP packets and sets the action reg0[2] = 1; next; to act as
567       a  hint  for  table  Pre-stateful  to send IP packets to the connection
568       tracker for packet de-fragmentation (and to possibly do  DNAT  for  al‐
569       ready established load balanced traffic) before eventually advancing to
570       ingress table Stateful. If controller_event has been enabled  and  load
571       balancing  rules with empty backends have been added in OVN_Northbound,
572       a 130 flow is added to trigger ovn-controller events whenever the chas‐
573       sis  receives  a packet for that particular VIP. If event-elb meter has
574       been previously created, it will be associated to the empty_lb  logical
575       flow
576
577       Prior  to OVN 20.09 we were setting the reg0[0] = 1 only if the IP des‐
578       tination matches the load balancer VIP. However  this  had  few  issues
579       cases  where  a logical switch doesn’t have any ACLs with allow-related
580       action. To understand the issue lets a  take  a  TCP  load  balancer  -
581       10.0.0.10:80=10.0.0.3:80.  If  a  logical  port - p1 with IP - 10.0.0.5
582       opens a TCP connection with the VIP - 10.0.0.10, then the packet in the
583       ingress  pipeline of ’p1’ is sent to the p1’s conntrack zone id and the
584       packet is load balanced to the backend - 10.0.0.3. For the reply packet
585       from  the  backend  lport,  it  is not sent to the conntrack of backend
586       lport’s zone id. This is fine as long as the packet is  valid.  Suppose
587       the  backend lport sends an invalid TCP packet (like incorrect sequence
588       number), the packet gets delivered to the lport ’p1’ without  unDNATing
589       the packet to the VIP - 10.0.0.10. And this causes the connection to be
590       reset by the lport p1’s VIF.
591
592       We can’t fix this issue by adding a logical flow to drop ct.inv packets
593       in  the  egress  pipeline  since it will drop all other connections not
594       destined to the load balancers. To fix this  issue,  we  send  all  the
595       packets  to the conntrack in the ingress pipeline if a load balancer is
596       configured. We can now add a lflow to drop ct.inv packets.
597
598       This table also has priority-120 flows that punt all  IGMP/MLD  packets
599       to  ovn-controller  if the switch is an interconnect switch with multi‐
600       cast snooping enabled.
601
602       This table also has a priority-110 flow with the match eth.dst == E for
603       all logical switch datapaths to move traffic to the next table. Where E
604       is the service monitor mac defined in the options:svc_monitor_mac  col‐
605       umn of NB_Global table.
606
607       This  table also has a priority-110 flow with the match inport == I for
608       all logical switch datapaths to move traffic to the next table. Where I
609       is  the  peer  of a logical router port. This flow is added to skip the
610       connection tracking of packets which enter from logical router datapath
611       to logical switch datapath.
612
613     Ingress Table 6: Pre-stateful
614
615       This  table prepares flows for all possible stateful processing in next
616       tables. It contains a priority-0 flow that simply moves traffic to  the
617       next table.
618
619              •      Priority-120  flows  that  send the packets to connection
620                     tracker using ct_lb_mark; as the action so that  the  al‐
621                     ready  established  traffic destined to the load balancer
622                     VIP gets DNATted. These flows  match  each  VIPs  IP  and
623                     port.  For  IPv4 traffic the flows also load the original
624                     destination IP and transport port in registers  reg1  and
625                     reg2.  For  IPv6 traffic the flows also load the original
626                     destination IP and transport port in registers xxreg1 and
627                     reg2.
628
629              •      A  priority-110  flow  sends the packets that don’t match
630                     the above flows to connection tracker  based  on  a  hint
631                     provided by the previous tables (with a match for reg0[2]
632                     == 1) by using the ct_lb_mark; action.
633
634              •      A priority-100  flow  sends  the  packets  to  connection
635                     tracker  based  on a hint provided by the previous tables
636                     (with a match for reg0[0] == 1) by using the ct_next; ac‐
637                     tion.
638
639     Ingress Table 7: from-lport ACL hints
640
641       This  table  consists of logical flows that set hints (reg0 bits) to be
642       used in the next stage, in the ACL processing table, if  stateful  ACLs
643       or  load  balancers  are  configured. Multiple hints can be set for the
644       same packet. The possible hints are:
645
646reg0[7]: the packet might match an allow-related ACL  and
647                     might have to commit the connection to conntrack.
648
649reg0[8]:  the packet might match an allow-related ACL but
650                     there will be no need to commit the  connection  to  con‐
651                     ntrack because it already exists.
652
653reg0[9]: the packet might match a drop/reject.
654
655reg0[10]:  the  packet  might match a drop/reject ACL but
656                     the connection was previously allowed so it might have to
657                     be committed again with ct_label=1/1.
658
659       The table contains the following flows:
660
661              •      A priority-65535 flow to advance to the next table if the
662                     logical switch has no ACLs configured, otherwise a prior‐
663                     ity-0 flow to advance to the next table.
664
665              •      A priority-7 flow that matches on packets that initiate a
666                     new session. This flow sets reg0[7] and reg0[9] and  then
667                     advances to the next table.
668
669              •      A priority-6 flow that matches on packets that are in the
670                     request direction of an already existing session that has
671                     been  marked  as  blocked.  This  flow  sets  reg0[7] and
672                     reg0[9] and then advances to the next table.
673
674              •      A priority-5 flow that matches  untracked  packets.  This
675                     flow  sets  reg0[8]  and reg0[9] and then advances to the
676                     next table.
677
678              •      A priority-4 flow that matches on packets that are in the
679                     request direction of an already existing session that has
680                     not been marked as blocked. This flow  sets  reg0[8]  and
681                     reg0[10] and then advances to the next table.
682
683              •      A priority-3 flow that matches on packets that are in not
684                     part of established sessions. This flow sets reg0[9]  and
685                     then advances to the next table.
686
687              •      A  priority-2  flow that matches on packets that are part
688                     of  an  established  session  that  has  been  marked  as
689                     blocked.  This flow sets reg0[9] and then advances to the
690                     next table.
691
692              •      A priority-1 flow that matches on packets that  are  part
693                     of  an  established  session  that has not been marked as
694                     blocked. This flow sets reg0[10] and then advances to the
695                     next table.
696
697     Ingress table 8: from-lport ACL evaluation before LB
698
699       Logical flows in this table closely reproduce those in the ACL table in
700       the OVN_Northbound database for the from-lport  direction  without  the
701       option apply-after-lb set or set to false. The priority values from the
702       ACL table have a limited range and have 1000 added  to  them  to  leave
703       room for OVN default flows at both higher and lower priorities.
704
705              •      This  table  is responsible for evaluating ACLs, and set‐
706                     ting a register bit to indicate whether the  ACL  decided
707                     to  allow,  drop, or reject the traffic. The allow bit is
708                     reg8[16]. The drop bit is reg8[17]. All flows in this ta‐
709                     ble  will advance the packet to the next table, where the
710                     bits from before are evaluated to determine  what  to  do
711                     with  the packet. Any flows in this table that intend for
712                     the packet to pass will set reg8[16] to 1, even if an ACL
713                     with  an allow-type action was not matched. This lets the
714                     next table know to allow the traffic to pass. These  bits
715                     will  be referred to as the "allow", "drop", and "reject"
716                     bits in the upcoming paragraphs.
717
718              •      If the tier column has been configured on the  ACL,  then
719                     OVN  will also match the current tier counter against the
720                     configured ACL tier. OVN keeps count of the current  tier
721                     in reg8[30..31].
722
723allow  ACLs translate into logical flows that set the al‐
724                     low bit to 1 and advance the packet to the next table. If
725                     there  are any stateful ACLs on this datapath, then allow
726                     ACLs set the allow bit to one  and  in  addition  perform
727                     ct_commit;  (which  acts  as  a hint for future tables to
728                     commit the connection to conntrack). In case the ACL  has
729                     a  label  then  reg3  is  loaded with the label value and
730                     reg0[13] bit is set to 1 (which acts as a  hint  for  the
731                     next tables to commit the label to conntrack).
732
733allow-related  ACLs translate into logical flows that set
734                     the allow  bit  and  additionally  have  ct_commit(ct_la‐
735                     bel=0/1); next; actions for new connections and reg0[1] =
736                     1; next; for existing connections. In case the ACL has  a
737                     label  then  reg3  is  loaded  with  the  label value and
738                     reg0[13] bit is set to 1 (which acts as a  hint  for  the
739                     next tables to commit the label to conntrack).
740
741allow-stateless  ACLs  translate  into logical flows that
742                     set the allow bit and advance to the next table.
743
744reject ACLs translate into logical flows  with  that  set
745                     the reject bit and advance to the next table.
746
747pass  ACLs  translate  into logical flows that do not set
748                     the allow, drop, or reject bit and advance  to  the  next
749                     table.
750
751              •      Other ACLs set the drop bit and advance to the next table
752                     for new or untracked connections. For known  connections,
753                     they  set  the  drop  bit, as well as running the ct_com‐
754                     mit(ct_label=1/1); action. Setting ct_label marks a  con‐
755                     nection as one that was previously allowed, but should no
756                     longer be allowed due to a policy change.
757
758       This table contains a priority-65535 flow to set the allow bit and  ad‐
759       vance  to  the next table if the logical switch has no ACLs configured,
760       otherwise a priority-0 flow to advance to the next table is added. This
761       flow  does  not  set  the  allow bit, so that the next table can decide
762       whether to allow or drop the packet based  on  the  value  of  the  op‐
763       tions:default_acl_drop column of the NB_Global table.
764
765       A  priority-65532 flow is added that sets the allow bit for IPv6 Neigh‐
766       bor solicitation, Neighbor discover, Router solicitation, Router adver‐
767       tisement and MLD packets regardless of other ACLs defined.
768
769       If  the logical datapath has a stateful ACL or a load balancer with VIP
770       configured, the following flows will also be added:
771
772              •      If options:default_acl_drop column of NB_Global is  false
773                     or  not set, a priority-1 flow that sets the hint to com‐
774                     mit IP traffic that is not part of  established  sessions
775                     to  the  connection  tracker  (with  action  reg0[1] = 1;
776                     next;). This is needed for the default allow  policy  be‐
777                     cause,  while  the initiator’s direction may not have any
778                     stateful rules, the server’s  may  and  then  its  return
779                     traffic would not be known and marked as invalid.
780
781              •      A  priority-1  flow  that sets the allow bit and sets the
782                     hint to commit IP traffic to the connection tracker (with
783                     action  reg0[1]  =  1; next;). This is needed for the de‐
784                     fault allow policy because, while the initiator’s  direc‐
785                     tion  may  not  have any stateful rules, the server’s may
786                     and then its return traffic would not be known and marked
787                     as invalid.
788
789              •      A  priority-65532  flow  that  sets the allow bit for any
790                     traffic in the reply direction for a connection that  has
791                     been  committed  to  the connection tracker (i.e., estab‐
792                     lished flows), as long as the  committed  flow  does  not
793                     have  ct_mark.blocked  set. We only handle traffic in the
794                     reply direction here because we want all packets going in
795                     the  request direction to still go through the flows that
796                     implement the currently defined policy based on ACLs.  If
797                     a   connection   is   no   longer   allowed   by  policy,
798                     ct_mark.blocked will get set and packets in the reply di‐
799                     rection will no longer be allowed, either. This flow also
800                     clears the register bits reg0[9] and  reg0[10]  and  sets
801                     register  bit reg0[17]. If ACL logging and logging of re‐
802                     lated packets is enabled, then a companion priority-65533
803                     flow  will  be installed that accomplishes the same thing
804                     but also logs the traffic.
805
806              •      A priority-65532 flow that sets the  allow  bit  for  any
807                     traffic that is considered related to a committed flow in
808                     the connection tracker (e.g., an  ICMP  Port  Unreachable
809                     from  a non-listening UDP port), as long as the committed
810                     flow does not have ct_mark.blocked set.  This  flow  also
811                     applies  NAT  to the related traffic so that ICMP headers
812                     and the inner packet have correct addresses. If ACL  log‐
813                     ging  and  logging  of related packets is enabled, then a
814                     companion priority-65533 flow will be installed that  ac‐
815                     complishes the same thing but also logs the traffic.
816
817              •      A  priority-65532  flow  that  sets  the drop bit for all
818                     traffic marked by the connection tracker as invalid.
819
820              •      A priority-65532 flow that sets  the  drop  bit  for  all
821                     traffic  in  the reply direction with ct_mark.blocked set
822                     meaning that the connection should no longer  be  allowed
823                     due  to a policy change. Packets in the request direction
824                     are skipped here to let a newly created ACL re-allow this
825                     connection.
826
827       If the logical datapath has any ACL or a load balancer with VIP config‐
828       ured, the following flow will also be added:
829
830              •      A priority 34000 logical flow is added for  each  logical
831                     switch  datapath  with the match eth.dst = E to allow the
832                     service monitor reply packet destined  to  ovn-controller
833                     that  sets  the allow bit, where E is the service monitor
834                     mac defined  in  the  options:svc_monitor_mac  column  of
835                     NB_Global table.
836
837     Ingress Table 9: from-lport ACL action
838
839       Logical  flows  in this table decide how to proceed based on the values
840       of the allow, drop, and reject bits that may have been set in the  pre‐
841       vious table.
842
843              •      If  no ACLs are configured, then a priority 0 flow is in‐
844                     stalled that matches everything and advances to the  next
845                     table.
846
847              •      A  priority  1000 flow is installed that will advance the
848                     packet to the next table if the allow bit is set.
849
850              •      A priority 1000 flow is installed that will run the drop;
851                     action if the drop bit is set.
852
853              •      A  priority  1000  flow  is  installed  that will run the
854                     tcp_reset { output <->  inport;  next(pipeline=egress,ta‐
855                     ble=5);}  action  for  TCP connections,icmp4/icmp6 action
856                     for UDP connections, and sctp_abort  {output  <-%gt;  in‐
857                     port; next(pipeline=egress,table=5);} action for SCTP as‐
858                     sociations.
859
860              •      If any ACLs have tiers configured  on  them,  then  three
861                     priority  500  flows  are  installed. If the current tier
862                     counter is 0, 1, or 2, then the current tier  counter  is
863                     incremented  by  one  and  the packet is sent back to the
864                     previous table for re-evaluation.
865
866     Ingress Table 10: from-lport QoS Marking
867
868       Logical flows in this table closely reproduce those in  the  QoS  table
869       with  the  action  column  set  in  the OVN_Northbound database for the
870       from-lport direction.
871
872              •      For every qos_rules entry in a logical switch  with  DSCP
873                     marking  enabled,  a  flow  will be added at the priority
874                     mentioned in the QoS table.
875
876              •      One priority-0 fallback flow that matches all packets and
877                     advances to the next table.
878
879     Ingress Table 11: from-lport QoS Meter
880
881       Logical  flows  in  this table closely reproduce those in the QoS table
882       with the bandwidth column set in the OVN_Northbound  database  for  the
883       from-lport direction.
884
885              •      For every qos_rules entry in a logical switch with meter‐
886                     ing enabled, a flow will be added at  the  priority  men‐
887                     tioned in the QoS table.
888
889              •      One priority-0 fallback flow that matches all packets and
890                     advances to the next table.
891
892     Ingress Table 12: Load balancing affinity check
893
894       Load balancing affinity check  table  contains  the  following  logical
895       flows:
896
897              •      For  all the configured load balancing rules for a switch
898                     in OVN_Northbound  database  where  a  positive  affinity
899                     timeout  is  specified in options column, that includes a
900                     L4 port PORT of protocol P and IP address VIP,  a  prior‐
901                     ity-100  flow  is  added. For IPv4 VIPs, the flow matches
902                     ct.new && ip && ip4.dst == VIP && P.dst == PORT. For IPv6
903                     VIPs, the flow matches ct.new && ip && ip6.dst == VIP&& P
904                     && P.dst  ==   PORT.  The  flow’s  action  is  reg9[6]  =
905                     chk_lb_aff(); next;.
906
907              •      A  priority  0 flow is added which matches on all packets
908                     and applies the action next;.
909
910     Ingress Table 13: LB
911
912              •      For all the configured load balancing rules for a  switch
913                     in  OVN_Northbound  database  where  a  positive affinity
914                     timeout is specified in options column, that  includes  a
915                     L4  port  PORT of protocol P and IP address VIP, a prior‐
916                     ity-150 flow is added. For IPv4 VIPs,  the  flow  matches
917                     reg9[6]  ==  1 && ct.new && ip && ip4.dst == VIP && P.dst
918                     == PORT . For IPv6 VIPs, the flow matches reg9[6] == 1 &&
919                     ct.new  &&  ip && ip6.dst ==  VIP && P && P.dst ==  PORT.
920                     The flow’s action is ct_lb_mark(args),  where  args  con‐
921                     tains  comma  separated  IP  addresses (and optional port
922                     numbers) to load balance to. The address family of the IP
923                     addresses  of  args  is the same as the address family of
924                     VIP.
925
926              •      For all the configured load balancing rules for a  switch
927                     in  OVN_Northbound  database that includes a L4 port PORT
928                     of protocol P and IP address VIP, a priority-120 flow  is
929                     added.  For  IPv4 VIPs , the flow matches ct.new && ip &&
930                     ip4.dst == VIP && P.dst == PORT. For IPv6 VIPs, the  flow
931                     matches  ct.new  && ip && ip6.dst == VIP && P && P.dst ==
932                     PORT. The flow’s action is ct_lb_mark(args) , where  args
933                     contains  comma separated IP addresses (and optional port
934                     numbers) to load balance to. The address family of the IP
935                     addresses  of  args  is the same as the address family of
936                     VIP. If health check is enabled, then args will only con‐
937                     tain  those  endpoints whose service monitor status entry
938                     in OVN_Southbound db is either online or empty. For  IPv4
939                     traffic  the  flow also loads the original destination IP
940                     and transport port in registers reg1 and reg2.  For  IPv6
941                     traffic  the  flow also loads the original destination IP
942                     and transport port in  registers  xxreg1  and  reg2.  The
943                     above  flow  is  created even if the load balancer is at‐
944                     tached to a logical router connected to the current logi‐
945                     cal  switch and the install_ls_lb_from_router variable in
946                     options is set to true.
947
948              •      For all the configured load balancing rules for a  switch
949                     in  OVN_Northbound  database that includes just an IP ad‐
950                     dress VIP to match on, OVN adds a priority-110 flow.  For
951                     IPv4  VIPs,  the  flow matches ct.new && ip && ip4.dst ==
952                     VIP. For IPv6 VIPs, the flow  matches  ct.new  &&  ip  &&
953                     ip6.dst   ==   VIP.   The   action   on   this   flow  is
954                     ct_lb_mark(args), where args contains comma separated  IP
955                     addresses  of  the  same  address family as VIP. For IPv4
956                     traffic the flow also loads the original  destination  IP
957                     and  transport  port in registers reg1 and reg2. For IPv6
958                     traffic the flow also loads the original  destination  IP
959                     and  transport  port  in  registers  xxreg1 and reg2. The
960                     above flow is created even if the load  balancer  is  at‐
961                     tached to a logical router connected to the current logi‐
962                     cal switch and the install_ls_lb_from_router variable  in
963                     options is set to true.
964
965              •      If  the load balancer is created with --reject option and
966                     it has no active backends, a TCP reset segment (for  tcp)
967                     or an ICMP port unreachable packet (for all other kind of
968                     traffic) will be sent whenever an incoming packet is  re‐
969                     ceived for this load-balancer. Please note using --reject
970                     option will disable empty_lb SB controller event for this
971                     load balancer.
972
973     Ingress Table 14: Load balancing affinity learn
974
975       Load  balancing  affinity  learn  table  contains the following logical
976       flows:
977
978              •      For all the configured load balancing rules for a  switch
979                     in  OVN_Northbound  database  where  a  positive affinity
980                     timeout T is specified in options column, that includes a
981                     L4  port  PORT of protocol P and IP address VIP, a prior‐
982                     ity-100 flow is added. For IPv4 VIPs,  the  flow  matches
983                     reg9[6]  ==  0 && ct.new && ip && ip4.dst == VIP && P.dst
984                     == PORT. For IPv6 VIPs, the flow matches ct.new && ip  &&
985                     ip6.dst  == VIP && P && P.dst == PORT . The flow’s action
986                     is commit_lb_aff(vip = VIP:PORT, backend  =  backend  ip:
987                     backend port, proto = P, timeout = T); .
988
989              •      A  priority  0 flow is added which matches on all packets
990                     and applies the action next;.
991
992     Ingress Table 15: Pre-Hairpin
993
994              •      If the logical switch has  load  balancer(s)  configured,
995                     then  a  priority-100  flow is added with the match ip &&
996                     ct.trk to check if the packet needs to be hairpinned  (if
997                     after  load  balancing  the  destination  IP  matches the
998                     source IP) or not by  executing  the  actions  reg0[6]  =
999                     chk_lb_hairpin();  and reg0[12] = chk_lb_hairpin_reply();
1000                     and advances the packet to the next table.
1001
1002              •      A priority-0 flow that simply moves traffic to  the  next
1003                     table.
1004
1005     Ingress Table 16: Nat-Hairpin
1006
1007              •      If  the  logical  switch has load balancer(s) configured,
1008                     then a priority-100 flow is added with the  match  ip  &&
1009                     ct.new && ct.trk && reg0[6] == 1 which hairpins the traf‐
1010                     fic by NATting source IP to the load balancer VIP by exe‐
1011                     cuting  the action ct_snat_to_vip and advances the packet
1012                     to the next table.
1013
1014              •      If the logical switch has  load  balancer(s)  configured,
1015                     then  a  priority-100  flow is added with the match ip &&
1016                     ct.est && ct.trk && reg0[6] == 1 which hairpins the traf‐
1017                     fic by NATting source IP to the load balancer VIP by exe‐
1018                     cuting the action ct_snat and advances the packet to  the
1019                     next table.
1020
1021              •      If  the  logical  switch has load balancer(s) configured,
1022                     then a priority-90 flow is added with  the  match  ip  &&
1023                     reg0[12]  == 1 which matches on the replies of hairpinned
1024                     traffic (i.e., destination IP is VIP, source  IP  is  the
1025                     backend IP and source L4 port is backend port for L4 load
1026                     balancers) and executes ct_snat and advances  the  packet
1027                     to the next table.
1028
1029              •      A  priority-0  flow that simply moves traffic to the next
1030                     table.
1031
1032     Ingress Table 17: Hairpin
1033
1034              •      If logical switch has attached  logical  switch  port  of
1035                     vtep  type, then for each distributed gateway router port
1036                     RP attached to this logical switch and has chassis  redi‐
1037                     rect  port  cr-RP, a priority-2000 flow is added with the
1038                     match .IP
1039                     reg0[14] == 1 && is_chassis_resident(cr-RP)
1040
1041                     and action next;.
1042
1043                     reg0[14] register bit is set in the ingress L2 port secu‐
1044                     rity check table for traffic received from HW VTEP (ramp)
1045                     ports.
1046
1047              •      If logical switch has attached  logical  switch  port  of
1048                     vtep  type,  then  a  priority-1000  flow that matches on
1049                     reg0[14] register bit for the traffic  received  from  HW
1050                     VTEP  (ramp) ports. This traffic is passed to ingress ta‐
1051                     ble ls_in_l2_lkup.
1052
1053              •      A priority-1 flow that hairpins traffic matched  by  non-
1054                     default  flows  in  the Pre-Hairpin table. Hairpinning is
1055                     done at L2, Ethernet addresses are swapped and the  pack‐
1056                     ets are looped back on the input port.
1057
1058              •      A  priority-0  flow that simply moves traffic to the next
1059                     table.
1060
1061     Ingress table 18: from-lport ACL evaluation after LB
1062
1063       Logical flows in this table closely reproduce those in the ACL eval ta‐
1064       ble  in  the  OVN_Northbound database for the from-lport direction with
1065       the option apply-after-lb set to true. The priority values from the ACL
1066       table  have  a  limited range and have 1000 added to them to leave room
1067       for OVN default flows at both higher and lower priorities. The flows in
1068       this  table indicate the ACL verdict by setting reg8[16] for allow-type
1069       ACLs, reg8[17] for drop ACLs, and reg8[17] for reject  ACLs,  and  then
1070       advancing  the  packet  to the next table. These will be reffered to as
1071       the allow bit, drop bit, and reject bit  throughout  the  documentation
1072       for this table and the next one.
1073
1074       Like  with ACLs that are evaluated before load balancers, if the ACL is
1075       configured with a tier value, then the current tier  counter,  supplied
1076       in  reg8[30..31]  is matched against the ACL’s configured tier in addi‐
1077       tion to the ACL’s match.
1078
1079allow apply-after-lb ACLs translate  into  logical  flows
1080                     that  set  the  allow bit. If there are any stateful ACLs
1081                     (including both before-lb  and  after-lb  ACLs)  on  this
1082                     datapath,  then  allow  ACLs  also  run  ct_commit; next;
1083                     (which acts as a hint for an upcoming table to commit the
1084                     connection  to  conntrack).  In  case the ACL has a label
1085                     then reg3 is loaded with the label value and reg0[13] bit
1086                     is  set to 1 (which acts as a hint for the next tables to
1087                     commit the label to conntrack).
1088
1089allow-related apply-after-lb ACLs translate into  logical
1090                     flows that set the allow bit and run the ct_commit(ct_la‐
1091                     bel=0/1); next; actions for new connections and reg0[1] =
1092                     1;  next; for existing connections. In case the ACL has a
1093                     label then reg3  is  loaded  with  the  label  value  and
1094                     reg0[13]  bit  is  set to 1 (which acts as a hint for the
1095                     next tables to commit the label to conntrack).
1096
1097allow-stateless apply-after-lb ACLs translate into  logi‐
1098                     cal  flows that set the allow bit and advance to the next
1099                     table.
1100
1101reject apply-after-lb ACLs translate into  logical  flows
1102                     that set the reject bit and advance to the next table.
1103
1104pass  apply-after-lb  ACLs  translate  into logical flows
1105                     that do not set the allow, drop, or reject  bit  and  ad‐
1106                     vance to the next table.
1107
1108              •      Other apply-after-lb ACLs set the drop bit for new or un‐
1109                     tracked  connections  and  ct_commit(ct_label=1/1);   for
1110                     known connections. Setting ct_label marks a connection as
1111                     one that was previously allowed, but should no longer  be
1112                     allowed due to a policy change.
1113
1114              •      One  priority-65532  flow  matching packets with reg0[17]
1115                     set (either replies to existing sessions or  traffic  re‐
1116                     lated  to  existing sessions) and allows these by setting
1117                     the allow bit and advancing to the next table.
1118
1119              •      One priority-0 fallback flow that matches all packets and
1120                     advances to the next table.
1121
1122     Ingress Table 19: from-lport ACL action after LB
1123
1124       Logical  flows  in this table decide how to proceed based on the values
1125       of the allow, drop, and reject bits that may have been set in the  pre‐
1126       vious table.
1127
1128              •      If  no ACLs are configured, then a priority 0 flow is in‐
1129                     stalled that matches everything and advances to the  next
1130                     table.
1131
1132              •      A  priority  1000 flow is installed that will advance the
1133                     packet to the next table if the allow bit is set.
1134
1135              •      A priority 1000 flow is installed that will run the drop;
1136                     action if the drop bit is set.
1137
1138              •      A  priority  1000  flow  is  installed  that will run the
1139                     tcp_reset { output <->  inport;  next(pipeline=egress,ta‐
1140                     ble=5);}  action  for  TCP connections,icmp4/icmp6 action
1141                     for UDP connections, and sctp_abort  {output  <-%gt;  in‐
1142                     port; next(pipeline=egress,table=5);} action for SCTP as‐
1143                     sociations.
1144
1145              •      If any ACLs have tiers configured  on  them,  then  three
1146                     priority  500  flows  are  installed. If the current tier
1147                     counter is 0, 1, or 2, then the current tier  counter  is
1148                     incremented  by  one  and  the packet is sent back to the
1149                     previous table for re-evaluation.
1150
1151     Ingress Table 20: Stateful
1152
1153              •      A priority 100 flow is added which commits the packet  to
1154                     the  conntrack  and  sets the most significant 32-bits of
1155                     ct_label with the reg3 value based on the  hint  provided
1156                     by  previous  tables  (with  a  match for reg0[1] == 1 &&
1157                     reg0[13] == 1). This is used by the ACLs  with  label  to
1158                     commit the label value to conntrack.
1159
1160              •      For  ACLs  without label, a second priority-100 flow com‐
1161                     mits packets to connection tracker using ct_commit; next;
1162                     action  based  on  a hint provided by the previous tables
1163                     (with a match for reg0[1] == 1 && reg0[13] == 0).
1164
1165              •      A priority-0 flow that simply moves traffic to  the  next
1166                     table.
1167
1168     Ingress Table 21: ARP/ND responder
1169
1170       This  table  implements  ARP/ND responder in a logical switch for known
1171       IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
1172       by locally responding to ARP requests without the need to send to other
1173       hypervisors. One common case is when the inport is a logical port asso‐
1174       ciated with a VIF and the broadcast is responded to on the local hyper‐
1175       visor rather than broadcast across the whole network and  responded  to
1176       by the destination VM. This behavior is proxy ARP.
1177
1178       ARP  requests  arrive from VMs from a logical switch inport of type de‐
1179       fault. For this case, the logical switch proxy ARP  rules  can  be  for
1180       other  VMs  or logical router ports. Logical switch proxy ARP rules may
1181       be programmed both for mac binding of IP  addresses  on  other  logical
1182       switch  VIF  ports  (which are of the default logical switch port type,
1183       representing connectivity to VMs or containers), and for mac binding of
1184       IP  addresses  on  logical switch router type ports, representing their
1185       logical router port peers. In order to support proxy  ARP  for  logical
1186       router  ports,  an  IP address must be configured on the logical switch
1187       router type port, with the same value as the peer logical router  port.
1188       The configured MAC addresses must match as well. When a VM sends an ARP
1189       request for a distributed logical router port and if  the  peer  router
1190       type  port  of  the attached logical switch does not have an IP address
1191       configured, the ARP request will be broadcast on  the  logical  switch.
1192       One of the copies of the ARP request will go through the logical switch
1193       router type port to the logical  router  datapath,  where  the  logical
1194       router  ARP  responder will generate a reply. The MAC binding of a dis‐
1195       tributed logical router, once learned by an associated VM, is used  for
1196       all  that VM’s communication needing routing. Hence, the action of a VM
1197       re-arping for the mac binding of the  logical  router  port  should  be
1198       rare.
1199
1200       Logical  switch  ARP responder proxy ARP rules can also be hit when re‐
1201       ceiving ARP requests externally on a L2 gateway port. In this case, the
1202       hypervisor  acting as an L2 gateway, responds to the ARP request on be‐
1203       half of a destination VM.
1204
1205       Note that ARP requests received from localnet logical inports  can  ei‐
1206       ther  go  directly  to VMs, in which case the VM responds or can hit an
1207       ARP responder for a logical router port if the packet is  used  to  re‐
1208       solve  a  logical router port next hop address. In either case, logical
1209       switch ARP responder rules will not be hit. It contains  these  logical
1210       flows:
1211
1212              •      If  packet  was  received from HW VTEP (ramp switch), and
1213                     this packet is ARP or Neighbor Solicitation, such  packet
1214                     is  passed  to  next  table with max proirity. ARP/ND re‐
1215                     quests from HW VTEP must be  handled  in  logical  router
1216                     ingress pipeline.
1217
1218              •      If  the  logical  switch  has  no  router  ports with op‐
1219                     tions:arp_proxy configured add a  priority-100  flows  to
1220                     skip  the ARP responder if inport is of type localnet ad‐
1221                     vances directly to the next table. ARP requests  sent  to
1222                     localnet  ports  can be received by multiple hypervisors.
1223                     Now, because the same mac binding rules are downloaded to
1224                     all  hypervisors,  each  of the multiple hypervisors will
1225                     respond. This will confuse L2 learning on the  source  of
1226                     the  ARP  requests. ARP requests received on an inport of
1227                     type router are not expected to hit  any  logical  switch
1228                     ARP responder flows. However, no skip flows are installed
1229                     for these packets, as there would be some additional flow
1230                     cost for this and the value appears limited.
1231
1232              •      If  inport V is of type virtual adds a priority-100 logi‐
1233                     cal flows for each P configured in  the  options:virtual-
1234                     parents column with the match
1235
1236                     inport == P && && ((arp.op == 1 && arp.spa == VIP && arp.tpa == VIP) || (arp.op == 2 && arp.spa == VIP))
1237                     inport == P && && ((nd_ns && ip6.dst == {VIP, NS_MULTICAST_ADDR} && nd.target == VIP) || (nd_na && nd.target == VIP))
1238
1239
1240                     and applies the action
1241
1242                     bind_vport(V, inport);
1243
1244
1245                     and advances the packet to the next table.
1246
1247                     Where  VIP is the virtual ip configured in the column op‐
1248                     tions:virtual-ip and NS_MULTICAST_ADDR is  solicited-node
1249                     multicast address corresponding to the VIP.
1250
1251              •      Priority-50  flows  that match ARP requests to each known
1252                     IP address A of every logical switch  port,  and  respond
1253                     with ARP replies directly with corresponding Ethernet ad‐
1254                     dress E:
1255
1256                     eth.dst = eth.src;
1257                     eth.src = E;
1258                     arp.op = 2; /* ARP reply. */
1259                     arp.tha = arp.sha;
1260                     arp.sha = E;
1261                     arp.tpa = arp.spa;
1262                     arp.spa = A;
1263                     outport = inport;
1264                     flags.loopback = 1;
1265                     output;
1266
1267
1268                     These flows are omitted for  logical  ports  (other  than
1269                     router  ports  or  localport ports) that are down (unless
1270                     ignore_lsp_down is configured as true in  options  column
1271                     of NB_Global table of the Northbound database), for logi‐
1272                     cal ports of type virtual, for logical  ports  with  ’un‐
1273                     known’  address  set  and  for logical ports of a logical
1274                     switch configured with other_config:vlan-passthru=true.
1275
1276                     The above ARP responder flows are added for the  list  of
1277                     IPv4  addresses if defined in options:arp_proxy column of
1278                     Logical_Switch_Port table for  logical  switch  ports  of
1279                     type router.
1280
1281              •      Priority-50  flows  that match IPv6 ND neighbor solicita‐
1282                     tions to each known IP address A (and A’s solicited  node
1283                     address)  of  every  logical  switch  port except of type
1284                     router, and respond with neighbor advertisements directly
1285                     with corresponding Ethernet address E:
1286
1287                     nd_na {
1288                         eth.src = E;
1289                         ip6.src = A;
1290                         nd.target = A;
1291                         nd.tll = E;
1292                         outport = inport;
1293                         flags.loopback = 1;
1294                         output;
1295                     };
1296
1297
1298                     Priority-50  flows  that match IPv6 ND neighbor solicita‐
1299                     tions to each known IP address A (and A’s solicited  node
1300                     address)  of  logical switch port of type router, and re‐
1301                     spond with neighbor advertisements directly  with  corre‐
1302                     sponding Ethernet address E:
1303
1304                     nd_na_router {
1305                         eth.src = E;
1306                         ip6.src = A;
1307                         nd.target = A;
1308                         nd.tll = E;
1309                         outport = inport;
1310                         flags.loopback = 1;
1311                         output;
1312                     };
1313
1314
1315                     These  flows  are  omitted  for logical ports (other than
1316                     router ports or localport ports) that  are  down  (unless
1317                     ignore_lsp_down  is  configured as true in options column
1318                     of NB_Global table of the Northbound database), for logi‐
1319                     cal ports of type virtual and for logical ports with ’un‐
1320                     known’ address set.
1321
1322                     The above NDP responder flows are added for the  list  of
1323                     IPv6  addresses if defined in options:arp_proxy column of
1324                     Logical_Switch_Port table for  logical  switch  ports  of
1325                     type router.
1326
1327              •      Priority-100  flows  with match criteria like the ARP and
1328                     ND flows above, except that they only match packets  from
1329                     the  inport  that owns the IP addresses in question, with
1330                     action next;. These flows prevent OVN from  replying  to,
1331                     for  example,  an ARP request emitted by a VM for its own
1332                     IP address. A VM only makes this kind of request  to  at‐
1333                     tempt  to  detect  a  duplicate IP address assignment, so
1334                     sending a reply will prevent the VM from accepting the IP
1335                     address that it owns.
1336
1337                     In  place  of  next;, it would be reasonable to use drop;
1338                     for the flows’ actions. If everything is working as it is
1339                     configured,  then  this would produce equivalent results,
1340                     since no host should reply to the request. But ARPing for
1341                     one’s  own  IP  address  is intended to detect situations
1342                     where the network is not working as configured, so  drop‐
1343                     ping the request would frustrate that intent.
1344
1345              •      For  each  SVC_MON_SRC_IP  defined  in  the  value of the
1346                     ip_port_mappings:ENDPOINT_IP column of Load_Balancer  ta‐
1347                     ble,  priority-110  logical  flow is added with the match
1348                     arp.tpa == SVC_MON_SRC_IP && && arp.op == 1  and  applies
1349                     the action
1350
1351                     eth.dst = eth.src;
1352                     eth.src = E;
1353                     arp.op = 2; /* ARP reply. */
1354                     arp.tha = arp.sha;
1355                     arp.sha = E;
1356                     arp.tpa = arp.spa;
1357                     arp.spa = A;
1358                     outport = inport;
1359                     flags.loopback = 1;
1360                     output;
1361
1362
1363                     where  E is the service monitor source mac defined in the
1364                     options:svc_monitor_mac column in  the  NB_Global  table.
1365                     This mac is used as the source mac in the service monitor
1366                     packets for the load balancer endpoint IP health checks.
1367
1368                     SVC_MON_SRC_IP is used as the source ip  in  the  service
1369                     monitor  IPv4  packets  for the load balancer endpoint IP
1370                     health checks.
1371
1372                     These flows are required if an ARP request  is  sent  for
1373                     the IP SVC_MON_SRC_IP.
1374
1375                     For IPv6 the similar flow is added with the following ac‐
1376                     tion
1377
1378                     nd_na {
1379                         eth.dst = eth.src;
1380                         eth.src = E;
1381                         ip6.src = A;
1382                         nd.target = A;
1383                         nd.tll = E;
1384                         outport = inport;
1385                         flags.loopback = 1;
1386                         output;
1387                     };
1388
1389
1390              •      For each VIP configured in the table  Forwarding_Group  a
1391                     priority-50  logical flow is added with the match arp.tpa
1392                     == vip && && arp.op == 1
1393                      and applies the action
1394
1395                     eth.dst = eth.src;
1396                     eth.src = E;
1397                     arp.op = 2; /* ARP reply. */
1398                     arp.tha = arp.sha;
1399                     arp.sha = E;
1400                     arp.tpa = arp.spa;
1401                     arp.spa = A;
1402                     outport = inport;
1403                     flags.loopback = 1;
1404                     output;
1405
1406
1407                     where E is the forwarding  group’s  mac  defined  in  the
1408                     vmac.
1409
1410                     A is used as either the destination ip for load balancing
1411                     traffic to child ports or as nexthop to hosts behind  the
1412                     child ports.
1413
1414                     These  flows are required to respond to an ARP request if
1415                     an ARP request is sent for the IP vip.
1416
1417              •      One priority-0 fallback flow that matches all packets and
1418                     advances to the next table.
1419
1420     Ingress Table 22: DHCP option processing
1421
1422       This  table adds the DHCPv4 options to a DHCPv4 packet from the logical
1423       ports configured with IPv4 address(es) and DHCPv4  options,  and  simi‐
1424       larly  for  DHCPv6  options. This table also adds flows for the logical
1425       ports of type external.
1426
1427              •      A priority-100 logical flow is added  for  these  logical
1428                     ports which matches the IPv4 packet with udp.src = 68 and
1429                     udp.dst = 67 and applies the action put_dhcp_opts and ad‐
1430                     vances the packet to the next table.
1431
1432                     reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
1433                     next;
1434
1435
1436                     For  DHCPDISCOVER  and  DHCPREQUEST,  this transforms the
1437                     packet into a DHCP reply, adds the DHCP offer IP  ip  and
1438                     options  to  the  packet,  and stores 1 into reg0[3]. For
1439                     other kinds of packets, it just stores  0  into  reg0[3].
1440                     Either way, it continues to the next table.
1441
1442              •      A  priority-100  logical  flow is added for these logical
1443                     ports which matches the IPv6 packet with  udp.src  =  546
1444                     and  udp.dst = 547 and applies the action put_dhcpv6_opts
1445                     and advances the packet to the next table.
1446
1447                     reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
1448                     next;
1449
1450
1451                     For DHCPv6 Solicit/Request/Confirm packets,  this  trans‐
1452                     forms  the packet into a DHCPv6 Advertise/Reply, adds the
1453                     DHCPv6 offer IP ip and options to the packet, and  stores
1454                     1  into  reg0[3].  For  other  kinds  of packets, it just
1455                     stores 0 into reg0[3]. Either way, it  continues  to  the
1456                     next table.
1457
1458              •      A priority-0 flow that matches all packets to advances to
1459                     table 16.
1460
1461     Ingress Table 23: DHCP responses
1462
1463       This table implements DHCP responder for the DHCP replies generated  by
1464       the previous table.
1465
1466              •      A  priority  100  logical  flow  is added for the logical
1467                     ports configured with DHCPv4 options which  matches  IPv4
1468                     packets with udp.src == 68 && udp.dst == 67 && reg0[3] ==
1469                     1 and responds back to the inport  after  applying  these
1470                     actions. If reg0[3] is set to 1, it means that the action
1471                     put_dhcp_opts was successful.
1472
1473                     eth.dst = eth.src;
1474                     eth.src = E;
1475                     ip4.src = S;
1476                     udp.src = 67;
1477                     udp.dst = 68;
1478                     outport = P;
1479                     flags.loopback = 1;
1480                     output;
1481
1482
1483                     where E is the server MAC address and  S  is  the  server
1484                     IPv4  address  defined  in  the DHCPv4 options. Note that
1485                     ip4.dst field is handled by put_dhcp_opts.
1486
1487                     (This terminates ingress packet  processing;  the  packet
1488                     does not go to the next ingress table.)
1489
1490              •      A  priority  100  logical  flow  is added for the logical
1491                     ports configured with DHCPv6 options which  matches  IPv6
1492                     packets  with udp.src == 546 && udp.dst == 547 && reg0[3]
1493                     == 1 and responds back to the inport after applying these
1494                     actions. If reg0[3] is set to 1, it means that the action
1495                     put_dhcpv6_opts was successful.
1496
1497                     eth.dst = eth.src;
1498                     eth.src = E;
1499                     ip6.dst = A;
1500                     ip6.src = S;
1501                     udp.src = 547;
1502                     udp.dst = 546;
1503                     outport = P;
1504                     flags.loopback = 1;
1505                     output;
1506
1507
1508                     where E is the server MAC address and  S  is  the  server
1509                     IPv6  LLA address generated from the server_id defined in
1510                     the DHCPv6 options and A is the IPv6 address  defined  in
1511                     the logical port’s addresses column.
1512
1513                     (This  terminates  packet processing; the packet does not
1514                     go on the next ingress table.)
1515
1516              •      A priority-0 flow that matches all packets to advances to
1517                     table 17.
1518
1519     Ingress Table 24 DNS Lookup
1520
1521       This  table  looks  up  and resolves the DNS names to the corresponding
1522       configured IP address(es).
1523
1524              •      A priority-100 logical flow for each logical switch data‐
1525                     path  if it is configured with DNS records, which matches
1526                     the IPv4 and IPv6 packets with udp.dst = 53  and  applies
1527                     the action dns_lookup and advances the packet to the next
1528                     table.
1529
1530                     reg0[4] = dns_lookup(); next;
1531
1532
1533                     For valid DNS packets, this transforms the packet into  a
1534                     DNS  reply  if the DNS name can be resolved, and stores 1
1535                     into reg0[4]. For failed DNS resolution or other kinds of
1536                     packets,  it  just  stores 0 into reg0[4]. Either way, it
1537                     continues to the next table.
1538
1539     Ingress Table 25 DNS Responses
1540
1541       This table implements DNS responder for the DNS  replies  generated  by
1542       the previous table.
1543
1544              •      A priority-100 logical flow for each logical switch data‐
1545                     path if it is configured with DNS records, which  matches
1546                     the IPv4 and IPv6 packets with udp.dst = 53 && reg0[4] ==
1547                     1 and responds back to the inport  after  applying  these
1548                     actions. If reg0[4] is set to 1, it means that the action
1549                     dns_lookup was successful.
1550
1551                     eth.dst <-> eth.src;
1552                     ip4.src <-> ip4.dst;
1553                     udp.dst = udp.src;
1554                     udp.src = 53;
1555                     outport = P;
1556                     flags.loopback = 1;
1557                     output;
1558
1559
1560                     (This terminates ingress packet  processing;  the  packet
1561                     does not go to the next ingress table.)
1562
1563     Ingress table 26 External ports
1564
1565       Traffic  from  the  external  logical  ports enter the ingress datapath
1566       pipeline via the localnet port. This table adds the below logical flows
1567       to handle the traffic from these ports.
1568
1569              •      A  priority-100  flow  is added for each external logical
1570                     port which doesn’t  reside  on  a  chassis  to  drop  the
1571                     ARP/IPv6  NS  request to the router IP(s) (of the logical
1572                     switch) which matches on the inport of the external logi‐
1573                     cal  port and the valid eth.src address(es) of the exter‐
1574                     nal logical port.
1575
1576                     This flow guarantees  that  the  ARP/NS  request  to  the
1577                     router IP address from the external ports is responded by
1578                     only the chassis which has claimed these external  ports.
1579                     All the other chassis, drops these packets.
1580
1581                     A  priority-100  flow  is added for each external logical
1582                     port which doesn’t reside on a chassis to drop any packet
1583                     destined to the router mac - with the match inport == ex‐
1584                     ternal && eth.src == E  &&  eth.dst  ==  R  &&  !is_chas‐
1585                     sis_resident("external") where E is the external port mac
1586                     and R is the router port mac.
1587
1588              •      A priority-0 flow that matches all packets to advances to
1589                     table 20.
1590
1591     Ingress Table 27 Destination Lookup
1592
1593       This  table  implements  switching  behavior. It contains these logical
1594       flows:
1595
1596              •      A priority-110 flow with the match eth.src == E  for  all
1597                     logical  switch  datapaths  and  applies  the action han‐
1598                     dle_svc_check(inport). Where E is the service monitor mac
1599                     defined   in   the   options:svc_monitor_mac   column  of
1600                     NB_Global table.
1601
1602              •      A priority-100 flow that punts all  IGMP/MLD  packets  to
1603                     ovn-controller  if  multicast  snooping is enabled on the
1604                     logical switch.
1605
1606              •      Priority-90 flows that forward  registered  IP  multicast
1607                     traffic  to  their  corresponding  multicast group, which
1608                     ovn-northd creates based on  learnt  IGMP_Group  entries.
1609                     The  flows  also  forward packets to the MC_MROUTER_FLOOD
1610                     multicast group, which ovn-nortdh populates with all  the
1611                     logical  ports that are connected to logical routers with
1612                     options:mcast_relay=’true’.
1613
1614              •      A priority-85 flow that forwards all IP multicast traffic
1615                     destined to 224.0.0.X to the MC_FLOOD_L2 multicast group,
1616                     which ovn-northd populates with  all  non-router  logical
1617                     ports.
1618
1619              •      A priority-85 flow that forwards all IP multicast traffic
1620                     destined to reserved multicast IPv6 addresses (RFC  4291,
1621                     2.7.1,  e.g.,  Solicited-Node  multicast) to the MC_FLOOD
1622                     multicast group, which ovn-northd populates with all  en‐
1623                     abled logical ports.
1624
1625              •      A priority-80 flow that forwards all unregistered IP mul‐
1626                     ticast traffic to the MC_STATIC  multicast  group,  which
1627                     ovn-northd populates with all the logical ports that have
1628                     options :mcast_flood=’true’. The flow also  forwards  un‐
1629                     registered  IP  multicast traffic to the MC_MROUTER_FLOOD
1630                     multicast group, which ovn-northd populates with all  the
1631                     logical  ports connected to logical routers that have op‐
1632                     tions :mcast_relay=’true’.
1633
1634              •      A priority-80 flow that drops all unregistered IP  multi‐
1635                     cast  traffic  if  other_config  :mcast_snoop=’true’  and
1636                     other_config  :mcast_flood_unregistered=’false’  and  the
1637                     switch  is not connected to a logical router that has op‐
1638                     tions :mcast_relay=’true’ and the switch doesn’t have any
1639                     logical port with options :mcast_flood=’true’.
1640
1641              •      Priority-80  flows  for  each  IP address/VIP/NAT address
1642                     owned by a router port connected  to  the  switch.  These
1643                     flows  match ARP requests and ND packets for the specific
1644                     IP addresses. Matched packets are forwarded only  to  the
1645                     router  that  owns  the IP address and to the MC_FLOOD_L2
1646                     multicast group which  contains  all  non-router  logical
1647                     ports.
1648
1649              •      Priority-75  flows  for  each port connected to a logical
1650                     router matching  self  originated  ARP  request/RARP  re‐
1651                     quest/ND  packets.  These  packets  are  flooded  to  the
1652                     MC_FLOOD_L2 which contains all non-router logical ports.
1653
1654              •      A priority-72 flow that outputs all ARP requests  and  ND
1655                     packets  with  an Ethernet broadcast or multicast eth.dst
1656                     to the MC_FLOOD_L2 multicast group if other_config:broad‐
1657                     cast-arps-to-all-routers=true.
1658
1659              •      A  priority-70 flow that outputs all packets with an Eth‐
1660                     ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
1661                     ticast group.
1662
1663              •      One priority-50 flow that matches each known Ethernet ad‐
1664                     dress against eth.dst. Action of this  flow  outputs  the
1665                     packet  to the single associated output port if it is en‐
1666                     abled. drop; action is applied if LSP is disabled.
1667
1668                     For the Ethernet address on a logical switch port of type
1669                     router,  when that logical switch port’s addresses column
1670                     is set to router and the connected  logical  router  port
1671                     has a gateway chassis:
1672
1673                     •      The  flow  for the connected logical router port’s
1674                            Ethernet address is only programmed on the gateway
1675                            chassis.
1676
1677                     •      If  the  logical router has rules specified in nat
1678                            with external_mac, then those addresses  are  also
1679                            used  to  populate the switch’s destination lookup
1680                            on the chassis where logical_port is resident.
1681
1682                     For the Ethernet address on a logical switch port of type
1683                     router,  when that logical switch port’s addresses column
1684                     is set to router and the connected  logical  router  port
1685                     specifies  a  reside-on-redirect-chassis  and the logical
1686                     router to which the connected logical router port belongs
1687                     to has a distributed gateway LRP:
1688
1689                     •      The  flow  for the connected logical router port’s
1690                            Ethernet address is only programmed on the gateway
1691                            chassis.
1692
1693                     For  each  forwarding  group  configured  on  the logical
1694                     switch datapath,  a  priority-50  flow  that  matches  on
1695                     eth.dst == VIP
1696                      with  an  action  of  fwd_group(childports=args ), where
1697                     args contains comma separated logical switch child  ports
1698                     to  load  balance to. If liveness is enabled, then action
1699                     also includes  liveness=true.
1700
1701              •      One priority-0 fallback flow  that  matches  all  packets
1702                     with  the  action  outport = get_fdb(eth.dst); next;. The
1703                     action get_fdb gets the port for the eth.dst in  the  MAC
1704                     learning  table  of the logical switch datapath. If there
1705                     is no entry for eth.dst in the MAC learning  table,  then
1706                     it stores none in the outport.
1707
1708     Ingress Table 28 Destination unknown
1709
1710       This  table  handles the packets whose destination was not found or and
1711       looked up in the MAC learning table of the logical switch datapath.  It
1712       contains the following flows.
1713
1714              •      Priority 50 flow with the match outport == P is added for
1715                     each disabled Logical Switch Port P. This flow has action
1716                     drop;.
1717
1718              •      If  the  logical  switch has logical ports with ’unknown’
1719                     addresses set, then the below logical flow is added
1720
1721                     •      Priority 50 flow with the match outport ==  "none"
1722                            then  outputs  them  to  the  MC_UNKNOWN multicast
1723                            group, which ovn-northd populates with all enabled
1724                            logical  ports  that  accept  unknown  destination
1725                            packets. As a small optimization,  if  no  logical
1726                            ports    accept   unknown   destination   packets,
1727                            ovn-northd omits this multicast group and  logical
1728                            flow.
1729
1730                     If the logical switch has no logical ports with ’unknown’
1731                     address set, then the below logical flow is added
1732
1733                     •      Priority 50 flow with the match  outport  ==  none
1734                            and drops the packets.
1735
1736              •      One  priority-0  fallback flow that outputs the packet to
1737                     the egress stage with the outport learnt from get_fdb ac‐
1738                     tion.
1739
1740     Egress Table 0: to-lport Pre-ACLs
1741
1742       This is similar to ingress table Pre-ACLs except for to-lport traffic.
1743
1744       This table also has a priority-110 flow with the match eth.src == E for
1745       all logical switch datapaths to move traffic to the next table. Where E
1746       is  the service monitor mac defined in the options:svc_monitor_mac col‐
1747       umn of NB_Global table.
1748
1749       This table also has a priority-110 flow with the match outport == I for
1750       all logical switch datapaths to move traffic to the next table. Where I
1751       is the peer of a logical router port. This flow is added  to  skip  the
1752       connection  tracking  of  packets which will be entering logical router
1753       datapath from logical switch datapath for routing.
1754
1755     Egress Table 1: Pre-LB
1756
1757       This table is similar to ingress table Pre-LB. It contains a priority-0
1758       flow  that simply moves traffic to the next table. Moreover it contains
1759       two priority-110 flows to move multicast, IPv6 Neighbor  Discovery  and
1760       MLD  traffic  to  the next table. If any load balancing rules exist for
1761       the datapath, a priority-100 flow is added with a match of ip  and  ac‐
1762       tion  of  reg0[2] = 1; next; to act as a hint for table Pre-stateful to
1763       send IP packets to the connection tracker for  packet  de-fragmentation
1764       and  possibly  DNAT  the destination VIP to one of the selected backend
1765       for already committed load balanced traffic.
1766
1767       This table also has a priority-110 flow with the match eth.src == E for
1768       all logical switch datapaths to move traffic to the next table. Where E
1769       is the service monitor mac defined in the options:svc_monitor_mac  col‐
1770       umn of NB_Global table.
1771
1772       This table also has a priority-110 flow with the match outport == I for
1773       all logical switch datapaths to move traffic to the next table, and, if
1774       there are no stateful_acl, clear the ct_state. Where I is the peer of a
1775       logical router port. This flow is added to skip the connection tracking
1776       of  packets which will be entering logical router datapath from logical
1777       switch datapath for routing.
1778
1779     Egress Table 2: Pre-stateful
1780
1781       This is similar to ingress table Pre-stateful. This table adds the  be‐
1782       low 3 logical flows.
1783
1784              •      A  Priority-120  flow that send the packets to connection
1785                     tracker using ct_lb_mark; as the action so that  the  al‐
1786                     ready established traffic gets unDNATted from the backend
1787                     IP to the load balancer VIP based on a hint  provided  by
1788                     the previous tables with a match for reg0[2] == 1. If the
1789                     packet was not DNATted earlier, then ct_lb_mark functions
1790                     like ct_next.
1791
1792              •      A  priority-100  flow  sends  the  packets  to connection
1793                     tracker based on a hint provided by the  previous  tables
1794                     (with a match for reg0[0] == 1) by using the ct_next; ac‐
1795                     tion.
1796
1797              •      A priority-0 flow that matches all packets to advance  to
1798                     the next table.
1799
1800     Egress Table 3: from-lport ACL hints
1801
1802       This is similar to ingress table ACL hints.
1803
1804     Egress Table 4: to-lport ACL evaluation
1805
1806       This  is similar to ingress table ACL eval except for to-lport ACLs. As
1807       a reminder, these flows use the following  register  bits  to  indicate
1808       their  verdicts.  Allow-type ACLs set reg8[16], drop ACLs set reg8[17],
1809       and reject ACLs set reg8[18].
1810
1811       Also like with ingress ACLs, egress ACLs can have a configured tier. If
1812       a  tier  is  configured,  then  the  current  tier counter is evaluated
1813       against the ACL’s configured tier in addition to the ACL’s  match.  The
1814       current tier counter is stored in reg8[30..31].
1815
1816       Similar  to ingress table, a priority-65532 flow is added to allow IPv6
1817       Neighbor solicitation, Neighbor discover, Router  solicitation,  Router
1818       advertisement and MLD packets regardless of other ACLs defined.
1819
1820       In addition, the following flows are added.
1821
1822              •      A  priority  34000 logical flow is added for each logical
1823                     port which has DHCPv4 options defined to allow the DHCPv4
1824                     reply  packet and which has DHCPv6 options defined to al‐
1825                     low the DHCPv6 reply packet from the  Ingress  Table  18:
1826                     DHCP  responses.  This  is indicated by setting the allow
1827                     bit.
1828
1829              •      A priority 34000 logical flow is added for  each  logical
1830                     switch  datapath  configured  with  DNS  records with the
1831                     match udp.dst = 53 to allow the DNS reply packet from the
1832                     Ingress  Table  20:  DNS  responses. This is indicated by
1833                     setting the allow bit.
1834
1835              •      A priority 34000 logical flow is added for  each  logical
1836                     switch  datapath  with the match eth.src = E to allow the
1837                     service monitor  request  packet  generated  by  ovn-con‐
1838                     troller with the action next, where E is the service mon‐
1839                     itor mac defined in the options:svc_monitor_mac column of
1840                     NB_Global  table.  This is indicated by setting the allow
1841                     bit.
1842
1843     Egress Table 5: to-lport ACL action
1844
1845       This is similar to ingress table ACL action.
1846
1847     Egress Table 6: to-lport QoS Marking
1848
1849       This is similar to ingress table  QoS  marking  except  they  apply  to
1850       to-lport QoS rules.
1851
1852     Egress Table 7: to-lport QoS Meter
1853
1854       This  is  similar  to  ingress  table  QoS  meter  except they apply to
1855       to-lport QoS rules.
1856
1857     Egress Table 8: Stateful
1858
1859       This is similar to ingress table Stateful  except  that  there  are  no
1860       rules added for load balancing new connections.
1861
1862     Egress Table 9: Egress Port Security - check
1863
1864       This  is similar to the port security logic in table Ingress Port Secu‐
1865       rity check except that action check_out_port_sec is used to  check  the
1866       port security rules. This table adds the below logical flows.
1867
1868              •      A  priority 100 flow which matches on the multicast traf‐
1869                     fic and applies the  action  REGBIT_PORT_SEC_DROP"  =  0;
1870                     next;" to skip the out port security checks.
1871
1872              •      A  priority  0 logical flow is added which matches on all
1873                     the packets and applies the action  REGBIT_PORT_SEC_DROP"
1874                     =     check_out_port_sec();     next;".     The    action
1875                     check_out_port_sec applies the port security rules  based
1876                     on  the  addresses defined in the port_security column of
1877                     Logical_Switch_Port table before delivering the packet to
1878                     the outport.
1879
1880     Egress Table 10: Egress Port Security - Apply
1881
1882       This  is  similar to the ingress port security logic in ingress table A
1883       Ingress Port Security - Apply. This table drops the packets if the port
1884       security  check  failed in the previous stage i.e the register bit REG‐
1885       BIT_PORT_SEC_DROP is set to 1.
1886
1887       The following flows are added.
1888
1889              •      For each port configured  with  egress  qos  in  the  op‐
1890                     tions:qdisc_queue_id  column of Logical_Switch_Port, run‐
1891                     ning a localnet port on the same logical switch, a prior‐
1892                     ity  110 flow is added which matches on the localnet out‐
1893                     port and on  the  port  inport  and  applies  the  action
1894                     set_queue(id); output;".
1895
1896              •      For  each localnet port configured with egress qos in the
1897                     options:qdisc_queue_id column of  Logical_Switch_Port,  a
1898                     priority  100 flow is added which matches on the localnet
1899                     outport and applies the action set_queue(id); output;".
1900
1901                     Please remember to mark the corresponding physical inter‐
1902                     face with ovn-egress-iface set to true in external_ids.
1903
1904              •      A  priority-50 flow that drops the packet if the register
1905                     bit REGBIT_PORT_SEC_DROP is set to 1.
1906
1907              •      A priority-0 flow that outputs the packet to the outport.
1908
1909   Logical Router Datapaths
1910       Logical router datapaths will only exist for Logical_Router rows in the
1911       OVN_Northbound database that do not have enabled set to false
1912
1913     Ingress Table 0: L2 Admission Control
1914
1915       This  table drops packets that the router shouldn’t see at all based on
1916       their Ethernet headers. It contains the following flows:
1917
1918              •      Priority-100 flows to drop packets with VLAN tags or mul‐
1919                     ticast Ethernet source addresses.
1920
1921              •      For each enabled router port P with Ethernet address E, a
1922                     priority-50 flow that matches inport == P  &&  (eth.mcast
1923                     || eth.dst == E), stores the router port ethernet address
1924                     and advances to next table, with  action  xreg0[0..47]=E;
1925                     next;.
1926
1927                     For  the  gateway  port  on  a distributed logical router
1928                     (where one of the logical router ports specifies a  gate‐
1929                     way  chassis),  the  above  flow matching eth.dst == E is
1930                     only programmed on the gateway port instance on the gate‐
1931                     way  chassis. If LRP’s logical switch has attached LSP of
1932                     vtep type, the is_chassis_resident() part is not added to
1933                     lflow  to allow traffic originated from logical switch to
1934                     reach LR services (LBs, NAT).
1935
1936                     For each gateway port GW on a distributed logical  router
1937                     a  priority-120  flow  that  matches  inport  == cr-GW &&
1938                     !is_chassis_resident(cr-GW) where cr-GW  is  the  chassis
1939                     resident  port of GW, stores GW as inport and advances to
1940                     the next table.
1941
1942                     For a distributed logical router or  for  gateway  router
1943                     where the port is configured with options:gateway_mtu the
1944                     action   of   the   above   flow   is   modified   adding
1945                     check_pkt_larger in order to mark the packet setting REG‐
1946                     BIT_PKT_LARGER if the size is greater than  the  MTU.  If
1947                     the  port is also configured with options:gateway_mtu_by‐
1948                     pass then another flow is added, with priority-55, to by‐
1949                     pass  the check_pkt_larger flow. This is useful for traf‐
1950                     fic that normally doesn’t need to be fragmented  and  for
1951                     which  check_pkt_larger,  which might not be offloadable,
1952                     is not really needed. One such example is TCP traffic.
1953
1954              •      For each dnat_and_snat NAT rule on a  distributed  router
1955                     that  specifies  an external Ethernet address E, a prior‐
1956                     ity-50 flow that matches inport == GW &&  eth.dst  ==  E,
1957                     where  GW  is the logical router distributed gateway port
1958                     corresponding to the NAT rule  (specified  or  inferred),
1959                     with action xreg0[0..47]=E; next;.
1960
1961                     This flow is only programmed on the gateway port instance
1962                     on the chassis where the logical_port  specified  in  the
1963                     NAT rule resides.
1964
1965              •      A  priority-0  logical  flow that matches all packets not
1966                     already handled (match 1) and drops them (action drop;).
1967
1968       Other packets are implicitly dropped.
1969
1970     Ingress Table 1: Neighbor lookup
1971
1972       For ARP and IPv6 Neighbor Discovery packets, this table looks into  the
1973       MAC_Binding  records  to  determine if OVN needs to learn the mac bind‐
1974       ings. Following flows are added:
1975
1976              •      For each router port P that owns IP address A, which  be‐
1977                     longs to subnet S with prefix length L, if the option al‐
1978                     ways_learn_from_arp_request is true for  this  router,  a
1979                     priority-100  flow  is added which matches inport == P &&
1980                     arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1981                     lowing actions:
1982
1983                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1984                     next;
1985
1986
1987                     If the option always_learn_from_arp_request is false, the
1988                     following two flows are added.
1989
1990                     A priority-110 flow is added which matches inport == P &&
1991                     arp.spa  ==  S/L  && arp.tpa == A && arp.op == 1 (ARP re‐
1992                     quest) with the following actions:
1993
1994                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1995                     reg9[3] = 1;
1996                     next;
1997
1998
1999                     A priority-100 flow is added which matches inport == P &&
2000                     arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
2001                     lowing actions:
2002
2003                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
2004                     reg9[3] = lookup_arp_ip(inport, arp.spa);
2005                     next;
2006
2007
2008                     If the logical router port P  is  a  distributed  gateway
2009                     router  port,  additional match is_chassis_resident(cr-P)
2010                     is added for all these flows.
2011
2012              •      A priority-100 flow which matches on  ARP  reply  packets
2013                     and    applies    the   actions   if   the   option   al‐
2014                     ways_learn_from_arp_request is true:
2015
2016                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
2017                     next;
2018
2019
2020                     If the option always_learn_from_arp_request is false, the
2021                     above actions will be:
2022
2023                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
2024                     reg9[3] = 1;
2025                     next;
2026
2027
2028              •      A  priority-100  flow which matches on IPv6 Neighbor Dis‐
2029                     covery advertisement packet and applies  the  actions  if
2030                     the option always_learn_from_arp_request is true:
2031
2032                     reg9[2] = lookup_nd(inport, nd.target, nd.tll);
2033                     next;
2034
2035
2036                     If the option always_learn_from_arp_request is false, the
2037                     above actions will be:
2038
2039                     reg9[2] = lookup_nd(inport, nd.target, nd.tll);
2040                     reg9[3] = 1;
2041                     next;
2042
2043
2044              •      A priority-100 flow which matches on IPv6  Neighbor  Dis‐
2045                     covery solicitation packet and applies the actions if the
2046                     option always_learn_from_arp_request is true:
2047
2048                     reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
2049                     next;
2050
2051
2052                     If the option always_learn_from_arp_request is false, the
2053                     above actions will be:
2054
2055                     reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
2056                     reg9[3] = lookup_nd_ip(inport, ip6.src);
2057                     next;
2058
2059
2060              •      A  priority-0  fallback flow that matches all packets and
2061                     applies the action  reg9[2]  =  1;  next;  advancing  the
2062                     packet to the next table.
2063
2064     Ingress Table 2: Neighbor learning
2065
2066       This  table  adds flows to learn the mac bindings from the ARP and IPv6
2067       Neighbor Solicitation/Advertisement packets if it is  needed  according
2068       to the lookup results from the previous stage.
2069
2070       reg9[2] will be 1 if the lookup_arp/lookup_nd in the previous table was
2071       successful or skipped, meaning no need to learn mac  binding  from  the
2072       packet.
2073
2074       reg9[3] will be 1 if the lookup_arp_ip/lookup_nd_ip in the previous ta‐
2075       ble was successful or skipped, meaning it is ok to  learn  mac  binding
2076       from the packet (if reg9[2] is 0).
2077
2078              •      A  priority-100  flow  with  the  match  reg9[2]  == 1 ||
2079                     reg9[3] == 0 and advances the packet to the next table as
2080                     there is no need to learn the neighbor.
2081
2082              •      A  priority-95 flow with the match nd_ns && (ip6.src == 0
2083                     || nd.sll == 0) and applies the action next;
2084
2085              •      A priority-90 flow with the match arp and applies the ac‐
2086                     tion put_arp(inport, arp.spa, arp.sha); next;
2087
2088              •      A  priority-95  flow with the match nd_na  && nd.tll == 0
2089                     and  applies   the   action   put_nd(inport,   nd.target,
2090                     eth.src); next;
2091
2092              •      A  priority-90  flow with the match nd_na and applies the
2093                     action put_nd(inport, nd.target, nd.tll); next;
2094
2095              •      A priority-90 flow with the match nd_ns and  applies  the
2096                     action put_nd(inport, ip6.src, nd.sll); next;
2097
2098              •      A  priority-0  logical  flow that matches all packets not
2099                     already handled (match 1) and drops them (action drop;).
2100
2101     Ingress Table 3: IP Input
2102
2103       This table is the core of the logical router datapath functionality. It
2104       contains  the following flows to implement very basic IP host function‐
2105       ality.
2106
2107              •      For each dnat_and_snat NAT rule on a distributed  logical
2108                     routers  or  gateway routers with gateway port configured
2109                     with options:gateway_mtu to a valid integer  value  M,  a
2110                     priority-160  flow  with  the match inport == LRP && REG‐
2111                     BIT_PKT_LARGER && REGBIT_EGRESS_LOOPBACK == 0, where  LRP
2112                     is  the logical router port and applies the following ac‐
2113                     tion for ipv4 and ipv6 respectively:
2114
2115                     icmp4_error {
2116                         icmp4.type = 3; /* Destination Unreachable. */
2117                         icmp4.code = 4;  /* Frag Needed and DF was Set. */
2118                         icmp4.frag_mtu = M;
2119                         eth.dst = eth.src;
2120                         eth.src = E;
2121                         ip4.dst = ip4.src;
2122                         ip4.src = I;
2123                         ip.ttl = 255;
2124                         REGBIT_EGRESS_LOOPBACK = 1;
2125                         REGBIT_PKT_LARGER 0;
2126                         outport = LRP;
2127                         flags.loopback = 1;
2128                         output;
2129                     };
2130                     icmp6_error {
2131                         icmp6.type = 2;
2132                         icmp6.code = 0;
2133                         icmp6.frag_mtu = M;
2134                         eth.dst = eth.src;
2135                         eth.src = E;
2136                         ip6.dst = ip6.src;
2137                         ip6.src = I;
2138                         ip.ttl = 255;
2139                         REGBIT_EGRESS_LOOPBACK = 1;
2140                         REGBIT_PKT_LARGER 0;
2141                         outport = LRP;
2142                         flags.loopback = 1;
2143                         output;
2144                     };
2145
2146
2147                     where E and I are the NAT rule external mac  and  IP  re‐
2148                     spectively.
2149
2150              •      For  distributed  logical routers or gateway routers with
2151                     gateway port configured  with  options:gateway_mtu  to  a
2152                     valid  integer  value, a priority-150 flow with the match
2153                     inport == LRP && REGBIT_PKT_LARGER && REGBIT_EGRESS_LOOP‐
2154                     BACK  ==  0, where LRP is the logical router port and ap‐
2155                     plies the following action  for  ipv4  and  ipv6  respec‐
2156                     tively:
2157
2158                     icmp4_error {
2159                         icmp4.type = 3; /* Destination Unreachable. */
2160                         icmp4.code = 4;  /* Frag Needed and DF was Set. */
2161                         icmp4.frag_mtu = M;
2162                         eth.dst = E;
2163                         ip4.dst = ip4.src;
2164                         ip4.src = I;
2165                         ip.ttl = 255;
2166                         REGBIT_EGRESS_LOOPBACK = 1;
2167                         REGBIT_PKT_LARGER 0;
2168                         next(pipeline=ingress, table=0);
2169                     };
2170                     icmp6_error {
2171                         icmp6.type = 2;
2172                         icmp6.code = 0;
2173                         icmp6.frag_mtu = M;
2174                         eth.dst = E;
2175                         ip6.dst = ip6.src;
2176                         ip6.src = I;
2177                         ip.ttl = 255;
2178                         REGBIT_EGRESS_LOOPBACK = 1;
2179                         REGBIT_PKT_LARGER 0;
2180                         next(pipeline=ingress, table=0);
2181                     };
2182
2183
2184              •      For  each NAT entry of a distributed logical router (with
2185                     distributed gateway router port(s)) of type snat, a  pri‐
2186                     ority-120 flow with the match inport == P && ip4.src == A
2187                     advances the packet to the next pipeline, where P is  the
2188                     distributed  logical router port corresponding to the NAT
2189                     entry (specified or inferred) and A  is  the  external_ip
2190                     set  in  the  NAT  entry.  If  A is an IPv6 address, then
2191                     ip6.src is used for the match.
2192
2193                     The above flow is required to handle the routing  of  the
2194                     East/west NAT traffic.
2195
2196              •      For  each  BFD  port the two following priority-110 flows
2197                     are added to manage BFD traffic:
2198
2199                     •      if ip4.src or ip6.src is any IP address  owned  by
2200                            the  router  port and udp.dst == 3784 , the packet
2201                            is advanced to the next pipeline stage.
2202
2203                     •      if ip4.dst or ip6.dst is any IP address  owned  by
2204                            the  router  port  and  udp.dst == 3784 , the han‐
2205                            dle_bfd_msg action is executed.
2206
2207              •      L3 admission control: Priority-120 flows allows IGMP  and
2208                     MLD packets if the router has logical ports that have op‐
2209                     tions :mcast_flood=’true’.
2210
2211              •      L3 admission control: A priority-100 flow  drops  packets
2212                     that match any of the following:
2213
2214ip4.src[28..31] == 0xe (multicast source)
2215
2216ip4.src == 255.255.255.255 (broadcast source)
2217
2218ip4.src  ==  127.0.0.0/8 || ip4.dst == 127.0.0.0/8
2219                            (localhost source or destination)
2220
2221ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
2222                            network source or destination)
2223
2224ip4.src  or ip6.src is any IP address owned by the
2225                            router, unless the packet was recirculated due  to
2226                            egress    loopback    as    indicated    by   REG‐
2227                            BIT_EGRESS_LOOPBACK.
2228
2229ip4.src is the broadcast address of any IP network
2230                            known to the router.
2231
2232              •      A  priority-100 flow parses DHCPv6 replies from IPv6 pre‐
2233                     fix delegation routers (udp.src  ==  547  &&  udp.dst  ==
2234                     546). The handle_dhcpv6_reply is used to send IPv6 prefix
2235                     delegation messages to the delegation router.
2236
2237              •      ICMP echo reply. These flows reply to ICMP echo  requests
2238                     received  for the router’s IP address. Let A be an IP ad‐
2239                     dress owned by a router port. Then, for each A that is an
2240                     IPv4  address, a priority-90 flow matches on ip4.dst == A
2241                     and icmp4.type == 8 && icmp4.code ==  0  (ICMP  echo  re‐
2242                     quest). For each A that is an IPv6 address, a priority-90
2243                     flow matches on ip6.dst == A and  icmp6.type  ==  128  &&
2244                     icmp6.code  ==  0  (ICMPv6 echo request). The port of the
2245                     router that receives the echo request  does  not  matter.
2246                     Also,  the  ip.ttl  of  the  echo  request  packet is not
2247                     checked, so it complies with RFC 1812,  section  4.2.2.9.
2248                     Flows for ICMPv4 echo requests use the following actions:
2249
2250                     ip4.dst <-> ip4.src;
2251                     ip.ttl = 255;
2252                     icmp4.type = 0;
2253                     flags.loopback = 1;
2254                     next;
2255
2256
2257                     Flows for ICMPv6 echo requests use the following actions:
2258
2259                     ip6.dst <-> ip6.src;
2260                     ip.ttl = 255;
2261                     icmp6.type = 129;
2262                     flags.loopback = 1;
2263                     next;
2264
2265
2266              •      Reply to ARP requests.
2267
2268                     These flows reply to ARP requests for the router’s own IP
2269                     address. The ARP requests are handled  only  if  the  re‐
2270                     questor’s  IP  belongs to the same subnets of the logical
2271                     router port. For each router port P that owns IP  address
2272                     A,  which  belongs  to subnet S with prefix length L, and
2273                     Ethernet address E, a priority-90 flow matches inport  ==
2274                     P  &&  arp.spa == S/L && arp.op == 1 && arp.tpa == A (ARP
2275                     request) with the following actions:
2276
2277                     eth.dst = eth.src;
2278                     eth.src = xreg0[0..47];
2279                     arp.op = 2; /* ARP reply. */
2280                     arp.tha = arp.sha;
2281                     arp.sha = xreg0[0..47];
2282                     arp.tpa = arp.spa;
2283                     arp.spa = A;
2284                     outport = inport;
2285                     flags.loopback = 1;
2286                     output;
2287
2288
2289                     For the gateway port  on  a  distributed  logical  router
2290                     (where  one of the logical router ports specifies a gate‐
2291                     way chassis), the above flows are only programmed on  the
2292                     gateway port instance on the gateway chassis. This behav‐
2293                     ior avoids generation of multiple ARP responses from dif‐
2294                     ferent chassis, and allows upstream MAC learning to point
2295                     to the gateway chassis.
2296
2297                     For the logical router port with the option reside-on-re‐
2298                     direct-chassis  set  (which  is  centralized),  the above
2299                     flows are only programmed on the gateway port instance on
2300                     the gateway chassis (if the logical router has a distrib‐
2301                     uted gateway port). This behavior  avoids  generation  of
2302                     multiple ARP responses from different chassis, and allows
2303                     upstream MAC learning to point to the gateway chassis.
2304
2305              •      Reply to IPv6 Neighbor Solicitations. These  flows  reply
2306                     to  Neighbor  Solicitation  requests for the router’s own
2307                     IPv6 address and populate the logical router’s mac  bind‐
2308                     ing table.
2309
2310                     For  each  router  port  P  that owns IPv6 address A, so‐
2311                     licited node address S, and Ethernet address E, a  prior‐
2312                     ity-90  flow  matches  inport == P && nd_ns && ip6.dst ==
2313                     {A, E} && nd.target == A with the following actions:
2314
2315                     nd_na_router {
2316                         eth.src = xreg0[0..47];
2317                         ip6.src = A;
2318                         nd.target = A;
2319                         nd.tll = xreg0[0..47];
2320                         outport = inport;
2321                         flags.loopback = 1;
2322                         output;
2323                     };
2324
2325
2326                     For the gateway port  on  a  distributed  logical  router
2327                     (where  one of the logical router ports specifies a gate‐
2328                     way chassis), the above flows replying to  IPv6  Neighbor
2329                     Solicitations are only programmed on the gateway port in‐
2330                     stance on the gateway chassis. This behavior avoids  gen‐
2331                     eration  of  multiple replies from different chassis, and
2332                     allows upstream MAC learning  to  point  to  the  gateway
2333                     chassis.
2334
2335              •      These flows reply to ARP requests or IPv6 neighbor solic‐
2336                     itation for the virtual IP addresses  configured  in  the
2337                     router for NAT (both DNAT and SNAT) or load balancing.
2338
2339                     IPv4:  For  a  configured NAT (both DNAT and SNAT) IP ad‐
2340                     dress or a load balancer IPv4 VIP A, for each router port
2341                     P  with  Ethernet  address  E, a priority-90 flow matches
2342                     arp.op == 1 && arp.tpa == A (ARP request) with  the  fol‐
2343                     lowing actions:
2344
2345                     eth.dst = eth.src;
2346                     eth.src = xreg0[0..47];
2347                     arp.op = 2; /* ARP reply. */
2348                     arp.tha = arp.sha;
2349                     arp.sha = xreg0[0..47];
2350                     arp.tpa <-> arp.spa;
2351                     outport = inport;
2352                     flags.loopback = 1;
2353                     output;
2354
2355
2356                     IPv4:  For a configured load balancer IPv4 VIP, a similar
2357                     flow is added with the additional match inport  ==  P  if
2358                     the  VIP is reachable from any logical router port of the
2359                     logical router.
2360
2361                     If the router port P  is  a  distributed  gateway  router
2362                     port,  then  the  is_chassis_resident(P) is also added in
2363                     the match condition for the load balancer IPv4 VIP A.
2364
2365                     IPv6: For a configured NAT (both DNAT and  SNAT)  IP  ad‐
2366                     dress or a load balancer IPv6 VIP A (if the VIP is reach‐
2367                     able from any logical router port of the logical router),
2368                     solicited  node  address  S,  for each router port P with
2369                     Ethernet address E, a priority-90 flow matches inport  ==
2370                     P  &&  nd_ns  && ip6.dst == {A, S} && nd.target == A with
2371                     the following actions:
2372
2373                     eth.dst = eth.src;
2374                     nd_na {
2375                         eth.src = xreg0[0..47];
2376                         nd.tll = xreg0[0..47];
2377                         ip6.src = A;
2378                         nd.target = A;
2379                         outport = inport;
2380                         flags.loopback = 1;
2381                         output;
2382                     }
2383
2384
2385                     If the router port P  is  a  distributed  gateway  router
2386                     port,  then  the  is_chassis_resident(P) is also added in
2387                     the match condition for the load balancer IPv6 VIP A.
2388
2389                     For the gateway port on a distributed logical router with
2390                     NAT  (where  one  of the logical router ports specifies a
2391                     gateway chassis):
2392
2393                     •      If the corresponding NAT rule cannot be handled in
2394                            a  distributed  manner, then a priority-92 flow is
2395                            programmed on the gateway  port  instance  on  the
2396                            gateway  chassis.  A priority-91 drop flow is pro‐
2397                            grammed on the other chassis when ARP  requests/NS
2398                            packets are received on the gateway port. This be‐
2399                            havior avoids generation of multiple ARP responses
2400                            from  different  chassis,  and allows upstream MAC
2401                            learning to point to the gateway chassis.
2402
2403                     •      If the corresponding NAT rule can be handled in  a
2404                            distributed  manner,  then  this flow is only pro‐
2405                            grammed on the gateway  port  instance  where  the
2406                            logical_port specified in the NAT rule resides.
2407
2408                            Some  of  the actions are different for this case,
2409                            using the external_mac specified in the  NAT  rule
2410                            rather than the gateway port’s Ethernet address E:
2411
2412                            eth.src = external_mac;
2413                            arp.sha = external_mac;
2414
2415
2416                            or in the case of IPv6 neighbor solicition:
2417
2418                            eth.src = external_mac;
2419                            nd.tll = external_mac;
2420
2421
2422                            This  behavior  avoids  generation of multiple ARP
2423                            responses from different chassis, and  allows  up‐
2424                            stream  MAC learning to point to the correct chas‐
2425                            sis.
2426
2427              •      Priority-85 flows which drops the ARP and  IPv6  Neighbor
2428                     Discovery packets.
2429
2430              •      A priority-84 flow explicitly allows IPv6 multicast traf‐
2431                     fic that is supposed to reach the router pipeline  (i.e.,
2432                     router solicitation and router advertisement packets).
2433
2434              •      A  priority-83 flow explicitly drops IPv6 multicast traf‐
2435                     fic that is destined to reserved multicast groups.
2436
2437              •      A priority-82 flow allows IP  multicast  traffic  if  op‐
2438                     tions:mcast_relay=’true’, otherwise drops it.
2439
2440              •      UDP  port  unreachable.  Priority-80  flows generate ICMP
2441                     port unreachable messages in reply to UDP  datagrams  di‐
2442                     rected  to the router’s IP address, except in the special
2443                     case of gateways, which  accept  traffic  directed  to  a
2444                     router IP for load balancing and NAT purposes.
2445
2446                     These  flows  should  not match IP fragments with nonzero
2447                     offset.
2448
2449              •      TCP reset. Priority-80 flows generate TCP reset  messages
2450                     in reply to TCP datagrams directed to the router’s IP ad‐
2451                     dress, except in the special case of gateways, which  ac‐
2452                     cept  traffic  directed to a router IP for load balancing
2453                     and NAT purposes.
2454
2455                     These flows should not match IP  fragments  with  nonzero
2456                     offset.
2457
2458              •      Protocol or address unreachable. Priority-70 flows gener‐
2459                     ate ICMP protocol or  address  unreachable  messages  for
2460                     IPv4  and  IPv6 respectively in reply to packets directed
2461                     to the router’s IP address on  IP  protocols  other  than
2462                     UDP,  TCP,  and ICMP, except in the special case of gate‐
2463                     ways, which accept traffic directed to a  router  IP  for
2464                     load balancing purposes.
2465
2466                     These  flows  should  not match IP fragments with nonzero
2467                     offset.
2468
2469              •      Drop other IP traffic to this router.  These  flows  drop
2470                     any  other  traffic  destined  to  an  IP address of this
2471                     router that is not already handled by one  of  the  flows
2472                     above,  which  amounts to ICMP (other than echo requests)
2473                     and fragments with nonzero offsets. For each IP address A
2474                     owned  by  the router, a priority-60 flow matches ip4.dst
2475                     == A or ip6.dst == A and drops the traffic. An  exception
2476                     is  made  and  the  above flow is not added if the router
2477                     port’s own IP address is used  to  SNAT  packets  passing
2478                     through  that  router or if it is used as a load balancer
2479                     VIP.
2480
2481       The flows above handle all of the traffic that might be directed to the
2482       router  itself.  The following flows (with lower priorities) handle the
2483       remaining traffic, potentially for forwarding:
2484
2485              •      Drop Ethernet local broadcast. A  priority-50  flow  with
2486                     match  eth.bcast drops traffic destined to the local Eth‐
2487                     ernet  broadcast  address.  By  definition  this  traffic
2488                     should not be forwarded.
2489
2490              •      Avoid  ICMP  time  exceeded  for multicast. A priority-32
2491                     flow with match ip.ttl == {0,  1}  &&  !ip.later_frag  &&
2492                     (ip4.mcast  ||  ip6.mcast) and actions drop; drops multi‐
2493                     cast packets whose TTL has expired without  sending  ICMP
2494                     time exceeded.
2495
2496              •      ICMP  time exceeded. For each router port P, whose IP ad‐
2497                     dress is A, a priority-31 flow with match inport == P  &&
2498                     ip.ttl  == {0, 1} && !ip.later_frag matches packets whose
2499                     TTL has expired, with the following actions  to  send  an
2500                     ICMP time exceeded reply for IPv4 and IPv6 respectively:
2501
2502                     icmp4 {
2503                         icmp4.type = 11; /* Time exceeded. */
2504                         icmp4.code = 0;  /* TTL exceeded in transit. */
2505                         ip4.dst = ip4.src;
2506                         ip4.src = A;
2507                         ip.ttl = 254;
2508                         next;
2509                     };
2510                     icmp6 {
2511                         icmp6.type = 3; /* Time exceeded. */
2512                         icmp6.code = 0;  /* TTL exceeded in transit. */
2513                         ip6.dst = ip6.src;
2514                         ip6.src = A;
2515                         ip.ttl = 254;
2516                         next;
2517                     };
2518
2519
2520              •      TTL  discard. A priority-30 flow with match ip.ttl == {0,
2521                     1} and actions drop; drops other packets  whose  TTL  has
2522                     expired, that should not receive a ICMP error reply (i.e.
2523                     fragments with nonzero offset).
2524
2525              •      Next table. A priority-0 flows  match  all  packets  that
2526                     aren’t  already  handled  and  uses actions next; to feed
2527                     them to the next table.
2528
2529     Ingress Table 4: UNSNAT
2530
2531       This is for already established  connections’  reverse  traffic.  i.e.,
2532       SNAT  has  already  been done in egress pipeline and now the packet has
2533       entered the ingress pipeline as part of a reply. It is unSNATted here.
2534
2535       Ingress Table 4: UNSNAT on Gateway and Distributed Routers
2536
2537              •      If the Router (Gateway or Distributed) is configured with
2538                     load balancers, then below lflows are added:
2539
2540                     For each IPv4 address A defined as load balancer VIP with
2541                     the protocol P (and the protocol port T  if  defined)  is
2542                     also present as an external_ip in the NAT table, a prior‐
2543                     ity-120 logical flow is  added  with  the  match  ip4  &&
2544                     ip4.dst  ==  A  && P with the action next; to advance the
2545                     packet to the next table. If the load balancer has proto‐
2546                     col port B defined, then the match also has P.dst == B.
2547
2548                     The above flows are also added for IPv6 load balancers.
2549
2550       Ingress Table 4: UNSNAT on Gateway Routers
2551
2552              •      If  the  Gateway router has been configured to force SNAT
2553                     any previously DNATted packets to B, a priority-110  flow
2554                     matches  ip && ip4.dst == B or ip && ip6.dst == B with an
2555                     action ct_snat; .
2556
2557                     If   the    Gateway    router    is    configured    with
2558                     lb_force_snat_ip=router_ip  then for every logical router
2559                     port P attached to the Gateway router with the router  ip
2560                     B,  a priority-110 flow is added with the match inport ==
2561                     P && ip4.dst == B or inport == P && ip6.dst == B with  an
2562                     action ct_snat; .
2563
2564                     If  the  Gateway router has been configured to force SNAT
2565                     any previously load-balanced packets to B, a priority-100
2566                     flow  matches  ip  &&  ip4.dst == B or ip && ip6.dst == B
2567                     with an action ct_snat; .
2568
2569                     For each NAT configuration in the  OVN  Northbound  data‐
2570                     base,  that  asks  to  change  the source IP address of a
2571                     packet from A to B, a  priority-90  flow  matches  ip  &&
2572                     ip4.dst  ==  B  or  ip  &&  ip6.dst  ==  B with an action
2573                     ct_snat; . If the NAT rule is of type  dnat_and_snat  and
2574                     has  stateless=true in the options, then the action would
2575                     be next;.
2576
2577                     A priority-0 logical flow with match 1 has actions next;.
2578
2579       Ingress Table 4: UNSNAT on Distributed Routers
2580
2581              •      For each configuration in the  OVN  Northbound  database,
2582                     that  asks  to  change  the source IP address of a packet
2583                     from A to B, two priority-100 flows are added.
2584
2585                     If the NAT rule cannot be handled in a  distributed  man‐
2586                     ner,  then  the  below  priority-100  flows are only pro‐
2587                     grammed on the gateway chassis.
2588
2589                     •      The first flow matches ip && ip4.dst == B  &&  in‐
2590                            port == GW
2591                             or ip && ip6.dst == B && inport == GW where GW is
2592                            the distributed gateway port corresponding to  the
2593                            NAT  rule  (specified or inferred), with an action
2594                            ct_snat; to unSNAT in the common zone. If the  NAT
2595                            rule  is  of  type  dnat_and_snat  and  has state‐
2596                            less=true in the options, then the action would be
2597                            next;.
2598
2599                            If the NAT entry is of type snat, then there is an
2600                            additional match is_chassis_resident(cr-GW)
2601                             where cr-GW is the chassis resident port of GW.
2602
2603                     A priority-0 logical flow with match 1 has actions next;.
2604
2605     Ingress Table 5: DEFRAG
2606
2607       This is to send packets to connection tracker for tracking and  defrag‐
2608       mentation.  It  contains a priority-0 flow that simply moves traffic to
2609       the next table.
2610
2611       For all load balancing rules  that  are  configured  in  OVN_Northbound
2612       database  for  a  Gateway router, a priority-100 flow is added for each
2613       configured virtual IP address VIP. For IPv4 VIPs the flow matches ip &&
2614       ip4.dst  ==  VIP. For IPv6 VIPs, the flow matches ip && ip6.dst == VIP.
2615       The flow applies the action  ct_dnat; to send IP packets to the connec‐
2616       tion tracker for packet de-fragmentation and to dnat the destination IP
2617       for the committed connection before sending it to the next table.
2618
2619       If ECMP routes with symmetric reply are configured  in  the  OVN_North‐
2620       bound  database  for a gateway router, a priority-100 flow is added for
2621       each router port on which symmetric replies are configured. The  match‐
2622       ing  logic for these ports essentially reverses the configured logic of
2623       the ECMP route. So for instance, a route  with  a  destination  routing
2624       policy  will  instead match if the source IP address matches the static
2625       route’s prefix. The flow uses the actions chk_ecmp_nh_mac(); ct_next or
2626       chk_ecmp_nh(); ct_next to send IP packets to table 76 or to table 77 in
2627       order to check if source info are already stored by OVN and then to the
2628       connection  tracker  for  packet  de-fragmentation  and tracking before
2629       sending it to the next table.
2630
2631       If load balancing rules are configured in OVN_Northbound database for a
2632       Gateway  router,  a priority 50 flow that matches icmp || icmp6 with an
2633       action of ct_dnat;, this allows potentially  related  ICMP  traffic  to
2634       pass through CT.
2635
2636     Ingress Table 6: Load balancing affinity check
2637
2638       Load  balancing  affinity  check  table  contains the following logical
2639       flows:
2640
2641              •      For all the configured load balancing rules for a logical
2642                     router  where a positive affinity timeout is specified in
2643                     options column, that includes a L4 port PORT of  protocol
2644                     P  and IPv4 or IPv6 address VIP, a priority-100 flow that
2645                     matches on ct.new && ip && ip.dst == VIP && P && P.dst ==
2646                     PORT (xxreg0 == VIP
2647                      in  the  IPv6  case)  with  an  action of reg0 = ip.dst;
2648                     reg9[16..31]  =  P.dst;  reg9[6]  =  chk_lb_aff();  next;
2649                     (xxreg0 == ip6.dst  in the IPv6 case)
2650
2651              •      A  priority  0 flow is added which matches on all packets
2652                     and applies the action next;.
2653
2654     Ingress Table 7: DNAT
2655
2656       Packets enter the pipeline with destination IP address that needs to be
2657       DNATted  from a virtual IP address to a real IP address. Packets in the
2658       reverse direction needs to be unDNATed.
2659
2660       Ingress Table 7: Load balancing DNAT rules
2661
2662       Following load balancing DNAT flows are added  for  Gateway  router  or
2663       Router  with gateway port. These flows are programmed only on the gate‐
2664       way chassis. These flows do not get programmed for load balancers  with
2665       IPv6 VIPs.
2666
2667              •      For all the configured load balancing rules for a logical
2668                     router where a positive affinity timeout is specified  in
2669                     options  column, that includes a L4 port PORT of protocol
2670                     P and IPv4 or IPv6 address VIP, a priority-150 flow  that
2671                     matches  on reg9[6] == 1 && ct.new && ip && ip.dst == VIP
2672                     && P && P.dst ==  PORT with an action of ct_lb_mark(args)
2673                     ,  where  args contains comma separated IP addresses (and
2674                     optional port numbers) to load balance  to.  The  address
2675                     family of the IP addresses of args is the same as the ad‐
2676                     dress family of VIP.
2677
2678              •      If controller_event has been enabled for all the  config‐
2679                     ured  load balancing rules for a Gateway router or Router
2680                     with gateway port in OVN_Northbound  database  that  does
2681                     not  have  configured  backends,  a  priority-130 flow is
2682                     added to trigger ovn-controller events whenever the chas‐
2683                     sis  receives  a  packet  for  that  particular  VIP.  If
2684                     event-elb meter has been previously created, it  will  be
2685                     associated to the empty_lb logical flow
2686
2687              •      For all the configured load balancing rules for a Gateway
2688                     router or Router  with  gateway  port  in  OVN_Northbound
2689                     database  that  includes a L4 port PORT of protocol P and
2690                     IPv4 or  IPv6  address  VIP,  a  priority-120  flow  that
2691                     matches  on ct.new && !ct.rel && ip && ip.dst == VIP && P
2692                     && P.dst ==
2693                      PORT with an action of ct_lb_mark(args), where args con‐
2694                     tains  comma  separated  IPv4  or IPv6 addresses (and op‐
2695                     tional port numbers) to load balance to. If the router is
2696                     configured  to  force SNAT any load-balanced packets, the
2697                     above action will be replaced by  flags.force_snat_for_lb
2698                     = 1; ct_lb_mark(args; force_snat);. If the load balancing
2699                     rule is configured with skip_snat set to true, the  above
2700                     action  will  be  replaced by flags.skip_snat_for_lb = 1;
2701                     ct_lb_mark(args; skip_snat);. If health check is enabled,
2702                     then args will only contain those endpoints whose service
2703                     monitor status entry in OVN_Southbound db is  either  on‐
2704                     line or empty.
2705
2706              •      For  all the configured load balancing rules for a router
2707                     in OVN_Northbound database that includes just an  IP  ad‐
2708                     dress  VIP  to match on, a priority-110 flow that matches
2709                     on ct.new && !ct.rel && ip4 && ip.dst == VIP with an  ac‐
2710                     tion of ct_lb_mark(args), where args contains comma sepa‐
2711                     rated IPv4 or IPv6 addresses. If the router is configured
2712                     to force SNAT any load-balanced packets, the above action
2713                     will  be  replaced  by   flags.force_snat_for_lb   =   1;
2714                     ct_lb_mark(args; force_snat);. If the load balancing rule
2715                     is configured with skip_snat set to true, the  above  ac‐
2716                     tion  will  be  replaced  by  flags.skip_snat_for_lb = 1;
2717                     ct_lb_mark(args; skip_snat);.
2718
2719                     The previous table lr_in_defrag sets  the  register  reg0
2720                     (or  xxreg0  for IPv6) and does ct_dnat. Hence for estab‐
2721                     lished traffic, this table just advances  the  packet  to
2722                     the next stage.
2723
2724              •      If  the load balancer is created with --reject option and
2725                     it has no active backends, a TCP reset segment (for  tcp)
2726                     or an ICMP port unreachable packet (for all other kind of
2727                     traffic) will be sent whenever an incoming packet is  re‐
2728                     ceived for this load-balancer. Please note using --reject
2729                     option will disable empty_lb SB controller event for this
2730                     load balancer.
2731
2732              •      For  the related traffic, a priority 50 flow that matches
2733                     ct.rel && !ct.est && !ct.new  with an action  of  ct_com‐
2734                     mit_nat;, if the router has load balancer assigned to it.
2735                     Along with two priority 70 flows that match skip_snat and
2736                     force_snat flags, setting the flags.force_snat_for_lb = 1
2737                     or flags.skip_snat_for_lb = 1 accordingly.
2738
2739              •      For the established traffic,  a  priority  50  flow  that
2740                     matches  ct.est  &&  !ct.rel && !ct.new && ct_mark.natted
2741                     with an action of next;, if the router has load  balancer
2742                     assigned  to  it.  Along  with two priority 70 flows that
2743                     match  skip_snat  and  force_snat  flags,   setting   the
2744                     flags.force_snat_for_lb = 1 or flags.skip_snat_for_lb = 1
2745                     accordingly.
2746
2747       Ingress Table 7: DNAT on Gateway Routers
2748
2749              •      For each configuration in the  OVN  Northbound  database,
2750                     that  asks  to  change  the  destination  IP address of a
2751                     packet from A to B, a priority-100  flow  matches  ip  &&
2752                     ip4.dst  ==  A  or  ip  &&  ip6.dst  ==  A with an action
2753                     flags.loopback = 1; ct_dnat(B);. If the Gateway router is
2754                     configured to force SNAT any DNATed packet, the above ac‐
2755                     tion will be replaced by flags.force_snat_for_dnat  =  1;
2756                     flags.loopback  =  1;  ct_dnat(B);. If the NAT rule is of
2757                     type dnat_and_snat and has stateless=true in the options,
2758                     then the action would be ip4/6.dst= (B).
2759
2760                     If  the  NAT  rule  has  allowed_ext_ips configured, then
2761                     there is an additional match ip4.src == allowed_ext_ips .
2762                     Similarly,  for  IPV6,  match  would  be  ip6.src  == al‐
2763                     lowed_ext_ips.
2764
2765                     If the NAT rule has exempted_ext_ips set, then  there  is
2766                     an  additional  flow configured at priority 101. The flow
2767                     matches if source ip is an exempted_ext_ip and the action
2768                     is next; . This flow is used to bypass the ct_dnat action
2769                     for a packet originating from exempted_ext_ips.
2770
2771              •      A priority-0 logical flow with match 1 has actions next;.
2772
2773       Ingress Table 7: DNAT on Distributed Routers
2774
2775       On distributed routers, the DNAT table only handles packets with desti‐
2776       nation IP address that needs to be DNATted from a virtual IP address to
2777       a real IP address. The unDNAT processing in the  reverse  direction  is
2778       handled in a separate table in the egress pipeline.
2779
2780              •      For  each  configuration  in the OVN Northbound database,
2781                     that asks to change  the  destination  IP  address  of  a
2782                     packet  from  A  to  B, a priority-100 flow matches ip &&
2783                     ip4.dst == B && inport == GW, where  GW  is  the  logical
2784                     router gateway port corresponding to the NAT rule (speci‐
2785                     fied or inferred), with an action ct_dnat(B);. The  match
2786                     will  include  ip6.dst  == B in the IPv6 case. If the NAT
2787                     rule is of type dnat_and_snat and has  stateless=true  in
2788                     the options, then the action would be ip4/6.dst=(B).
2789
2790                     If  the  NAT rule cannot be handled in a distributed man‐
2791                     ner, then the priority-100 flow above is only  programmed
2792                     on the gateway chassis.
2793
2794                     If  the  NAT  rule  has  allowed_ext_ips configured, then
2795                     there is an additional match ip4.src == allowed_ext_ips .
2796                     Similarly,  for  IPV6,  match  would  be  ip6.src  == al‐
2797                     lowed_ext_ips.
2798
2799                     If the NAT rule has exempted_ext_ips set, then  there  is
2800                     an  additional  flow configured at priority 101. The flow
2801                     matches if source ip is an exempted_ext_ip and the action
2802                     is next; . This flow is used to bypass the ct_dnat action
2803                     for a packet originating from exempted_ext_ips.
2804
2805                     A priority-0 logical flow with match 1 has actions next;.
2806
2807     Ingress Table 8: Load balancing affinity learn
2808
2809       Load balancing affinity learn  table  contains  the  following  logical
2810       flows:
2811
2812              •      For all the configured load balancing rules for a logical
2813                     router where a positive affinity timeout T  is  specified
2814                     in options
2815                      column,  that  includes a L4 port PORT of protocol P and
2816                     IPv4 or  IPv6  address  VIP,  a  priority-100  flow  that
2817                     matches on reg9[6] == 0 && ct.new && ip && reg0 == VIP &&
2818                     P && reg9[16..31] ==  PORT (xxreg0 == VIP   in  the  IPv6
2819                     case)  with  an  action  of commit_lb_aff(vip = VIP:PORT,
2820                     backend = backend ip: backend port, proto = P, timeout  =
2821                     T);.
2822
2823              •      A  priority  0 flow is added which matches on all packets
2824                     and applies the action next;.
2825
2826     Ingress Table 9: ECMP symmetric reply processing
2827
2828              •      If ECMP routes with symmetric reply are configured in the
2829                     OVN_Northbound  database  for  a gateway router, a prior‐
2830                     ity-100 flow is added for each router port on which  sym‐
2831                     metric  replies  are  configured.  The matching logic for
2832                     these ports essentially reverses the configured logic  of
2833                     the  ECMP route. So for instance, a route with a destina‐
2834                     tion routing policy will instead match if the  source  IP
2835                     address  matches the static route’s prefix. The flow uses
2836                     the  action   ct_commit   {   ct_label.ecmp_reply_eth   =
2837                     eth.src;"   "   ct_mark.ecmp_reply_port   =   K;};   com‐
2838                     mit_ecmp_nh(); next;
2839                      to commit the connection and  storing  eth.src  and  the
2840                     ECMP  reply port binding tunnel key K in the ct_label and
2841                     the traffic pattern to table 76 or 77.
2842
2843     Ingress Table 10: IPv6 ND RA option processing
2844
2845              •      A priority-50 logical flow  is  added  for  each  logical
2846                     router  port  configured  with  IPv6  ND RA options which
2847                     matches IPv6 ND Router Solicitation  packet  and  applies
2848                     the  action put_nd_ra_opts and advances the packet to the
2849                     next table.
2850
2851                     reg0[5] = put_nd_ra_opts(options);next;
2852
2853
2854                     For a valid IPv6 ND RS packet, this transforms the packet
2855                     into  an  IPv6 ND RA reply and sets the RA options to the
2856                     packet and stores 1 into  reg0[5].  For  other  kinds  of
2857                     packets,  it  just  stores 0 into reg0[5]. Either way, it
2858                     continues to the next table.
2859
2860              •      A priority-0 logical flow with match 1 has actions next;.
2861
2862     Ingress Table 11: IPv6 ND RA responder
2863
2864       This table implements IPv6 ND RA responder for the IPv6 ND  RA  replies
2865       generated by the previous table.
2866
2867              •      A  priority-50  logical  flow  is  added for each logical
2868                     router port configured with  IPv6  ND  RA  options  which
2869                     matches  IPv6 ND RA packets and reg0[5] == 1 and responds
2870                     back to the  inport  after  applying  these  actions.  If
2871                     reg0[5]   is   set   to  1,  it  means  that  the  action
2872                     put_nd_ra_opts was successful.
2873
2874                     eth.dst = eth.src;
2875                     eth.src = E;
2876                     ip6.dst = ip6.src;
2877                     ip6.src = I;
2878                     outport = P;
2879                     flags.loopback = 1;
2880                     output;
2881
2882
2883                     where E is the MAC address and I is the IPv6  link  local
2884                     address of the logical router port.
2885
2886                     (This  terminates  packet processing in ingress pipeline;
2887                     the packet does not go to the next ingress table.)
2888
2889              •      A priority-0 logical flow with match 1 has actions next;.
2890
2891     Ingress Table 12: IP Routing Pre
2892
2893       If a packet arrived at this table from Logical Router Port P which  has
2894       options:route_table  value set, a logical flow with match inport == "P"
2895       with priority 100  and  action  setting  unique-generated  per-datapath
2896       32-bit  value  (non-zero)  in  OVS register 7. This register’s value is
2897       checked in next table. If packet didn’t  match  any  configured  inport
2898       (<main> route table), register 7 value is set to 0.
2899
2900       This table contains the following logical flows:
2901
2902              •      Priority-100  flow  with match inport == "LRP_NAME" value
2903                     and action, which set route table identifier in reg7.
2904
2905                     A priority-0 logical flow with match 1 has actions reg7 =
2906                     0; next;.
2907
2908     Ingress Table 13: IP Routing
2909
2910       A  packet  that  arrives  at  this table is an IP packet that should be
2911       routed to the address in ip4.dst or ip6.dst. This table  implements  IP
2912       routing,  setting  reg0 (or xxreg0 for IPv6) to the next-hop IP address
2913       (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
2914       and  advances  to  the next table for ARP resolution. It also sets reg1
2915       (or xxreg1) to the  IP  address  owned  by  the  selected  router  port
2916       (ingress  table  ARP  Request  will generate an ARP request, if needed,
2917       with reg0 as the target protocol address and reg1 as the source  proto‐
2918       col address).
2919
2920       For  ECMP routes, i.e. multiple static routes with same policy and pre‐
2921       fix but different nexthops, the above actions are deferred to next  ta‐
2922       ble.  This  table, instead, is responsible for determine the ECMP group
2923       id and select a member id within the group based on 5-tuple hashing. It
2924       stores group id in reg8[0..15] and member id in reg8[16..31]. This step
2925       is skipped with a priority-10300 rule if the traffic going out the ECMP
2926       route  is  reply traffic, and the ECMP route was configured to use sym‐
2927       metric replies. Instead, the stored values  in  conntrack  is  used  to
2928       choose  the destination. The ct_label.ecmp_reply_eth tells the destina‐
2929       tion  MAC  address  to  which  the   packet   should   be   sent.   The
2930       ct_mark.ecmp_reply_port  tells  the  logical  router  port on which the
2931       packet should be sent. These values saved to the conntrack fields  when
2932       the initial ingress traffic is received over the ECMP route and commit‐
2933       ted to conntrack. If REGBIT_KNOWN_ECMP_NH is  set,  the  priority-10300
2934       flows  in this stage set the outport, while the eth.dst is set by flows
2935       at the ARP/ND Resolution stage.
2936
2937       This table contains the following logical flows:
2938
2939              •      Priority-10550 flow  that  drops  IPv6  Router  Solicita‐
2940                     tion/Advertisement  packets  that  were  not processed in
2941                     previous tables.
2942
2943              •      Priority-10550 flows that drop IGMP and MLD packets  with
2944                     source MAC address owned by the router. These are used to
2945                     prevent looping statically forwarded IGMP and MLD packets
2946                     for which TTL is not decremented (it is always 1).
2947
2948              •      Priority-10500 flows that match IP multicast traffic des‐
2949                     tined  to  groups  registered  on  any  of  the  attached
2950                     switches  and  sets  outport  to the associated multicast
2951                     group that will eventually flood the traffic to  all  in‐
2952                     terested attached logical switches. The flows also decre‐
2953                     ment TTL.
2954
2955              •      Priority-10460 flows that  match  IGMP  and  MLD  control
2956                     packets,  set  outport  to the MC_STATIC multicast group,
2957                     which ovn-northd populates with the  logical  ports  that
2958                     have  options :mcast_flood=’true’. If no router ports are
2959                     configured to flood multicast  traffic  the  packets  are
2960                     dropped.
2961
2962              •      Priority-10450  flow  that matches unregistered IP multi‐
2963                     cast traffic decrements  TTL  and  sets  outport  to  the
2964                     MC_STATIC  multicast  group,  which  ovn-northd populates
2965                     with   the    logical    ports    that    have    options
2966                     :mcast_flood=’true’. If no router ports are configured to
2967                     flood multicast traffic the packets are dropped.
2968
2969              •      IPv4 routing table. For each route to IPv4 network N with
2970                     netmask  M, on router port P with IP address A and Ether‐
2971                     net address E, a logical flow with match ip4.dst ==  N/M,
2972                     whose priority is the number of 1-bits in M, has the fol‐
2973                     lowing actions:
2974
2975                     ip.ttl--;
2976                     reg8[0..15] = 0;
2977                     reg0 = G;
2978                     reg1 = A;
2979                     eth.src = E;
2980                     outport = P;
2981                     flags.loopback = 1;
2982                     next;
2983
2984
2985                     (Ingress table 1 already verified that ip.ttl--; will not
2986                     yield a TTL exceeded error.)
2987
2988                     If  the route has a gateway, G is the gateway IP address.
2989                     Instead, if the route is from a configured static  route,
2990                     G is the next hop IP address. Else it is ip4.dst.
2991
2992              •      IPv6 routing table. For each route to IPv6 network N with
2993                     netmask M, on router port P with IP address A and  Ether‐
2994                     net address E, a logical flow with match in CIDR notation
2995                     ip6.dst == N/M, whose priority is the integer value of M,
2996                     has the following actions:
2997
2998                     ip.ttl--;
2999                     reg8[0..15] = 0;
3000                     xxreg0 = G;
3001                     xxreg1 = A;
3002                     eth.src = E;
3003                     outport = inport;
3004                     flags.loopback = 1;
3005                     next;
3006
3007
3008                     (Ingress table 1 already verified that ip.ttl--; will not
3009                     yield a TTL exceeded error.)
3010
3011                     If the route has a gateway, G is the gateway IP  address.
3012                     Instead,  if the route is from a configured static route,
3013                     G is the next hop IP address. Else it is ip6.dst.
3014
3015                     If the address A is in the link-local  scope,  the  route
3016                     will be limited to sending on the ingress port.
3017
3018                     For  each  static  route the reg7 == id && is prefixed in
3019                     logical flow match portion. For routes  with  route_table
3020                     value set a unique non-zero id is used. For routes within
3021                     <main> route table (no route table set), this id value is
3022                     0.
3023
3024                     For each connected route (route to the LRP’s subnet CIDR)
3025                     the logical flow match portion has no reg7 == id &&  pre‐
3026                     fix to have route to LRP’s subnets in all routing tables.
3027
3028              •      For  ECMP  routes, they are grouped by policy and prefix.
3029                     An unique id (non-zero) is assigned to  each  group,  and
3030                     each  member  is  also  assigned  an unique id (non-zero)
3031                     within each group.
3032
3033                     For each IPv4/IPv6 ECMP group with group id GID and  mem‐
3034                     ber  ids  MID1,  MID2,  ..., a logical flow with match in
3035                     CIDR notation ip4.dst == N/M, or ip6.dst  ==  N/M,  whose
3036                     priority is the integer value of M, has the following ac‐
3037                     tions:
3038
3039                     ip.ttl--;
3040                     flags.loopback = 1;
3041                     reg8[0..15] = GID;
3042                     select(reg8[16..31], MID1, MID2, ...);
3043
3044
3045              •      A priority-0 logical flow that matches  all  packets  not
3046                     already handled (match 1) and drops them (action drop;).
3047
3048     Ingress Table 14: IP_ROUTING_ECMP
3049
3050       This  table  implements  the  second part of IP routing for ECMP routes
3051       following the previous table. If a packet matched a ECMP group  in  the
3052       previous  table,  this  table matches the group id and member id stored
3053       from the previous table, setting reg0 (or xxreg0 for IPv6) to the next-
3054       hop IP address (leaving ip4.dst or ip6.dst, the packet’s final destina‐
3055       tion, unchanged) and advances to the next table for ARP resolution.  It
3056       also  sets  reg1  (or  xxreg1)  to the IP address owned by the selected
3057       router port (ingress table ARP Request will generate an ARP request, if
3058       needed, with reg0 as the target protocol address and reg1 as the source
3059       protocol address).
3060
3061       This processing is skipped for reply traffic being sent out of an  ECMP
3062       route if the route was configured to use symmetric replies.
3063
3064       This table contains the following logical flows:
3065
3066              •      A  priority-150  flow  that matches reg8[0..15] == 0 with
3067                     action  next;  directly  bypasses  packets  of   non-ECMP
3068                     routes.
3069
3070              •      For  each  member  with ID MID in each ECMP group with ID
3071                     GID, a priority-100 flow with match reg8[0..15] == GID &&
3072                     reg8[16..31] == MID has following actions:
3073
3074                     [xx]reg0 = G;
3075                     [xx]reg1 = A;
3076                     eth.src = E;
3077                     outport = P;
3078
3079
3080              •      A  priority-0  logical  flow that matches all packets not
3081                     already handled (match 1) and drops them (action drop;).
3082
3083     Ingress Table 15: Router policies
3084
3085       This table adds flows for the logical router policies configured on the
3086       logical   router.   Please   see   the  OVN_Northbound  database  Logi‐
3087       cal_Router_Policy table documentation in ovn-nb for supported actions.
3088
3089              •      For each router policy configured on the logical  router,
3090                     a  logical  flow  is added with specified priority, match
3091                     and actions.
3092
3093              •      If the policy action is reroute with 2 or  more  nexthops
3094                     defined,  then the logical flow is added with the follow‐
3095                     ing actions:
3096
3097                     reg8[0..15] = GID;
3098                     reg8[16..31] = select(1,..n);
3099
3100
3101                     where GID is the ECMP group id  generated  by  ovn-northd
3102                     for  this  policy and n is the number of nexthops. select
3103                     action selects one of the nexthop member id, stores it in
3104                     the  register reg8[16..31] and advances the packet to the
3105                     next stage.
3106
3107              •      If the policy action is reroute  with  just  one  nexhop,
3108                     then  the  logical  flow  is added with the following ac‐
3109                     tions:
3110
3111                     [xx]reg0 = H;
3112                     eth.src = E;
3113                     outport = P;
3114                     reg8[0..15] = 0;
3115                     flags.loopback = 1;
3116                     next;
3117
3118
3119                     where H is the nexthop  defined in the router  policy,  E
3120                     is  the  ethernet address of the logical router port from
3121                     which the nexthop is  reachable  and  P  is  the  logical
3122                     router port from which the nexthop is reachable.
3123
3124              •      If  a  router policy has the option pkt_mark=m set and if
3125                     the action is not drop, then  the  action  also  includes
3126                     pkt.mark = m to mark the packet with the marker m.
3127
3128     Ingress Table 16: ECMP handling for router policies
3129
3130       This  table  handles  the  ECMP for the router policies configured with
3131       multiple nexthops.
3132
3133              •      A priority-150 flow is added to advance the packet to the
3134                     next  stage  if the ECMP group id register reg8[0..15] is
3135                     0.
3136
3137              •      For each ECMP reroute router policy  with  multiple  nex‐
3138                     thops,  a  priority-100  flow is added for each nexthop H
3139                     with the match reg8[0..15] == GID &&  reg8[16..31]  ==  M
3140                     where  GID  is  the  router  policy group id generated by
3141                     ovn-northd and M is the member id of the nexthop H gener‐
3142                     ated  by  ovn-northd.  The following actions are added to
3143                     the flow:
3144
3145                     [xx]reg0 = H;
3146                     eth.src = E;
3147                     outport = P
3148                     "flags.loopback = 1; "
3149                     "next;"
3150
3151
3152                     where H is the nexthop  defined in the router  policy,  E
3153                     is  the  ethernet address of the logical router port from
3154                     which the nexthop is  reachable  and  P  is  the  logical
3155                     router port from which the nexthop is reachable.
3156
3157              •      A  priority-0  logical  flow that matches all packets not
3158                     already handled (match 1) and drops them (action drop;).
3159
3160     Ingress Table 17: ARP/ND Resolution
3161
3162       Any packet that reaches this table is an IP packet whose next-hop  IPv4
3163       address  is  in  reg0 or IPv6 address is in xxreg0. (ip4.dst or ip6.dst
3164       contains the final destination.) This table resolves the IP address  in
3165       reg0 (or xxreg0) into an output port in outport and an Ethernet address
3166       in eth.dst, using the following flows:
3167
3168              •      A priority-500 flow that  matches  IP  multicast  traffic
3169                     that  was  allowed in the routing pipeline. For this kind
3170                     of traffic the outport was already set so the  flow  just
3171                     advances to the next table.
3172
3173              •      Priority-200  flows that match ECMP reply traffic for the
3174                     routes configured to use symmetric replies, with  actions
3175                     push(xxreg1);    xxreg1    =    ct_label;    eth.dst    =
3176                     xxreg1[32..79]; pop(xxreg1); next;. xxreg1 is  used  here
3177                     to  avoid masked access to ct_label, to make the flow HW-
3178                     offloading friendly.
3179
3180              •      Static MAC bindings. MAC bindings can be known statically
3181                     based  on data in the OVN_Northbound database. For router
3182                     ports connected to logical switches, MAC bindings can  be
3183                     known  statically  from the addresses column in the Logi‐
3184                     cal_Switch_Port table. (Note: the flow is  not  installed
3185                     for  IPs of logical switch ports of type virtual, and dy‐
3186                     namic MAC binding is used for those IPs instead, so  that
3187                     virtual parent failover does not depend on ovn-northd, to
3188                     achieve better failover performance.)  For  router  ports
3189                     connected  to  other logical routers, MAC bindings can be
3190                     known statically from the mac and networks column in  the
3191                     Logical_Router_Port  table.  (Note:  the  flow is NOT in‐
3192                     stalled for the IP addresses that belong  to  a  neighbor
3193                     logical  router  port  if  the current router has the op‐
3194                     tions:dynamic_neigh_routers set to true)
3195
3196                     For each IPv4 address A whose host is known to have  Eth‐
3197                     ernet  address  E  on  router port P, a priority-100 flow
3198                     with match outport === P && reg0 == A has actions eth.dst
3199                     = E; next;.
3200
3201                     For  each IPv6 address A whose host is known to have Eth‐
3202                     ernet address E on router port  P,  a  priority-100  flow
3203                     with  match  outport  ===  P  &&  xxreg0 == A has actions
3204                     eth.dst = E; next;.
3205
3206                     For each logical router port with an IPv4 address A and a
3207                     mac  address of E that is reachable via a different logi‐
3208                     cal router port P, a priority-100 flow with match outport
3209                     === P && reg0 == A has actions eth.dst = E; next;.
3210
3211                     For each logical router port with an IPv6 address A and a
3212                     mac address of E that is reachable via a different  logi‐
3213                     cal router port P, a priority-100 flow with match outport
3214                     === P && xxreg0 == A has actions eth.dst = E; next;.
3215
3216              •      Static MAC bindings from NAT entries.  MAC  bindings  can
3217                     also  be  known  for  the entries in the NAT table. Below
3218                     flows are programmed for distributed logical routers  i.e
3219                     with a distributed router port.
3220
3221                     For  each row in the NAT table with IPv4 address A in the
3222                     external_ip column of NAT table, below two flows are pro‐
3223                     grammed:
3224
3225                     A  priority-100  flow with the match outport == P && reg0
3226                     == A has actions eth.dst = E; next;, where P is the  dis‐
3227                     tributed  logical  router port, E is the Ethernet address
3228                     if set in the external_mac column of  NAT  table  for  of
3229                     type dnat_and_snat, otherwise the Ethernet address of the
3230                     distributed logical router port. Note that if the  exter‐
3231                     nal_ip  is  not  within  a  subnet  on the owning logical
3232                     router, then OVN will only create ARP resolution flows if
3233                     the  options:add_route  is set to true. Otherwise, no ARP
3234                     resolution flows will be added.
3235
3236                     Corresponding to the above flow, a priority-150 flow with
3237                     the match inport == P && outport == P && ip4.dst == A has
3238                     actions drop; to exclude packets that have  gone  through
3239                     DNAT/unSNAT  stage but failed to convert the destination,
3240                     to avoid loop.
3241
3242                     For IPv6 NAT entries, same flows are added, but using the
3243                     register xxreg0 and field ip6 for the match.
3244
3245              •      If the router datapath runs a port with redirect-type set
3246                     to bridged, for each distributed NAT rule with  IP  A  in
3247                     the  logical_ip  column  and  logical port P in the logi‐
3248                     cal_port column of NAT table, a priority-90 flow with the
3249                     match  outport  ==  Q && ip.src === A && is_chassis_resi‐
3250                     dent(P), where Q is the distributed logical  router  port
3251                     and  action  get_arp(outport,  reg0);  next; for IPv4 and
3252                     get_nd(outport, xxreg0); next; for IPv6.
3253
3254              •      Traffic with IP  destination  an  address  owned  by  the
3255                     router  should  be  dropped.  Such  traffic  is  normally
3256                     dropped in ingress table IP Input except for IPs that are
3257                     also shared with SNAT rules. However, if there was no un‐
3258                     SNAT operation  that  happened  successfully  until  this
3259                     point  in  the  pipeline  and  the  destination IP of the
3260                     packet is still a router owned IP,  the  packets  can  be
3261                     safely dropped.
3262
3263                     A  priority-2  logical  flow  with  match  ip4.dst = {..}
3264                     matches on traffic destined  to  router  owned  IPv4  ad‐
3265                     dresses  which  are  also  SNAT IPs. This flow has action
3266                     drop;.
3267
3268                     A priority-2 logical  flow  with  match  ip6.dst  =  {..}
3269                     matches  on  traffic  destined  to  router owned IPv6 ad‐
3270                     dresses which are also SNAT IPs.  This  flow  has  action
3271                     drop;.
3272
3273                     A  priority-0  logical  that flow matches all packets not
3274                     already handled (match 1) and drops them (action drop;).
3275
3276              •      Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
3277                     ings  that  have  become known dynamically through ARP or
3278                     neighbor discovery. (The ingress table ARP  Request  will
3279                     issue  an  ARP or neighbor solicitation request for cases
3280                     where the binding is not yet known.)
3281
3282                     A priority-0 logical flow  with  match  ip4  has  actions
3283                     get_arp(outport, reg0); next;.
3284
3285                     A  priority-0  logical  flow  with  match ip6 has actions
3286                     get_nd(outport, xxreg0); next;.
3287
3288              •      For a distributed gateway LRP with redirect-type  set  to
3289                     bridged,   a  priority-50  flow  will  match  outport  ==
3290                     "ROUTER_PORT" and !is_chassis_resident ("cr-ROUTER_PORT")
3291                     has  actions  eth.dst = E; next;, where E is the ethernet
3292                     address of the logical router port.
3293
3294     Ingress Table 18: Check packet length
3295
3296       For distributed logical routers or gateway routers  with  gateway  port
3297       configured  with options:gateway_mtu to a valid integer value, this ta‐
3298       ble adds a priority-50 logical flow with the match outport  ==  GW_PORT
3299       where  GW_PORT  is  the  gateway  router  port  and  applies the action
3300       check_pkt_larger and advances the packet to the next table.
3301
3302       REGBIT_PKT_LARGER = check_pkt_larger(L); next;
3303
3304
3305       where L is the packet length to check for. If the packet is larger than
3306       L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
3307       taken from options:gateway_mtu column of Logical_Router_Port row.
3308
3309       If the port is also configured with options:gateway_mtu_bypass then an‐
3310       other  flow  is added, with priority-55, to bypass the check_pkt_larger
3311       flow.
3312
3313       This table adds one priority-0 fallback flow that matches  all  packets
3314       and advances to the next table.
3315
3316     Ingress Table 19: Handle larger packets
3317
3318       For  distributed  logical  routers or gateway routers with gateway port
3319       configured with options:gateway_mtu to a valid integer value, this  ta‐
3320       ble  adds  the  following  priority-150  logical  flow for each logical
3321       router port with the match inport == LRP && outport == GW_PORT &&  REG‐
3322       BIT_PKT_LARGER  &&  !REGBIT_EGRESS_LOOPBACK,  where  LRP is the logical
3323       router port and GW_PORT is the gateway port and applies  the  following
3324       action for ipv4 and ipv6 respectively:
3325
3326       icmp4 {
3327           icmp4.type = 3; /* Destination Unreachable. */
3328           icmp4.code = 4;  /* Frag Needed and DF was Set. */
3329           icmp4.frag_mtu = M;
3330           eth.dst = E;
3331           ip4.dst = ip4.src;
3332           ip4.src = I;
3333           ip.ttl = 255;
3334           REGBIT_EGRESS_LOOPBACK = 1;
3335           REGBIT_PKT_LARGER = 0;
3336           next(pipeline=ingress, table=0);
3337       };
3338       icmp6 {
3339           icmp6.type = 2;
3340           icmp6.code = 0;
3341           icmp6.frag_mtu = M;
3342           eth.dst = E;
3343           ip6.dst = ip6.src;
3344           ip6.src = I;
3345           ip.ttl = 255;
3346           REGBIT_EGRESS_LOOPBACK = 1;
3347           REGBIT_PKT_LARGER = 0;
3348           next(pipeline=ingress, table=0);
3349       };
3350
3351
3352              •      Where  M  is the (fragment MTU - 58) whose value is taken
3353                     from options:gateway_mtu  column  of  Logical_Router_Port
3354                     row.
3355
3356E is the Ethernet address of the logical router port.
3357
3358I is the IPv4/IPv6 address of the logical router port.
3359
3360       This  table  adds one priority-0 fallback flow that matches all packets
3361       and advances to the next table.
3362
3363     Ingress Table 20: Gateway Redirect
3364
3365       For distributed logical routers where one or more of the logical router
3366       ports specifies a gateway chassis, this table redirects certain packets
3367       to the distributed gateway port instances  on  the  gateway  chassises.
3368       This table has the following flows:
3369
3370              •      For  all the configured load balancing rules that include
3371                     an IPv4 address VIP, and a list of IPv4 backend addresses
3372                     B0,  B1  .. Bn defined for the VIP a priority-200 flow is
3373                     added that matches ip4 && (ip4.src == B0 || ip4.src == B1
3374                     ||  ...  ||  ip4.src  == Bn) with an action outport = CR;
3375                     next; where CR is the chassisredirect  port  representing
3376                     the  instance  of  the logical router distributed gateway
3377                     port on the gateway chassis. If the backend IPv4  address
3378                     Bx  is  also  configured with L4 port PORT of protocol P,
3379                     then the match also includes P.src == PORT. Similar flows
3380                     are added for IPv6.
3381
3382              •      For each NAT rule in the OVN Northbound database that can
3383                     be handled in a distributed manner, a priority-100  logi‐
3384                     cal  flow  with  match  ip4.src  == B && outport == GW &&
3385                     is_chassis_resident(P), where GW is the distributed gate‐
3386                     way port specified in the NAT rule and P is the NAT logi‐
3387                     cal port. IP traffic matching the above rule will be man‐
3388                     aged  locally setting reg1 to C and eth.src to D, where C
3389                     is NAT external ip and D is NAT external mac.
3390
3391              •      For each dnat_and_snat NAT rule with  stateless=true  and
3392                     allowed_ext_ips  configured,  a  priority-75 flow is pro‐
3393                     grammed with match ip4.dst == B and action outport =  CR;
3394                     next;  where  B is the NAT rule external IP and CR is the
3395                     chassisredirect port representing  the  instance  of  the
3396                     logical  router  distributed  gateway port on the gateway
3397                     chassis. Moreover a priority-70 flow is  programmed  with
3398                     same  match  and action drop;. For each dnat_and_snat NAT
3399                     rule with stateless=true and exempted_ext_ips configured,
3400                     a  priority-75 flow is programmed with match ip4.dst == B
3401                     and action drop; where B is the NAT rule external  IP.  A
3402                     similar flow is added for IPv6 traffic.
3403
3404              •      For each NAT rule in the OVN Northbound database that can
3405                     be handled in a distributed manner, a priority-80 logical
3406                     flow  with  drop action if the NAT logical port is a vir‐
3407                     tual port not claimed by any chassis yet.
3408
3409              •      A priority-50 logical flow with match outport ==  GW  has
3410                     actions  outport  =  CR;  next;,  where GW is the logical
3411                     router distributed gateway  port  and  CR  is  the  chas‐
3412                     sisredirect port representing the instance of the logical
3413                     router distributed gateway port on the gateway chassis.
3414
3415              •      A priority-0 logical flow with match 1 has actions next;.
3416
3417     Ingress Table 21: ARP Request
3418
3419       In the common case where the Ethernet destination  has  been  resolved,
3420       this  table outputs the packet. Otherwise, it composes and sends an ARP
3421       or IPv6 Neighbor Solicitation request. It holds the following flows:
3422
3423              •      Unknown MAC address. A priority-100 flow for IPv4 packets
3424                     with match eth.dst == 00:00:00:00:00:00 has the following
3425                     actions:
3426
3427                     arp {
3428                         eth.dst = ff:ff:ff:ff:ff:ff;
3429                         arp.spa = reg1;
3430                         arp.tpa = reg0;
3431                         arp.op = 1;  /* ARP request. */
3432                         output;
3433                     };
3434
3435
3436                     Unknown MAC address. For each IPv6 static  route  associ‐
3437                     ated  with  the  router  with the nexthop IP: G, a prior‐
3438                     ity-200 flow for  IPv6  packets  with  match  eth.dst  ==
3439                     00:00:00:00:00:00  &&  xxreg0 == G with the following ac‐
3440                     tions is added:
3441
3442                     nd_ns {
3443                         eth.dst = E;
3444                         ip6.dst = I
3445                         nd.target = G;
3446                         output;
3447                     };
3448
3449
3450                     Where E is the multicast mac derived from the Gateway IP,
3451                     I  is  the solicited-node multicast address corresponding
3452                     to the target address G.
3453
3454                     Unknown MAC address. A priority-100 flow for IPv6 packets
3455                     with match eth.dst == 00:00:00:00:00:00 has the following
3456                     actions:
3457
3458                     nd_ns {
3459                         nd.target = xxreg0;
3460                         output;
3461                     };
3462
3463
3464                     (Ingress table IP Routing initialized reg1  with  the  IP
3465                     address  owned  by outport and (xx)reg0 with the next-hop
3466                     IP address)
3467
3468                     The IP packet that triggers the ARP/IPv6  NS  request  is
3469                     dropped.
3470
3471              •      Known MAC address. A priority-0 flow with match 1 has ac‐
3472                     tions output;.
3473
3474     Egress Table 0: Check DNAT local
3475
3476       This table checks if the packet  needs  to  be  DNATed  in  the  router
3477       ingress  table  lr_in_dnat  after  it  is SNATed and looped back to the
3478       ingress pipeline. This check is done only for routers  configured  with
3479       distributed  gateway  ports and NAT entries. This check is done so that
3480       SNAT and DNAT is done in different zones instead of a common zone.
3481
3482              •      A priority-0 logical flow with match 1 has  actions  REG‐
3483                     BIT_DST_NAT_IP_LOCAL = 0; next;.
3484
3485     Egress Table 1: UNDNAT
3486
3487       This  is  for  already  established connections’ reverse traffic. i.e.,
3488       DNAT has already been done in ingress pipeline and now the  packet  has
3489       entered  the  egress  pipeline as part of a reply. This traffic is unD‐
3490       NATed here.
3491
3492              •      A priority-0 logical flow with match 1 has actions next;.
3493
3494     Egress Table 1: UNDNAT on Gateway Routers
3495
3496              •      For IPv6 Neighbor Discovery or Router Solicitation/Adver‐
3497                     tisement traffic, a priority-100 flow with action next;.
3498
3499              •      For  all  IP  packets,  a priority-50 flow with an action
3500                     flags.loopback = 1; ct_dnat;.
3501
3502     Egress Table 1: UNDNAT on Distributed Routers
3503
3504              •      For all the configured load balancing rules for a  router
3505                     with  gateway  port  in  OVN_Northbound database that in‐
3506                     cludes an IPv4 address VIP, for every  backend  IPv4  ad‐
3507                     dress  B  defined for the VIP a priority-120 flow is pro‐
3508                     grammed on gateway chassis that matches ip && ip4.src  ==
3509                     B  && outport == GW, where GW is the logical router gate‐
3510                     way port with an action ct_dnat;. If the backend IPv4 ad‐
3511                     dress  B is also configured with L4 port PORT of protocol
3512                     P, then the match also  includes  P.src  ==  PORT.  These
3513                     flows are not added for load balancers with IPv6 VIPs.
3514
3515                     If  the  router is configured to force SNAT any load-bal‐
3516                     anced  packets,  above  action  will   be   replaced   by
3517                     flags.force_snat_for_lb = 1; ct_dnat;.
3518
3519              •      For  each  configuration  in  the OVN Northbound database
3520                     that asks to change  the  destination  IP  address  of  a
3521                     packet  from an IP address of A to B, a priority-100 flow
3522                     matches ip && ip4.src == B && outport == GW, where GW  is
3523                     the logical router gateway port, with an action ct_dnat;.
3524                     If the NAT rule is of type dnat_and_snat and  has  state‐
3525                     less=true in the options, then the action would be next;.
3526
3527                     If  the  NAT rule cannot be handled in a distributed man‐
3528                     ner, then the priority-100 flow above is only  programmed
3529                     on the gateway chassis with the action ct_dnat.
3530
3531                     If  the  NAT rule can be handled in a distributed manner,
3532                     then there is an additional action eth.src =  EA;,  where
3533                     EA is the ethernet address associated with the IP address
3534                     A in the NAT rule. This allows upstream MAC  learning  to
3535                     point to the correct chassis.
3536
3537     Egress Table 2: Post UNDNAT
3538
3539              •      A  priority-50 logical flow is added that commits any un‐
3540                     tracked flows from the previous table  lr_out_undnat  for
3541                     Gateway  routers.  This flow matches on ct.new && ip with
3542                     action ct_commit { } ; next; .
3543
3544              •      A priority-0 logical flow with match 1 has actions next;.
3545
3546     Egress Table 3: SNAT
3547
3548       Packets that are configured to be SNATed get their  source  IP  address
3549       changed based on the configuration in the OVN Northbound database.
3550
3551              •      A  priority-120 flow to advance the IPv6 Neighbor solici‐
3552                     tation packet to next table to skip  SNAT.  In  the  case
3553                     where  ovn-controller  injects an IPv6 Neighbor Solicita‐
3554                     tion packet (for nd_ns action) we don’t want  the  packet
3555                     to go through conntrack.
3556
3557       Egress Table 3: SNAT on Gateway Routers
3558
3559              •      If  the Gateway router in the OVN Northbound database has
3560                     been configured to force SNAT a  packet  (that  has  been
3561                     previously  DNATted)  to  B,  a priority-100 flow matches
3562                     flags.force_snat_for_dnat ==  1  &&  ip  with  an  action
3563                     ct_snat(B);.
3564
3565              •      If  a  load balancer configured to skip snat has been ap‐
3566                     plied to the Gateway router pipeline, a priority-120 flow
3567                     matches  flags.skip_snat_for_lb == 1 && ip with an action
3568                     next;.
3569
3570              •      If the Gateway router in the OVN Northbound database  has
3571                     been  configured  to  force  SNAT a packet (that has been
3572                     previously  load-balanced)  using  router  IP  (i.e   op‐
3573                     tions:lb_force_snat_ip=router_ip),  then for each logical
3574                     router port P attached to the Gateway  router,  a  prior‐
3575                     ity-110 flow matches flags.force_snat_for_lb == 1 && out‐
3576                     port == P
3577                      with an action ct_snat(R); where R is the IP  configured
3578                     on  the  router  port.  If  R is an IPv4 address then the
3579                     match will also include ip4 and if it is an IPv6 address,
3580                     then the match will also include ip6.
3581
3582                     If  the logical router port P is configured with multiple
3583                     IPv4 and multiple IPv6 addresses, only the first IPv4 and
3584                     first IPv6 address is considered.
3585
3586              •      If  the Gateway router in the OVN Northbound database has
3587                     been configured to force SNAT a  packet  (that  has  been
3588                     previously  load-balanced)  to  B,  a  priority-100  flow
3589                     matches flags.force_snat_for_lb == 1 && ip with an action
3590                     ct_snat(B);.
3591
3592              •      For  each  configuration  in the OVN Northbound database,
3593                     that asks to change the source IP  address  of  a  packet
3594                     from  an  IP  address of A or to change the source IP ad‐
3595                     dress of a packet that belongs to network A to B, a  flow
3596                     matches  ip  && ip4.src == A && (!ct.trk || !ct.rpl) with
3597                     an action ct_snat(B);. The priority of the flow is calcu‐
3598                     lated  based on the mask of A, with matches having larger
3599                     masks getting higher priorities. If the NAT  rule  is  of
3600                     type dnat_and_snat and has stateless=true in the options,
3601                     then the action would be ip4/6.src= (B).
3602
3603              •      If the NAT  rule  has  allowed_ext_ips  configured,  then
3604                     there is an additional match ip4.dst == allowed_ext_ips .
3605                     Similarly, for  IPV6,  match  would  be  ip6.dst  ==  al‐
3606                     lowed_ext_ips.
3607
3608              •      If  the  NAT rule has exempted_ext_ips set, then there is
3609                     an additional flow configured at the priority + 1 of cor‐
3610                     responding  NAT  rule. The flow matches if destination ip
3611                     is an exempted_ext_ip and the action is next; . This flow
3612                     is  used  to bypass the ct_snat action for a packet which
3613                     is destinted to exempted_ext_ips.
3614
3615              •      A priority-0 logical flow with match 1 has actions next;.
3616
3617       Egress Table 3: SNAT on Distributed Routers
3618
3619              •      For each configuration in the  OVN  Northbound  database,
3620                     that  asks  to  change  the source IP address of a packet
3621                     from an IP address of A or to change the  source  IP  ad‐
3622                     dress  of  a  packet  that belongs to network A to B, two
3623                     flows are added. The priority P of these flows are calcu‐
3624                     lated  based on the mask of A, with matches having larger
3625                     masks getting higher priorities.
3626
3627                     If the NAT rule cannot be handled in a  distributed  man‐
3628                     ner,  then  the  below  flows  are only programmed on the
3629                     gateway chassis increasing flow priority by 128 in  order
3630                     to be run first.
3631
3632                     •      The first flow is added with the calculated prior‐
3633                            ity P and match ip && ip4.src == A &&  outport  ==
3634                            GW,  where  GW is the logical router gateway port,
3635                            with an action ct_snat(B); to SNATed in the common
3636                            zone. If the NAT rule is of type dnat_and_snat and
3637                            has stateless=true in the options, then the action
3638                            would be ip4/6.src=(B).
3639
3640                     If  the  NAT rule can be handled in a distributed manner,
3641                     then there is an additional action (for both  the  flows)
3642                     eth.src  =  EA;, where EA is the ethernet address associ‐
3643                     ated with the IP address A in the NAT rule.  This  allows
3644                     upstream MAC learning to point to the correct chassis.
3645
3646                     If  the  NAT  rule  has  allowed_ext_ips configured, then
3647                     there is an additional match ip4.dst == allowed_ext_ips .
3648                     Similarly,  for  IPV6,  match  would  be  ip6.dst  == al‐
3649                     lowed_ext_ips.
3650
3651                     If the NAT rule has exempted_ext_ips set, then  there  is
3652                     an  additional  flow configured at the priority P + 2  of
3653                     corresponding NAT rule. The flow matches  if  destination
3654                     ip  is  an exempted_ext_ip and the action is next; . This
3655                     flow is used to bypass the  ct_snat  action  for  a  flow
3656                     which is destinted to exempted_ext_ips.
3657
3658              •      A priority-0 logical flow with match 1 has actions next;.
3659
3660     Egress Table 4: Post SNAT
3661
3662       Packets reaching this table are processed according to the flows below:
3663
3664              •      A  priority-0  logical  flow that matches all packets not
3665                     already handled (match 1) and action next;.
3666
3667     Egress Table 5: Egress Loopback
3668
3669       For distributed logical routers where one of the logical  router  ports
3670       specifies a gateway chassis.
3671
3672       While  UNDNAT  and SNAT processing have already occurred by this point,
3673       this traffic needs to be forced through egress loopback  on  this  dis‐
3674       tributed gateway port instance, in order for UNSNAT and DNAT processing
3675       to be applied, and also for IP routing and ARP resolution after all  of
3676       the NAT processing, so that the packet can be forwarded to the destina‐
3677       tion.
3678
3679       This table has the following flows:
3680
3681              •      For each NAT rule in the OVN  Northbound  database  on  a
3682                     distributed  router,  a  priority-100  logical  flow with
3683                     match ip4.dst == E && outport == GW  &&  is_chassis_resi‐
3684                     dent(P),  where E is the external IP address specified in
3685                     the NAT rule, GW is the distributed gateway  port  corre‐
3686                     sponding  to  the  NAT  rule (specified or inferred). For
3687                     dnat_and_snat NAT rule, P is the logical  port  specified
3688                     in  the  NAT rule. If logical_port column of NAT table is
3689                     NOT set, then P is the chassisredirect port  of  GW  with
3690                     the following actions:
3691
3692                     clone {
3693                         ct_clear;
3694                         inport = outport;
3695                         outport = "";
3696                         flags = 0;
3697                         flags.loopback = 1;
3698                         reg0 = 0;
3699                         reg1 = 0;
3700                         ...
3701                         reg9 = 0;
3702                         REGBIT_EGRESS_LOOPBACK = 1;
3703                         next(pipeline=ingress, table=0);
3704                     };
3705
3706
3707                     flags.loopback  is set since in_port is unchanged and the
3708                     packet may return back to that port after NAT processing.
3709                     REGBIT_EGRESS_LOOPBACK  is  set  to  indicate that egress
3710                     loopback has occurred, in order to skip the source IP ad‐
3711                     dress check against the router address.
3712
3713              •      A priority-0 logical flow with match 1 has actions next;.
3714
3715     Egress Table 6: Delivery
3716
3717       Packets that reach this table are ready for delivery. It contains:
3718
3719              •      Priority-110  logical flows that match IP multicast pack‐
3720                     ets on each enabled logical router port  and  modify  the
3721                     Ethernet  source  address  of the packets to the Ethernet
3722                     address of the port and then execute action output;.
3723
3724              •      Priority-100 logical flows that match packets on each en‐
3725                     abled logical router port, with action output;.
3726
3727              •      A  priority-0  logical  flow that matches all packets not
3728                     already handled (match 1) and drops them (action drop;).
3729

DROP SAMPLING

3731       As described in the previous section, there are  several  places  where
3732       ovn-northd might decided to drop a packet by explicitly creating a Log‐
3733       ical_Flow with the drop; action.
3734
3735       When debug drop-sampling has been cofigured in the OVN Northbound data‐
3736       base,  the  ovn-northd  will  replace all the drop; actions with a sam‐
3737       ple(priority=65535,        collector_set=id,         obs_domain=obs_id,
3738       obs_point=@cookie) action, where:
3739
3740id  is the value the debug_drop_collector_set option con‐
3741                     figured in the OVN Northbound.
3742
3743obs_id has it’s 8 most  significant  bits  equal  to  the
3744                     value  of  the  debug_drop_domain_id  option  in  the OVN
3745                     Northbound and it’s 24 least significant  bits  equal  to
3746                     the datapath’s tunnel key.
3747
3748
3749
3750OVN 23.09.2                       ovn-northd                     ovn-northd(8)
Impressum