1ovn-northd(8)                     OVN Manual                     ovn-northd(8)
2
3
4

NAME

6       ovn-northd  and ovn-northd-ddlog - Open Virtual Network central control
7       daemon
8

SYNOPSIS

10       ovn-northd [options]
11

DESCRIPTION

13       ovn-northd is a centralized  daemon  responsible  for  translating  the
14       high-level  OVN  configuration into logical configuration consumable by
15       daemons such as ovn-controller. It translates the logical network  con‐
16       figuration  in  terms  of conventional network concepts, taken from the
17       OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
18       the OVN Southbound Database (see ovn-sb(5)) below it.
19
20       ovn-northd is implemented in C. ovn-northd-ddlog is a compatible imple‐
21       mentation written in DDlog, a language for  incremental  database  pro‐
22       cessing.  This documentation applies to both implementations, with dif‐
23       ferences indicated where relevant.
24

OPTIONS

26       --ovnnb-db=database
27              The OVSDB database containing the OVN  Northbound  Database.  If
28              the  OVN_NB_DB environment variable is set, its value is used as
29              the default. Otherwise, the default is unix:/ovnnb_db.sock.
30
31       --ovnsb-db=database
32              The OVSDB database containing the OVN  Southbound  Database.  If
33              the  OVN_SB_DB environment variable is set, its value is used as
34              the default. Otherwise, the default is unix:/ovnsb_db.sock.
35
36       --ddlog-record=file
37              This option is for ovn-north-ddlog only. It causes the daemon to
38              record  the  initial database state and later changes to file in
39              the text-based DDlog command format. The ovn_northd_cli  program
40              can  later replay these changes for debugging purposes. This op‐
41              tion has a performance impact. See  debugging-ddlog.rst  in  the
42              OVN documentation for more details.
43
44       --dry-run
45              Causes   ovn-northd  to  start  paused.  In  the  paused  state,
46              ovn-northd does not apply any changes to the databases, although
47              it  continues  to  monitor  them.  For more information, see the
48              pause command, under Runtime Management Commands below.
49
50              For  ovn-northd-ddlog,  one   could   use   this   option   with
51              --ddlog-record  to  generate  a  replay log without restarting a
52              process or disturbing a running system.
53
54       database in the above options must be an OVSDB active or  passive  con‐
55       nection method, as described in ovsdb(7).
56
57   Daemon Options
58       --pidfile[=pidfile]
59              Causes a file (by default, program.pid) to be created indicating
60              the PID of the running process. If the pidfile argument  is  not
61              specified, or if it does not begin with /, then it is created in
62              .
63
64              If --pidfile is not specified, no pidfile is created.
65
66       --overwrite-pidfile
67              By default, when --pidfile is specified and the  specified  pid‐
68              file already exists and is locked by a running process, the dae‐
69              mon refuses to start. Specify --overwrite-pidfile to cause it to
70              instead overwrite the pidfile.
71
72              When --pidfile is not specified, this option has no effect.
73
74       --detach
75              Runs  this  program  as a background process. The process forks,
76              and in the child it starts a new session,  closes  the  standard
77              file descriptors (which has the side effect of disabling logging
78              to the console), and changes its current directory to  the  root
79              (unless  --no-chdir is specified). After the child completes its
80              initialization, the parent exits.
81
82       --monitor
83              Creates an additional process to monitor  this  program.  If  it
84              dies  due  to a signal that indicates a programming error (SIGA‐
85              BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
86              or SIGXFSZ) then the monitor process starts a new copy of it. If
87              the daemon dies or exits for another reason, the monitor process
88              exits.
89
90              This  option  is  normally used with --detach, but it also func‐
91              tions without it.
92
93       --no-chdir
94              By default, when --detach is specified, the daemon  changes  its
95              current  working  directory  to  the root directory after it de‐
96              taches. Otherwise, invoking the daemon from a carelessly  chosen
97              directory  would  prevent  the administrator from unmounting the
98              file system that holds that directory.
99
100              Specifying --no-chdir suppresses this behavior,  preventing  the
101              daemon  from changing its current working directory. This may be
102              useful for collecting core files, since it is common behavior to
103              write core dumps into the current working directory and the root
104              directory is not a good directory to use.
105
106              This option has no effect when --detach is not specified.
107
108       --no-self-confinement
109              By default this daemon will try to self-confine itself  to  work
110              with  files  under  well-known  directories  determined at build
111              time. It is better to stick with this default behavior  and  not
112              to  use  this  flag  unless some other Access Control is used to
113              confine daemon. Note that in contrast to  other  access  control
114              implementations  that  are  typically enforced from kernel-space
115              (e.g. DAC or MAC), self-confinement is imposed  from  the  user-
116              space daemon itself and hence should not be considered as a full
117              confinement strategy, but instead should be viewed as  an  addi‐
118              tional layer of security.
119
120       --user=user:group
121              Causes  this  program  to  run  as a different user specified in
122              user:group, thus dropping most of  the  root  privileges.  Short
123              forms  user  and  :group  are also allowed, with current user or
124              group assumed, respectively. Only daemons started  by  the  root
125              user accepts this argument.
126
127              On   Linux,   daemons   will   be   granted   CAP_IPC_LOCK   and
128              CAP_NET_BIND_SERVICES before dropping root  privileges.  Daemons
129              that  interact  with  a  datapath, such as ovs-vswitchd, will be
130              granted three  additional  capabilities,  namely  CAP_NET_ADMIN,
131              CAP_NET_BROADCAST  and  CAP_NET_RAW.  The capability change will
132              apply even if the new user is root.
133
134              On Windows, this option is not currently supported. For security
135              reasons,  specifying  this  option will cause the daemon process
136              not to start.
137
138   Logging Options
139       -v[spec]
140       --verbose=[spec]
141            Sets logging levels. Without any spec, sets the log level for  ev‐
142            ery  module  and  destination to dbg. Otherwise, spec is a list of
143            words separated by spaces or commas or colons, up to one from each
144            category below:
145
146            •      A  valid module name, as displayed by the vlog/list command
147                   on ovs-appctl(8), limits the log level change to the speci‐
148                   fied module.
149
150syslog,  console, or file, to limit the log level change to
151                   only to the system log, to the console, or to a  file,  re‐
152                   spectively.  (If  --detach  is specified, the daemon closes
153                   its standard file descriptors, so logging  to  the  console
154                   will have no effect.)
155
156                   On  Windows  platform,  syslog is accepted as a word and is
157                   only useful along with the --syslog-target option (the word
158                   has no effect otherwise).
159
160off,  emer,  err,  warn,  info,  or dbg, to control the log
161                   level. Messages of the given severity  or  higher  will  be
162                   logged,  and  messages  of  lower severity will be filtered
163                   out. off filters out all messages. See ovs-appctl(8) for  a
164                   definition of each log level.
165
166            Case is not significant within spec.
167
168            Regardless  of the log levels set for file, logging to a file will
169            not take place unless --log-file is also specified (see below).
170
171            For compatibility with older versions of OVS, any is accepted as a
172            word but has no effect.
173
174       -v
175       --verbose
176            Sets  the  maximum  logging  verbosity level, equivalent to --ver‐
177            bose=dbg.
178
179       -vPATTERN:destination:pattern
180       --verbose=PATTERN:destination:pattern
181            Sets the log pattern for destination to pattern. Refer to  ovs-ap‐
182            pctl(8) for a description of the valid syntax for pattern.
183
184       -vFACILITY:facility
185       --verbose=FACILITY:facility
186            Sets  the RFC5424 facility of the log message. facility can be one
187            of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
188            ftp,  ntp,  audit,  alert, clock2, local0, local1, local2, local3,
189            local4, local5, local6 or local7. If this option is not specified,
190            daemon  is used as the default for the local system syslog and lo‐
191            cal0 is used while sending a message to the  target  provided  via
192            the --syslog-target option.
193
194       --log-file[=file]
195            Enables  logging  to a file. If file is specified, then it is used
196            as the exact name for the log file. The default log file name used
197            if file is omitted is /var/log/ovn/program.log.
198
199       --syslog-target=host:port
200            Send  syslog messages to UDP port on host, in addition to the sys‐
201            tem syslog. The host must be a numerical IP address, not  a  host‐
202            name.
203
204       --syslog-method=method
205            Specify  method  as  how  syslog messages should be sent to syslog
206            daemon. The following forms are supported:
207
208libc, to use the libc syslog() function. Downside of  using
209                   this  options  is that libc adds fixed prefix to every mes‐
210                   sage before it is actually sent to the syslog  daemon  over
211                   /dev/log UNIX domain socket.
212
213unix:file, to use a UNIX domain socket directly. It is pos‐
214                   sible to specify arbitrary message format with this option.
215                   However,  rsyslogd  8.9  and  older versions use hard coded
216                   parser function anyway that limits UNIX domain socket  use.
217                   If  you  want  to  use  arbitrary message format with older
218                   rsyslogd versions, then use UDP socket to localhost IP  ad‐
219                   dress instead.
220
221udp:ip:port,  to  use  a UDP socket. With this method it is
222                   possible to use arbitrary message format  also  with  older
223                   rsyslogd.  When sending syslog messages over UDP socket ex‐
224                   tra precaution needs to be taken into account, for example,
225                   syslog daemon needs to be configured to listen on the spec‐
226                   ified UDP port, accidental iptables rules could  be  inter‐
227                   fering  with  local syslog traffic and there are some secu‐
228                   rity considerations that apply to UDP sockets, but  do  not
229                   apply to UNIX domain sockets.
230
231null, to discard all messages logged to syslog.
232
233            The  default is taken from the OVS_SYSLOG_METHOD environment vari‐
234            able; if it is unset, the default is libc.
235
236   PKI Options
237       PKI configuration is required in order to use SSL for  the  connections
238       to the Northbound and Southbound databases.
239
240              -p privkey.pem
241              --private-key=privkey.pem
242                   Specifies  a  PEM  file  containing the private key used as
243                   identity for outgoing SSL connections.
244
245              -c cert.pem
246              --certificate=cert.pem
247                   Specifies a PEM file containing a certificate  that  certi‐
248                   fies the private key specified on -p or --private-key to be
249                   trustworthy. The certificate must be signed by the certifi‐
250                   cate  authority  (CA) that the peer in SSL connections will
251                   use to verify it.
252
253              -C cacert.pem
254              --ca-cert=cacert.pem
255                   Specifies a PEM file containing the CA certificate for ver‐
256                   ifying certificates presented to this program by SSL peers.
257                   (This may be the same certificate that  SSL  peers  use  to
258                   verify the certificate specified on -c or --certificate, or
259                   it may be a different one, depending on the PKI  design  in
260                   use.)
261
262              -C none
263              --ca-cert=none
264                   Disables  verification  of  certificates  presented  by SSL
265                   peers. This introduces a security risk,  because  it  means
266                   that  certificates  cannot be verified to be those of known
267                   trusted hosts.
268
269   Other Options
270       --unixctl=socket
271              Sets the name of the control socket on which program listens for
272              runtime  management  commands  (see RUNTIME MANAGEMENT COMMANDS,
273              below). If socket does not begin with /, it  is  interpreted  as
274              relative  to  .  If  --unixctl  is  not used at all, the default
275              socket is /program.pid.ctl, where pid is program’s process ID.
276
277              On Windows a local named pipe is used to listen for runtime man‐
278              agement  commands.  A  file  is  created in the absolute path as
279              pointed by socket or if --unixctl is not used at all, a file  is
280              created  as  program in the configured OVS_RUNDIR directory. The
281              file exists just to mimic the behavior of a Unix domain socket.
282
283              Specifying none for socket disables the control socket feature.
284
285
286
287       -h
288       --help
289            Prints a brief help message to the console.
290
291       -V
292       --version
293            Prints version information to the console.
294

RUNTIME MANAGEMENT COMMANDS

296       ovs-appctl can send commands to a running ovn-northd process. The  cur‐
297       rently supported commands are described below.
298
299              exit   Causes ovn-northd to gracefully terminate.
300
301              pause  Pauses ovn-northd. When it is paused, ovn-northd receives
302                     changes  from  the  Northbound  and  Southbound  database
303                     changes  as  usual,  but  it does not send any updates. A
304                     paused ovn-northd also drops database locks, which allows
305                     any other non-paused instance of ovn-northd to take over.
306
307              resume Resumes  the  ovn-northd  operation to process Northbound
308                     and Southbound database  contents  and  generate  logical
309                     flows.  This  will also instruct ovn-northd to aspire for
310                     the lock on SB DB.
311
312              is-paused
313                     Returns "true" if ovn-northd is currently paused, "false"
314                     otherwise.
315
316              status Prints  this  server’s status. Status will be "active" if
317                     ovn-northd has acquired OVSDB lock on SB DB, "standby" if
318                     it has not or "paused" if this instance is paused.
319
320              sb-cluster-state-reset
321                     Reset  southbound  database cluster status when databases
322                     are destroyed and rebuilt.
323
324                     If all databases in a clustered southbound  database  are
325                     removed from disk, then the stored index of all databases
326                     will be reset to zero. This will cause ovn-northd  to  be
327                     unable  to  read or write to the southbound database, be‐
328                     cause it will always detect the data as stale. In such  a
329                     case,  run this command so that ovn-northd will reset its
330                     local index so that it can interact with  the  southbound
331                     database again.
332
333              nb-cluster-state-reset
334                     Reset  northbound  database cluster status when databases
335                     are destroyed and rebuilt.
336
337                     This performs the same task as sb-cluster-state-reset ex‐
338                     cept for the northbound database client.
339
340       Only ovn-northd-ddlog supports the following commands:
341
342              enable-cpu-profiling
343              disable-cpu-profiling
344                   Enables or disables profiling of CPU time used by the DDlog
345                   engine. When CPU profiling is enabled, the profile  command
346                   (see  below) will include DDlog CPU usage statistics in its
347                   output. Enabling CPU profiling will slow  ovn-northd-ddlog.
348                   Disabling  CPU  profiling  does  not  clear  any previously
349                   recorded statistics.
350
351              profile
352                   Outputs a profile of the current and peak sizes of arrange‐
353                   ments  inside  DDlog. This profiling data can be useful for
354                   optimizing DDlog code. If CPU profiling was previously  en‐
355                   abled  (even if it was later disabled), the output also in‐
356                   cludes a CPU time profile. See Profiling inside  the  tuto‐
357                   rial in the DDlog repository for an introduction to profil‐
358                   ing DDlog.
359

ACTIVE-STANDBY FOR HIGH AVAILABILITY

361       You may run ovn-northd more than once in an OVN deployment.  When  con‐
362       nected  to  a  standalone or clustered DB setup, OVN will automatically
363       ensure that only one of them is active at a time. If multiple instances
364       of  ovn-northd  are running and the active ovn-northd fails, one of the
365       hot standby instances of ovn-northd will automatically take over.
366
367   Active-Standby with multiple OVN DB servers
368       You may run multiple OVN DB servers in an OVN deployment with:
369
370              •      OVN DB servers deployed in active/passive mode  with  one
371                     active and multiple passive ovsdb-servers.
372
373ovn-northd  also  deployed on all these nodes, using unix
374                     ctl sockets to connect to the local OVN DB servers.
375
376       In such deployments, the ovn-northds on the passive nodes will  process
377       the  DB  changes  and compute logical flows to be thrown out later, be‐
378       cause write transactions are not allowed by the passive  ovsdb-servers.
379       It results in unnecessary CPU usage.
380
381       With  the  help  of  runtime  management  command  pause, you can pause
382       ovn-northd on these nodes. When a passive node becomes master, you  can
383       use  the  runtime management command resume to resume the ovn-northd to
384       process the DB changes.
385

LOGICAL FLOW TABLE STRUCTURE

387       One of the main purposes of ovn-northd is to populate the  Logical_Flow
388       table  in  the  OVN_Southbound  database.  This  section  describes how
389       ovn-northd does this for switch and router logical datapaths.
390
391   Logical Switch Datapaths
392     Ingress Table 0: Admission Control and Ingress Port Security - L2
393
394       Ingress table 0 contains these logical flows:
395
396              •      Priority 100 flows to drop packets with VLAN tags or mul‐
397                     ticast Ethernet source addresses.
398
399              •      Priority  50  flows  that implement ingress port security
400                     for each enabled logical port. For logical ports on which
401                     port  security is enabled, these match the inport and the
402                     valid eth.src address(es) and advance only those  packets
403                     to  the  next flow table. For logical ports on which port
404                     security is not enabled, these advance all  packets  that
405                     match the inport.
406
407              •      For  logical  ports  of type vtep, the above logical flow
408                     will also apply the action REGBIT_FROM_RAMP = 1; to indi‐
409                     cate  that  the packet is coming from a RAMP (controller-
410                     vtep) device. Later pipelines will use  this  information
411                     to skip sending the packet to the conntrack. Packets from
412                     vtep logical ports should go though ingress pipeline only
413                     to  determine the output port and they should not be sub‐
414                     jected to any ACL checks. Egress pipeline will do the ACL
415                     checks.
416
417       There  are no flows for disabled logical ports because the default-drop
418       behavior of logical flow tables causes packets that ingress  from  them
419       to be dropped.
420
421     Ingress Table 1: Ingress Port Security - IP
422
423       Ingress table 1 contains these logical flows:
424
425              •      For  each  element in the port security set having one or
426                     more IPv4 or IPv6 addresses (or both),
427
428                     •      Priority 90 flow to allow IPv4 traffic if  it  has
429                            IPv4  addresses  which  match  the  inport,  valid
430                            eth.src and valid ip4.src address(es).
431
432                     •      Priority 90 flow  to  allow  IPv4  DHCP  discovery
433                            traffic  if it has a valid eth.src. This is neces‐
434                            sary since DHCP discovery messages are  sent  from
435                            the  unspecified  IPv4 address (0.0.0.0) since the
436                            IPv4 address has not yet been assigned.
437
438                     •      Priority 90 flow to allow IPv6 traffic if  it  has
439                            IPv6  addresses  which  match  the  inport,  valid
440                            eth.src and valid ip6.src address(es).
441
442                     •      Priority 90 flow to allow IPv6 DAD (Duplicate  Ad‐
443                            dress   Detection)  traffic  if  it  has  a  valid
444                            eth.src. This is is necessary  since  DAD  include
445                            requires  joining  an  multicast group and sending
446                            neighbor solicitations for the newly assigned  ad‐
447                            dress. Since no address is yet assigned, these are
448                            sent from the unspecified IPv6 address (::).
449
450                     •      Priority 80 flow to drop IP (both IPv4  and  IPv6)
451                            traffic which match the inport and valid eth.src.
452
453              •      One priority-0 fallback flow that matches all packets and
454                     advances to the next table.
455
456     Ingress Table 2: Ingress Port Security - Neighbor discovery
457
458       Ingress table 2 contains these logical flows:
459
460              •      For each element in the port security set,
461
462                     •      Priority 90 flow to allow ARP traffic which  match
463                            the  inport  and valid eth.src and arp.sha. If the
464                            element has one or more IPv4  addresses,  then  it
465                            also matches the valid arp.spa.
466
467                     •      Priority  90 flow to allow IPv6 Neighbor Solicita‐
468                            tion and Advertisement traffic which match the in‐
469                            port, valid eth.src and nd.sll/nd.tll. If the ele‐
470                            ment has one or more IPv6 addresses, then it  also
471                            matches the valid nd.target address(es) for Neigh‐
472                            bor Advertisement traffic.
473
474                     •      Priority 80 flow to drop ARP and IPv6 Neighbor So‐
475                            licitation  and  Advertisement traffic which match
476                            the inport and valid eth.src.
477
478              •      One priority-0 fallback flow that matches all packets and
479                     advances to the next table.
480
481     Ingress Table 3: Lookup MAC address learning table
482
483       This  table looks up the MAC learning table of the logical switch data‐
484       path to check if the port-mac pair is present or  not.  MAC  is  learnt
485       only  for  logical switch VIF ports whose port security is disabled and
486       ’unknown’ address set.
487
488              •      For each such logical port p whose port security is  dis‐
489                     abled and ’unknown’ address set following flow is added.
490
491                     •      Priority  100  flow with the match inport == p and
492                            action  reg0[11]  =  lookup_fdb(inport,  eth.src);
493                            next;
494
495              •      One priority-0 fallback flow that matches all packets and
496                     advances to the next table.
497
498     Ingress Table 4: Learn MAC of ’unknown’ ports.
499
500       This table learns the MAC addresses seen on  the  logical  ports  whose
501       port  security  is disabled and ’unknown’ address set if the lookup_fdb
502       action returned false in the previous table.
503
504              •      For each such logical port p whose port security is  dis‐
505                     abled and ’unknown’ address set following flow is added.
506
507                     •      Priority  100  flow  with the match inport == p &&
508                            reg0[11] == 0 and action put_fdb(inport, eth.src);
509                            next;  which stores the port-mac in the mac learn‐
510                            ing table of the logical switch datapath  and  ad‐
511                            vances the packet to the next table.
512
513              •      One priority-0 fallback flow that matches all packets and
514                     advances to the next table.
515
516     Ingress Table 5: from-lport Pre-ACLs
517
518       This table prepares flows  for  possible  stateful  ACL  processing  in
519       ingress  table  ACLs.  It  contains a priority-0 flow that simply moves
520       traffic to the next table. If stateful ACLs are  used  in  the  logical
521       datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
522       1; next;) for table Pre-stateful to send IP packets to  the  connection
523       tracker  before  eventually advancing to ingress table ACLs. If special
524       ports such as route ports or localnet ports can’t use  ct(),  a  prior‐
525       ity-110 flow is added to skip over stateful ACLs. IPv6 Neighbor Discov‐
526       ery and MLD traffic also skips  stateful  ACLs.  For  "allow-stateless"
527       ACLs, a flow is added to bypass setting the hint for connection tracker
528       processing.
529
530       This table has a priority-110 flow with the match REGBIT_FROM_RAMP == 1
531       for all logical switch datapaths to resubmit traffic to the next table.
532       REGBIT_FROM_RAMP indicates that packet was received from  vtep  logical
533       ports  and  it  can  be skipped from the stateful ACL processing in the
534       ingress pipeline.
535
536       This table also has a priority-110 flow with the match eth.dst == E for
537       all logical switch datapaths to move traffic to the next table. Where E
538       is the service monitor mac defined in the options:svc_monitor_mac colum
539       of NB_Global table.
540
541     Ingress Table 6: Pre-LB
542
543       This table prepares flows for possible stateful load balancing process‐
544       ing in ingress table LB and Stateful. It  contains  a  priority-0  flow
545       that  simply  moves  traffic  to the next table. Moreover it contains a
546       priority-110 flow to move IPv6 Neighbor Discovery and  MLD  traffic  to
547       the  next table. If load balancing rules with virtual IP addresses (and
548       ports) are configured in OVN_Northbound database for a  logical  switch
549       datapath, a priority-100 flow is added with the match ip to match on IP
550       packets and sets the action reg0[2] = 1; next; to act as a hint for ta‐
551       ble  Pre-stateful  to  send  IP  packets  to the connection tracker for
552       packet de-fragmentation (and to possibly do  DNAT  for  already  estab‐
553       lished  load  balanced  traffic) before eventually advancing to ingress
554       table Stateful. If controller_event has been enabled and load balancing
555       rules with empty backends have been added in OVN_Northbound, a 130 flow
556       is added to trigger ovn-controller events whenever the chassis receives
557       a  packet  for  that particular VIP. If event-elb meter has been previ‐
558       ously created, it will be associated to the empty_lb logical flow
559
560       Prior to OVN 20.09 we were setting the reg0[0] = 1 only if the IP  des‐
561       tination  matches  the  load  balancer VIP. However this had few issues
562       cases where a logical switch doesn’t have any ACLs  with  allow-related
563       action.  To  understand  the  issue  lets  a take a TCP load balancer -
564       10.0.0.10:80=10.0.0.3:80. If a logical port - p1  with  IP  -  10.0.0.5
565       opens a TCP connection with the VIP - 10.0.0.10, then the packet in the
566       ingress pipeline of ’p1’ is sent to the p1’s conntrack zone id and  the
567       packet is load balanced to the backend - 10.0.0.3. For the reply packet
568       from the backend lport, it is not sent  to  the  conntrack  of  backend
569       lport’s  zone  id. This is fine as long as the packet is valid. Suppose
570       the backend lport sends an invalid TCP packet (like incorrect  sequence
571       number),  the packet gets delivered to the lport ’p1’ without unDNATing
572       the packet to the VIP - 10.0.0.10. And this causes the connection to be
573       reset by the lport p1’s VIF.
574
575       We can’t fix this issue by adding a logical flow to drop ct.inv packets
576       in the egress pipeline since it will drop  all  other  connections  not
577       destined  to  the  load  balancers.  To fix this issue, we send all the
578       packets to the conntrack in the ingress pipeline if a load balancer  is
579       configured. We can now add a lflow to drop ct.inv packets.
580
581       This table has a priority-110 flow with the match REGBIT_FROM_RAMP == 1
582       for all logical switch datapaths to resubmit traffic to the next table.
583       REGBIT_FROM_RAMP  indicates  that packet was received from vtep logical
584       ports and it can be skipped from the load balancer  processing  in  the
585       ingress pipeline.
586
587       This table also has a priority-110 flow with the match eth.dst == E for
588       all logical switch datapaths to move traffic to the next table. Where E
589       is the service monitor mac defined in the options:svc_monitor_mac colum
590       of NB_Global table.
591
592       This table also has a priority-110 flow with the match inport == I  for
593       all logical switch datapaths to move traffic to the next table. Where I
594       is the peer of a logical router port. This flow is added  to  skip  the
595       connection tracking of packets which enter from logical router datapath
596       to logical switch datapath.
597
598     Ingress Table 7: Pre-stateful
599
600       This table prepares flows for all possible stateful processing in  next
601       tables.  It contains a priority-0 flow that simply moves traffic to the
602       next table.
603
604              •      Priority-120 flows that send the  packets  to  connection
605                     tracker  using  ct_lb;  as the action so that the already
606                     established traffic destined to  the  load  balancer  VIP
607                     gets DNATted based on a hint provided by the previous ta‐
608                     bles (with a match for reg0[2] == 1 and on supported load
609                     balancer  protocols and address families). For IPv4 traf‐
610                     fic the flows also load the original destination  IP  and
611                     transport port in registers reg1 and reg2. For IPv6 traf‐
612                     fic the flows also load the original destination  IP  and
613                     transport port in registers xxreg1 and reg2.
614
615              •      A  priority-110  flow  sends  the  packets  to connection
616                     tracker based on a hint provided by the  previous  tables
617                     (with  a  match for reg0[2] == 1) by using the ct_lb; ac‐
618                     tion. This flow is added to handle the traffic  for  load
619                     balancer  VIPs  whose protocol is not defined (mainly for
620                     ICMP traffic).
621
622              •      A priority-100  flow  sends  the  packets  to  connection
623                     tracker  based  on a hint provided by the previous tables
624                     (with a match for reg0[0] == 1) by using the ct_next; ac‐
625                     tion.
626
627     Ingress Table 8: from-lport ACL hints
628
629       This  table  consists of logical flows that set hints (reg0 bits) to be
630       used in the next stage, in the ACL processing table, if  stateful  ACLs
631       or  load  balancers  are  configured. Multiple hints can be set for the
632       same packet. The possible hints are:
633
634reg0[7]: the packet might match an allow-related ACL  and
635                     might have to commit the connection to conntrack.
636
637reg0[8]:  the packet might match an allow-related ACL but
638                     there will be no need to commit the  connection  to  con‐
639                     ntrack because it already exists.
640
641reg0[9]: the packet might match a drop/reject.
642
643reg0[10]:  the  packet  might match a drop/reject ACL but
644                     the connection was previously allowed so it might have to
645                     be committed again with ct_label=1/1.
646
647       The table contains the following flows:
648
649              •      A priority-65535 flow to advance to the next table if the
650                     logical switch has no ACLs configured, otherwise a prior‐
651                     ity-0 flow to advance to the next table.
652
653              •      A priority-7 flow that matches on packets that initiate a
654                     new session. This flow sets reg0[7] and reg0[9] and  then
655                     advances to the next table.
656
657              •      A priority-6 flow that matches on packets that are in the
658                     request direction of an already existing session that has
659                     been  marked  as  blocked.  This  flow  sets  reg0[7] and
660                     reg0[9] and then advances to the next table.
661
662              •      A priority-5 flow that matches  untracked  packets.  This
663                     flow  sets  reg0[8]  and reg0[9] and then advances to the
664                     next table.
665
666              •      A priority-4 flow that matches on packets that are in the
667                     request direction of an already existing session that has
668                     not been marked as blocked. This flow  sets  reg0[8]  and
669                     reg0[10] and then advances to the next table.
670
671              •      A priority-3 flow that matches on packets that are in not
672                     part of established sessions. This flow sets reg0[9]  and
673                     then advances to the next table.
674
675              •      A  priority-2  flow that matches on packets that are part
676                     of  an  established  session  that  has  been  marked  as
677                     blocked.  This flow sets reg0[9] and then advances to the
678                     next table.
679
680              •      A priority-1 flow that matches on packets that  are  part
681                     of  an  established  session  that has not been marked as
682                     blocked. This flow sets reg0[10] and then advances to the
683                     next table.
684
685     Ingress table 9: from-lport ACLs
686
687       Logical flows in this table closely reproduce those in the ACL table in
688       the OVN_Northbound database for the from-lport direction. The  priority
689       values  from  the ACL table have a limited range and have 1000 added to
690       them to leave room for OVN default flows at both higher and lower  pri‐
691       orities.
692
693allow  ACLs  translate  into logical flows with the next;
694                     action. If there are any stateful ACLs on this  datapath,
695                     then allow ACLs translate to ct_commit; next; (which acts
696                     as a hint for the next tables to commit the connection to
697                     conntrack).  In  case  the  ACL  has a label then reg3 is
698                     loaded with the label value and reg0[13] bit is set to  1
699                     (which  acts  as a hint for the next tables to commit the
700                     label to conntrack).
701
702allow-related ACLs translate into logical flows with  the
703                     ct_commit(ct_label=0/1);  next;  actions  for new connec‐
704                     tions and reg0[1] = 1; next; for existing connections. In
705                     case the ACL has a label then reg3 is loaded with the la‐
706                     bel value and reg0[13] bit is set to 1 (which acts  as  a
707                     hint  for  the  next  tables  to commit the label to con‐
708                     ntrack).
709
710allow-stateless ACLs translate into  logical  flows  with
711                     the next; action.
712
713reject ACLs translate into logical flows with the tcp_re‐
714                     set { output <-> inport;  next(pipeline=egress,table=5);}
715                     action  for  TCP  connections,icmp4/icmp6  action for UDP
716                     connections,  and  sctp_abort  {output   <-%gt;   inport;
717                     next(pipeline=egress,table=5);}  action for SCTP associa‐
718                     tions.
719
720              •      Other ACLs translate to drop; for new or  untracked  con‐
721                     nections  and  ct_commit(ct_label=1/1); for known connec‐
722                     tions. Setting ct_label marks a connection  as  one  that
723                     was  previously  allowed, but should no longer be allowed
724                     due to a policy change.
725
726       This table contains a priority-65535 flow to advance to the next  table
727       if  the  logical  switch has no ACLs configured, otherwise a priority-0
728       flow to advance to the next table so that ACLs  allow  packets  by  de‐
729       fault.
730
731       If  the logical datapath has a stateful ACL or a load balancer with VIP
732       configured, the following flows will also be added:
733
734              •      A priority-1 flow that sets the hint to commit IP traffic
735                     to  the  connection  tracker  (with  action  reg0[1] = 1;
736                     next;). This is needed for the default allow  policy  be‐
737                     cause,  while  the initiator’s direction may not have any
738                     stateful rules, the server’s  may  and  then  its  return
739                     traffic would not be known and marked as invalid.
740
741              •      A  priority-65532 flow that allows any traffic in the re‐
742                     ply direction for a connection that has been committed to
743                     the connection tracker (i.e., established flows), as long
744                     as the committed flow does not have ct_label.blocked set.
745                     We  only  handle  traffic in the reply direction here be‐
746                     cause we want all packets going in the request  direction
747                     to  still  go  through  the flows that implement the cur‐
748                     rently defined policy based on ACLs. If a  connection  is
749                     no  longer  allowed  by policy, ct_label.blocked will get
750                     set and packets in the reply direction will no longer  be
751                     allowed, either.
752
753              •      A  priority-65532  flow  that  allows any traffic that is
754                     considered related to a committed flow in the  connection
755                     tracker  (e.g.,  an ICMP Port Unreachable from a non-lis‐
756                     tening UDP port), as long as the committed flow does  not
757                     have ct_label.blocked set.
758
759              •      A  priority-65532  flow  that drops all traffic marked by
760                     the connection tracker as invalid.
761
762              •      A priority-65532 flow that drops all traffic in the reply
763                     direction with ct_label.blocked set meaning that the con‐
764                     nection should no longer  be  allowed  due  to  a  policy
765                     change. Packets in the request direction are skipped here
766                     to let a newly created ACL re-allow this connection.
767
768              •      A priority-65532 flow that allows IPv6 Neighbor solicita‐
769                     tion,  Neighbor discover, Router solicitation, Router ad‐
770                     vertisement and MLD packets.
771
772       If the logical datapath has any ACL or a load balancer with VIP config‐
773       ured, the following flow will also be added:
774
775              •      A  priority  34000 logical flow is added for each logical
776                     switch datapath with the match eth.dst = E to  allow  the
777                     service  monitor  reply packet destined to ovn-controller
778                     with the action next, where E is the service monitor  mac
779                     defined in the options:svc_monitor_mac colum of NB_Global
780                     table.
781
782     Ingress Table 10: from-lport QoS Marking
783
784       Logical flows in this table closely reproduce those in  the  QoS  table
785       with  the  action  column  set  in  the OVN_Northbound database for the
786       from-lport direction.
787
788              •      For every qos_rules entry in a logical switch  with  DSCP
789                     marking  enabled,  a  flow  will be added at the priority
790                     mentioned in the QoS table.
791
792              •      One priority-0 fallback flow that matches all packets and
793                     advances to the next table.
794
795     Ingress Table 11: from-lport QoS Meter
796
797       Logical  flows  in  this table closely reproduce those in the QoS table
798       with the bandwidth column set in the OVN_Northbound  database  for  the
799       from-lport direction.
800
801              •      For every qos_rules entry in a logical switch with meter‐
802                     ing enabled, a flow will be added at  the  priority  men‐
803                     tioned in the QoS table.
804
805              •      One priority-0 fallback flow that matches all packets and
806                     advances to the next table.
807
808     Ingress Table 12: Stateful
809
810              •      For all the configured load balancing rules for a  switch
811                     in  OVN_Northbound  database that includes a L4 port PORT
812                     of protocol P and IP address VIP, a priority-120 flow  is
813                     added.  For  IPv4 VIPs , the flow matches ct.new && ip &&
814                     ip4.dst == VIP && P && P.dst == PORT. For IPv6 VIPs,  the
815                     flow matches ct.new && ip && ip6.dst == VIP && P && P.dst
816                     == PORT. The flow’s action is ct_lb(args)  ,  where  args
817                     contains  comma separated IP addresses (and optional port
818                     numbers) to load balance to. The address family of the IP
819                     addresses  of  args  is the same as the address family of
820                     VIP. If health check is enabled, then args will only con‐
821                     tain  those  endpoints whose service monitor status entry
822                     in OVN_Southbound db is either online or empty. For  IPv4
823                     traffic  the  flow also loads the original destination IP
824                     and transport port in registers reg1 and reg2.  For  IPv6
825                     traffic  the  flow also loads the original destination IP
826                     and transport port in registers xxreg1 and reg2.
827
828              •      For all the configured load balancing rules for a  switch
829                     in  OVN_Northbound  database that includes just an IP ad‐
830                     dress VIP to match on, OVN adds a priority-110 flow.  For
831                     IPv4  VIPs,  the  flow matches ct.new && ip && ip4.dst ==
832                     VIP. For IPv6 VIPs, the flow  matches  ct.new  &&  ip  &&
833                     ip6.dst  ==  VIP. The action on this flow is ct_lb(args),
834                     where args contains comma separated IP addresses  of  the
835                     same  address  family  as  VIP. For IPv4 traffic the flow
836                     also loads the original destination IP and transport port
837                     in  registers  reg1  and  reg2. For IPv6 traffic the flow
838                     also loads the original destination IP and transport port
839                     in registers xxreg1 and reg2.
840
841              •      If  the load balancer is created with --reject option and
842                     it has no active backends, a TCP reset segment (for  tcp)
843                     or an ICMP port unreachable packet (for all other kind of
844                     traffic) will be sent whenever an incoming packet is  re‐
845                     ceived for this load-balancer. Please note using --reject
846                     option will disable empty_lb SB controller event for this
847                     load balancer.
848
849              •      A  priority 100 flow is added which commits the packet to
850                     the conntrack and sets the most  significant  32-bits  of
851                     ct_label  with  the reg3 value based on the hint provided
852                     by previous tables (with a match  for  reg0[1]  ==  1  &&
853                     reg0[13]  ==  1).  This is used by the ACLs with label to
854                     commit the label value to conntrack.
855
856              •      For ACLs without label, a second priority-100  flow  com‐
857                     mits packets to connection tracker using ct_commit; next;
858                     action based on a hint provided by  the  previous  tables
859                     (with a match for reg0[1] == 1 && reg0[13] == 0).
860
861              •      A  priority-0  flow that simply moves traffic to the next
862                     table.
863
864     Ingress Table 13: Pre-Hairpin
865
866              •      If the logical switch has  load  balancer(s)  configured,
867                     then  a  priority-100  flow is added with the match ip &&
868                     ct.trk to check if the packet needs to be hairpinned  (if
869                     after  load  balancing  the  destination  IP  matches the
870                     source IP) or not by  executing  the  actions  reg0[6]  =
871                     chk_lb_hairpin();  and reg0[12] = chk_lb_hairpin_reply();
872                     and advances the packet to the next table.
873
874              •      A priority-0 flow that simply moves traffic to  the  next
875                     table.
876
877     Ingress Table 14: Nat-Hairpin
878
879              •      If  the  logical  switch has load balancer(s) configured,
880                     then a priority-100 flow is added with the  match  ip  &&
881                     ct.new && ct.trk && reg0[6] == 1 which hairpins the traf‐
882                     fic by NATting source IP to the load balancer VIP by exe‐
883                     cuting  the action ct_snat_to_vip and advances the packet
884                     to the next table.
885
886              •      If the logical switch has  load  balancer(s)  configured,
887                     then  a  priority-100  flow is added with the match ip &&
888                     ct.est && ct.trk && reg0[6] == 1 which hairpins the traf‐
889                     fic by NATting source IP to the load balancer VIP by exe‐
890                     cuting the action ct_snat and advances the packet to  the
891                     next table.
892
893              •      If  the  logical  switch has load balancer(s) configured,
894                     then a priority-90 flow is added with  the  match  ip  &&
895                     reg0[12]  == 1 which matches on the replies of hairpinned
896                     traffic (i.e., destination IP is VIP, source  IP  is  the
897                     backend IP and source L4 port is backend port for L4 load
898                     balancers) and executes ct_snat and advances  the  packet
899                     to the next table.
900
901              •      A  priority-0  flow that simply moves traffic to the next
902                     table.
903
904     Ingress Table 15: Hairpin
905
906              •      A priority-1 flow that hairpins traffic matched  by  non-
907                     default  flows  in  the Pre-Hairpin table. Hairpinning is
908                     done at L2, Ethernet addresses are swapped and the  pack‐
909                     ets are looped back on the input port.
910
911              •      A  priority-0  flow that simply moves traffic to the next
912                     table.
913
914     Ingress Table 16: ARP/ND responder
915
916       This table implements ARP/ND responder in a logical  switch  for  known
917       IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
918       by locally responding to ARP requests without the need to send to other
919       hypervisors. One common case is when the inport is a logical port asso‐
920       ciated with a VIF and the broadcast is responded to on the local hyper‐
921       visor  rather  than broadcast across the whole network and responded to
922       by the destination VM. This behavior is proxy ARP.
923
924       ARP requests arrive from VMs from a logical switch inport of  type  de‐
925       fault.  For  this  case,  the logical switch proxy ARP rules can be for
926       other VMs or logical router ports. Logical switch proxy ARP  rules  may
927       be  programmed  both  for  mac binding of IP addresses on other logical
928       switch VIF ports (which are of the default logical  switch  port  type,
929       representing connectivity to VMs or containers), and for mac binding of
930       IP addresses on logical switch router type  ports,  representing  their
931       logical  router  port  peers. In order to support proxy ARP for logical
932       router ports, an IP address must be configured on  the  logical  switch
933       router  type port, with the same value as the peer logical router port.
934       The configured MAC addresses must match as well. When a VM sends an ARP
935       request  for  a  distributed logical router port and if the peer router
936       type port of the attached logical switch does not have  an  IP  address
937       configured,  the  ARP  request will be broadcast on the logical switch.
938       One of the copies of the ARP request will go through the logical switch
939       router  type  port  to  the  logical router datapath, where the logical
940       router ARP responder will generate a reply. The MAC binding of  a  dis‐
941       tributed  logical router, once learned by an associated VM, is used for
942       all that VM’s communication needing routing. Hence, the action of a  VM
943       re-arping  for  the  mac  binding  of the logical router port should be
944       rare.
945
946       Logical switch ARP responder proxy ARP rules can also be hit  when  re‐
947       ceiving ARP requests externally on a L2 gateway port. In this case, the
948       hypervisor acting as an L2 gateway, responds to the ARP request on  be‐
949       half of a destination VM.
950
951       Note  that  ARP requests received from localnet or vtep logical inports
952       can either go directly to VMs, in which case the VM responds or can hit
953       an ARP responder for a logical router port if the packet is used to re‐
954       solve a logical router port next hop address. In either  case,  logical
955       switch  ARP  responder rules will not be hit. It contains these logical
956       flows:
957
958              •      Priority-100 flows to skip the ARP responder if inport is
959                     of  type  localnet  or  vtep and advances directly to the
960                     next table. ARP requests sent to localnet or  vtep  ports
961                     can be received by multiple hypervisors. Now, because the
962                     same mac binding rules are downloaded to all hypervisors,
963                     each  of the multiple hypervisors will respond. This will
964                     confuse L2 learning on the source of  the  ARP  requests.
965                     ARP requests received on an inport of type router are not
966                     expected to hit any logical switch ARP  responder  flows.
967                     However,  no  skip flows are installed for these packets,
968                     as there would be some additional flow cost for this  and
969                     the value appears limited.
970
971              •      If  inport V is of type virtual adds a priority-100 logi‐
972                     cal flow for each P configured  in  the  options:virtual-
973                     parents column with the match
974
975                     inport == P && && ((arp.op == 1 && arp.spa == VIP && arp.tpa == VIP) || (arp.op == 2 && arp.spa == VIP))
976
977
978                     and applies the action
979
980                     bind_vport(V, inport);
981
982
983                     and advances the packet to the next table.
984
985                     Where  VIP is the virtual ip configured in the column op‐
986                     tions:virtual-ip.
987
988              •      Priority-50 flows that match ARP requests to  each  known
989                     IP  address  A  of every logical switch port, and respond
990                     with ARP replies directly with corresponding Ethernet ad‐
991                     dress E:
992
993                     eth.dst = eth.src;
994                     eth.src = E;
995                     arp.op = 2; /* ARP reply. */
996                     arp.tha = arp.sha;
997                     arp.sha = E;
998                     arp.tpa = arp.spa;
999                     arp.spa = A;
1000                     outport = inport;
1001                     flags.loopback = 1;
1002                     output;
1003
1004
1005                     These  flows  are  omitted  for logical ports (other than
1006                     router ports or localport ports) that  are  down  (unless
1007                     ignore_lsp_down  is  configured as true in options column
1008                     of NB_Global table of the Northbound database), for logi‐
1009                     cal  ports  of  type virtual, for logical ports with ’un‐
1010                     known’ address set and for logical  ports  of  a  logical
1011                     switch configured with other_config:vlan-passthru=true.
1012
1013                     The  above  ARP responder flows are added for the list of
1014                     IPv4 addresses if defined in options:arp_proxy column  of
1015                     Logical_Switch_Port  table  for  logical  switch ports of
1016                     type router.
1017
1018              •      Priority-50 flows that match IPv6 ND  neighbor  solicita‐
1019                     tions  to each known IP address A (and A’s solicited node
1020                     address) of every logical  switch  port  except  of  type
1021                     router, and respond with neighbor advertisements directly
1022                     with corresponding Ethernet address E:
1023
1024                     nd_na {
1025                         eth.src = E;
1026                         ip6.src = A;
1027                         nd.target = A;
1028                         nd.tll = E;
1029                         outport = inport;
1030                         flags.loopback = 1;
1031                         output;
1032                     };
1033
1034
1035                     Priority-50 flows that match IPv6 ND  neighbor  solicita‐
1036                     tions  to each known IP address A (and A’s solicited node
1037                     address) of logical switch port of type router,  and  re‐
1038                     spond  with  neighbor advertisements directly with corre‐
1039                     sponding Ethernet address E:
1040
1041                     nd_na_router {
1042                         eth.src = E;
1043                         ip6.src = A;
1044                         nd.target = A;
1045                         nd.tll = E;
1046                         outport = inport;
1047                         flags.loopback = 1;
1048                         output;
1049                     };
1050
1051
1052                     These flows are omitted for  logical  ports  (other  than
1053                     router  ports  or  localport ports) that are down (unless
1054                     ignore_lsp_down is configured as true in  options  column
1055                     of NB_Global table of the Northbound database), for logi‐
1056                     cal ports of type virtual and for logical ports with ’un‐
1057                     known’ address set.
1058
1059              •      Priority-100  flows  with match criteria like the ARP and
1060                     ND flows above, except that they only match packets  from
1061                     the  inport  that owns the IP addresses in question, with
1062                     action next;. These flows prevent OVN from  replying  to,
1063                     for  example,  an ARP request emitted by a VM for its own
1064                     IP address. A VM only makes this kind of request  to  at‐
1065                     tempt  to  detect  a  duplicate IP address assignment, so
1066                     sending a reply will prevent the VM from accepting the IP
1067                     address that it owns.
1068
1069                     In  place  of  next;, it would be reasonable to use drop;
1070                     for the flows’ actions. If everything is working as it is
1071                     configured,  then  this would produce equivalent results,
1072                     since no host should reply to the request. But ARPing for
1073                     one’s  own  IP  address  is intended to detect situations
1074                     where the network is not working as configured, so  drop‐
1075                     ping the request would frustrate that intent.
1076
1077              •      For  each  SVC_MON_SRC_IP  defined  in  the  value of the
1078                     ip_port_mappings:ENDPOINT_IP column of Load_Balancer  ta‐
1079                     ble,  priority-110  logical  flow is added with the match
1080                     arp.tpa == SVC_MON_SRC_IP && && arp.op == 1  and  applies
1081                     the action
1082
1083                     eth.dst = eth.src;
1084                     eth.src = E;
1085                     arp.op = 2; /* ARP reply. */
1086                     arp.tha = arp.sha;
1087                     arp.sha = E;
1088                     arp.tpa = arp.spa;
1089                     arp.spa = A;
1090                     outport = inport;
1091                     flags.loopback = 1;
1092                     output;
1093
1094
1095                     where  E is the service monitor source mac defined in the
1096                     options:svc_monitor_mac column in  the  NB_Global  table.
1097                     This mac is used as the source mac in the service monitor
1098                     packets for the load balancer endpoint IP health checks.
1099
1100                     SVC_MON_SRC_IP is used as the source ip  in  the  service
1101                     monitor  IPv4  packets  for the load balancer endpoint IP
1102                     health checks.
1103
1104                     These flows are required if an ARP request  is  sent  for
1105                     the IP SVC_MON_SRC_IP.
1106
1107              •      For  each  VIP configured in the table Forwarding_Group a
1108                     priority-50 logical flow is added with the match  arp.tpa
1109                     == vip && && arp.op == 1
1110                      and applies the action
1111
1112                     eth.dst = eth.src;
1113                     eth.src = E;
1114                     arp.op = 2; /* ARP reply. */
1115                     arp.tha = arp.sha;
1116                     arp.sha = E;
1117                     arp.tpa = arp.spa;
1118                     arp.spa = A;
1119                     outport = inport;
1120                     flags.loopback = 1;
1121                     output;
1122
1123
1124                     where  E  is  the  forwarding  group’s mac defined in the
1125                     vmac.
1126
1127                     A is used as either the destination ip for load balancing
1128                     traffic  to child ports or as nexthop to hosts behind the
1129                     child ports.
1130
1131                     These flows are required to respond to an ARP request  if
1132                     an ARP request is sent for the IP vip.
1133
1134              •      One priority-0 fallback flow that matches all packets and
1135                     advances to the next table.
1136
1137     Ingress Table 17: DHCP option processing
1138
1139       This table adds the DHCPv4 options to a DHCPv4 packet from the  logical
1140       ports  configured  with  IPv4 address(es) and DHCPv4 options, and simi‐
1141       larly for DHCPv6 options. This table also adds flows  for  the  logical
1142       ports of type external.
1143
1144              •      A  priority-100  logical  flow is added for these logical
1145                     ports which matches the IPv4 packet with udp.src = 68 and
1146                     udp.dst = 67 and applies the action put_dhcp_opts and ad‐
1147                     vances the packet to the next table.
1148
1149                     reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
1150                     next;
1151
1152
1153                     For DHCPDISCOVER and  DHCPREQUEST,  this  transforms  the
1154                     packet  into  a DHCP reply, adds the DHCP offer IP ip and
1155                     options to the packet, and stores  1  into  reg0[3].  For
1156                     other  kinds  of  packets, it just stores 0 into reg0[3].
1157                     Either way, it continues to the next table.
1158
1159              •      A priority-100 logical flow is added  for  these  logical
1160                     ports  which  matches  the IPv6 packet with udp.src = 546
1161                     and udp.dst = 547 and applies the action  put_dhcpv6_opts
1162                     and advances the packet to the next table.
1163
1164                     reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
1165                     next;
1166
1167
1168                     For  DHCPv6  Solicit/Request/Confirm packets, this trans‐
1169                     forms the packet into a DHCPv6 Advertise/Reply, adds  the
1170                     DHCPv6  offer IP ip and options to the packet, and stores
1171                     1 into reg0[3]. For  other  kinds  of  packets,  it  just
1172                     stores  0  into  reg0[3]. Either way, it continues to the
1173                     next table.
1174
1175              •      A priority-0 flow that matches all packets to advances to
1176                     table 16.
1177
1178     Ingress Table 18: DHCP responses
1179
1180       This  table implements DHCP responder for the DHCP replies generated by
1181       the previous table.
1182
1183              •      A priority 100 logical flow  is  added  for  the  logical
1184                     ports  configured  with DHCPv4 options which matches IPv4
1185                     packets with udp.src == 68 && udp.dst == 67 && reg0[3] ==
1186                     1  and  responds  back to the inport after applying these
1187                     actions. If reg0[3] is set to 1, it means that the action
1188                     put_dhcp_opts was successful.
1189
1190                     eth.dst = eth.src;
1191                     eth.src = E;
1192                     ip4.src = S;
1193                     udp.src = 67;
1194                     udp.dst = 68;
1195                     outport = P;
1196                     flags.loopback = 1;
1197                     output;
1198
1199
1200                     where  E  is  the  server MAC address and S is the server
1201                     IPv4 address defined in the  DHCPv4  options.  Note  that
1202                     ip4.dst field is handled by put_dhcp_opts.
1203
1204                     (This  terminates  ingress  packet processing; the packet
1205                     does not go to the next ingress table.)
1206
1207              •      A priority 100 logical flow  is  added  for  the  logical
1208                     ports  configured  with DHCPv6 options which matches IPv6
1209                     packets with udp.src == 546 && udp.dst == 547 &&  reg0[3]
1210                     == 1 and responds back to the inport after applying these
1211                     actions. If reg0[3] is set to 1, it means that the action
1212                     put_dhcpv6_opts was successful.
1213
1214                     eth.dst = eth.src;
1215                     eth.src = E;
1216                     ip6.dst = A;
1217                     ip6.src = S;
1218                     udp.src = 547;
1219                     udp.dst = 546;
1220                     outport = P;
1221                     flags.loopback = 1;
1222                     output;
1223
1224
1225                     where  E  is  the  server MAC address and S is the server
1226                     IPv6 LLA address generated from the server_id defined  in
1227                     the  DHCPv6  options and A is the IPv6 address defined in
1228                     the logical port’s addresses column.
1229
1230                     (This terminates packet processing; the packet  does  not
1231                     go on the next ingress table.)
1232
1233              •      A priority-0 flow that matches all packets to advances to
1234                     table 17.
1235
1236     Ingress Table 19 DNS Lookup
1237
1238       This table looks up and resolves the DNS  names  to  the  corresponding
1239       configured IP address(es).
1240
1241              •      A priority-100 logical flow for each logical switch data‐
1242                     path if it is configured with DNS records, which  matches
1243                     the  IPv4  and IPv6 packets with udp.dst = 53 and applies
1244                     the action dns_lookup and advances the packet to the next
1245                     table.
1246
1247                     reg0[4] = dns_lookup(); next;
1248
1249
1250                     For  valid DNS packets, this transforms the packet into a
1251                     DNS reply if the DNS name can be resolved, and  stores  1
1252                     into reg0[4]. For failed DNS resolution or other kinds of
1253                     packets, it just stores 0 into reg0[4].  Either  way,  it
1254                     continues to the next table.
1255
1256     Ingress Table 20 DNS Responses
1257
1258       This  table  implements  DNS responder for the DNS replies generated by
1259       the previous table.
1260
1261              •      A priority-100 logical flow for each logical switch data‐
1262                     path  if it is configured with DNS records, which matches
1263                     the IPv4 and IPv6 packets with udp.dst = 53 && reg0[4] ==
1264                     1  and  responds  back to the inport after applying these
1265                     actions. If reg0[4] is set to 1, it means that the action
1266                     dns_lookup was successful.
1267
1268                     eth.dst <-> eth.src;
1269                     ip4.src <-> ip4.dst;
1270                     udp.dst = udp.src;
1271                     udp.src = 53;
1272                     outport = P;
1273                     flags.loopback = 1;
1274                     output;
1275
1276
1277                     (This  terminates  ingress  packet processing; the packet
1278                     does not go to the next ingress table.)
1279
1280     Ingress table 21 External ports
1281
1282       Traffic from the external logical  ports  enter  the  ingress  datapath
1283       pipeline via the localnet port. This table adds the below logical flows
1284       to handle the traffic from these ports.
1285
1286              •      A priority-100 flow is added for  each  external  logical
1287                     port  which  doesn’t  reside  on  a  chassis  to drop the
1288                     ARP/IPv6 NS request to the router IP(s) (of  the  logical
1289                     switch) which matches on the inport of the external logi‐
1290                     cal port and the valid eth.src address(es) of the  exter‐
1291                     nal logical port.
1292
1293                     This  flow  guarantees  that  the  ARP/NS  request to the
1294                     router IP address from the external ports is responded by
1295                     only  the chassis which has claimed these external ports.
1296                     All the other chassis, drops these packets.
1297
1298                     A priority-100 flow is added for  each  external  logical
1299                     port which doesn’t reside on a chassis to drop any packet
1300                     destined to the router mac - with the match inport == ex‐
1301                     ternal  &&  eth.src  ==  E  &&  eth.dst == R && !is_chas‐
1302                     sis_resident("external") where E is the external port mac
1303                     and R is the router port mac.
1304
1305              •      A priority-0 flow that matches all packets to advances to
1306                     table 20.
1307
1308     Ingress Table 22 Destination Lookup
1309
1310       This table implements switching behavior.  It  contains  these  logical
1311       flows:
1312
1313              •      A  priority-110  flow with the match eth.src == E for all
1314                     logical switch datapaths  and  applies  the  action  han‐
1315                     dle_svc_check(inport). Where E is the service monitor mac
1316                     defined in the options:svc_monitor_mac colum of NB_Global
1317                     table.
1318
1319              •      A  priority-100  flow  that punts all IGMP/MLD packets to
1320                     ovn-controller if multicast snooping is  enabled  on  the
1321                     logical switch. The flow also forwards the IGMP/MLD pack‐
1322                     ets  to  the  MC_MROUTER_STATIC  multicast  group,  which
1323                     ovn-northd populates with all the logical ports that have
1324                     options :mcast_flood_reports=’true’.
1325
1326              •      Priority-90 flows that forward  registered  IP  multicast
1327                     traffic  to  their  corresponding  multicast group, which
1328                     ovn-northd creates based on  learnt  IGMP_Group  entries.
1329                     The  flows  also  forward packets to the MC_MROUTER_FLOOD
1330                     multicast group, which ovn-nortdh populates with all  the
1331                     logical  ports that are connected to logical routers with
1332                     options:mcast_relay=’true’.
1333
1334              •      A priority-85 flow that forwards all IP multicast traffic
1335                     destined  to  224.0.0.X  to the MC_FLOOD multicast group,
1336                     which  ovn-northd  populates  with  all  enabled  logical
1337                     ports.
1338
1339              •      A priority-85 flow that forwards all IP multicast traffic
1340                     destined to reserved multicast IPv6 addresses (RFC  4291,
1341                     2.7.1,  e.g.,  Solicited-Node  multicast) to the MC_FLOOD
1342                     multicast group, which ovn-northd populates with all  en‐
1343                     abled logical ports.
1344
1345              •      A priority-80 flow that forwards all unregistered IP mul‐
1346                     ticast traffic to the MC_STATIC  multicast  group,  which
1347                     ovn-northd populates with all the logical ports that have
1348                     options :mcast_flood=’true’. The flow also  forwards  un‐
1349                     registered  IP  multicast traffic to the MC_MROUTER_FLOOD
1350                     multicast group, which ovn-northd populates with all  the
1351                     logical  ports connected to logical routers that have op‐
1352                     tions :mcast_relay=’true’.
1353
1354              •      A priority-80 flow that drops all unregistered IP  multi‐
1355                     cast  traffic  if  other_config  :mcast_snoop=’true’  and
1356                     other_config  :mcast_flood_unregistered=’false’  and  the
1357                     switch  is not connected to a logical router that has op‐
1358                     tions :mcast_relay=’true’ and the switch doesn’t have any
1359                     logical port with options :mcast_flood=’true’.
1360
1361              •      Priority-80  flows  for  each  IP address/VIP/NAT address
1362                     owned by a router port connected  to  the  switch.  These
1363                     flows  match ARP requests and ND packets for the specific
1364                     IP addresses. Matched packets are forwarded only  to  the
1365                     router  that  owns  the IP address and to the MC_FLOOD_L2
1366                     multicast group which  contains  all  non-router  logical
1367                     ports.
1368
1369              •      Priority-90  flows  for  each  IP address/VIP/NAT address
1370                     configured outside its owning router port’s subnet. These
1371                     flows  match ARP requests and ND packets for the specific
1372                     IP  addresses.  Matched  packets  are  forwarded  to  the
1373                     MC_FLOOD  multicast  group  which  contains all connected
1374                     logical ports.
1375
1376              •      Priority-75 flows for each port connected  to  a  logical
1377                     router  matching  self originated ARP request/ND packets.
1378                     These packets are flooded to the MC_FLOOD_L2  which  con‐
1379                     tains all non-router logical ports.
1380
1381              •      A  priority-70 flow that outputs all packets with an Eth‐
1382                     ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
1383                     ticast group.
1384
1385              •      One priority-50 flow that matches each known Ethernet ad‐
1386                     dress against eth.dst and outputs the packet to the  sin‐
1387                     gle associated output port.
1388
1389                     For the Ethernet address on a logical switch port of type
1390                     router, when that logical switch port’s addresses  column
1391                     is  set  to  router and the connected logical router port
1392                     has a gateway chassis:
1393
1394                     •      The flow for the connected logical  router  port’s
1395                            Ethernet address is only programmed on the gateway
1396                            chassis.
1397
1398                     •      If the logical router has rules specified  in  nat
1399                            with  external_mac,  then those addresses are also
1400                            used to populate the switch’s  destination  lookup
1401                            on the chassis where logical_port is resident.
1402
1403                     For the Ethernet address on a logical switch port of type
1404                     router, when that logical switch port’s addresses  column
1405                     is  set  to  router and the connected logical router port
1406                     specifies a reside-on-redirect-chassis  and  the  logical
1407                     router to which the connected logical router port belongs
1408                     to has a distributed gateway LRP:
1409
1410                     •      The flow for the connected logical  router  port’s
1411                            Ethernet address is only programmed on the gateway
1412                            chassis.
1413
1414                     For each  forwarding  group  configured  on  the  logical
1415                     switch  datapath,  a  priority-50  flow  that  matches on
1416                     eth.dst == VIP
1417                      with an action  of  fwd_group(childports=args  ),  where
1418                     args  contains comma separated logical switch child ports
1419                     to load balance to. If liveness is enabled,  then  action
1420                     also includes  liveness=true.
1421
1422              •      One  priority-0  fallback  flow  that matches all packets
1423                     with the action outport =  get_fdb(eth.dst);  next;.  The
1424                     action  get_fdb  gets the port for the eth.dst in the MAC
1425                     learning table of the logical switch datapath.  If  there
1426                     is  no  entry for eth.dst in the MAC learning table, then
1427                     it stores none in the outport.
1428
1429     Ingress Table 24 Destination unknown
1430
1431       This table handles the packets whose destination was not found  or  and
1432       looked  up in the MAC learning table of the logical switch datapath. It
1433       contains the following flows.
1434
1435              •      If the logical switch has logical  ports  with  ’unknown’
1436                     addresses set, then the below logical flow is added
1437
1438                     •      Priority  50  flow  with the match outport == none
1439                            then outputs  them  to  the  MC_UNKNOWN  multicast
1440                            group, which ovn-northd populates with all enabled
1441                            logical  ports  that  accept  unknown  destination
1442                            packets.  As  a  small optimization, if no logical
1443                            ports   accept   unknown   destination    packets,
1444                            ovn-northd  omits this multicast group and logical
1445                            flow.
1446
1447                     If the logical switch has no logical ports with ’unknown’
1448                     address set, then the below logical flow is added
1449
1450                     •      Priority  50  flow  with the match outport == none
1451                            and drops the packets.
1452
1453              •      One priority-0 fallback flow that outputs the  packet  to
1454                     the egress stage with the outport learnt from get_fdb ac‐
1455                     tion.
1456
1457     Egress Table 0: Pre-LB
1458
1459       This table is similar to ingress table Pre-LB. It contains a priority-0
1460       flow  that simply moves traffic to the next table. Moreover it contains
1461       a priority-110 flow to move IPv6 Neighbor Discovery traffic to the next
1462       table.  If  any  load  balancing rules exist for the datapath, a prior‐
1463       ity-100 flow is added with a match of ip and action  of  reg0[2]  =  1;
1464       next; to act as a hint for table Pre-stateful to send IP packets to the
1465       connection tracker for packet de-fragmentation and  possibly  DNAT  the
1466       destination  VIP  to  one  of the selected backend for already commited
1467       load balanced traffic.
1468
1469       This table also has a priority-110 flow with the match eth.src == E for
1470       all logical switch datapaths to move traffic to the next table. Where E
1471       is the service monitor mac defined in the options:svc_monitor_mac colum
1472       of NB_Global table.
1473
1474     Egress Table 1: to-lport Pre-ACLs
1475
1476       This is similar to ingress table Pre-ACLs except for to-lport traffic.
1477
1478       This table also has a priority-110 flow with the match eth.src == E for
1479       all logical switch datapaths to move traffic to the next table. Where E
1480       is the service monitor mac defined in the options:svc_monitor_mac colum
1481       of NB_Global table.
1482
1483       This table also has a priority-110 flow with the match outport == I for
1484       all logical switch datapaths to move traffic to the next table. Where I
1485       is the peer of a logical router port. This flow is added  to  skip  the
1486       connection  tracking  of  packets which will be entering logical router
1487       datapath from logical switch datapath for routing.
1488
1489     Egress Table 2: Pre-stateful
1490
1491       This is similar to ingress table Pre-stateful. This table adds the  be‐
1492       low 3 logical flows.
1493
1494              •      A  Priority-120  flow that send the packets to connection
1495                     tracker using ct_lb; as the action so  that  the  already
1496                     established traffic gets unDNATted from the backend IP to
1497                     the load balancer VIP based on a  hint  provided  by  the
1498                     previous  tables  with  a  match for reg0[2] == 1. If the
1499                     packet was not DNATted earlier, then ct_lb functions like
1500                     ct_next.
1501
1502              •      A  priority-100  flow  sends  the  packets  to connection
1503                     tracker based on a hint provided by the  previous  tables
1504                     (with a match for reg0[0] == 1) by using the ct_next; ac‐
1505                     tion.
1506
1507              •      A priority-0 flow that matches all packets to advance  to
1508                     the next table.
1509
1510     Egress Table 3: from-lport ACL hints
1511
1512       This is similar to ingress table ACL hints.
1513
1514     Egress Table 4: to-lport ACLs
1515
1516       This is similar to ingress table ACLs except for to-lport ACLs.
1517
1518       In addition, the following flows are added.
1519
1520              •      A  priority  34000 logical flow is added for each logical
1521                     port which has DHCPv4 options defined to allow the DHCPv4
1522                     reply  packet and which has DHCPv6 options defined to al‐
1523                     low the DHCPv6 reply packet from the  Ingress  Table  16:
1524                     DHCP responses.
1525
1526              •      A  priority  34000 logical flow is added for each logical
1527                     switch datapath configured  with  DNS  records  with  the
1528                     match udp.dst = 53 to allow the DNS reply packet from the
1529                     Ingress Table 18: DNS responses.
1530
1531              •      A priority 34000 logical flow is added for  each  logical
1532                     switch  datapath  with the match eth.src = E to allow the
1533                     service monitor  request  packet  generated  by  ovn-con‐
1534                     troller with the action next, where E is the service mon‐
1535                     itor mac defined in the options:svc_monitor_mac colum  of
1536                     NB_Global table.
1537
1538     Egress Table 5: to-lport QoS Marking
1539
1540       This  is  similar  to  ingress  table  QoS marking except they apply to
1541       to-lport QoS rules.
1542
1543     Egress Table 6: to-lport QoS Meter
1544
1545       This is similar to  ingress  table  QoS  meter  except  they  apply  to
1546       to-lport QoS rules.
1547
1548     Egress Table 7: Stateful
1549
1550       This  is  similar  to  ingress  table Stateful except that there are no
1551       rules added for load balancing new connections.
1552
1553     Egress Table 8: Egress Port Security - IP
1554
1555       This is similar to the port security logic in table Ingress Port  Secu‐
1556       rity - IP except that outport, eth.dst, ip4.dst and ip6.dst are checked
1557       instead of inport, eth.src, ip4.src and ip6.src
1558
1559     Egress Table 9: Egress Port Security - L2
1560
1561       This is similar to the ingress port security logic in ingress table Ad‐
1562       mission Control and Ingress Port Security - L2, but with important dif‐
1563       ferences. Most obviously, outport and eth.dst are  checked  instead  of
1564       inport  and eth.src. Second, packets directed to broadcast or multicast
1565       eth.dst are always accepted instead of being subject to the port  secu‐
1566       rity  rules;  this  is  implemented  through  a  priority-100 flow that
1567       matches on eth.mcast with action output;. Moreover, to ensure that even
1568       broadcast  and  multicast packets are not delivered to disabled logical
1569       ports, a priority-150 flow for each disabled logical outport  overrides
1570       the  priority-100  flow  with a drop; action. Finally if egress qos has
1571       been enabled on a localnet port, the outgoing queue id is  set  through
1572       set_queue  action.  Please  remember to mark the corresponding physical
1573       interface with ovn-egress-iface set to true in external_ids
1574
1575   Logical Router Datapaths
1576       Logical router datapaths will only exist for Logical_Router rows in the
1577       OVN_Northbound database that do not have enabled set to false
1578
1579     Ingress Table 0: L2 Admission Control
1580
1581       This  table drops packets that the router shouldn’t see at all based on
1582       their Ethernet headers. It contains the following flows:
1583
1584              •      Priority-100 flows to drop packets with VLAN tags or mul‐
1585                     ticast Ethernet source addresses.
1586
1587              •      For each enabled router port P with Ethernet address E, a
1588                     priority-50 flow that matches inport == P  &&  (eth.mcast
1589                     || eth.dst == E), stores the router port ethernet address
1590                     and advances to next table, with  action  xreg0[0..47]=E;
1591                     next;.
1592
1593                     For  the  gateway  port  on  a distributed logical router
1594                     (where one of the logical router ports specifies a  gate‐
1595                     way  chassis),  the  above  flow matching eth.dst == E is
1596                     only programmed on the gateway port instance on the gate‐
1597                     way chassis.
1598
1599                     For  a  distributed  logical router or for gateway router
1600                     where the port is configured with options:gateway_mtu the
1601                     action   of   the   above   flow   is   modified   adding
1602                     check_pkt_larger in order to mark the packet setting REG‐
1603                     BIT_PKT_LARGER if the size is greater than the MTU.
1604
1605              •      For  each  dnat_and_snat NAT rule on a distributed router
1606                     that specifies an external Ethernet address E,  a  prior‐
1607                     ity-50  flow  that  matches inport == GW && eth.dst == E,
1608                     where GW is the logical router gateway port, with  action
1609                     xreg0[0..47]=E; next;.
1610
1611                     This flow is only programmed on the gateway port instance
1612                     on the chassis where the logical_port  specified  in  the
1613                     NAT rule resides.
1614
1615       Other packets are implicitly dropped.
1616
1617     Ingress Table 1: Neighbor lookup
1618
1619       For  ARP and IPv6 Neighbor Discovery packets, this table looks into the
1620       MAC_Binding records to determine if OVN needs to learn  the  mac  bind‐
1621       ings. Following flows are added:
1622
1623              •      For  each router port P that owns IP address A, which be‐
1624                     longs to subnet S with prefix length L, if the option al‐
1625                     ways_learn_from_arp_request  is  true  for this router, a
1626                     priority-100 flow is added which matches inport ==  P  &&
1627                     arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1628                     lowing actions:
1629
1630                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1631                     next;
1632
1633
1634                     If the option always_learn_from_arp_request is false, the
1635                     following two flows are added.
1636
1637                     A priority-110 flow is added which matches inport == P &&
1638                     arp.spa == S/L && arp.tpa == A && arp.op ==  1  (ARP  re‐
1639                     quest) with the following actions:
1640
1641                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1642                     reg9[3] = 1;
1643                     next;
1644
1645
1646                     A priority-100 flow is added which matches inport == P &&
1647                     arp.spa == S/L && arp.op == 1 (ARP request) with the fol‐
1648                     lowing actions:
1649
1650                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1651                     reg9[3] = lookup_arp_ip(inport, arp.spa);
1652                     next;
1653
1654
1655                     If  the  logical  router  port P is a distributed gateway
1656                     router port, additional  match  is_chassis_resident(cr-P)
1657                     is added for all these flows.
1658
1659              •      A  priority-100  flow  which matches on ARP reply packets
1660                     and   applies   the   actions   if   the    option    al‐
1661                     ways_learn_from_arp_request is true:
1662
1663                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1664                     next;
1665
1666
1667                     If the option always_learn_from_arp_request is false, the
1668                     above actions will be:
1669
1670                     reg9[2] = lookup_arp(inport, arp.spa, arp.sha);
1671                     reg9[3] = 1;
1672                     next;
1673
1674
1675              •      A priority-100 flow which matches on IPv6  Neighbor  Dis‐
1676                     covery  advertisement  packet  and applies the actions if
1677                     the option always_learn_from_arp_request is true:
1678
1679                     reg9[2] = lookup_nd(inport, nd.target, nd.tll);
1680                     next;
1681
1682
1683                     If the option always_learn_from_arp_request is false, the
1684                     above actions will be:
1685
1686                     reg9[2] = lookup_nd(inport, nd.target, nd.tll);
1687                     reg9[3] = 1;
1688                     next;
1689
1690
1691              •      A  priority-100  flow which matches on IPv6 Neighbor Dis‐
1692                     covery solicitation packet and applies the actions if the
1693                     option always_learn_from_arp_request is true:
1694
1695                     reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
1696                     next;
1697
1698
1699                     If the option always_learn_from_arp_request is false, the
1700                     above actions will be:
1701
1702                     reg9[2] = lookup_nd(inport, ip6.src, nd.sll);
1703                     reg9[3] = lookup_nd_ip(inport, ip6.src);
1704                     next;
1705
1706
1707              •      A priority-0 fallback flow that matches all  packets  and
1708                     applies  the  action  reg9[2]  =  1;  next; advancing the
1709                     packet to the next table.
1710
1711     Ingress Table 2: Neighbor learning
1712
1713       This table adds flows to learn the mac bindings from the ARP  and  IPv6
1714       Neighbor  Solicitation/Advertisement  packets if it is needed according
1715       to the lookup results from the previous stage.
1716
1717       reg9[2] will be 1 if the lookup_arp/lookup_nd in the previous table was
1718       successful  or  skipped,  meaning no need to learn mac binding from the
1719       packet.
1720
1721       reg9[3] will be 1 if the lookup_arp_ip/lookup_nd_ip in the previous ta‐
1722       ble  was  successful  or skipped, meaning it is ok to learn mac binding
1723       from the packet (if reg9[2] is 0).
1724
1725              •      A priority-100 flow  with  the  match  reg9[2]  ==  1  ||
1726                     reg9[3] == 0 and advances the packet to the next table as
1727                     there is no need to learn the neighbor.
1728
1729              •      A priority-90 flow with the match arp and applies the ac‐
1730                     tion put_arp(inport, arp.spa, arp.sha); next;
1731
1732              •      A  priority-90  flow with the match nd_na and applies the
1733                     action put_nd(inport, nd.target, nd.tll); next;
1734
1735              •      A priority-90 flow with the match nd_ns and  applies  the
1736                     action put_nd(inport, ip6.src, nd.sll); next;
1737
1738     Ingress Table 3: IP Input
1739
1740       This table is the core of the logical router datapath functionality. It
1741       contains the following flows to implement very basic IP host  function‐
1742       ality.
1743
1744              •      For  distributed  logical routers or gateway routers with
1745                     gateway port configured  with  options:gateway_mtu  to  a
1746                     valid  integer  value, a priority-150 flow with the match
1747                     inport == LRP && REGBIT_PKT_LARGER && REGBIT_EGRESS_LOOP‐
1748                     BACK  ==  0, where LRP is the logical router port and ap‐
1749                     plies the following action  for  ipv4  and  ipv6  respec‐
1750                     tively:
1751
1752                     icmp4 {
1753                         icmp4.type = 3; /* Destination Unreachable. */
1754                         icmp4.code = 4;  /* Frag Needed and DF was Set. */
1755                         icmp4.frag_mtu = M;
1756                         eth.dst = E;
1757                         ip4.dst = ip4.src;
1758                         ip4.src = I;
1759                         ip.ttl = 255;
1760                         REGBIT_EGRESS_LOOPBACK = 1;
1761                         REGBIT_PKT_LARGER 0;
1762                         next(pipeline=ingress, table=0);
1763                     };
1764                     icmp6 {
1765                         icmp6.type = 2;
1766                         icmp6.code = 0;
1767                         icmp6.frag_mtu = M;
1768                         eth.dst = E;
1769                         ip6.dst = ip6.src;
1770                         ip6.src = I;
1771                         ip.ttl = 255;
1772                         REGBIT_EGRESS_LOOPBACK = 1;
1773                         REGBIT_PKT_LARGER 0;
1774                         next(pipeline=ingress, table=0);
1775                     };
1776
1777
1778              •      For  each NAT entry of a distributed logical router (with
1779                     distributed gateway router port) of type snat,  a  prior‐
1780                     ity-120  flow  with the match inport == P && ip4.src == A
1781                     advances the packet to the next pipeline, where P is  the
1782                     distributed  logical router port and A is the external_ip
1783                     set in the NAT entry. If  A  is  an  IPv6  address,  then
1784                     ip6.src is used for the match.
1785
1786                     The  above  flow is required to handle the routing of the
1787                     East/west NAT traffic.
1788
1789              •      For each BFD port the two  following  priority-110  flows
1790                     are added to manage BFD traffic:
1791
1792                     •      if  ip4.src  or ip6.src is any IP address owned by
1793                            the router port and udp.dst == 3784 ,  the  packet
1794                            is advanced to the next pipeline stage.
1795
1796                     •      if  ip4.dst  or ip6.dst is any IP address owned by
1797                            the router port and udp.dst ==  3784  ,  the  han‐
1798                            dle_bfd_msg action is executed.
1799
1800              •      L3  admission  control: A priority-100 flow drops packets
1801                     that match any of the following:
1802
1803ip4.src[28..31] == 0xe (multicast source)
1804
1805ip4.src == 255.255.255.255 (broadcast source)
1806
1807ip4.src == 127.0.0.0/8 || ip4.dst  ==  127.0.0.0/8
1808                            (localhost source or destination)
1809
1810ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
1811                            network source or destination)
1812
1813ip4.src or ip6.src is any IP address owned by  the
1814                            router,  unless the packet was recirculated due to
1815                            egress   loopback    as    indicated    by    REG‐
1816                            BIT_EGRESS_LOOPBACK.
1817
1818ip4.src is the broadcast address of any IP network
1819                            known to the router.
1820
1821              •      A priority-100 flow parses DHCPv6 replies from IPv6  pre‐
1822                     fix  delegation  routers  (udp.src  ==  547 && udp.dst ==
1823                     546). The handle_dhcpv6_reply is used to send IPv6 prefix
1824                     delegation messages to the delegation router.
1825
1826              •      ICMP  echo reply. These flows reply to ICMP echo requests
1827                     received for the router’s IP address. Let A be an IP  ad‐
1828                     dress owned by a router port. Then, for each A that is an
1829                     IPv4 address, a priority-90 flow matches on ip4.dst ==  A
1830                     and  icmp4.type  ==  8  && icmp4.code == 0 (ICMP echo re‐
1831                     quest). For each A that is an IPv6 address, a priority-90
1832                     flow  matches  on  ip6.dst  == A and icmp6.type == 128 &&
1833                     icmp6.code == 0 (ICMPv6 echo request). The  port  of  the
1834                     router  that  receives  the echo request does not matter.
1835                     Also, the ip.ttl  of  the  echo  request  packet  is  not
1836                     checked,  so  it complies with RFC 1812, section 4.2.2.9.
1837                     Flows for ICMPv4 echo requests use the following actions:
1838
1839                     ip4.dst <-> ip4.src;
1840                     ip.ttl = 255;
1841                     icmp4.type = 0;
1842                     flags.loopback = 1;
1843                     next;
1844
1845
1846                     Flows for ICMPv6 echo requests use the following actions:
1847
1848                     ip6.dst <-> ip6.src;
1849                     ip.ttl = 255;
1850                     icmp6.type = 129;
1851                     flags.loopback = 1;
1852                     next;
1853
1854
1855              •      Reply to ARP requests.
1856
1857                     These flows reply to ARP requests for the router’s own IP
1858                     address.  The  ARP  requests  are handled only if the re‐
1859                     questor’s IP belongs to the same subnets of  the  logical
1860                     router  port. For each router port P that owns IP address
1861                     A, which belongs to subnet S with prefix  length  L,  and
1862                     Ethernet  address E, a priority-90 flow matches inport ==
1863                     P && arp.spa == S/L && arp.op == 1 && arp.tpa ==  A  (ARP
1864                     request) with the following actions:
1865
1866                     eth.dst = eth.src;
1867                     eth.src = xreg0[0..47];
1868                     arp.op = 2; /* ARP reply. */
1869                     arp.tha = arp.sha;
1870                     arp.sha = xreg0[0..47];
1871                     arp.tpa = arp.spa;
1872                     arp.spa = A;
1873                     outport = inport;
1874                     flags.loopback = 1;
1875                     output;
1876
1877
1878                     For  the  gateway  port  on  a distributed logical router
1879                     (where one of the logical router ports specifies a  gate‐
1880                     way  chassis), the above flows are only programmed on the
1881                     gateway port instance on the gateway chassis. This behav‐
1882                     ior avoids generation of multiple ARP responses from dif‐
1883                     ferent chassis, and allows upstream MAC learning to point
1884                     to the gateway chassis.
1885
1886                     For the logical router port with the option reside-on-re‐
1887                     direct-chassis set  (which  is  centralized),  the  above
1888                     flows are only programmed on the gateway port instance on
1889                     the gateway chassis (if the logical router has a distrib‐
1890                     uted  gateway  port).  This behavior avoids generation of
1891                     multiple ARP responses from different chassis, and allows
1892                     upstream MAC learning to point to the gateway chassis.
1893
1894              •      Reply  to  IPv6 Neighbor Solicitations. These flows reply
1895                     to Neighbor Solicitation requests for  the  router’s  own
1896                     IPv6  address and populate the logical router’s mac bind‐
1897                     ing table.
1898
1899                     For each router port P that  owns  IPv6  address  A,  so‐
1900                     licited  node address S, and Ethernet address E, a prior‐
1901                     ity-90 flow matches inport == P && nd_ns  &&  ip6.dst  ==
1902                     {A, E} && nd.target == A with the following actions:
1903
1904                     nd_na_router {
1905                         eth.src = xreg0[0..47];
1906                         ip6.src = A;
1907                         nd.target = A;
1908                         nd.tll = xreg0[0..47];
1909                         outport = inport;
1910                         flags.loopback = 1;
1911                         output;
1912                     };
1913
1914
1915                     For  the  gateway  port  on  a distributed logical router
1916                     (where one of the logical router ports specifies a  gate‐
1917                     way  chassis),  the above flows replying to IPv6 Neighbor
1918                     Solicitations are only programmed on the gateway port in‐
1919                     stance  on the gateway chassis. This behavior avoids gen‐
1920                     eration of multiple replies from different  chassis,  and
1921                     allows  upstream  MAC  learning  to  point to the gateway
1922                     chassis.
1923
1924              •      These flows reply to ARP requests or IPv6 neighbor solic‐
1925                     itation  for  the  virtual IP addresses configured in the
1926                     router for NAT (both DNAT and SNAT) or load balancing.
1927
1928                     IPv4: For a configured NAT (both DNAT and  SNAT)  IP  ad‐
1929                     dress or a load balancer IPv4 VIP A, for each router port
1930                     P with Ethernet address E,  a  priority-90  flow  matches
1931                     arp.op  ==  1 && arp.tpa == A (ARP request) with the fol‐
1932                     lowing actions:
1933
1934                     eth.dst = eth.src;
1935                     eth.src = xreg0[0..47];
1936                     arp.op = 2; /* ARP reply. */
1937                     arp.tha = arp.sha;
1938                     arp.sha = xreg0[0..47];
1939                     arp.tpa <-> arp.spa;
1940                     outport = inport;
1941                     flags.loopback = 1;
1942                     output;
1943
1944
1945                     IPv4: For a configured load balancer IPv4 VIP, a  similar
1946                     flow is added with the additional match inport == P.
1947
1948                     If  the  router  port  P  is a distributed gateway router
1949                     port, then the is_chassis_resident(P) is  also  added  in
1950                     the match condition for the load balancer IPv4 VIP A.
1951
1952                     IPv6:  For  a  configured NAT (both DNAT and SNAT) IP ad‐
1953                     dress or a load balancer IPv6 VIP A, solicited  node  ad‐
1954                     dress  S, for each router port P with Ethernet address E,
1955                     a priority-90 flow  matches  inport  ==  P  &&  nd_ns  &&
1956                     ip6.dst  ==  {A,  S} && nd.target == A with the following
1957                     actions:
1958
1959                     eth.dst = eth.src;
1960                     nd_na {
1961                         eth.src = xreg0[0..47];
1962                         nd.tll = xreg0[0..47];
1963                         ip6.src = A;
1964                         nd.target = A;
1965                         outport = inport;
1966                         flags.loopback = 1;
1967                         output;
1968                     }
1969
1970
1971                     If the router port P  is  a  distributed  gateway  router
1972                     port,  then  the  is_chassis_resident(P) is also added in
1973                     the match condition for the load balancer IPv6 VIP A.
1974
1975                     For the gateway port on a distributed logical router with
1976                     NAT  (where  one  of the logical router ports specifies a
1977                     gateway chassis):
1978
1979                     •      If the corresponding NAT rule cannot be handled in
1980                            a  distributed  manner, then a priority-92 flow is
1981                            programmed on the gateway  port  instance  on  the
1982                            gateway  chassis.  A priority-91 drop flow is pro‐
1983                            grammed on the other chassis when ARP  requests/NS
1984                            packets are received on the gateway port. This be‐
1985                            havior avoids generation of multiple ARP responses
1986                            from  different  chassis,  and allows upstream MAC
1987                            learning to point to the gateway chassis.
1988
1989                     •      If the corresponding NAT rule can be handled in  a
1990                            distributed  manner,  then  this flow is only pro‐
1991                            grammed on the gateway  port  instance  where  the
1992                            logical_port specified in the NAT rule resides.
1993
1994                            Some  of  the actions are different for this case,
1995                            using the external_mac specified in the  NAT  rule
1996                            rather than the gateway port’s Ethernet address E:
1997
1998                            eth.src = external_mac;
1999                            arp.sha = external_mac;
2000
2001
2002                            or in the case of IPv6 neighbor solicition:
2003
2004                            eth.src = external_mac;
2005                            nd.tll = external_mac;
2006
2007
2008                            This  behavior  avoids  generation of multiple ARP
2009                            responses from different chassis, and  allows  up‐
2010                            stream  MAC learning to point to the correct chas‐
2011                            sis.
2012
2013              •      Priority-85 flows which drops the ARP and  IPv6  Neighbor
2014                     Discovery packets.
2015
2016              •      A priority-84 flow explicitly allows IPv6 multicast traf‐
2017                     fic that is supposed to reach the router pipeline  (i.e.,
2018                     router solicitation and router advertisement packets).
2019
2020              •      A  priority-83 flow explicitly drops IPv6 multicast traf‐
2021                     fic that is destined to reserved multicast groups.
2022
2023              •      A priority-82 flow allows IP  multicast  traffic  if  op‐
2024                     tions:mcast_relay=’true’, otherwise drops it.
2025
2026              •      UDP  port  unreachable.  Priority-80  flows generate ICMP
2027                     port unreachable messages in reply to UDP  datagrams  di‐
2028                     rected  to the router’s IP address, except in the special
2029                     case of gateways, which  accept  traffic  directed  to  a
2030                     router IP for load balancing and NAT purposes.
2031
2032                     These  flows  should  not match IP fragments with nonzero
2033                     offset.
2034
2035              •      TCP reset. Priority-80 flows generate TCP reset  messages
2036                     in reply to TCP datagrams directed to the router’s IP ad‐
2037                     dress, except in the special case of gateways, which  ac‐
2038                     cept  traffic  directed to a router IP for load balancing
2039                     and NAT purposes.
2040
2041                     These flows should not match IP  fragments  with  nonzero
2042                     offset.
2043
2044              •      Protocol or address unreachable. Priority-70 flows gener‐
2045                     ate ICMP protocol or  address  unreachable  messages  for
2046                     IPv4  and  IPv6 respectively in reply to packets directed
2047                     to the router’s IP address on  IP  protocols  other  than
2048                     UDP,  TCP,  and ICMP, except in the special case of gate‐
2049                     ways, which accept traffic directed to a  router  IP  for
2050                     load balancing purposes.
2051
2052                     These  flows  should  not match IP fragments with nonzero
2053                     offset.
2054
2055              •      Drop other IP traffic to this router.  These  flows  drop
2056                     any  other  traffic  destined  to  an  IP address of this
2057                     router that is not already handled by one  of  the  flows
2058                     above,  which  amounts to ICMP (other than echo requests)
2059                     and fragments with nonzero offsets. For each IP address A
2060                     owned  by  the router, a priority-60 flow matches ip4.dst
2061                     == A or ip6.dst == A and drops the traffic. An  exception
2062                     is  made  and  the  above flow is not added if the router
2063                     port’s own IP address is used  to  SNAT  packets  passing
2064                     through that router.
2065
2066       The flows above handle all of the traffic that might be directed to the
2067       router itself. The following flows (with lower priorities)  handle  the
2068       remaining traffic, potentially for forwarding:
2069
2070              •      Drop  Ethernet  local  broadcast. A priority-50 flow with
2071                     match eth.bcast drops traffic destined to the local  Eth‐
2072                     ernet  broadcast  address.  By  definition  this  traffic
2073                     should not be forwarded.
2074
2075              •      ICMP time exceeded. For each router port P, whose IP  ad‐
2076                     dress  is A, a priority-40 flow with match inport == P &&
2077                     ip.ttl == {0, 1} && !ip.later_frag matches packets  whose
2078                     TTL  has  expired,  with the following actions to send an
2079                     ICMP time exceeded reply for IPv4 and IPv6 respectively:
2080
2081                     icmp4 {
2082                         icmp4.type = 11; /* Time exceeded. */
2083                         icmp4.code = 0;  /* TTL exceeded in transit. */
2084                         ip4.dst = ip4.src;
2085                         ip4.src = A;
2086                         ip.ttl = 255;
2087                         next;
2088                     };
2089                     icmp6 {
2090                         icmp6.type = 3; /* Time exceeded. */
2091                         icmp6.code = 0;  /* TTL exceeded in transit. */
2092                         ip6.dst = ip6.src;
2093                         ip6.src = A;
2094                         ip.ttl = 255;
2095                         next;
2096                     };
2097
2098
2099              •      TTL discard. A priority-30 flow with match ip.ttl ==  {0,
2100                     1}  and  actions  drop; drops other packets whose TTL has
2101                     expired, that should not receive a ICMP error reply (i.e.
2102                     fragments with nonzero offset).
2103
2104              •      Next  table.  A  priority-0  flows match all packets that
2105                     aren’t already handled and uses  actions  next;  to  feed
2106                     them to the next table.
2107
2108     Ingress Table 4: UNSNAT
2109
2110       This  is  for  already  established connections’ reverse traffic. i.e.,
2111       SNAT has already been done in egress pipeline and now  the  packet  has
2112       entered the ingress pipeline as part of a reply. It is unSNATted here.
2113
2114       Ingress Table 4: UNSNAT on Gateway and Distributed Routers
2115
2116              •      If the Router (Gateway or Distributed) is configured with
2117                     load balancers, then below lflows are added:
2118
2119                     For each IPv4 address A defined as load balancer VIP with
2120                     the  protocol  P  (and the protocol port T if defined) is
2121                     also present as an external_ip in the NAT table, a prior‐
2122                     ity-120  logical  flow  is  added  with  the match ip4 &&
2123                     ip4.dst == A && P with the action next;  to  advance  the
2124                     packet to the next table. If the load balancer has proto‐
2125                     col port B defined, then the match also has P.dst == B.
2126
2127                     The above flows are also added for IPv6 load balancers.
2128
2129       Ingress Table 4: UNSNAT on Gateway Routers
2130
2131              •      If the Gateway router has been configured to  force  SNAT
2132                     any  previously DNATted packets to B, a priority-110 flow
2133                     matches ip && ip4.dst == B or ip && ip6.dst == B with  an
2134                     action ct_snat; .
2135
2136                     If    the    Gateway    router    is    configured   with
2137                     lb_force_snat_ip=router_ip then for every logical  router
2138                     port  P attached to the Gateway router with the router ip
2139                     B, a priority-110 flow is added with the match inport  ==
2140                     P  && ip4.dst == B or inport == P && ip6.dst == B with an
2141                     action ct_snat; .
2142
2143                     If the Gateway router has been configured to  force  SNAT
2144                     any previously load-balanced packets to B, a priority-100
2145                     flow matches ip && ip4.dst == B or ip  &&  ip6.dst  ==  B
2146                     with an action ct_snat; .
2147
2148                     For  each  NAT  configuration in the OVN Northbound data‐
2149                     base, that asks to change the  source  IP  address  of  a
2150                     packet  from  A  to  B,  a priority-90 flow matches ip &&
2151                     ip4.dst == B or  ip  &&  ip6.dst  ==  B  with  an  action
2152                     ct_snat;  .  If the NAT rule is of type dnat_and_snat and
2153                     has stateless=true in the options, then the action  would
2154                     be ip4/6.dst= (B).
2155
2156                     A priority-0 logical flow with match 1 has actions next;.
2157
2158       Ingress Table 4: UNSNAT on Distributed Routers
2159
2160              •      For  each  configuration  in the OVN Northbound database,
2161                     that asks to change the source IP  address  of  a  packet
2162                     from A to B, a priority-100 flow matches ip && ip4.dst ==
2163                     B && inport == GW or ip && ip6.dst == B && inport  ==  GW
2164                     where  GW is the logical router gateway port, with an ac‐
2165                     tion ct_snat;. If the NAT rule is of  type  dnat_and_snat
2166                     and  has  stateless=true  in the options, then the action
2167                     would be ip4/6.dst= (B).
2168
2169                     If the NAT rule cannot be handled in a  distributed  man‐
2170                     ner,  then the priority-100 flow above is only programmed
2171                     on the gateway chassis.
2172
2173                     A priority-0 logical flow with match 1 has actions next;.
2174
2175     Ingress Table 5: DEFRAG
2176
2177       This is to send packets to connection tracker for tracking and  defrag‐
2178       mentation.  It  contains a priority-0 flow that simply moves traffic to
2179       the next table.
2180
2181       If load balancing rules with only virtual IP addresses  are  configured
2182       in OVN_Northbound database for a Gateway router, a priority-100 flow is
2183       added for each configured virtual IP address VIP.  For  IPv4  VIPs  the
2184       flow  matches  ip && ip4.dst == VIP. For IPv6 VIPs, the flow matches ip
2185       && ip6.dst == VIP. The flow applies the action reg0 = VIP; ct_dnat; (or
2186       xxreg0  for  IPv6)  to  send  IP  packets to the connection tracker for
2187       packet de-fragmentation and to dnat the destination IP for the  commit‐
2188       ted connection before sending it to the next table.
2189
2190       If load balancing rules with virtual IP addresses and ports are config‐
2191       ured in OVN_Northbound database for a Gateway  router,  a  priority-110
2192       flow  is  added  for  each  configured virtual IP address VIP, protocol
2193       PROTO and port PORT. For IPv4 VIPs the flow matches ip  &&  ip4.dst  ==
2194       VIP  &&  PROTO && PROTO.dst == PORT. For IPv6 VIPs, the flow matches ip
2195       && ip6.dst == VIP && PROTO && PROTO.dst == PORT. The flow  applies  the
2196       action  reg0  =  VIP; reg9[16..31] = PROTO.dst; ct_dnat; (or xxreg0 for
2197       IPv6) to send IP packets to the connection tracker for packet  de-frag‐
2198       mentation  and  to dnat the destination IP for the committed connection
2199       before sending it to the next table.
2200
2201       If ECMP routes with symmetric reply are configured  in  the  OVN_North‐
2202       bound  database  for a gateway router, a priority-300 flow is added for
2203       each router port on which symmetric replies are configured. The  match‐
2204       ing  logic for these ports essentially reverses the configured logic of
2205       the ECMP route. So for instance, a route  with  a  destination  routing
2206       policy  will  instead match if the source IP address matches the static
2207       route’s prefix. The flow uses the action ct_next to send IP packets  to
2208       the  connection tracker for packet de-fragmentation and tracking before
2209       sending it to the next table.
2210
2211     Ingress Table 6: DNAT
2212
2213       Packets enter the pipeline with destination IP address that needs to be
2214       DNATted  from a virtual IP address to a real IP address. Packets in the
2215       reverse direction needs to be unDNATed.
2216
2217       Ingress Table 6: Load balancing DNAT rules
2218
2219       Following load balancing DNAT flows are added  for  Gateway  router  or
2220       Router  with gateway port. These flows are programmed only on the gate‐
2221       way chassis. These flows do not get programmed for load balancers  with
2222       IPv6 VIPs.
2223
2224              •      If  controller_event has been enabled for all the config‐
2225                     ured load balancing rules for a Gateway router or  Router
2226                     with  gateway  port  in OVN_Northbound database that does
2227                     not have configured  backends,  a  priority-130  flow  is
2228                     added to trigger ovn-controller events whenever the chas‐
2229                     sis  receives  a  packet  for  that  particular  VIP.  If
2230                     event-elb  meter  has been previously created, it will be
2231                     associated to the empty_lb logical flow
2232
2233              •      For all the configured load balancing rules for a Gateway
2234                     router  or  Router  with  gateway  port in OVN_Northbound
2235                     database that includes a L4 port PORT of protocol  P  and
2236                     IPv4  or  IPv6  address  VIP,  a  priority-120  flow that
2237                     matches  on  ct.new  &&  ip  &&  reg0  ==  VIP  &&  P  &&
2238                     reg9[16..31]  ==   PORT  (xxreg0 == VIP in the IPv6 case)
2239                     with an action of ct_lb(args), where args contains  comma
2240                     separated  IPv4 or IPv6 addresses (and optional port num‐
2241                     bers) to load balance to. If the router is configured  to
2242                     force  SNAT  any  load-balanced packets, the above action
2243                     will  be  replaced  by   flags.force_snat_for_lb   =   1;
2244                     ct_lb(args);.  If  the  load balancing rule is configured
2245                     with skip_snat set to true, the above action will be  re‐
2246                     placed  by  flags.skip_snat_for_lb  = 1; ct_lb(args);. If
2247                     health check is enabled,  then  args  will  only  contain
2248                     those  endpoints  whose  service  monitor status entry in
2249                     OVN_Southbound db is either online or empty.
2250
2251                     The previous table lr_in_defrag sets  the  register  reg0
2252                     (or  xxreg0  for IPv6) and does ct_dnat. Hence for estab‐
2253                     lished traffic, this table just advances  the  packet  to
2254                     the next stage.
2255
2256              •      For  all the configured load balancing rules for a router
2257                     in OVN_Northbound database that includes a L4  port  PORT
2258                     of  protocol  P  and  IPv4  or IPv6 address VIP, a prior‐
2259                     ity-120 flow that matches on ct.est && ip4 && reg0 == VIP
2260                     &&  P  && reg9[16..31] ==  PORT (ip6 and xxreg0 == VIP in
2261                     the IPv6 case) with an action of next;. If the router  is
2262                     configured  to  force SNAT any load-balanced packets, the
2263                     above action will be replaced by  flags.force_snat_for_lb
2264                     = 1; next;. If the load balancing rule is configured with
2265                     skip_snat set to true, the above action will be  replaced
2266                     by flags.skip_snat_for_lb = 1; next;.
2267
2268                     The  previous  table  lr_in_defrag sets the register reg0
2269                     (or xxreg0 for IPv6) and does ct_dnat. Hence  for  estab‐
2270                     lished  traffic,  this  table just advances the packet to
2271                     the next stage.
2272
2273              •      For all the configured load balancing rules for a  router
2274                     in  OVN_Northbound  database that includes just an IP ad‐
2275                     dress VIP to match on, a priority-110 flow  that  matches
2276                     on ct.new && ip4 && reg0 == VIP (ip6 and xxreg0 == VIP in
2277                     the IPv6 case) with an action of ct_lb(args), where  args
2278                     contains  comma  separated IPv4 or IPv6 addresses. If the
2279                     router is configured  to  force  SNAT  any  load-balanced
2280                     packets,   the   above   action   will   be  replaced  by
2281                     flags.force_snat_for_lb = 1; ct_lb(args);.  If  the  load
2282                     balancing  rule is configured with skip_snat set to true,
2283                     the    above    action    will     be     replaced     by
2284                     flags.skip_snat_for_lb = 1; ct_lb(args);.
2285
2286                     The  previous  table  lr_in_defrag sets the register reg0
2287                     (or xxreg0 for IPv6) and does ct_dnat. Hence  for  estab‐
2288                     lished  traffic,  this  table just advances the packet to
2289                     the next stage.
2290
2291              •      For all the configured load balancing rules for a  router
2292                     in  OVN_Northbound  database that includes just an IP ad‐
2293                     dress VIP to match on, a priority-110 flow  that  matches
2294                     on  ct.est  &&  ip4  && reg0 == VIP (or ip6 and xxreg0 ==
2295                     VIP) with an action of next;. If the router is configured
2296                     to force SNAT any load-balanced packets, the above action
2297                     will be replaced by flags.force_snat_for_lb =  1;  next;.
2298                     If  the  load balancing rule is configured with skip_snat
2299                     set to  true,  the  above  action  will  be  replaced  by
2300                     flags.skip_snat_for_lb = 1; next;.
2301
2302                     The  previous  table  lr_in_defrag sets the register reg0
2303                     (or xxreg0 for IPv6) and does ct_dnat. Hence  for  estab‐
2304                     lished  traffic,  this  table just advances the packet to
2305                     the next stage.
2306
2307              •      If the load balancer is created with --reject option  and
2308                     it  has no active backends, a TCP reset segment (for tcp)
2309                     or an ICMP port unreachable packet (for all other kind of
2310                     traffic)  will be sent whenever an incoming packet is re‐
2311                     ceived for this load-balancer. Please note using --reject
2312                     option will disable empty_lb SB controller event for this
2313                     load balancer.
2314
2315       Ingress Table 6: DNAT on Gateway Routers
2316
2317              •      For each configuration in the  OVN  Northbound  database,
2318                     that  asks  to  change  the  destination  IP address of a
2319                     packet from A to B, a priority-100  flow  matches  ip  &&
2320                     ip4.dst  ==  A  or  ip  &&  ip6.dst  ==  A with an action
2321                     flags.loopback = 1; ct_dnat(B);. If the Gateway router is
2322                     configured to force SNAT any DNATed packet, the above ac‐
2323                     tion will be replaced by flags.force_snat_for_dnat  =  1;
2324                     flags.loopback  =  1;  ct_dnat(B);. If the NAT rule is of
2325                     type dnat_and_snat and has stateless=true in the options,
2326                     then the action would be ip4/6.dst= (B).
2327
2328                     If  the  NAT  rule  has  allowed_ext_ips configured, then
2329                     there is an additional match ip4.src == allowed_ext_ips .
2330                     Similarly,  for  IPV6,  match  would  be  ip6.src  == al‐
2331                     lowed_ext_ips.
2332
2333                     If the NAT rule has exempted_ext_ips set, then  there  is
2334                     an  additional  flow configured at priority 101. The flow
2335                     matches if source ip is an exempted_ext_ip and the action
2336                     is next; . This flow is used to bypass the ct_dnat action
2337                     for a packet originating from exempted_ext_ips.
2338
2339              •      A priority-0 logical flow with match 1 has actions next;.
2340
2341       Ingress Table 6: DNAT on Distributed Routers
2342
2343       On distributed routers, the DNAT table only handles packets with desti‐
2344       nation IP address that needs to be DNATted from a virtual IP address to
2345       a real IP address. The unDNAT processing in the  reverse  direction  is
2346       handled in a separate table in the egress pipeline.
2347
2348              •      For  each  configuration  in the OVN Northbound database,
2349                     that asks to change  the  destination  IP  address  of  a
2350                     packet  from  A  to  B, a priority-100 flow matches ip &&
2351                     ip4.dst == B && inport == GW, where  GW  is  the  logical
2352                     router  gateway  port,  with  an  action ct_dnat(B);. The
2353                     match will include ip6.dst == B in the IPv6 case. If  the
2354                     NAT  rule is of type dnat_and_snat and has stateless=true
2355                     in the options, then the action would be ip4/6.dst=(B).
2356
2357                     If the NAT rule cannot be handled in a  distributed  man‐
2358                     ner,  then the priority-100 flow above is only programmed
2359                     on the gateway chassis.
2360
2361                     If the NAT  rule  has  allowed_ext_ips  configured,  then
2362                     there is an additional match ip4.src == allowed_ext_ips .
2363                     Similarly, for  IPV6,  match  would  be  ip6.src  ==  al‐
2364                     lowed_ext_ips.
2365
2366                     If  the  NAT rule has exempted_ext_ips set, then there is
2367                     an additional flow configured at priority 101.  The  flow
2368                     matches if source ip is an exempted_ext_ip and the action
2369                     is next; . This flow is used to bypass the ct_dnat action
2370                     for a packet originating from exempted_ext_ips.
2371
2372                     A priority-0 logical flow with match 1 has actions next;.
2373
2374     Ingress Table 7: ECMP symmetric reply processing
2375
2376              •      If ECMP routes with symmetric reply are configured in the
2377                     OVN_Northbound database for a gateway  router,  a  prior‐
2378                     ity-100  flow is added for each router port on which sym‐
2379                     metric replies are configured.  The  matching  logic  for
2380                     these  ports essentially reverses the configured logic of
2381                     the ECMP route. So for instance, a route with a  destina‐
2382                     tion  routing  policy will instead match if the source IP
2383                     address matches the static route’s prefix. The flow  uses
2384                     the   action   ct_commit   {   ct_label.ecmp_reply_eth  =
2385                     eth.src;" " ct_label.ecmp_reply_port  =  K;};  next;   to
2386                     commit  the  connection  and storing eth.src and the ECMP
2387                     reply port binding tunnel key K in the ct_label.
2388
2389     Ingress Table 8: IPv6 ND RA option processing
2390
2391              •      A priority-50 logical flow  is  added  for  each  logical
2392                     router  port  configured  with  IPv6  ND RA options which
2393                     matches IPv6 ND Router Solicitation  packet  and  applies
2394                     the  action put_nd_ra_opts and advances the packet to the
2395                     next table.
2396
2397                     reg0[5] = put_nd_ra_opts(options);next;
2398
2399
2400                     For a valid IPv6 ND RS packet, this transforms the packet
2401                     into  an  IPv6 ND RA reply and sets the RA options to the
2402                     packet and stores 1 into  reg0[5].  For  other  kinds  of
2403                     packets,  it  just  stores 0 into reg0[5]. Either way, it
2404                     continues to the next table.
2405
2406              •      A priority-0 logical flow with match 1 has actions next;.
2407
2408     Ingress Table 9: IPv6 ND RA responder
2409
2410       This table implements IPv6 ND RA responder for the IPv6 ND  RA  replies
2411       generated by the previous table.
2412
2413              •      A  priority-50  logical  flow  is  added for each logical
2414                     router port configured with  IPv6  ND  RA  options  which
2415                     matches  IPv6 ND RA packets and reg0[5] == 1 and responds
2416                     back to the  inport  after  applying  these  actions.  If
2417                     reg0[5]   is   set   to  1,  it  means  that  the  action
2418                     put_nd_ra_opts was successful.
2419
2420                     eth.dst = eth.src;
2421                     eth.src = E;
2422                     ip6.dst = ip6.src;
2423                     ip6.src = I;
2424                     outport = P;
2425                     flags.loopback = 1;
2426                     output;
2427
2428
2429                     where E is the MAC address and I is the IPv6  link  local
2430                     address of the logical router port.
2431
2432                     (This  terminates  packet processing in ingress pipeline;
2433                     the packet does not go to the next ingress table.)
2434
2435              •      A priority-0 logical flow with match 1 has actions next;.
2436
2437     Ingress Table 10: IP Routing
2438
2439       A packet that arrives at this table is an  IP  packet  that  should  be
2440       routed  to  the address in ip4.dst or ip6.dst. This table implements IP
2441       routing, setting reg0 (or xxreg0 for IPv6) to the next-hop  IP  address
2442       (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
2443       and advances to the next table for ARP resolution. It  also  sets  reg1
2444       (or  xxreg1)  to  the  IP  address  owned  by  the selected router port
2445       (ingress table ARP Request will generate an  ARP  request,  if  needed,
2446       with  reg0 as the target protocol address and reg1 as the source proto‐
2447       col address).
2448
2449       For ECMP routes, i.e. multiple static routes with same policy and  pre‐
2450       fix  but different nexthops, the above actions are deferred to next ta‐
2451       ble. This table, instead, is responsible for determine the  ECMP  group
2452       id and select a member id within the group based on 5-tuple hashing. It
2453       stores group id in reg8[0..15] and member id in reg8[16..31]. This step
2454       is  skipped  if  the traffic going out the ECMP route is reply traffic,
2455       and the ECMP route was configured to use  symmetric  replies.  Instead,
2456       the  stored ct_label value is used to choose the destination. The least
2457       significant 48 bits of the ct_label tell the destination MAC address to
2458       which  the  packet  should  be  sent. The next 16 bits tell the logical
2459       router port on which the packet should be sent.  These  values  in  the
2460       ct_label  are set when the initial ingress traffic is received over the
2461       ECMP route.
2462
2463       This table contains the following logical flows:
2464
2465              •      Priority-550 flow that drops IPv6 Router Solicitation/Ad‐
2466                     vertisement  packets  that were not processed in previous
2467                     tables.
2468
2469              •      Priority-500 flows that match IP multicast  traffic  des‐
2470                     tined  to  groups  registered  on  any  of  the  attached
2471                     switches and sets outport  to  the  associated  multicast
2472                     group  that  will eventually flood the traffic to all in‐
2473                     terested attached logical switches. The flows also decre‐
2474                     ment TTL.
2475
2476              •      Priority-450  flow that matches unregistered IP multicast
2477                     traffic and  sets  outport  to  the  MC_STATIC  multicast
2478                     group,  which ovn-northd populates with the logical ports
2479                     that have options :mcast_flood=’true’. If no router ports
2480                     are configured to flood multicast traffic the packets are
2481                     dropped.
2482
2483              •      IPv4 routing table. For each route to IPv4 network N with
2484                     netmask  M, on router port P with IP address A and Ether‐
2485                     net address E, a logical flow with match ip4.dst ==  N/M,
2486                     whose priority is the number of 1-bits in M, has the fol‐
2487                     lowing actions:
2488
2489                     ip.ttl--;
2490                     reg8[0..15] = 0;
2491                     reg0 = G;
2492                     reg1 = A;
2493                     eth.src = E;
2494                     outport = P;
2495                     flags.loopback = 1;
2496                     next;
2497
2498
2499                     (Ingress table 1 already verified that ip.ttl--; will not
2500                     yield a TTL exceeded error.)
2501
2502                     If  the route has a gateway, G is the gateway IP address.
2503                     Instead, if the route is from a configured static  route,
2504                     G is the next hop IP address. Else it is ip4.dst.
2505
2506              •      IPv6 routing table. For each route to IPv6 network N with
2507                     netmask M, on router port P with IP address A and  Ether‐
2508                     net address E, a logical flow with match in CIDR notation
2509                     ip6.dst == N/M, whose priority is the integer value of M,
2510                     has the following actions:
2511
2512                     ip.ttl--;
2513                     reg8[0..15] = 0;
2514                     xxreg0 = G;
2515                     xxreg1 = A;
2516                     eth.src = E;
2517                     outport = inport;
2518                     flags.loopback = 1;
2519                     next;
2520
2521
2522                     (Ingress table 1 already verified that ip.ttl--; will not
2523                     yield a TTL exceeded error.)
2524
2525                     If the route has a gateway, G is the gateway IP  address.
2526                     Instead,  if the route is from a configured static route,
2527                     G is the next hop IP address. Else it is ip6.dst.
2528
2529                     If the address A is in the link-local  scope,  the  route
2530                     will be limited to sending on the ingress port.
2531
2532              •      For  ECMP  routes, they are grouped by policy and prefix.
2533                     An unique id (non-zero) is assigned to  each  group,  and
2534                     each  member  is  also  assigned  an unique id (non-zero)
2535                     within each group.
2536
2537                     For each IPv4/IPv6 ECMP group with group id GID and  mem‐
2538                     ber  ids  MID1,  MID2,  ..., a logical flow with match in
2539                     CIDR notation ip4.dst == N/M, or ip6.dst  ==  N/M,  whose
2540                     priority is the integer value of M, has the following ac‐
2541                     tions:
2542
2543                     ip.ttl--;
2544                     flags.loopback = 1;
2545                     reg8[0..15] = GID;
2546                     select(reg8[16..31], MID1, MID2, ...);
2547
2548
2549     Ingress Table 11: IP_ROUTING_ECMP
2550
2551       This table implements the second part of IP  routing  for  ECMP  routes
2552       following  the  previous table. If a packet matched a ECMP group in the
2553       previous table, this table matches the group id and  member  id  stored
2554       from the previous table, setting reg0 (or xxreg0 for IPv6) to the next-
2555       hop IP address (leaving ip4.dst or ip6.dst, the packet’s final destina‐
2556       tion,  unchanged) and advances to the next table for ARP resolution. It
2557       also sets reg1 (or xxreg1) to the IP  address  owned  by  the  selected
2558       router port (ingress table ARP Request will generate an ARP request, if
2559       needed, with reg0 as the target protocol address and reg1 as the source
2560       protocol address).
2561
2562       This  processing is skipped for reply traffic being sent out of an ECMP
2563       route if the route was configured to use symmetric replies.
2564
2565       This table contains the following logical flows:
2566
2567              •      A priority-150 flow that matches reg8[0..15]  ==  0  with
2568                     action   next;  directly  bypasses  packets  of  non-ECMP
2569                     routes.
2570
2571              •      For each member with ID MID in each ECMP  group  with  ID
2572                     GID, a priority-100 flow with match reg8[0..15] == GID &&
2573                     reg8[16..31] == MID has following actions:
2574
2575                     [xx]reg0 = G;
2576                     [xx]reg1 = A;
2577                     eth.src = E;
2578                     outport = P;
2579
2580
2581     Ingress Table 12: Router policies
2582
2583       This table adds flows for the logical router policies configured on the
2584       logical   router.   Please   see   the  OVN_Northbound  database  Logi‐
2585       cal_Router_Policy table documentation in ovn-nb for supported actions.
2586
2587              •      For each router policy configured on the logical  router,
2588                     a  logical  flow  is added with specified priority, match
2589                     and actions.
2590
2591              •      If the policy action is reroute with 2 or  more  nexthops
2592                     defined,  then the logical flow is added with the follow‐
2593                     ing actions:
2594
2595                     reg8[0..15] = GID;
2596                     reg8[16..31] = select(1,..n);
2597
2598
2599                     where GID is the ECMP group id  generated  by  ovn-northd
2600                     for  this  policy and n is the number of nexthops. select
2601                     action selects one of the nexthop member id, stores it in
2602                     the  register reg8[16..31] and advances the packet to the
2603                     next stage.
2604
2605              •      If the policy action is reroute  with  just  one  nexhop,
2606                     then  the  logical  flow  is added with the following ac‐
2607                     tions:
2608
2609                     [xx]reg0 = H;
2610                     eth.src = E;
2611                     outport = P;
2612                     reg8[0..15] = 0;
2613                     flags.loopback = 1;
2614                     next;
2615
2616
2617                     where H is the nexthop  defined in the router  policy,  E
2618                     is  the  ethernet address of the logical router port from
2619                     which the nexthop is  reachable  and  P  is  the  logical
2620                     router port from which the nexthop is reachable.
2621
2622              •      If  a  router policy has the option pkt_mark=m set and if
2623                     the action is not drop, then  the  action  also  includes
2624                     pkt.mark = m to mark the packet with the marker m.
2625
2626     Ingress Table 13: ECMP handling for router policies
2627
2628       This  table  handles  the  ECMP for the router policies configured with
2629       multiple nexthops.
2630
2631              •      A priority-150 flow is added to advance the packet to the
2632                     next  stage  if the ECMP group id register reg8[0..15] is
2633                     0.
2634
2635              •      For each ECMP reroute router policy  with  multiple  nex‐
2636                     thops,  a  priority-100  flow is added for each nexthop H
2637                     with the match reg8[0..15] == GID &&  reg8[16..31]  ==  M
2638                     where  GID  is  the  router  policy group id generated by
2639                     ovn-northd and M is the member id of the nexthop H gener‐
2640                     ated  by  ovn-northd.  The following actions are added to
2641                     the flow:
2642
2643                     [xx]reg0 = H;
2644                     eth.src = E;
2645                     outport = P
2646                     "flags.loopback = 1; "
2647                     "next;"
2648
2649
2650                     where H is the nexthop  defined in the router  policy,  E
2651                     is  the  ethernet address of the logical router port from
2652                     which the nexthop is  reachable  and  P  is  the  logical
2653                     router port from which the nexthop is reachable.
2654
2655     Ingress Table 14: ARP/ND Resolution
2656
2657       Any  packet that reaches this table is an IP packet whose next-hop IPv4
2658       address is in reg0 or IPv6 address is in xxreg0.  (ip4.dst  or  ip6.dst
2659       contains  the final destination.) This table resolves the IP address in
2660       reg0 (or xxreg0) into an output port in outport and an Ethernet address
2661       in eth.dst, using the following flows:
2662
2663              •      A  priority-500  flow  that  matches IP multicast traffic
2664                     that was allowed in the routing pipeline. For  this  kind
2665                     of  traffic  the outport was already set so the flow just
2666                     advances to the next table.
2667
2668              •      Static MAC bindings. MAC bindings can be known statically
2669                     based  on data in the OVN_Northbound database. For router
2670                     ports connected to logical switches, MAC bindings can  be
2671                     known  statically  from the addresses column in the Logi‐
2672                     cal_Switch_Port table.  For  router  ports  connected  to
2673                     other  logical  routers, MAC bindings can be known stati‐
2674                     cally from the mac  and  networks  column  in  the  Logi‐
2675                     cal_Router_Port  table.  (Note: the flow is NOT installed
2676                     for the IP addresses that belong to  a  neighbor  logical
2677                     router  port  if  the  current router has the options:dy‐
2678                     namic_neigh_routers set to true)
2679
2680                     For each IPv4 address A whose host is known to have  Eth‐
2681                     ernet  address  E  on  router port P, a priority-100 flow
2682                     with match outport === P && reg0 == A has actions eth.dst
2683                     = E; next;.
2684
2685                     For  each  virtual  ip  A configured on a logical port of
2686                     type virtual and its virtual parent  set  in  its  corre‐
2687                     sponding  Port_Binding record and the virtual parent with
2688                     the Ethernet address E and the virtual  ip  is  reachable
2689                     via  the  router  port  P, a priority-100 flow with match
2690                     outport === P && reg0 ==  A  has  actions  eth.dst  =  E;
2691                     next;.
2692
2693                     For  each  virtual  ip  A configured on a logical port of
2694                     type virtual and its virtual parent not set in its corre‐
2695                     sponding  Port_Binding  record  and  the  virtual ip A is
2696                     reachable via the router port P, a priority-100 flow with
2697                     match  outport  ===  P && reg0 == A has actions eth.dst =
2698                     00:00:00:00:00:00; next;. This flow is added so that  the
2699                     ARP is always resolved for the virtual ip A by generating
2700                     ARP request and not consulting the MAC_Binding  table  as
2701                     it can have incorrect value for the virtual ip A.
2702
2703                     For  each IPv6 address A whose host is known to have Eth‐
2704                     ernet address E on router port  P,  a  priority-100  flow
2705                     with  match  outport  ===  P  &&  xxreg0 == A has actions
2706                     eth.dst = E; next;.
2707
2708                     For each logical router port with an IPv4 address A and a
2709                     mac  address of E that is reachable via a different logi‐
2710                     cal router port P, a priority-100 flow with match outport
2711                     === P && reg0 == A has actions eth.dst = E; next;.
2712
2713                     For each logical router port with an IPv6 address A and a
2714                     mac address of E that is reachable via a different  logi‐
2715                     cal router port P, a priority-100 flow with match outport
2716                     === P && xxreg0 == A has actions eth.dst = E; next;.
2717
2718              •      Static MAC bindings from NAT entries.  MAC  bindings  can
2719                     also  be  known  for  the entries in the NAT table. Below
2720                     flows are programmed for distributed logical routers  i.e
2721                     with a distributed router port.
2722
2723                     For  each row in the NAT table with IPv4 address A in the
2724                     external_ip column of NAT table, a priority-100 flow with
2725                     the  match outport === P && reg0 == A has actions eth.dst
2726                     = E; next;, where P is  the  distributed  logical  router
2727                     port,  E  is  the  Ethernet  address if set in the exter‐
2728                     nal_mac column of NAT table for  of  type  dnat_and_snat,
2729                     otherwise the Ethernet address of the distributed logical
2730                     router port. Note that if the external_ip is not within a
2731                     subnet  on  the owning logical router, then OVN will only
2732                     create ARP resolution flows if the  options:add_route  is
2733                     set  to  true. Otherwise, no ARP resolution flows will be
2734                     added.
2735
2736                     For IPv6 NAT entries, same flows are added, but using the
2737                     register xxreg0 for the match.
2738
2739              •      Traffic  with  IP  destination  an  address  owned by the
2740                     router  should  be  dropped.  Such  traffic  is  normally
2741                     dropped in ingress table IP Input except for IPs that are
2742                     also shared with SNAT rules. However, if there was no un‐
2743                     SNAT  operation  that  happened  successfully  until this
2744                     point in the pipeline  and  the  destination  IP  of  the
2745                     packet  is  still  a  router owned IP, the packets can be
2746                     safely dropped.
2747
2748                     A priority-1 logical  flow  with  match  ip4.dst  =  {..}
2749                     matches  on  traffic  destined  to  router owned IPv4 ad‐
2750                     dresses which are also SNAT IPs.  This  flow  has  action
2751                     drop;.
2752
2753                     A  priority-1  logical  flow  with  match  ip6.dst = {..}
2754                     matches on traffic destined  to  router  owned  IPv6  ad‐
2755                     dresses  which  are  also  SNAT IPs. This flow has action
2756                     drop;.
2757
2758              •      Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
2759                     ings  that  have  become known dynamically through ARP or
2760                     neighbor discovery. (The ingress table ARP  Request  will
2761                     issue  an  ARP or neighbor solicitation request for cases
2762                     where the binding is not yet known.)
2763
2764                     A priority-0 logical flow  with  match  ip4  has  actions
2765                     get_arp(outport, reg0); next;.
2766
2767                     A  priority-0  logical  flow  with  match ip6 has actions
2768                     get_nd(outport, xxreg0); next;.
2769
2770              •      For a distributed gateway LRP with redirect-type  set  to
2771                     bridged,   a  priority-50  flow  will  match  outport  ==
2772                     "ROUTER_PORT" and !is_chassis_resident ("cr-ROUTER_PORT")
2773                     has  actions  eth.dst = E; next;, where E is the ethernet
2774                     address of the logical router port.
2775
2776     Ingress Table 15: Check packet length
2777
2778       For distributed logical routers or gateway routers  with  gateway  port
2779       configured  with options:gateway_mtu to a valid integer value, this ta‐
2780       ble adds a priority-50 logical flow with the match outport  ==  GW_PORT
2781       where  GW_PORT  is  the  gateway  router  port  and  applies the action
2782       check_pkt_larger and advances the packet to the next table.
2783
2784       REGBIT_PKT_LARGER = check_pkt_larger(L); next;
2785
2786
2787       where L is the packet length to check for. If the packet is larger than
2788       L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
2789       taken from options:gateway_mtu column of Logical_Router_Port row.
2790
2791       This table adds one priority-0 fallback flow that matches  all  packets
2792       and advances to the next table.
2793
2794     Ingress Table 16: Handle larger packets
2795
2796       For  distributed  logical  routers or gateway routers with gateway port
2797       configured with options:gateway_mtu to a valid integer value, this  ta‐
2798       ble  adds  the  following  priority-150  logical  flow for each logical
2799       router port with the match inport == LRP && outport == GW_PORT &&  REG‐
2800       BIT_PKT_LARGER  &&  !REGBIT_EGRESS_LOOPBACK,  where  LRP is the logical
2801       router port and GW_PORT is the gateway port and applies  the  following
2802       action for ipv4 and ipv6 respectively:
2803
2804       icmp4 {
2805           icmp4.type = 3; /* Destination Unreachable. */
2806           icmp4.code = 4;  /* Frag Needed and DF was Set. */
2807           icmp4.frag_mtu = M;
2808           eth.dst = E;
2809           ip4.dst = ip4.src;
2810           ip4.src = I;
2811           ip.ttl = 255;
2812           REGBIT_EGRESS_LOOPBACK = 1;
2813           REGBIT_PKT_LARGER = 0;
2814           next(pipeline=ingress, table=0);
2815       };
2816       icmp6 {
2817           icmp6.type = 2;
2818           icmp6.code = 0;
2819           icmp6.frag_mtu = M;
2820           eth.dst = E;
2821           ip6.dst = ip6.src;
2822           ip6.src = I;
2823           ip.ttl = 255;
2824           REGBIT_EGRESS_LOOPBACK = 1;
2825           REGBIT_PKT_LARGER = 0;
2826           next(pipeline=ingress, table=0);
2827       };
2828
2829
2830              •      Where  M  is the (fragment MTU - 58) whose value is taken
2831                     from options:gateway_mtu  column  of  Logical_Router_Port
2832                     row.
2833
2834E is the Ethernet address of the logical router port.
2835
2836I is the IPv4/IPv6 address of the logical router port.
2837
2838       This  table  adds one priority-0 fallback flow that matches all packets
2839       and advances to the next table.
2840
2841     Ingress Table 17: Gateway Redirect
2842
2843       For distributed logical routers where one or more of the logical router
2844       ports specifies a gateway chassis, this table redirects certain packets
2845       to the distributed gateway port instances  on  the  gateway  chassises.
2846       This table has the following flows:
2847
2848              •      For each NAT rule in the OVN Northbound database that can
2849                     be handled in a distributed manner, a priority-100  logi‐
2850                     cal  flow  with  match  ip4.src  == B && outport == GW &&
2851                     is_chassis_resident(P), where GW is  the  logical  router
2852                     distributed  gateway  port and P is the NAT logical port.
2853                     IP traffic matching the above rule will  be  managed  lo‐
2854                     cally  setting reg1 to C and eth.src to D, where C is NAT
2855                     external ip and D is NAT external mac.
2856
2857              •      For each NAT rule in the OVN Northbound database that can
2858                     be handled in a distributed manner, a priority-80 logical
2859                     flow with drop action if the NAT logical port is  a  vir‐
2860                     tual port not claimed by any chassis yet.
2861
2862              •      A  priority-50  logical flow with match outport == GW has
2863                     actions outport = CR; next;,  where  GW  is  the  logical
2864                     router  distributed  gateway  port  and  CR  is the chas‐
2865                     sisredirect port representing the instance of the logical
2866                     router distributed gateway port on the gateway chassis.
2867
2868              •      A priority-0 logical flow with match 1 has actions next;.
2869
2870     Ingress Table 18: ARP Request
2871
2872       In  the  common  case where the Ethernet destination has been resolved,
2873       this table outputs the packet. Otherwise, it composes and sends an  ARP
2874       or IPv6 Neighbor Solicitation request. It holds the following flows:
2875
2876              •      Unknown MAC address. A priority-100 flow for IPv4 packets
2877                     with match eth.dst == 00:00:00:00:00:00 has the following
2878                     actions:
2879
2880                     arp {
2881                         eth.dst = ff:ff:ff:ff:ff:ff;
2882                         arp.spa = reg1;
2883                         arp.tpa = reg0;
2884                         arp.op = 1;  /* ARP request. */
2885                         output;
2886                     };
2887
2888
2889                     Unknown  MAC  address. For each IPv6 static route associ‐
2890                     ated with the router with the nexthop  IP:  G,  a  prior‐
2891                     ity-200  flow  for  IPv6  packets  with  match eth.dst ==
2892                     00:00:00:00:00:00 && xxreg0 == G with the  following  ac‐
2893                     tions is added:
2894
2895                     nd_ns {
2896                         eth.dst = E;
2897                         ip6.dst = I
2898                         nd.target = G;
2899                         output;
2900                     };
2901
2902
2903                     Where E is the multicast mac derived from the Gateway IP,
2904                     I is the solicited-node multicast  address  corresponding
2905                     to the target address G.
2906
2907                     Unknown MAC address. A priority-100 flow for IPv6 packets
2908                     with match eth.dst == 00:00:00:00:00:00 has the following
2909                     actions:
2910
2911                     nd_ns {
2912                         nd.target = xxreg0;
2913                         output;
2914                     };
2915
2916
2917                     (Ingress  table  IP  Routing initialized reg1 with the IP
2918                     address owned by outport and (xx)reg0 with  the  next-hop
2919                     IP address)
2920
2921                     The  IP  packet  that triggers the ARP/IPv6 NS request is
2922                     dropped.
2923
2924              •      Known MAC address. A priority-0 flow with match 1 has ac‐
2925                     tions output;.
2926
2927     Egress Table 0: UNDNAT
2928
2929       This  is  for  already  established connections’ reverse traffic. i.e.,
2930       DNAT has already been done in ingress pipeline and now the  packet  has
2931       entered  the  egress  pipeline as part of a reply. This traffic is unD‐
2932       NATed here.
2933
2934              •      For all the configured load balancing rules for a  router
2935                     with  gateway  port  in  OVN_Northbound database that in‐
2936                     cludes an IPv4 address VIP, for every  backend  IPv4  ad‐
2937                     dress  B  defined for the VIP a priority-120 flow is pro‐
2938                     grammed on gateway chassis that matches ip && ip4.src  ==
2939                     B  && outport == GW, where GW is the logical router gate‐
2940                     way port with an action ct_dnat;. If the backend IPv4 ad‐
2941                     dress  B is also configured with L4 port PORT of protocol
2942                     P, then the match also  includes  P.src  ==  PORT.  These
2943                     flows are not added for load balancers with IPv6 VIPs.
2944
2945                     If  the  router is configured to force SNAT any load-bal‐
2946                     anced  packets,  above  action  will   be   replaced   by
2947                     flags.force_snat_for_lb = 1; ct_dnat;.
2948
2949              •      For  each  configuration  in  the OVN Northbound database
2950                     that asks to change  the  destination  IP  address  of  a
2951                     packet  from an IP address of A to B, a priority-100 flow
2952                     matches ip && ip4.src == B && outport == GW, where GW  is
2953                     the logical router gateway port, with an action ct_dnat;.
2954                     If the NAT rule is of type dnat_and_snat and  has  state‐
2955                     less=true  in  the  options,  then  the  action  would be
2956                     ip4/6.src= (B).
2957
2958                     If the NAT rule cannot be handled in a  distributed  man‐
2959                     ner,  then the priority-100 flow above is only programmed
2960                     on the gateway chassis.
2961
2962                     If the NAT rule can be handled in a  distributed  manner,
2963                     then  there  is an additional action eth.src = EA;, where
2964                     EA is the ethernet address associated with the IP address
2965                     A  in  the NAT rule. This allows upstream MAC learning to
2966                     point to the correct chassis.
2967
2968              •      For all IP packets, a priority-50  flow  with  an  action
2969                     flags.loopback = 1; ct_dnat;.
2970
2971              •      A priority-0 logical flow with match 1 has actions next;.
2972
2973     Egress Table 1: Post UNDNAT
2974
2975              •      A  priority-50 logical flow is added that commits any un‐
2976                     tracked flows from the previous table lr_out_undnat. This
2977                     flow  matches on ct.new && ip with action ct_commit { } ;
2978                     next; .
2979
2980              •      A priority-0 logical flow with match 1 has actions next;.
2981
2982     Egress Table 2: SNAT
2983
2984       Packets that are configured to be SNATed get their  source  IP  address
2985       changed based on the configuration in the OVN Northbound database.
2986
2987              •      A  priority-120 flow to advance the IPv6 Neighbor solici‐
2988                     tation packet to next table to skip  SNAT.  In  the  case
2989                     where  ovn-controller  injects an IPv6 Neighbor Solicita‐
2990                     tion packet (for nd_ns action) we don’t want  the  packet
2991                     to go throught conntrack.
2992
2993       Egress Table 2: SNAT on Gateway Routers
2994
2995              •      If  the Gateway router in the OVN Northbound database has
2996                     been configured to force SNAT a  packet  (that  has  been
2997                     previously  DNATted)  to  B,  a priority-100 flow matches
2998                     flags.force_snat_for_dnat ==  1  &&  ip  with  an  action
2999                     ct_snat(B);.
3000
3001              •      If  a  load balancer configured to skip snat has been ap‐
3002                     plied to the Gateway router pipeline, a priority-120 flow
3003                     matches  flags.skip_snat_for_lb == 1 && ip with an action
3004                     next;.
3005
3006              •      If the Gateway router in the OVN Northbound database  has
3007                     been  configured  to  force  SNAT a packet (that has been
3008                     previously  load-balanced)  using  router  IP  (i.e   op‐
3009                     tions:lb_force_snat_ip=router_ip),  then for each logical
3010                     router port P attached to the Gateway  router,  a  prior‐
3011                     ity-110 flow matches flags.force_snat_for_lb == 1 && out‐
3012                     port == P
3013                      with an action ct_snat(R); where R is the IP  configured
3014                     on  the  router  port.  If  R is an IPv4 address then the
3015                     match will also include ip4 and if it is an IPv6 address,
3016                     then the match will also include ip6.
3017
3018                     If  the logical router port P is configured with multiple
3019                     IPv4 and multiple IPv6 addresses, only the first IPv4 and
3020                     first IPv6 address is considered.
3021
3022              •      If  the Gateway router in the OVN Northbound database has
3023                     been configured to force SNAT a  packet  (that  has  been
3024                     previously  load-balanced)  to  B,  a  priority-100  flow
3025                     matches flags.force_snat_for_lb == 1 && ip with an action
3026                     ct_snat(B);.
3027
3028              •      For  each  configuration  in the OVN Northbound database,
3029                     that asks to change the source IP  address  of  a  packet
3030                     from  an  IP  address of A or to change the source IP ad‐
3031                     dress of a packet that belongs to network A to B, a  flow
3032                     matches  ip  &&  ip4.src == A with an action ct_snat(B);.
3033                     The priority of the flow is calculated based on the  mask
3034                     of  A,  with  matches  having larger masks getting higher
3035                     priorities. If the NAT rule is of type dnat_and_snat  and
3036                     has  stateless=true in the options, then the action would
3037                     be ip4/6.src= (B).
3038
3039              •      If the NAT  rule  has  allowed_ext_ips  configured,  then
3040                     there is an additional match ip4.dst == allowed_ext_ips .
3041                     Similarly, for  IPV6,  match  would  be  ip6.dst  ==  al‐
3042                     lowed_ext_ips.
3043
3044              •      If  the  NAT rule has exempted_ext_ips set, then there is
3045                     an additional flow configured at the priority + 1 of cor‐
3046                     responding  NAT  rule. The flow matches if destination ip
3047                     is an exempted_ext_ip and the action is next; . This flow
3048                     is  used  to bypass the ct_snat action for a packet which
3049                     is destinted to exempted_ext_ips.
3050
3051              •      A priority-0 logical flow with match 1 has actions next;.
3052
3053       Egress Table 2: SNAT on Distributed Routers
3054
3055              •      For each configuration in the  OVN  Northbound  database,
3056                     that  asks  to  change  the source IP address of a packet
3057                     from an IP address of A or to change the  source  IP  ad‐
3058                     dress  of a packet that belongs to network A to B, a flow
3059                     matches ip && ip4.src == A && outport == GW, where GW  is
3060                     the   logical   router   gateway  port,  with  an  action
3061                     ct_snat(B);. The priority of the flow is calculated based
3062                     on  the  mask of A, with matches having larger masks get‐
3063                     ting higher priorities.  If  the  NAT  rule  is  of  type
3064                     dnat_and_snat and has stateless=true in the options, then
3065                     the action would be ip4/6.src= (B).
3066
3067                     If the NAT rule cannot be handled in a  distributed  man‐
3068                     ner,  then the flow above is only programmed on the gate‐
3069                     way chassis increasing flow priority by 128 in  order  to
3070                     be run first
3071
3072                     If  the  NAT rule can be handled in a distributed manner,
3073                     then there is an additional action eth.src =  EA;,  where
3074                     EA is the ethernet address associated with the IP address
3075                     A in the NAT rule. This allows upstream MAC  learning  to
3076                     point to the correct chassis.
3077
3078                     If  the  NAT  rule  has  allowed_ext_ips configured, then
3079                     there is an additional match ip4.dst == allowed_ext_ips .
3080                     Similarly,  for  IPV6,  match  would  be  ip6.dst  == al‐
3081                     lowed_ext_ips.
3082
3083                     If the NAT rule has exempted_ext_ips set, then  there  is
3084                     an additional flow configured at the priority + 1 of cor‐
3085                     responding NAT rule. The flow matches if  destination  ip
3086                     is an exempted_ext_ip and the action is next; . This flow
3087                     is used to bypass the ct_snat action for a flow which  is
3088                     destinted to exempted_ext_ips.
3089
3090              •      A priority-0 logical flow with match 1 has actions next;.
3091
3092     Egress Table 3: Egress Loopback
3093
3094       For  distributed  logical routers where one of the logical router ports
3095       specifies a gateway chassis.
3096
3097       While UNDNAT and SNAT processing have already occurred by  this  point,
3098       this  traffic  needs  to be forced through egress loopback on this dis‐
3099       tributed gateway port instance, in order for UNSNAT and DNAT processing
3100       to  be applied, and also for IP routing and ARP resolution after all of
3101       the NAT processing, so that the packet can be forwarded to the destina‐
3102       tion.
3103
3104       This table has the following flows:
3105
3106              •      For  each  NAT  rule  in the OVN Northbound database on a
3107                     distributed router,  a  priority-100  logical  flow  with
3108                     match  ip4.dst  == E && outport == GW && is_chassis_resi‐
3109                     dent(P), where E is the external IP address specified  in
3110                     the  NAT rule, GW is the logical router distributed gate‐
3111                     way port. For dnat_and_snat NAT rule, P  is  the  logical
3112                     port specified in the NAT rule. If logical_port column of
3113                     NAT table is NOT set, then P is the chassisredirect  port
3114                     of GW with the following actions:
3115
3116                     clone {
3117                         ct_clear;
3118                         inport = outport;
3119                         outport = "";
3120                         flags = 0;
3121                         flags.loopback = 1;
3122                         reg0 = 0;
3123                         reg1 = 0;
3124                         ...
3125                         reg9 = 0;
3126                         REGBIT_EGRESS_LOOPBACK = 1;
3127                         next(pipeline=ingress, table=0);
3128                     };
3129
3130
3131                     flags.loopback  is set since in_port is unchanged and the
3132                     packet may return back to that port after NAT processing.
3133                     REGBIT_EGRESS_LOOPBACK  is  set  to  indicate that egress
3134                     loopback has occurred, in order to skip the source IP ad‐
3135                     dress check against the router address.
3136
3137              •      A priority-0 logical flow with match 1 has actions next;.
3138
3139     Egress Table 4: Delivery
3140
3141       Packets that reach this table are ready for delivery. It contains:
3142
3143              •      Priority-110  logical flows that match IP multicast pack‐
3144                     ets on each enabled logical router port  and  modify  the
3145                     Ethernet  source  address  of the packets to the Ethernet
3146                     address of the port and then execute action output;.
3147
3148              •      Priority-100 logical flows that match packets on each en‐
3149                     abled logical router port, with action output;.
3150
3151
3152
3153OVN 21.09.0                       ovn-northd                     ovn-northd(8)
Impressum