1ovn-northd(8)                 Open vSwitch Manual                ovn-northd(8)
2
3
4

NAME

6       ovn-northd - Open Virtual Network central control daemon
7

SYNOPSIS

9       ovn-northd [options]
10

DESCRIPTION

12       ovn-northd  is  a  centralized  daemon  responsible for translating the
13       high-level OVN configuration into logical configuration  consumable  by
14       daemons  such as ovn-controller. It translates the logical network con‐
15       figuration in terms of conventional network concepts,  taken  from  the
16       OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
17       the OVN Southbound Database (see ovn-sb(5)) below it.
18

OPTIONS

20       --ovnnb-db=database
21              The OVSDB database containing the OVN  Northbound  Database.  If
22              the  OVN_NB_DB environment variable is set, its value is used as
23              the default. Otherwise, the default is unix:/ovnnb_db.sock.
24
25       --ovnsb-db=database
26              The OVSDB database containing the OVN  Southbound  Database.  If
27              the  OVN_SB_DB environment variable is set, its value is used as
28              the default. Otherwise, the default is unix:/ovnsb_db.sock.
29
30       database in the above options must be an OVSDB active or  passive  con‐
31       nection method, as described in ovsdb(7).
32
33   Daemon Options
34       --pidfile[=pidfile]
35              Causes a file (by default, program.pid) to be created indicating
36              the PID of the running process. If the pidfile argument  is  not
37              specified, or if it does not begin with /, then it is created in
38              .
39
40              If --pidfile is not specified, no pidfile is created.
41
42       --overwrite-pidfile
43              By default, when --pidfile is specified and the  specified  pid‐
44              file already exists and is locked by a running process, the dae‐
45              mon refuses to start. Specify --overwrite-pidfile to cause it to
46              instead overwrite the pidfile.
47
48              When --pidfile is not specified, this option has no effect.
49
50       --detach
51              Runs  this  program  as a background process. The process forks,
52              and in the child it starts a new session,  closes  the  standard
53              file descriptors (which has the side effect of disabling logging
54              to the console), and changes its current directory to  the  root
55              (unless  --no-chdir is specified). After the child completes its
56              initialization, the parent exits.
57
58       --monitor
59              Creates an additional process to monitor  this  program.  If  it
60              dies  due  to a signal that indicates a programming error (SIGA‐
61              BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
62              or SIGXFSZ) then the monitor process starts a new copy of it. If
63              the daemon dies or exits for another reason, the monitor process
64              exits.
65
66              This  option  is  normally used with --detach, but it also func‐
67              tions without it.
68
69       --no-chdir
70              By default, when --detach is specified, the daemon  changes  its
71              current  working  directory  to  the  root  directory  after  it
72              detaches. Otherwise, invoking the daemon from a carelessly  cho‐
73              sen  directory  would  prevent the administrator from unmounting
74              the file system that holds that directory.
75
76              Specifying --no-chdir suppresses this behavior,  preventing  the
77              daemon  from changing its current working directory. This may be
78              useful for collecting core files, since it is common behavior to
79              write core dumps into the current working directory and the root
80              directory is not a good directory to use.
81
82              This option has no effect when --detach is not specified.
83
84       --no-self-confinement
85              By default this daemon will try to self-confine itself  to  work
86              with  files  under  well-known  directories whitelisted at build
87              time. It is better to stick with this default behavior  and  not
88              to  use  this  flag  unless some other Access Control is used to
89              confine daemon. Note that in contrast to  other  access  control
90              implementations  that  are  typically enforced from kernel-space
91              (e.g. DAC or MAC), self-confinement is imposed  from  the  user-
92              space daemon itself and hence should not be considered as a full
93              confinement strategy, but instead should be viewed as  an  addi‐
94              tional layer of security.
95
96       --user=user:group
97              Causes  this  program  to  run  as a different user specified in
98              user:group, thus dropping most of  the  root  privileges.  Short
99              forms  user  and  :group  are also allowed, with current user or
100              group assumed, respectively. Only daemons started  by  the  root
101              user accepts this argument.
102
103              On   Linux,   daemons   will   be   granted   CAP_IPC_LOCK   and
104              CAP_NET_BIND_SERVICES before dropping root  privileges.  Daemons
105              that  interact  with  a  datapath, such as ovs-vswitchd, will be
106              granted three  additional  capabilities,  namely  CAP_NET_ADMIN,
107              CAP_NET_BROADCAST  and  CAP_NET_RAW.  The capability change will
108              apply even if the new user is root.
109
110              On Windows, this option is not currently supported. For security
111              reasons,  specifying  this  option will cause the daemon process
112              not to start.
113
114   Logging Options
115       -v[spec]
116       --verbose=[spec]
117            Sets logging levels. Without any spec,  sets  the  log  level  for
118            every  module and destination to dbg. Otherwise, spec is a list of
119            words separated by spaces or commas or colons, up to one from each
120            category below:
121
122            ·      A  valid module name, as displayed by the vlog/list command
123                   on ovs-appctl(8), limits the log level change to the speci‐
124                   fied module.
125
126            ·      syslog,  console, or file, to limit the log level change to
127                   only to the system log, to  the  console,  or  to  a  file,
128                   respectively.  (If --detach is specified, the daemon closes
129                   its standard file descriptors, so logging  to  the  console
130                   will have no effect.)
131
132                   On  Windows  platform,  syslog is accepted as a word and is
133                   only useful along with the --syslog-target option (the word
134                   has no effect otherwise).
135
136            ·      off,  emer,  err,  warn,  info,  or dbg, to control the log
137                   level. Messages of the given severity  or  higher  will  be
138                   logged,  and  messages  of  lower severity will be filtered
139                   out. off filters out all messages. See ovs-appctl(8) for  a
140                   definition of each log level.
141
142            Case is not significant within spec.
143
144            Regardless  of the log levels set for file, logging to a file will
145            not take place unless --log-file is also specified (see below).
146
147            For compatibility with older versions of OVS, any is accepted as a
148            word but has no effect.
149
150       -v
151       --verbose
152            Sets  the  maximum  logging  verbosity level, equivalent to --ver‐
153            bose=dbg.
154
155       -vPATTERN:destination:pattern
156       --verbose=PATTERN:destination:pattern
157            Sets  the  log  pattern  for  destination  to  pattern.  Refer  to
158            ovs-appctl(8) for a description of the valid syntax for pattern.
159
160       -vFACILITY:facility
161       --verbose=FACILITY:facility
162            Sets  the RFC5424 facility of the log message. facility can be one
163            of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
164            ftp,  ntp,  audit,  alert, clock2, local0, local1, local2, local3,
165            local4, local5, local6 or local7. If this option is not specified,
166            daemon  is  used  as  the  default for the local system syslog and
167            local0 is used while sending a message to the target provided  via
168            the --syslog-target option.
169
170       --log-file[=file]
171            Enables  logging  to a file. If file is specified, then it is used
172            as the exact name for the log file. The default log file name used
173            if file is omitted is /var/log/ovn/program.log.
174
175       --syslog-target=host:port
176            Send  syslog messages to UDP port on host, in addition to the sys‐
177            tem syslog. The host must be a numerical IP address, not  a  host‐
178            name.
179
180       --syslog-method=method
181            Specify  method  as  how  syslog messages should be sent to syslog
182            daemon. The following forms are supported:
183
184            ·      libc, to use the libc syslog() function. Downside of  using
185                   this  options  is that libc adds fixed prefix to every mes‐
186                   sage before it is actually sent to the syslog  daemon  over
187                   /dev/log UNIX domain socket.
188
189            ·      unix:file, to use a UNIX domain socket directly. It is pos‐
190                   sible to specify arbitrary message format with this option.
191                   However,  rsyslogd  8.9  and  older versions use hard coded
192                   parser function anyway that limits UNIX domain socket  use.
193                   If  you  want  to  use  arbitrary message format with older
194                   rsyslogd versions, then use  UDP  socket  to  localhost  IP
195                   address instead.
196
197            ·      udp:ip:port,  to  use  a UDP socket. With this method it is
198                   possible to use arbitrary message format  also  with  older
199                   rsyslogd.  When  sending  syslog  messages  over UDP socket
200                   extra precaution needs to be taken into account, for  exam‐
201                   ple,  syslog daemon needs to be configured to listen on the
202                   specified UDP port,  accidental  iptables  rules  could  be
203                   interfering  with  local  syslog traffic and there are some
204                   security considerations that apply to UDP sockets,  but  do
205                   not apply to UNIX domain sockets.
206
207            ·      null, to discard all messages logged to syslog.
208
209            The  default is taken from the OVS_SYSLOG_METHOD environment vari‐
210            able; if it is unset, the default is libc.
211
212   PKI Options
213       PKI configuration is required in order to use SSL for  the  connections
214       to the Northbound and Southbound databases.
215
216              -p privkey.pem
217              --private-key=privkey.pem
218                   Specifies  a  PEM  file  containing the private key used as
219                   identity for outgoing SSL connections.
220
221              -c cert.pem
222              --certificate=cert.pem
223                   Specifies a PEM file containing a certificate  that  certi‐
224                   fies the private key specified on -p or --private-key to be
225                   trustworthy. The certificate must be signed by the certifi‐
226                   cate  authority  (CA) that the peer in SSL connections will
227                   use to verify it.
228
229              -C cacert.pem
230              --ca-cert=cacert.pem
231                   Specifies a PEM file containing the CA certificate for ver‐
232                   ifying certificates presented to this program by SSL peers.
233                   (This may be the same certificate that  SSL  peers  use  to
234                   verify the certificate specified on -c or --certificate, or
235                   it may be a different one, depending on the PKI  design  in
236                   use.)
237
238              -C none
239              --ca-cert=none
240                   Disables  verification  of  certificates  presented  by SSL
241                   peers. This introduces a security risk,  because  it  means
242                   that  certificates  cannot be verified to be those of known
243                   trusted hosts.
244
245   Other Options
246       --unixctl=socket
247              Sets the name of the control socket on which program listens for
248              runtime  management  commands  (see RUNTIME MANAGEMENT COMMANDS,
249              below). If socket does not begin with /, it  is  interpreted  as
250              relative  to  .  If  --unixctl  is  not used at all, the default
251              socket is /program.pid.ctl, where pid is program’s process ID.
252
253              On Windows a local named pipe is used to listen for runtime man‐
254              agement  commands.  A  file  is  created in the absolute path as
255              pointed by socket or if --unixctl is not used at all, a file  is
256              created  as  program in the configured OVS_RUNDIR directory. The
257              file exists just to mimic the behavior of a Unix domain socket.
258
259              Specifying none for socket disables the control socket feature.
260
261
262
263       -h
264       --help
265            Prints a brief help message to the console.
266
267       -V
268       --version
269            Prints version information to the console.
270

RUNTIME MANAGEMENT COMMANDS

272       ovs-appctl can send commands to a running ovn-northd process. The  cur‐
273       rently supported commands are described below.
274
275              exit   Causes ovn-northd to gracefully terminate.
276
277              pause  Pauses  the  ovn-northd  operation  from  processing  any
278                     Northbound and Southbound database changes.
279
280              resume Resumes the ovn-northd operation  to  process  Northbound
281                     and  Southbound  database  contents  and generate logical
282                     flows.
283
284              is-paused
285                     Returns "true" if ovn-northd is currently paused, "false"
286                     otherwise.
287

ACTIVE-STANDBY FOR HIGH AVAILABILITY

289       You  may  run  ovn-northd more than once in an OVN deployment. OVN will
290       automatically ensure that only one of them is active at a time. If mul‐
291       tiple  instances  of  ovn-northd  are running and the active ovn-northd
292       fails, one of the hot standby instances of  ovn-northd  will  automati‐
293       cally take over.
294
295    Active-Standby with multiple OVN DB servers
296       You may run multiple OVN DB servers in an OVN deployment with:
297
298              ·      OVN  DB  servers deployed in active/passive mode with one
299                     active and multiple passive ovsdb-servers.
300
301              ·      ovn-northd also deployed on all these nodes,  using  unix
302                     ctl sockets to connect to the local OVN DB servers.
303
304       In  such deployments, the ovn-northds on the passive nodes will process
305       the DB changes and compute  logical  flows  to  be  thrown  out  later,
306       because  write  transactions  are  not  allowed  by  the passive ovsdb-
307       servers. It results in unnecessary CPU usage.
308
309       With the help of  runtime  management  command  pause,  you  can  pause
310       ovn-northd  on these nodes. When a passive node becomes master, you can
311       use the runtime management command resume to resume the  ovn-northd  to
312       process the DB changes.
313

LOGICAL FLOW TABLE STRUCTURE

315       One  of the main purposes of ovn-northd is to populate the Logical_Flow
316       table in  the  OVN_Southbound  database.  This  section  describes  how
317       ovn-northd does this for switch and router logical datapaths.
318
319   Logical Switch Datapaths
320     Ingress Table 0: Admission Control and Ingress Port Security - L2
321
322       Ingress table 0 contains these logical flows:
323
324              ·      Priority 100 flows to drop packets with VLAN tags or mul‐
325                     ticast Ethernet source addresses.
326
327              ·      Priority 50 flows that implement  ingress  port  security
328                     for each enabled logical port. For logical ports on which
329                     port security is enabled, these match the inport and  the
330                     valid  eth.src address(es) and advance only those packets
331                     to the next flow table. For logical ports on  which  port
332                     security  is  not enabled, these advance all packets that
333                     match the inport.
334
335       There are no flows for disabled logical ports because the  default-drop
336       behavior  of  logical flow tables causes packets that ingress from them
337       to be dropped.
338
339     Ingress Table 1: Ingress Port Security - IP
340
341       Ingress table 1 contains these logical flows:
342
343              ·      For each element in the port security set having  one  or
344                     more IPv4 or IPv6 addresses (or both),
345
346                     ·      Priority  90  flow to allow IPv4 traffic if it has
347                            IPv4  addresses  which  match  the  inport,  valid
348                            eth.src and valid ip4.src address(es).
349
350                     ·      Priority  90  flow  to  allow  IPv4 DHCP discovery
351                            traffic if it has a valid eth.src. This is  neces‐
352                            sary  since  DHCP discovery messages are sent from
353                            the unspecified IPv4 address (0.0.0.0)  since  the
354                            IPv4 address has not yet been assigned.
355
356                     ·      Priority  90  flow to allow IPv6 traffic if it has
357                            IPv6  addresses  which  match  the  inport,  valid
358                            eth.src and valid ip6.src address(es).
359
360                     ·      Priority  90  flow  to  allow  IPv6 DAD (Duplicate
361                            Address Detection)  traffic  if  it  has  a  valid
362                            eth.src.  This  is  is necessary since DAD include
363                            requires joining an multicast  group  and  sending
364                            neighbor  solicitations  for  the  newly  assigned
365                            address. Since no address is yet  assigned,  these
366                            are sent from the unspecified IPv6 address (::).
367
368                     ·      Priority  80  flow to drop IP (both IPv4 and IPv6)
369                            traffic which match the inport and valid eth.src.
370
371              ·      One priority-0 fallback flow that matches all packets and
372                     advances to the next table.
373
374     Ingress Table 2: Ingress Port Security - Neighbor discovery
375
376       Ingress table 2 contains these logical flows:
377
378              ·      For each element in the port security set,
379
380                     ·      Priority  90 flow to allow ARP traffic which match
381                            the inport and valid eth.src and arp.sha.  If  the
382                            element  has  one  or more IPv4 addresses, then it
383                            also matches the valid arp.spa.
384
385                     ·      Priority 90 flow to allow IPv6 Neighbor  Solicita‐
386                            tion  and  Advertisement  traffic  which match the
387                            inport, valid eth.src and  nd.sll/nd.tll.  If  the
388                            element  has  one  or more IPv6 addresses, then it
389                            also matches the valid nd.target  address(es)  for
390                            Neighbor Advertisement traffic.
391
392                     ·      Priority  80  flow  to  drop ARP and IPv6 Neighbor
393                            Solicitation and Advertisement traffic which match
394                            the inport and valid eth.src.
395
396              ·      One priority-0 fallback flow that matches all packets and
397                     advances to the next table.
398
399     Ingress Table 3: from-lport Pre-ACLs
400
401       This table prepares flows  for  possible  stateful  ACL  processing  in
402       ingress  table  ACLs.  It  contains a priority-0 flow that simply moves
403       traffic to the next table. If stateful ACLs are  used  in  the  logical
404       datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
405       1; next;) for table Pre-stateful to send IP packets to  the  connection
406       tracker  before  eventually advancing to ingress table ACLs. If special
407       ports such as route ports or localnet ports can’t use  ct(),  a  prior‐
408       ity-110 flow is added to skip over stateful ACLs.
409
410     Ingress Table 4: Pre-LB
411
412       This table prepares flows for possible stateful load balancing process‐
413       ing in ingress table LB and Stateful. It  contains  a  priority-0  flow
414       that  simply  moves  traffic  to the next table. Moreover it contains a
415       priority-110 flow to move IPv6 Neighbor Discovery traffic to  the  next
416       table.  If  load  balancing rules with virtual IP addresses (and ports)
417       are configured in OVN_Northbound database for a  logical  switch  data‐
418       path,  a  priority-100  flow  is  added  for each configured virtual IP
419       address VIP. For IPv4 VIPs, the match is ip && ip4.dst == VIP. For IPv6
420       VIPs,  the  match  is  ip  &&  ip6.dst  == VIP. The flow sets an action
421       reg0[0] = 1; next; to act as a hint for table Pre-stateful to  send  IP
422       packets  to  the  connection tracker for packet de-fragmentation before
423       eventually advancing to ingress table LB. If controller_event has  been
424       enabled and load balancing rules with empty backends have been added in
425       OVN_Northbound, a 130 flow is added to  trigger  ovn-controller  events
426       whenever  the  chassis  receives  a  packet for that particular VIP. If
427       event-elb meter has been previously created, it will be  associated  to
428       the empty_lb logical flow
429
430     Ingress Table 5: Pre-stateful
431
432       This  table prepares flows for all possible stateful processing in next
433       tables. It contains a priority-0 flow that simply moves traffic to  the
434       next table. A priority-100 flow sends the packets to connection tracker
435       based on a hint provided by the  previous  tables  (with  a  match  for
436       reg0[0] == 1) by using the ct_next; action.
437
438     Ingress table 6: from-lport ACLs
439
440       Logical flows in this table closely reproduce those in the ACL table in
441       the OVN_Northbound database for the from-lport direction. The  priority
442       values  from  the ACL table have a limited range and have 1000 added to
443       them to leave room for OVN default flows at both higher and lower  pri‐
444       orities.
445
446              ·      allow  ACLs  translate  into logical flows with the next;
447                     action. If there are any stateful ACLs on this  datapath,
448                     then allow ACLs translate to ct_commit; next; (which acts
449                     as a hint for the next tables to commit the connection to
450                     conntrack),
451
452              ·      allow-related  ACLs translate into logical flows with the
453                     ct_commit(ct_label=0/1); next; actions  for  new  connec‐
454                     tions and reg0[1] = 1; next; for existing connections.
455
456              ·      Other  ACLs  translate to drop; for new or untracked con‐
457                     nections and ct_commit(ct_label=1/1); for  known  connec‐
458                     tions.  Setting  ct_label  marks a connection as one that
459                     was previously allowed, but should no longer  be  allowed
460                     due to a policy change.
461
462       This  table  also contains a priority 0 flow with action next;, so that
463       ACLs allow packets by default. If the logical datapath has a  statetful
464       ACL, the following flows will also be added:
465
466              ·      A priority-1 flow that sets the hint to commit IP traffic
467                     to the connection  tracker  (with  action  reg0[1]  =  1;
468                     next;).  This  is  needed  for  the  default allow policy
469                     because, while the initiator’s direction may not have any
470                     stateful  rules,  the  server’s  may  and then its return
471                     traffic would not be known and marked as invalid.
472
473              ·      A priority-65535 flow that  allows  any  traffic  in  the
474                     reply  direction for a connection that has been committed
475                     to the connection tracker (i.e., established  flows),  as
476                     long as the committed flow does not have ct_label.blocked
477                     set. We only handle traffic in the reply  direction  here
478                     because  we  want all packets going in the request direc‐
479                     tion to still go through the  flows  that  implement  the
480                     currently  defined  policy based on ACLs. If a connection
481                     is no longer allowed by policy, ct_label.blocked will get
482                     set  and packets in the reply direction will no longer be
483                     allowed, either.
484
485              ·      A priority-65535 flow that allows  any  traffic  that  is
486                     considered  related to a committed flow in the connection
487                     tracker (e.g., an ICMP Port Unreachable from  a  non-lis‐
488                     tening  UDP port), as long as the committed flow does not
489                     have ct_label.blocked set.
490
491              ·      A priority-65535 flow that drops all  traffic  marked  by
492                     the connection tracker as invalid.
493
494              ·      A priority-65535 flow that drops all traffic in the reply
495                     direction with ct_label.blocked set meaning that the con‐
496                     nection  should  no  longer  be  allowed  due to a policy
497                     change. Packets in the request direction are skipped here
498                     to let a newly created ACL re-allow this connection.
499
500     Ingress Table 7: from-lport QoS Marking
501
502       Logical  flows  in  this table closely reproduce those in the QoS table
503       with the action column set  in  the  OVN_Northbound  database  for  the
504       from-lport direction.
505
506              ·      For  every  qos_rules entry in a logical switch with DSCP
507                     marking enabled, a flow will be  added  at  the  priority
508                     mentioned in the QoS table.
509
510              ·      One priority-0 fallback flow that matches all packets and
511                     advances to the next table.
512
513     Ingress Table 8: from-lport QoS Meter
514
515       Logical flows in this table closely reproduce those in  the  QoS  table
516       with  the  bandwidth  column set in the OVN_Northbound database for the
517       from-lport direction.
518
519              ·      For every qos_rules entry in a logical switch with meter‐
520                     ing  enabled,  a flow will be added at the priorirty men‐
521                     tioned in the QoS table.
522
523              ·      One priority-0 fallback flow that matches all packets and
524                     advances to the next table.
525
526     Ingress Table 9: LB
527
528       It contains a priority-0 flow that simply moves traffic to the next ta‐
529       ble. For established connections a priority 100 flow matches on  ct.est
530       &&  !ct.rel && !ct.new && !ct.inv and sets an action reg0[2] = 1; next;
531       to act as a hint for table Stateful to send packets through  connection
532       tracker  to  NAT the packets. (The packet will automatically get DNATed
533       to the same IP address as the first packet in that connection.)
534
535     Ingress Table 10: Stateful
536
537              ·      For all the configured load balancing rules for a  switch
538                     in  OVN_Northbound  database that includes a L4 port PORT
539                     of protocol P and IP address VIP, a priority-120 flow  is
540                     added.  For  IPv4 VIPs , the flow matches ct.new && ip &&
541                     ip4.dst == VIP && P && P.dst == PORT. For IPv6 VIPs,  the
542                     flow matches ct.new && ip && ip6.dst == VIP && P && P.dst
543                     == PORT. The flow’s action is ct_lb(args)  ,  where  args
544                     contains  comma separated IP addresses (and optional port
545                     numbers) to load balance to. The address family of the IP
546                     addresses  of  args  is the same as the address family of
547                     VIP
548
549              ·      For all the configured load balancing rules for a  switch
550                     in  OVN_Northbound  database  that  includes  just  an IP
551                     address VIP to match on, OVN adds  a  priority-110  flow.
552                     For  IPv4  VIPs, the flow matches ct.new && ip && ip4.dst
553                     == VIP. For IPv6 VIPs, the flow matches ct.new &&  ip  &&
554                     ip6.dst  ==  VIP. The action on this flow is ct_lb(args),
555                     where args contains comma separated IP addresses  of  the
556                     same address family as VIP.
557
558              ·      A priority-100 flow commits packets to connection tracker
559                     using ct_commit; next; action based on a hint provided by
560                     the previous tables (with a match for reg0[1] == 1).
561
562              ·      A  priority-100  flow  sends  the  packets  to connection
563                     tracker using ct_lb; as the action based on a  hint  pro‐
564                     vided by the previous tables (with a match for reg0[2] ==
565                     1).
566
567              ·      A priority-0 flow that simply moves traffic to  the  next
568                     table.
569
570     Ingress Table 11: ARP/ND responder
571
572       This  table  implements  ARP/ND responder in a logical switch for known
573       IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
574       by locally responding to ARP requests without the need to send to other
575       hypervisors. One common case is when the inport is a logical port asso‐
576       ciated with a VIF and the broadcast is responded to on the local hyper‐
577       visor rather than broadcast across the whole network and  responded  to
578       by the destination VM. This behavior is proxy ARP.
579
580       ARP  requests  arrive  from  VMs  from  a logical switch inport of type
581       default. For this case, the logical switch proxy ARP rules can  be  for
582       other  VMs  or logical router ports. Logical switch proxy ARP rules may
583       be programmed both for mac binding of IP  addresses  on  other  logical
584       switch  VIF  ports  (which are of the default logical switch port type,
585       representing connectivity to VMs or containers), and for mac binding of
586       IP  addresses  on  logical switch router type ports, representing their
587       logical router port peers. In order to support proxy  ARP  for  logical
588       router  ports,  an  IP address must be configured on the logical switch
589       router type port, with the same value as the peer logical router  port.
590       The configured MAC addresses must match as well. When a VM sends an ARP
591       request for a distributed logical router port and if  the  peer  router
592       type  port  of  the attached logical switch does not have an IP address
593       configured, the ARP request will be broadcast on  the  logical  switch.
594       One of the copies of the ARP request will go through the logical switch
595       router type port to the logical  router  datapath,  where  the  logical
596       router  ARP  responder will generate a reply. The MAC binding of a dis‐
597       tributed logical router, once learned by an associated VM, is used  for
598       all  that VM’s communication needing routing. Hence, the action of a VM
599       re-arping for the mac binding of the  logical  router  port  should  be
600       rare.
601
602       Logical  switch  ARP  responder  proxy  ARP  rules can also be hit when
603       receiving ARP requests externally on a L2 gateway port. In  this  case,
604       the  hypervisor acting as an L2 gateway, responds to the ARP request on
605       behalf of a destination VM.
606
607       Note that ARP requests received from localnet or vtep  logical  inports
608       can either go directly to VMs, in which case the VM responds or can hit
609       an ARP responder for a logical router port if the  packet  is  used  to
610       resolve a logical router port next hop address. In either case, logical
611       switch ARP responder rules will not be hit. It contains  these  logical
612       flows:
613
614              ·      Priority-100 flows to skip the ARP responder if inport is
615                     of type localnet or vtep and  advances  directly  to  the
616                     next  table.  ARP requests sent to localnet or vtep ports
617                     can be received by multiple hypervisors. Now, because the
618                     same mac binding rules are downloaded to all hypervisors,
619                     each of the multiple hypervisors will respond. This  will
620                     confuse  L2  learning  on the source of the ARP requests.
621                     ARP requests received on an inport of type router are not
622                     expected  to  hit any logical switch ARP responder flows.
623                     However, no skip flows are installed for  these  packets,
624                     as  there would be some additional flow cost for this and
625                     the value appears limited.
626
627              ·      If inport V is of type virtual adds a priority-100  logi‐
628                     cal  flow  for  each P configured in the options:virtual-
629                     parents column with the match
630
631                     inport == P && && ((arp.op == 1 && arp.spa == VIP && arp.tpa == VIP) || (arp.op == 2 && arp.spa == VIP))
632
633
634                     and applies the action
635
636                     bind_vport(V, inport);
637
638
639                     and advances the packet to the next table.
640
641                     Where VIP is the virtual  ip  configured  in  the  column
642                     options:virtual-ip.
643
644              ·      Priority-50  flows  that match ARP requests to each known
645                     IP address A of every logical switch  port,  and  respond
646                     with  ARP  replies  directly  with corresponding Ethernet
647                     address E:
648
649                     eth.dst = eth.src;
650                     eth.src = E;
651                     arp.op = 2; /* ARP reply. */
652                     arp.tha = arp.sha;
653                     arp.sha = E;
654                     arp.tpa = arp.spa;
655                     arp.spa = A;
656                     outport = inport;
657                     flags.loopback = 1;
658                     output;
659
660
661                     These flows are omitted for  logical  ports  (other  than
662                     router  ports  or  localport ports) that are down and for
663                     logical ports of type virtual.
664
665              ·      Priority-50 flows that match IPv6 ND  neighbor  solicita‐
666                     tions  to each known IP address A (and A’s solicited node
667                     address) of every logical  switch  port  except  of  type
668                     router, and respond with neighbor advertisements directly
669                     with corresponding Ethernet address E:
670
671                     nd_na {
672                         eth.src = E;
673                         ip6.src = A;
674                         nd.target = A;
675                         nd.tll = E;
676                         outport = inport;
677                         flags.loopback = 1;
678                         output;
679                     };
680
681
682                     Priority-50 flows that match IPv6 ND  neighbor  solicita‐
683                     tions  to each known IP address A (and A’s solicited node
684                     address) of logical  switch  port  of  type  router,  and
685                     respond with neighbor advertisements directly with corre‐
686                     sponding Ethernet address E:
687
688                     nd_na_router {
689                         eth.src = E;
690                         ip6.src = A;
691                         nd.target = A;
692                         nd.tll = E;
693                         outport = inport;
694                         flags.loopback = 1;
695                         output;
696                     };
697
698
699                     These flows are omitted for  logical  ports  (other  than
700                     router  ports  or  localport ports) that are down and for
701                     logical ports of type virtual.
702
703              ·      Priority-100 flows with match criteria like the  ARP  and
704                     ND  flows above, except that they only match packets from
705                     the inport that owns the IP addresses in  question,  with
706                     action  next;.  These flows prevent OVN from replying to,
707                     for example, an ARP request emitted by a VM for  its  own
708                     IP  address.  A  VM  only  makes  this kind of request to
709                     attempt to detect a duplicate IP address  assignment,  so
710                     sending a reply will prevent the VM from accepting the IP
711                     address that it owns.
712
713                     In place of next;, it would be reasonable  to  use  drop;
714                     for the flows’ actions. If everything is working as it is
715                     configured, then this would produce  equivalent  results,
716                     since no host should reply to the request. But ARPing for
717                     one’s own IP address is  intended  to  detect  situations
718                     where  the network is not working as configured, so drop‐
719                     ping the request would frustrate that intent.
720
721              ·      One priority-0 fallback flow that matches all packets and
722                     advances to the next table.
723
724     Ingress Table 12: DHCP option processing
725
726       This  table adds the DHCPv4 options to a DHCPv4 packet from the logical
727       ports configured with IPv4 address(es) and DHCPv4  options,  and  simi‐
728       larly  for  DHCPv6  options. This table also adds flows for the logical
729       ports of type external.
730
731              ·      A priority-100 logical flow is added  for  these  logical
732                     ports which matches the IPv4 packet with udp.src = 68 and
733                     udp.dst = 67 and applies  the  action  put_dhcp_opts  and
734                     advances the packet to the next table.
735
736                     reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
737                     next;
738
739
740                     For  DHCPDISCOVER  and  DHCPREQUEST,  this transforms the
741                     packet into a DHCP reply, adds the DHCP offer IP  ip  and
742                     options  to  the  packet,  and stores 1 into reg0[3]. For
743                     other kinds of packets, it just stores  0  into  reg0[3].
744                     Either way, it continues to the next table.
745
746              ·      A  priority-100  logical  flow is added for these logical
747                     ports which matches the IPv6 packet with  udp.src  =  546
748                     and  udp.dst = 547 and applies the action put_dhcpv6_opts
749                     and advances the packet to the next table.
750
751                     reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
752                     next;
753
754
755                     For DHCPv6 Solicit/Request/Confirm packets,  this  trans‐
756                     forms  the packet into a DHCPv6 Advertise/Reply, adds the
757                     DHCPv6 offer IP ip and options to the packet, and  stores
758                     1  into  reg0[3].  For  other  kinds  of packets, it just
759                     stores 0 into reg0[3]. Either way, it  continues  to  the
760                     next table.
761
762              ·      A priority-0 flow that matches all packets to advances to
763                     table 11.
764
765     Ingress Table 13: DHCP responses
766
767       This table implements DHCP responder for the DHCP replies generated  by
768       the previous table.
769
770              ·      A  priority  100  logical  flow  is added for the logical
771                     ports configured with DHCPv4 options which  matches  IPv4
772                     packets with udp.src == 68 && udp.dst == 67 && reg0[3] ==
773                     1 and responds back to the inport  after  applying  these
774                     actions. If reg0[3] is set to 1, it means that the action
775                     put_dhcp_opts was successful.
776
777                     eth.dst = eth.src;
778                     eth.src = E;
779                     ip4.dst = A;
780                     ip4.src = S;
781                     udp.src = 67;
782                     udp.dst = 68;
783                     outport = P;
784                     flags.loopback = 1;
785                     output;
786
787
788                     where E is the server MAC address and  S  is  the  server
789                     IPv4  address  defined in the DHCPv4 options and A is the
790                     IPv4 address defined in the logical port’s addresses col‐
791                     umn.
792
793                     (This  terminates  ingress  packet processing; the packet
794                     does not go to the next ingress table.)
795
796              ·      A priority 100 logical flow  is  added  for  the  logical
797                     ports  configured  with DHCPv6 options which matches IPv6
798                     packets with udp.src == 546 && udp.dst == 547 &&  reg0[3]
799                     == 1 and responds back to the inport after applying these
800                     actions. If reg0[3] is set to 1, it means that the action
801                     put_dhcpv6_opts was successful.
802
803                     eth.dst = eth.src;
804                     eth.src = E;
805                     ip6.dst = A;
806                     ip6.src = S;
807                     udp.src = 547;
808                     udp.dst = 546;
809                     outport = P;
810                     flags.loopback = 1;
811                     output;
812
813
814                     where  E  is  the  server MAC address and S is the server
815                     IPv6 LLA address generated from the server_id defined  in
816                     the  DHCPv6  options and A is the IPv6 address defined in
817                     the logical port’s addresses column.
818
819                     (This terminates packet processing; the packet  does  not
820                     go on the next ingress table.)
821
822              ·      A priority-0 flow that matches all packets to advances to
823                     table 12.
824
825     Ingress Table 14 DNS Lookup
826
827       This table looks up and resolves the DNS  names  to  the  corresponding
828       configured IP address(es).
829
830              ·      A priority-100 logical flow for each logical switch data‐
831                     path if it is configured with DNS records, which  matches
832                     the  IPv4  and IPv6 packets with udp.dst = 53 and applies
833                     the action dns_lookup and advances the packet to the next
834                     table.
835
836                     reg0[4] = dns_lookup(); next;
837
838
839                     For  valid DNS packets, this transforms the packet into a
840                     DNS reply if the DNS name can be resolved, and  stores  1
841                     into reg0[4]. For failed DNS resolution or other kinds of
842                     packets, it just stores 0 into reg0[4].  Either  way,  it
843                     continues to the next table.
844
845     Ingress Table 15 DNS Responses
846
847       This  table  implements  DNS responder for the DNS replies generated by
848       the previous table.
849
850              ·      A priority-100 logical flow for each logical switch data‐
851                     path  if it is configured with DNS records, which matches
852                     the IPv4 and IPv6 packets with udp.dst = 53 && reg0[4] ==
853                     1  and  responds  back to the inport after applying these
854                     actions. If reg0[4] is set to 1, it means that the action
855                     dns_lookup was successful.
856
857                     eth.dst <-> eth.src;
858                     ip4.src <-> ip4.dst;
859                     udp.dst = udp.src;
860                     udp.src = 53;
861                     outport = P;
862                     flags.loopback = 1;
863                     output;
864
865
866                     (This  terminates  ingress  packet processing; the packet
867                     does not go to the next ingress table.)
868
869     Ingress table 16 External ports
870
871       Traffic from the external logical  ports  enter  the  ingress  datapath
872       pipeline via the localnet port. This table adds the below logical flows
873       to handle the traffic from these ports.
874
875              ·      A priority-100 flow is added for  each  external  logical
876                     port  which  doesn’t  reside  on  a  chassis  to drop the
877                     ARP/IPv6 NS request to the router IP(s) (of  the  logical
878                     switch) which matches on the inport of the external logi‐
879                     cal port and the valid eth.src address(es) of the  exter‐
880                     nal logical port.
881
882                     This  flow  guarantees  that  the  ARP/NS  request to the
883                     router IP address from the external ports is responded by
884                     only  the chassis which has claimed these external ports.
885                     All the other chassis, drops these packets.
886
887              ·      A priority-0 flow that matches all packets to advances to
888                     table 17.
889
890     Ingress Table 17 Destination Lookup
891
892       This  table  implements  switching  behavior. It contains these logical
893       flows:
894
895              ·      A priority-100  flow  that  punts  all  IGMP  packets  to
896                     ovn-controller if IGMP snooping is enabled on the logical
897                     switch. The flow also forwards the IGMP  packets  to  the
898                     MC_MROUTER_STATIC multicast group, which ovn-northd popu‐
899                     lates with  all  the  logical  ports  that  have  options
900                     :mcast_flood_reports=’true’.
901
902              ·      Priority-90  flows  that  forward registered IP multicast
903                     traffic to their  corresponding  multicast  group,  which
904                     ovn-northd  creates  based  on learnt IGMP_Group entries.
905                     The flows also forward packets  to  the  MC_MROUTER_FLOOD
906                     multicast  group, which ovn-nortdh populates with all the
907                     logical ports that are connected to logical routers  with
908                     options:mcast_relay=’true’.
909
910              ·      A priority-85 flow that forwards all IP multicast traffic
911                     destined to 224.0.0.X to the  MC_FLOOD  multicast  group,
912                     which  ovn-northd  populates  with  all  enabled  logical
913                     ports.
914
915              ·      A priority-80 flow that forwards all unregistered IP mul‐
916                     ticast  traffic  to  the MC_STATIC multicast group, which
917                     ovn-northd populates with all the logical ports that have
918                     options   :mcast_flood=’true’.  The  flow  also  forwards
919                     unregistered IP multicast traffic to the MC_MROUTER_FLOOD
920                     multicast  group, which ovn-northd populates with all the
921                     logical ports connected  to  logical  routers  that  have
922                     options :mcast_relay=’true’.
923
924              ·      A  priority-80 flow that drops all unregistered IP multi‐
925                     cast  traffic  if  other_config  :mcast_snoop=’true’  and
926                     other_config  :mcast_flood_unregistered=’false’  and  the
927                     switch is not connected to  a  logical  router  that  has
928                     options  :mcast_relay=’true’  and the switch doesn’t have
929                     any logical port with options :mcast_flood=’true’.
930
931              ·      A priority-70 flow that outputs all packets with an  Eth‐
932                     ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
933                     ticast group.
934
935              ·      One priority-50 flow that  matches  each  known  Ethernet
936                     address  against  eth.dst  and  outputs the packet to the
937                     single associated output port.
938
939                     For the Ethernet address on a logical switch port of type
940                     router,  when that logical switch port’s addresses column
941                     is set to router and the connected  logical  router  port
942                     specifies a redirect-chassis:
943
944                     ·      The  flow  for the connected logical router port’s
945                            Ethernet address is only programmed on  the  redi‐
946                            rect-chassis.
947
948                     ·      If  the  logical router has rules specified in nat
949                            with external_mac, then those addresses  are  also
950                            used  to  populate the switch’s destination lookup
951                            on the chassis where logical_port is resident.
952
953                     For the Ethernet address on a logical switch port of type
954                     router,  when that logical switch port’s addresses column
955                     is set to router and the connected  logical  router  port
956                     specifies  a  reside-on-redirect-chassis  and the logical
957                     router to which the connected logical router port belongs
958                     to  has  a  redirect-chassis  distributed gateway logical
959                     router port:
960
961                     ·      The flow for the connected logical  router  port’s
962                            Ethernet  address  is only programmed on the redi‐
963                            rect-chassis.
964
965              ·      One priority-0 fallback flow that matches all packets and
966                     outputs  them  to  the  MC_UNKNOWN multicast group, which
967                     ovn-northd populates with all enabled logical ports  that
968                     accept  unknown destination packets. As a small optimiza‐
969                     tion, if no  logical  ports  accept  unknown  destination
970                     packets,  ovn-northd omits this multicast group and logi‐
971                     cal flow.
972
973     Egress Table 0: Pre-LB
974
975       This table is similar to ingress table Pre-LB. It contains a priority-0
976       flow  that simply moves traffic to the next table. Moreover it contains
977       a priority-110 flow to move IPv6 Neighbor Discovery traffic to the next
978       table.  If  any  load  balancing rules exist for the datapath, a prior‐
979       ity-100 flow is added with a match of ip and action  of  reg0[0]  =  1;
980       next; to act as a hint for table Pre-stateful to send IP packets to the
981       connection tracker for packet de-fragmentation.
982
983     Egress Table 1: to-lport Pre-ACLs
984
985       This is similar to ingress table Pre-ACLs except for to-lport traffic.
986
987     Egress Table 2: Pre-stateful
988
989       This is similar to ingress table Pre-stateful.
990
991     Egress Table 3: LB
992
993       This is similar to ingress table LB.
994
995     Egress Table 4: to-lport ACLs
996
997       This is similar to ingress table ACLs except for to-lport ACLs.
998
999       In addition, the following flows are added.
1000
1001              ·      A priority 34000 logical flow is added for  each  logical
1002                     port which has DHCPv4 options defined to allow the DHCPv4
1003                     reply packet and which  has  DHCPv6  options  defined  to
1004                     allow  the DHCPv6 reply packet from the Ingress Table 13:
1005                     DHCP responses.
1006
1007              ·      A priority 34000 logical flow is added for  each  logical
1008                     switch  datapath  configured  with  DNS  records with the
1009                     match udp.dst = 53 to allow the DNS reply packet from the
1010                     Ingress Table 15:DNS responses.
1011
1012     Egress Table 5: to-lport QoS Marking
1013
1014       This  is  similar  to  ingress  table  QoS marking except they apply to
1015       to-lport QoS rules.
1016
1017     Egress Table 6: to-lport QoS Meter
1018
1019       This is similar to  ingress  table  QoS  meter  except  they  apply  to
1020       to-lport QoS rules.
1021
1022     Egress Table 7: Stateful
1023
1024       This  is  similar  to  ingress  table Stateful except that there are no
1025       rules added for load balancing new connections.
1026
1027     Egress Table 8: Egress Port Security - IP
1028
1029       This is similar to the port security logic in table Ingress Port  Secu‐
1030       rity - IP except that outport, eth.dst, ip4.dst and ip6.dst are checked
1031       instead of inport, eth.src, ip4.src and ip6.src
1032
1033     Egress Table 9: Egress Port Security - L2
1034
1035       This is similar to the ingress port security  logic  in  ingress  table
1036       Admission  Control  and  Ingress Port Security - L2, but with important
1037       differences. Most obviously, outport and eth.dst are checked instead of
1038       inport  and eth.src. Second, packets directed to broadcast or multicast
1039       eth.dst are always accepted instead of being subject to the port  secu‐
1040       rity  rules;  this  is  implemented  through  a  priority-100 flow that
1041       matches on eth.mcast with action output;. Moreover, to ensure that even
1042       broadcast  and  multicast packets are not delivered to disabled logical
1043       ports, a priority-150 flow for each disabled logical outport  overrides
1044       the  priority-100  flow  with a drop; action. Finally if egress qos has
1045       been enabled on a localnet port, the outgoing queue id is  set  through
1046       set_queue  action.  Please  remember to mark the corresponding physical
1047       interface with ovn-egress-iface set to true in external_ids
1048
1049   Logical Router Datapaths
1050       Logical router datapaths will only exist for Logical_Router rows in the
1051       OVN_Northbound database that do not have enabled set to false
1052
1053     Ingress Table 0: L2 Admission Control
1054
1055       This  table drops packets that the router shouldn’t see at all based on
1056       their Ethernet headers. It contains the following flows:
1057
1058              ·      Priority-100 flows to drop packets with VLAN tags or mul‐
1059                     ticast Ethernet source addresses.
1060
1061              ·      For each enabled router port P with Ethernet address E, a
1062                     priority-50 flow that matches inport == P  &&  (eth.mcast
1063                     || eth.dst == E), with action next;.
1064
1065                     For  the  gateway  port  on  a distributed logical router
1066                     (where one of the logical router ports specifies a  redi‐
1067                     rect-chassis),  the  above  flow matching eth.dst == E is
1068                     only programmed on the gateway port instance on the redi‐
1069                     rect-chassis.
1070
1071              ·      For  each  dnat_and_snat NAT rule on a distributed router
1072                     that specifies an external Ethernet address E,  a  prior‐
1073                     ity-50  flow  that  matches inport == GW && eth.dst == E,
1074                     where GW is the logical router gateway port, with  action
1075                     next;.
1076
1077                     This flow is only programmed on the gateway port instance
1078                     on the chassis where the logical_port  specified  in  the
1079                     NAT rule resides.
1080
1081       Other packets are implicitly dropped.
1082
1083     Ingress Table 1: Neighbor lookup
1084
1085       For  ARP and IPv6 Neighbor Discovery packets, this table looks into the
1086       MAC_Binding records to determine if OVN needs to learn  the  mac  bind‐
1087       ings. Following flows are added:
1088
1089              ·      For  each  router  port  P  that owns IP address A, which
1090                     belongs to subnet S with prefix length L, a  priority-100
1091                     flow is added which matches inport == P && arp.spa == S/L
1092                     && arp.op == 1 (ARP request) with the following actions:
1093
1094                     reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
1095                     next;
1096
1097
1098                     If the logical router port P  is  a  distributed  gateway
1099                     router  port,  additional match is_chassis_resident(cr-P)
1100                     is added so that the resident gateway chassis handles the
1101                     neighbor lookup.
1102
1103              ·      A  priority-100  flow  which matches on ARP reply packets
1104                     and applies the actions:
1105
1106                     reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
1107                     next;
1108
1109
1110              ·      A priority-100 flow which matches on IPv6  Neighbor  Dis‐
1111                     covery advertisement packet and applies the actions:
1112
1113                     reg9[4] = lookup_nd(inport, nd.target, nd.tll);
1114                     next;
1115
1116
1117              ·      A  priority-100  flow which matches on IPv6 Neighbor Dis‐
1118                     covery solicitation packet and applies the actions:
1119
1120                     reg9[4] = lookup_nd(inport, ip6.src, nd.sll);
1121                     next;
1122
1123
1124              ·      A priority-0 fallback flow that matches all  packets  and
1125                     applies  the  action  reg9[5]  =  1;  next; advancing the
1126                     packet to the next table.
1127
1128     Ingress Table 2: Neighbor learning
1129
1130       This table adds flows to learn the mac bindings from the ARP  and  IPv6
1131       Neighbor  Solicitation/Advertisement packets if ARP/ND lookup failed in
1132       the previous table.
1133
1134       reg9[4] will be 1 if the lookup_arp/lookup_nd in the previous table was
1135       successful.
1136
1137       reg9[5] will be 1 if there was no need to do the lookup.
1138
1139              ·      A  priority-100  flow  with  the  match  reg9[4]  == 1 ||
1140                     reg9[5] == 1 and advances the packet to the next table as
1141                     there is no need to learn the neighbor.
1142
1143              ·      A  priority-90  flow  with  the match arp and applies the
1144                     action put_arp(inport, arp.spa, arp.sha); next;
1145
1146              ·      A priority-90 flow with the match nd_na and  applies  the
1147                     action put_nd(inport, nd.target, nd.tll); next;
1148
1149              ·      A  priority-90  flow with the match nd_ns and applies the
1150                     action put_nd(inport, ip6.src, nd.sll); next;
1151
1152     Ingress Table 3: IP Input
1153
1154       This table is the core of the logical router datapath functionality. It
1155       contains  the following flows to implement very basic IP host function‐
1156       ality.
1157
1158              ·      L3 admission control: A priority-100 flow  drops  packets
1159                     that match any of the following:
1160
1161                     ·      ip4.src[28..31] == 0xe (multicast source)
1162
1163                     ·      ip4.src == 255.255.255.255 (broadcast source)
1164
1165                     ·      ip4.src  ==  127.0.0.0/8 || ip4.dst == 127.0.0.0/8
1166                            (localhost source or destination)
1167
1168                     ·      ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
1169                            network source or destination)
1170
1171                     ·      ip4.src  or ip6.src is any IP address owned by the
1172                            router, unless the packet was recirculated due  to
1173                            egress    loopback    as    indicated    by   REG‐
1174                            BIT_EGRESS_LOOPBACK.
1175
1176                     ·      ip4.src is the broadcast address of any IP network
1177                            known to the router.
1178
1179              ·      A   priority-95  flow  allows  IP  multicast  traffic  if
1180                     options:mcast_relay=’true’, otherwise drops it.
1181
1182              ·      ICMP echo reply. These flows reply to ICMP echo  requests
1183                     received  for  the  router’s  IP  address. Let A be an IP
1184                     address owned by a router port. Then, for each A that  is
1185                     an IPv4 address, a priority-90 flow matches on ip4.dst ==
1186                     A and icmp4.type == 8  &&  icmp4.code  ==  0  (ICMP  echo
1187                     request).  For  each  A that is an IPv6 address, a prior‐
1188                     ity-90 flow matches on ip6.dst == A and icmp6.type == 128
1189                     && icmp6.code == 0 (ICMPv6 echo request). The port of the
1190                     router that receives the echo request  does  not  matter.
1191                     Also,  the  ip.ttl  of  the  echo  request  packet is not
1192                     checked, so it complies with RFC 1812,  section  4.2.2.9.
1193                     Flows for ICMPv4 echo requests use the following actions:
1194
1195                     ip4.dst <-> ip4.src;
1196                     ip.ttl = 255;
1197                     icmp4.type = 0;
1198                     flags.loopback = 1;
1199                     next;
1200
1201
1202                     Flows for ICMPv6 echo requests use the following actions:
1203
1204                     ip6.dst <-> ip6.src;
1205                     ip.ttl = 255;
1206                     icmp6.type = 129;
1207                     flags.loopback = 1;
1208                     next;
1209
1210
1211              ·      Reply to ARP requests.
1212
1213                     These flows reply to ARP requests for the router’s own IP
1214                     address.  The  ARP  requests  are  handled  only  if  the
1215                     requestor’s IP belongs to the same subnets of the logical
1216                     router port. For each router port P that owns IP  address
1217                     A,  which  belongs  to subnet S with prefix length L, and
1218                     Ethernet address E, a priority-90 flow matches inport  ==
1219                     P  &&  arp.spa == S/L && arp.op == 1 && arp.tpa == A (ARP
1220                     request) with the following actions:
1221
1222                     eth.dst = eth.src;
1223                     eth.src = E;
1224                     arp.op = 2; /* ARP reply. */
1225                     arp.tha = arp.sha;
1226                     arp.sha = E;
1227                     arp.tpa = arp.spa;
1228                     arp.spa = A;
1229                     outport = P;
1230                     flags.loopback = 1;
1231                     output;
1232
1233
1234                     For the gateway port  on  a  distributed  logical  router
1235                     (where  one of the logical router ports specifies a redi‐
1236                     rect-chassis), the above flows are only programmed on the
1237                     gateway  port  instance  on  the  redirect-chassis.  This
1238                     behavior avoids generation of multiple ARP responses from
1239                     different  chassis,  and  allows upstream MAC learning to
1240                     point to the redirect-chassis.
1241
1242                     For the logical router port with the option reside-on-re‐
1243                     direct-chassis  set  (which  is  centralized),  the above
1244                     flows are only programmed on the gateway port instance on
1245                     the  redirect-chassis  (if  the logical router has a dis‐
1246                     tributed gateway port). This behavior  avoids  generation
1247                     of  multiple  ARP  responses  from different chassis, and
1248                     allows upstream  MAC  learning  to  point  to  the  redi‐
1249                     rect-chassis.
1250
1251              ·      These  flows  reply  to  ARP  requests for the virtual IP
1252                     addresses configured in the router for DNAT or load  bal‐
1253                     ancing.  For  a configured DNAT IP address or a load bal‐
1254                     ancer IPv4 VIP A, for each router port  P  with  Ethernet
1255                     address  E,  a  priority-90  flow  matches inport == P &&
1256                     arp.op == 1 && arp.tpa == A (ARP request) with  the  fol‐
1257                     lowing actions:
1258
1259                     eth.dst = eth.src;
1260                     eth.src = E;
1261                     arp.op = 2; /* ARP reply. */
1262                     arp.tha = arp.sha;
1263                     arp.sha = E;
1264                     arp.tpa = arp.spa;
1265                     arp.spa = A;
1266                     outport = P;
1267                     flags.loopback = 1;
1268                     output;
1269
1270
1271                     For the gateway port on a distributed logical router with
1272                     NAT (where one of the logical router  ports  specifies  a
1273                     redirect-chassis):
1274
1275                     ·      If the corresponding NAT rule cannot be handled in
1276                            a distributed manner, then this flow is only  pro‐
1277                            grammed  on the gateway port instance on the redi‐
1278                            rect-chassis. This behavior avoids  generation  of
1279                            multiple ARP responses from different chassis, and
1280                            allows upstream MAC learning to point to the redi‐
1281                            rect-chassis.
1282
1283                     ·      If  the corresponding NAT rule can be handled in a
1284                            distributed manner, then this flow  is  only  pro‐
1285                            grammed  on  the  gateway  port instance where the
1286                            logical_port specified in the NAT rule resides.
1287
1288                            Some of the actions are different for  this  case,
1289                            using  the  external_mac specified in the NAT rule
1290                            rather than the gateway port’s Ethernet address E:
1291
1292                            eth.src = external_mac;
1293                            arp.sha = external_mac;
1294
1295
1296                            This behavior avoids generation  of  multiple  ARP
1297                            responses   from  different  chassis,  and  allows
1298                            upstream MAC learning  to  point  to  the  correct
1299                            chassis.
1300
1301              ·      Reply  to  IPv6 Neighbor Solicitations. These flows reply
1302                     to Neighbor Solicitation requests for  the  router’s  own
1303                     IPv6  address  and  load balancing IPv6 VIPs and populate
1304                     the logical router’s mac binding table.
1305
1306                     For  each  router  port  P  that  owns  IPv6  address  A,
1307                     solicited  node address S, and Ethernet address E, a pri‐
1308                     ority-90 flow matches inport == P && nd_ns && ip6.dst  ==
1309                     {A, E} && nd.target == A with the following actions:
1310
1311                     nd_na_router {
1312                         eth.src = E;
1313                         ip6.src = A;
1314                         nd.target = A;
1315                         nd.tll = E;
1316                         outport = inport;
1317                         flags.loopback = 1;
1318                         output;
1319                     };
1320
1321
1322                     For  each  router  port  P that has load balancing VIP A,
1323                     solicited node address S, and Ethernet address E, a  pri‐
1324                     ority-90  flow matches inport == P && nd_ns && ip6.dst ==
1325                     {A, E} && nd.target == A with the following actions:
1326
1327                     nd_na {
1328                         eth.src = E;
1329                         ip6.src = A;
1330                         nd.target = A;
1331                         nd.tll = E;
1332                         outport = inport;
1333                         flags.loopback = 1;
1334                         output;
1335                     };
1336
1337
1338                     For the gateway port  on  a  distributed  logical  router
1339                     (where  one of the logical router ports specifies a redi‐
1340                     rect-chassis), the above flows replying to IPv6  Neighbor
1341                     Solicitations  are  only  programmed  on the gateway port
1342                     instance on the redirect-chassis.  This  behavior  avoids
1343                     generation  of  multiple  replies from different chassis,
1344                     and allows upstream MAC learning to point  to  the  redi‐
1345                     rect-chassis.
1346
1347              ·      Priority-85  flows  which drops the ARP and IPv6 Neighbor
1348                     Discovery packets.
1349
1350              ·      UDP port unreachable.  Priority-80  flows  generate  ICMP
1351                     port  unreachable  messages  in  reply  to  UDP datagrams
1352                     directed to the router’s IP address, except in  the  spe‐
1353                     cial case of gateways, which accept traffic directed to a
1354                     router IP for load balancing and NAT purposes.
1355
1356                     These flows should not match IP  fragments  with  nonzero
1357                     offset.
1358
1359              ·      TCP  reset. Priority-80 flows generate TCP reset messages
1360                     in reply to TCP datagrams directed  to  the  router’s  IP
1361                     address,  except  in  the special case of gateways, which
1362                     accept traffic directed to a router IP for load balancing
1363                     and NAT purposes.
1364
1365                     These  flows  should  not match IP fragments with nonzero
1366                     offset.
1367
1368              ·      Protocol or address unreachable. Priority-70 flows gener‐
1369                     ate  ICMP  protocol  or  address unreachable messages for
1370                     IPv4 and IPv6 respectively in reply to  packets  directed
1371                     to  the  router’s  IP  address on IP protocols other than
1372                     UDP, TCP, and ICMP, except in the special case  of  gate‐
1373                     ways,  which  accept  traffic directed to a router IP for
1374                     load balancing purposes.
1375
1376                     These flows should not match IP  fragments  with  nonzero
1377                     offset.
1378
1379              ·      Drop  other  IP  traffic to this router. These flows drop
1380                     any other traffic destined  to  an  IP  address  of  this
1381                     router  that  is  not already handled by one of the flows
1382                     above, which amounts to ICMP (other than  echo  requests)
1383                     and fragments with nonzero offsets. For each IP address A
1384                     owned by the router, a priority-60 flow  matches  ip4.dst
1385                     ==  A and drops the traffic. An exception is made and the
1386                     above flow is not added  if  the  router  port’s  own  IP
1387                     address  is  used  to  SNAT  packets passing through that
1388                     router.
1389
1390       The flows above handle all of the traffic that might be directed to the
1391       router  itself.  The following flows (with lower priorities) handle the
1392       remaining traffic, potentially for forwarding:
1393
1394              ·      Drop Ethernet local broadcast. A  priority-50  flow  with
1395                     match  eth.bcast drops traffic destined to the local Eth‐
1396                     ernet  broadcast  address.  By  definition  this  traffic
1397                     should not be forwarded.
1398
1399              ·      ICMP  time  exceeded.  For  each  router port P, whose IP
1400                     address is A, a priority-40 flow with match inport  ==  P
1401                     &&  ip.ttl  ==  {0,  1} && !ip.later_frag matches packets
1402                     whose TTL has expired, with the following actions to send
1403                     an  ICMP  time  exceeded  reply for IPv4 and IPv6 respec‐
1404                     tively:
1405
1406                     icmp4 {
1407                         icmp4.type = 11; /* Time exceeded. */
1408                         icmp4.code = 0;  /* TTL exceeded in transit. */
1409                         ip4.dst = ip4.src;
1410                         ip4.src = A;
1411                         ip.ttl = 255;
1412                         next;
1413                     };
1414                     icmp6 {
1415                         icmp6.type = 3; /* Time exceeded. */
1416                         icmp6.code = 0;  /* TTL exceeded in transit. */
1417                         ip6.dst = ip6.src;
1418                         ip6.src = A;
1419                         ip.ttl = 255;
1420                         next;
1421                     };
1422
1423
1424              ·      TTL discard. A priority-30 flow with match ip.ttl ==  {0,
1425                     1}  and  actions  drop; drops other packets whose TTL has
1426                     expired, that should not receive a ICMP error reply (i.e.
1427                     fragments with nonzero offset).
1428
1429              ·      Next  table.  A  priority-0  flows match all packets that
1430                     aren’t already handled and uses  actions  next;  to  feed
1431                     them to the next table.
1432
1433     Ingress Table 4: DEFRAG
1434
1435       This  is to send packets to connection tracker for tracking and defrag‐
1436       mentation. It contains a priority-0 flow that simply moves  traffic  to
1437       the  next table. If load balancing rules with virtual IP addresses (and
1438       ports) are configured in OVN_Northbound database for a Gateway  router,
1439       a  priority-100  flow  is  added for each configured virtual IP address
1440       VIP. For IPv4 VIPs the flow matches ip &&  ip4.dst  ==  VIP.  For  IPv6
1441       VIPs,  the  flow matches ip && ip6.dst == VIP. The flow uses the action
1442       ct_next; to send IP packets to the connection tracker  for  packet  de-
1443       fragmentation and tracking before sending it to the next table.
1444
1445     Ingress Table 3: UNSNAT
1446
1447       This  is  for  already  established connections’ reverse traffic. i.e.,
1448       SNAT has already been done in egress pipeline and now  the  packet  has
1449       entered the ingress pipeline as part of a reply. It is unSNATted here.
1450
1451       Ingress Table 3: UNSNAT on Gateway Routers
1452
1453              ·      If  the  Gateway router has been configured to force SNAT
1454                     any previously DNATted packets to B, a priority-110  flow
1455                     matches ip && ip4.dst == B with an action ct_snat; .
1456
1457                     If  the  Gateway router has been configured to force SNAT
1458                     any previously load-balanced packets to B, a priority-100
1459                     flow matches ip && ip4.dst == B with an action ct_snat; .
1460
1461                     For  each  NAT  configuration in the OVN Northbound data‐
1462                     base, that asks to change the  source  IP  address  of  a
1463                     packet  from  A  to  B,  a priority-90 flow matches ip &&
1464                     ip4.dst == B with an action ct_snat; .
1465
1466                     A priority-0 logical flow with match 1 has actions next;.
1467
1468       Ingress Table 5: UNSNAT on Distributed Routers
1469
1470              ·      For each configuration in the  OVN  Northbound  database,
1471                     that  asks  to  change  the source IP address of a packet
1472                     from A to B, a priority-100 flow matches ip && ip4.dst ==
1473                     B && inport == GW, where GW is the logical router gateway
1474                     port, with an action ct_snat;.
1475
1476                     If the NAT rule cannot be handled in a  distributed  man‐
1477                     ner,  then the priority-100 flow above is only programmed
1478                     on the redirect-chassis.
1479
1480                     For each configuration in the  OVN  Northbound  database,
1481                     that  asks  to  change  the source IP address of a packet
1482                     from A to B, a priority-50 flow matches ip && ip4.dst  ==
1483                     B  with  an  action  REGBIT_NAT_REDIRECT = 1; next;. This
1484                     flow is for east/west traffic to a NAT  destination  IPv4
1485                     address.  By setting the REGBIT_NAT_REDIRECT flag, in the
1486                     ingress table Gateway Redirect this will trigger a  redi‐
1487                     rect  to  the  instance  of the gateway port on the redi‐
1488                     rect-chassis.
1489
1490                     A priority-0 logical flow with match 1 has actions next;.
1491
1492     Ingress Table 6: DNAT
1493
1494       Packets enter the pipeline with destination IP address that needs to be
1495       DNATted  from a virtual IP address to a real IP address. Packets in the
1496       reverse direction needs to be unDNATed.
1497
1498       Ingress Table 6: Load balancing DNAT rules
1499
1500       Following load balancing DNAT flows are added  for  Gateway  router  or
1501       Router  with gateway port. These flows are programmed only on the redi‐
1502       rect-chassis. These flows do not get programmed for load balancers with
1503       IPv6 VIPs.
1504
1505              ·      If  controller_event has been enabled for all the config‐
1506                     ured load balancing rules for a Gateway router or  Router
1507                     with  gateway  port  in OVN_Northbound database that does
1508                     not have configured  backends,  a  priority-130  flow  is
1509                     added to trigger ovn-controller events whenever the chas‐
1510                     sis  receives  a  packet  for  that  particular  VIP.  If
1511                     event-elb  meter  has been previously created, it will be
1512                     associated to the empty_lb logical flow
1513
1514              ·      For all the configured load balancing rules for a Gateway
1515                     router  or  Router  with  gateway  port in OVN_Northbound
1516                     database that includes a L4 port PORT of protocol  P  and
1517                     IPv4  address  VIP,  a  priority-120 flow that matches on
1518                     ct.new && ip && ip4.dst == VIP && P && P.dst == PORT
1519                      with an action of ct_lb(args), where args contains comma
1520                     separated  IPv4  addresses (and optional port numbers) to
1521                     load balance to. If the router  is  configured  to  force
1522                     SNAT  any load-balanced packets, the above action will be
1523                     replaced by flags.force_snat_for_lb = 1; ct_lb(args);.
1524
1525              ·      For all the configured load balancing rules for a  router
1526                     in  OVN_Northbound  database that includes a L4 port PORT
1527                     of protocol P and IPv4 address VIP, a  priority-120  flow
1528                     that  matches  on  ct.est && ip && ip4.dst == VIP && P &&
1529                     P.dst == PORT
1530                      with an action of ct_dnat;. If the router is  configured
1531                     to force SNAT any load-balanced packets, the above action
1532                     will  be  replaced  by   flags.force_snat_for_lb   =   1;
1533                     ct_dnat;.
1534
1535              ·      For  all the configured load balancing rules for a router
1536                     in OVN_Northbound  database  that  includes  just  an  IP
1537                     address VIP to match on, a priority-110 flow that matches
1538                     on ct.new && ip && ip4.dst  ==  VIP  with  an  action  of
1539                     ct_lb(args),  where  args  contains  comma separated IPv4
1540                     addresses. If the router is configured to force SNAT  any
1541                     load-balanced  packets, the above action will be replaced
1542                     by flags.force_snat_for_lb = 1; ct_lb(args);.
1543
1544              ·      For all the configured load balancing rules for a  router
1545                     in  OVN_Northbound  database  that  includes  just  an IP
1546                     address VIP to match on, a priority-110 flow that matches
1547                     on  ct.est  &&  ip  &&  ip4.dst  == VIP with an action of
1548                     ct_dnat;. If the router is configured to force  SNAT  any
1549                     load-balanced  packets, the above action will be replaced
1550                     by flags.force_snat_for_lb = 1; ct_dnat;.
1551
1552       Ingress Table 6: DNAT on Gateway Routers
1553
1554              ·      For each configuration in the  OVN  Northbound  database,
1555                     that  asks  to  change  the  destination  IP address of a
1556                     packet from A to B, a priority-100  flow  matches  ip  &&
1557                     ip4.dst   ==   A  with  an  action  flags.loopback  =  1;
1558                     ct_dnat(B);. If the Gateway router is configured to force
1559                     SNAT any DNATed packet, the above action will be replaced
1560                     by flags.force_snat_for_dnat =  1;  flags.loopback  =  1;
1561                     ct_dnat(B);.
1562
1563              ·      For  all  IP  packets  of a Gateway router, a priority-50
1564                     flow with an action flags.loopback = 1; ct_dnat;.
1565
1566              ·      A priority-0 logical flow with match 1 has actions next;.
1567
1568       Ingress Table 6: DNAT on Distributed Routers
1569
1570       On distributed routers, the DNAT table only handles packets with desti‐
1571       nation IP address that needs to be DNATted from a virtual IP address to
1572       a real IP address. The unDNAT processing in the  reverse  direction  is
1573       handled in a separate table in the egress pipeline.
1574
1575              ·      For  each  configuration  in the OVN Northbound database,
1576                     that asks to change  the  destination  IP  address  of  a
1577                     packet  from  A  to  B, a priority-100 flow matches ip &&
1578                     ip4.dst == B && inport == GW, where  GW  is  the  logical
1579                     router gateway port, with an action ct_dnat(B);.
1580
1581                     If  the  NAT rule cannot be handled in a distributed man‐
1582                     ner, then the priority-100 flow above is only  programmed
1583                     on the redirect-chassis.
1584
1585                     For  each  configuration  in the OVN Northbound database,
1586                     that asks to change  the  destination  IP  address  of  a
1587                     packet  from  A  to  B,  a priority-50 flow matches ip &&
1588                     ip4.dst == B with  an  action  REGBIT_NAT_REDIRECT  =  1;
1589                     next;. This flow is for east/west traffic to a NAT desti‐
1590                     nation IPv4 address. By setting  the  REGBIT_NAT_REDIRECT
1591                     flag,  in  the  ingress  table Gateway Redirect this will
1592                     trigger a redirect to the instance of the gateway port on
1593                     the redirect-chassis.
1594
1595                     A priority-0 logical flow with match 1 has actions next;.
1596
1597     Ingress Table 7: IPv6 ND RA option processing
1598
1599              ·      A  priority-50  logical  flow  is  added for each logical
1600                     router port configured with  IPv6  ND  RA  options  which
1601                     matches  IPv6  ND  Router Solicitation packet and applies
1602                     the action put_nd_ra_opts and advances the packet to  the
1603                     next table.
1604
1605                     reg0[5] = put_nd_ra_opts(options);next;
1606
1607
1608                     For a valid IPv6 ND RS packet, this transforms the packet
1609                     into an IPv6 ND RA reply and sets the RA options  to  the
1610                     packet  and  stores  1  into  reg0[5]. For other kinds of
1611                     packets, it just stores 0 into reg0[5].  Either  way,  it
1612                     continues to the next table.
1613
1614              ·      A priority-0 logical flow with match 1 has actions next;.
1615
1616     Ingress Table 8: IPv6 ND RA responder
1617
1618       This  table  implements IPv6 ND RA responder for the IPv6 ND RA replies
1619       generated by the previous table.
1620
1621              ·      A priority-50 logical flow  is  added  for  each  logical
1622                     router  port  configured  with  IPv6  ND RA options which
1623                     matches IPv6 ND RA packets and reg0[5] == 1 and  responds
1624                     back  to  the  inport  after  applying  these actions. If
1625                     reg0[5]  is  set  to  1,  it  means   that   the   action
1626                     put_nd_ra_opts was successful.
1627
1628                     eth.dst = eth.src;
1629                     eth.src = E;
1630                     ip6.dst = ip6.src;
1631                     ip6.src = I;
1632                     outport = P;
1633                     flags.loopback = 1;
1634                     output;
1635
1636
1637                     where  E  is the MAC address and I is the IPv6 link local
1638                     address of the logical router port.
1639
1640                     (This terminates packet processing in  ingress  pipeline;
1641                     the packet does not go to the next ingress table.)
1642
1643              ·      A priority-0 logical flow with match 1 has actions next;.
1644
1645     Ingress Table 9: IP Routing
1646
1647       A  packet  that  arrives  at  this table is an IP packet that should be
1648       routed to the address in ip4.dst or ip6.dst. This table  implements  IP
1649       routing,  setting  reg0 (or xxreg0 for IPv6) to the next-hop IP address
1650       (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
1651       and  advances  to  the next table for ARP resolution. It also sets reg1
1652       (or xxreg1) to the  IP  address  owned  by  the  selected  router  port
1653       (ingress  table  ARP  Request  will generate an ARP request, if needed,
1654       with reg0 as the target protocol address and reg1 as the source  proto‐
1655       col address).
1656
1657       This table contains the following logical flows:
1658
1659              ·      Priority-500  flows  that match IP multicast traffic des‐
1660                     tined  to  groups  registered  on  any  of  the  attached
1661                     switches  and  sets  outport  to the associated multicast
1662                     group that will  eventually  flood  the  traffic  to  all
1663                     interested  attached  logical  switches.  The  flows also
1664                     decrement TTL.
1665
1666              ·      Priority-450 flow that matches unregistered IP  multicast
1667                     traffic  and  sets  outport  to  the  MC_STATIC multicast
1668                     group, which ovn-northd populates with the logical  ports
1669                     that have options :mcast_flood=’true’.
1670
1671              ·      For  distributed logical routers where one of the logical
1672                     router ports specifies a redirect-chassis, a priority-400
1673                     logical  flow  for each ip source/destination couple that
1674                     matches the dnat_and_snat  NAT  rules  configured.  These
1675                     flows  will  allow  to  properly  forward  traffic to the
1676                     external connections if available and  avoid  sending  it
1677                     through  the tunnel. Assuming the two following NAT rules
1678                     have been configured:
1679
1680                     external_ip{0,1} = EIP{0,1};
1681                     external_mac{0,1} = MAC{0,1};
1682                     logical_ip{0,1} = LIP{0,1};
1683
1684
1685                     the following action will be applied:
1686
1687                     eth.dst = MAC0;
1688                     eth.src = MAC1;
1689                     reg0 = ip4.dst;
1690                     reg1 = EIP1;
1691                     outport = redirect-chassis-port;
1692                     REGBIT_DISTRIBUTED_NAT = 1; next;.
1693
1694
1695                     Morover a priority-400 logical  flow  is  configured  for
1696                     each  dnat_and_snat  NAT  rule configured in order to not
1697                     send traffic for local FIP through  the  overlay  tunnels
1698                     but manage it in the local hypervisor
1699
1700              ·      For  distributed logical routers where one of the logical
1701                     router ports specifies a redirect-chassis, a priority-300
1702                     logical  flow  with  match  REGBIT_NAT_REDIRECT  == 1 has
1703                     actions ip.ttl--; next;. The outport will be set later in
1704                     the Gateway Redirect table.
1705
1706              ·      IPv4 routing table. For each route to IPv4 network N with
1707                     netmask M, on router port P with IP address A and  Ether‐
1708                     net  address E, a logical flow with match ip4.dst == N/M,
1709                     whose priority is the number of 1-bits in M, has the fol‐
1710                     lowing actions:
1711
1712                     ip.ttl--;
1713                     reg0 = G;
1714                     reg1 = A;
1715                     eth.src = E;
1716                     outport = P;
1717                     flags.loopback = 1;
1718                     next;
1719
1720
1721                     (Ingress table 1 already verified that ip.ttl--; will not
1722                     yield a TTL exceeded error.)
1723
1724                     If the route has a gateway, G is the gateway IP  address.
1725                     Instead,  if the route is from a configured static route,
1726                     G is the next hop IP address. Else it is ip4.dst.
1727
1728              ·      IPv6 routing table. For each route to IPv6 network N with
1729                     netmask  M, on router port P with IP address A and Ether‐
1730                     net address E, a logical flow with match in CIDR notation
1731                     ip6.dst == N/M, whose priority is the integer value of M,
1732                     has the following actions:
1733
1734                     ip.ttl--;
1735                     xxreg0 = G;
1736                     xxreg1 = A;
1737                     eth.src = E;
1738                     outport = P;
1739                     flags.loopback = 1;
1740                     next;
1741
1742
1743                     (Ingress table 1 already verified that ip.ttl--; will not
1744                     yield a TTL exceeded error.)
1745
1746                     If  the route has a gateway, G is the gateway IP address.
1747                     Instead, if the route is from a configured static  route,
1748                     G is the next hop IP address. Else it is ip6.dst.
1749
1750                     If  the  address  A is in the link-local scope, the route
1751                     will be limited to sending on the ingress port.
1752
1753     Ingress Table 10: ARP/ND Resolution
1754
1755       Any packet that reaches this table is an IP packet whose next-hop  IPv4
1756       address  is  in  reg0 or IPv6 address is in xxreg0. (ip4.dst or ip6.dst
1757       contains the final destination.) This table resolves the IP address  in
1758       reg0 (or xxreg0) into an output port in outport and an Ethernet address
1759       in eth.dst, using the following flows:
1760
1761              ·      A priority-500 flow that  matches  IP  multicast  traffic
1762                     that  was  allowed in the routing pipeline. For this kind
1763                     of traffic the outport was already set so the  flow  just
1764                     advances to the next table.
1765
1766              ·      For  distributed logical routers where one of the logical
1767                     router ports specifies a redirect-chassis, a priority-400
1768                     logical  flow  with match REGBIT_DISTRIBUTED_NAT == 1 has
1769                     action next;
1770
1771                     For distributed logical routers where one of the  logical
1772                     router ports specifies a redirect-chassis, a priority-200
1773                     logical flow with  match  REGBIT_NAT_REDIRECT  ==  1  has
1774                     actions  eth.dst  =  E;  next;,  where  E is the ethernet
1775                     address of the router’s distributed gateway port.
1776
1777              ·      Static MAC bindings. MAC bindings can be known statically
1778                     based  on data in the OVN_Northbound database. For router
1779                     ports connected to logical switches, MAC bindings can  be
1780                     known  statically  from the addresses column in the Logi‐
1781                     cal_Switch_Port table.  For  router  ports  connected  to
1782                     other  logical  routers, MAC bindings can be known stati‐
1783                     cally from the mac  and  networks  column  in  the  Logi‐
1784                     cal_Router_Port table.
1785
1786                     For  each IPv4 address A whose host is known to have Eth‐
1787                     ernet address E on router port  P,  a  priority-100  flow
1788                     with match outport === P && reg0 == A has actions eth.dst
1789                     = E; next;.
1790
1791                     For each virtual ip A configured on  a  logical  port  of
1792                     type  virtual  and  its  virtual parent set in its corre‐
1793                     sponding Port_Binding record and the virtual parent  with
1794                     the  Ethernet  address  E and the virtual ip is reachable
1795                     via the router port P, a  priority-100  flow  with  match
1796                     outport  ===  P  &&  reg0  ==  A has actions eth.dst = E;
1797                     next;.
1798
1799                     For each virtual ip A configured on  a  logical  port  of
1800                     type virtual and its virtual parent not set in its corre‐
1801                     sponding Port_Binding record and  the  virtual  ip  A  is
1802                     reachable via the router port P, a priority-100 flow with
1803                     match outport === P && reg0 == A has  actions  eth.dst  =
1804                     00:00:00:00:00:00;  next;. This flow is added so that the
1805                     ARP is always resolved for the virtual ip A by generating
1806                     ARP  request  and not consulting the MAC_Binding table as
1807                     it can have incorrect value for the virtual ip A.
1808
1809                     For each IPv6 address A whose host is known to have  Eth‐
1810                     ernet  address  E  on  router port P, a priority-100 flow
1811                     with match outport === P  &&  xxreg0  ==  A  has  actions
1812                     eth.dst = E; next;.
1813
1814                     For each logical router port with an IPv4 address A and a
1815                     mac address of E that is reachable via a different  logi‐
1816                     cal router port P, a priority-100 flow with match outport
1817                     === P && reg0 == A has actions eth.dst = E; next;.
1818
1819                     For each logical router port with an IPv6 address A and a
1820                     mac  address of E that is reachable via a different logi‐
1821                     cal router port P, a priority-100 flow with match outport
1822                     === P && xxreg0 == A has actions eth.dst = E; next;.
1823
1824              ·      Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
1825                     ings that have become known dynamically  through  ARP  or
1826                     neighbor  discovery.  (The ingress table ARP Request will
1827                     issue an ARP or neighbor solicitation request  for  cases
1828                     where the binding is not yet known.)
1829
1830                     A  priority-0  logical  flow  with  match ip4 has actions
1831                     get_arp(outport, reg0); next;.
1832
1833                     A priority-0 logical flow  with  match  ip6  has  actions
1834                     get_nd(outport, xxreg0); next;.
1835
1836              ·      For  logical  router port with redirect-chassis and redi‐
1837                     rect-type being set as bridged, a priority-50  flow  will
1838                     match  outport  == "ROUTER_PORT" and !is_chassis_resident
1839                     ("cr-ROUTER_PORT") has actions eth.dst = E; next;,  where
1840                     E is the ethernet address of the logical router port.
1841
1842     Ingress Table 11: Check packet length
1843
1844       For  distributed  logical routers with distributed gateway port config‐
1845       ured with options:gateway_mtu to a valid integer value, this table adds
1846       a  priority-50  logical  flow  with the match ip4 && outport == GW_PORT
1847       where GW_PORT is the distributed gateway router port  and  applies  the
1848       action check_pkt_larger and advances the packet to the next table.
1849
1850       REGBIT_PKT_LARGER = check_pkt_larger(L); next;
1851
1852
1853       where L is the packet length to check for. If the packet is larger than
1854       L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
1855       taken from options:gateway_mtu column of Logical_Router_Port row.
1856
1857       This  table  adds one priority-0 fallback flow that matches all packets
1858       and advances to the next table.
1859
1860     Ingress Table 12: Handle larger packets
1861
1862       For distributed logical routers with distributed gateway  port  config‐
1863       ured with options:gateway_mtu to a valid integer value, this table adds
1864       the following priority-50 logical flow for  each  logical  router  port
1865       with  the  match  ip4  &&  inport  == LRP && outport == GW_PORT && REG‐
1866       BIT_PKT_LARGER, where LRP is the logical router port and GW_PORT is the
1867       distributed gateway router port and applies the following action
1868
1869       icmp4 {
1870           icmp4.type = 3; /* Destination Unreachable. */
1871           icmp4.code = 4;  /* Frag Needed and DF was Set. */
1872           icmp4.frag_mtu = M;
1873           eth.dst = E;
1874           ip4.dst = ip4.src;
1875           ip4.src = I;
1876           ip.ttl = 255;
1877           REGBIT_EGRESS_LOOPBACK = 1;
1878           next(pipeline=ingress, table=0);
1879       };
1880
1881
1882              ·      Where  M  is the (fragment MTU - 58) whose value is taken
1883                     from options:gateway_mtu  column  of  Logical_Router_Port
1884                     row.
1885
1886              ·      E is the Ethernet address of the logical router port.
1887
1888              ·      I is the IPv4 address of the logical router port.
1889
1890       This  table  adds one priority-0 fallback flow that matches all packets
1891       and advances to the next table.
1892
1893     Ingress Table 13: Gateway Redirect
1894
1895       For distributed logical routers where one of the logical  router  ports
1896       specifies  a  redirect-chassis, this table redirects certain packets to
1897       the distributed gateway port instance on the redirect-chassis. This ta‐
1898       ble has the following flows:
1899
1900              ·      A  priority-300  logical  flow with match REGBIT_DISTRIB‐
1901                     UTED_NAT == 1 has action next;
1902
1903              ·      A priority-200 logical flow with  match  REGBIT_NAT_REDI‐
1904                     RECT  ==  1  has actions outport = CR; next;, where CR is
1905                     the chassisredirect port representing the instance of the
1906                     logical  router  distributed  gateway  port  on the redi‐
1907                     rect-chassis.
1908
1909              ·      A priority-150 logical flow with match outport ==  GW  &&
1910                     eth.dst  ==  00:00:00:00:00:00  has actions outport = CR;
1911                     next;, where GW is the logical router distributed gateway
1912                     port  and CR is the chassisredirect port representing the
1913                     instance of the logical router distributed  gateway  port
1914                     on the redirect-chassis.
1915
1916              ·      For each NAT rule in the OVN Northbound database that can
1917                     be handled in a distributed manner, a priority-100  logi‐
1918                     cal  flow with match ip4.src == B && outport == GW, where
1919                     GW is the logical router distributed gateway  port,  with
1920                     actions next;.
1921
1922              ·      A  priority-50  logical flow with match outport == GW has
1923                     actions outport = CR; next;,  where  GW  is  the  logical
1924                     router  distributed  gateway  port  and  CR  is the chas‐
1925                     sisredirect port representing the instance of the logical
1926                     router distributed gateway port on the redirect-chassis.
1927
1928              ·      A priority-0 logical flow with match 1 has actions next;.
1929
1930     Ingress Table 14: ARP Request
1931
1932       In  the  common  case where the Ethernet destination has been resolved,
1933       this table outputs the packet. Otherwise, it composes and sends an  ARP
1934       or IPv6 Neighbor Solicitation request. It holds the following flows:
1935
1936              ·      Unknown MAC address. A priority-100 flow for IPv4 packets
1937                     with match eth.dst == 00:00:00:00:00:00 has the following
1938                     actions:
1939
1940                     arp {
1941                         eth.dst = ff:ff:ff:ff:ff:ff;
1942                         arp.spa = reg1;
1943                         arp.tpa = reg0;
1944                         arp.op = 1;  /* ARP request. */
1945                         output;
1946                     };
1947
1948
1949                     Unknown  MAC  address. For each IPv6 static route associ‐
1950                     ated with the router with the nexthop  IP:  G,  a  prior‐
1951                     ity-200  flow  for  IPv6  packets  with  match eth.dst ==
1952                     00:00:00:00:00:00 &&  xxreg0  ==  G  with  the  following
1953                     actions is added:
1954
1955                     nd_ns {
1956                         eth.dst = E;
1957                         ip6.dst = I
1958                         nd.target = G;
1959                         output;
1960                     };
1961
1962
1963                     Where E is the multicast mac derived from the Gateway IP,
1964                     I is the solicited-node multicast  address  corresponding
1965                     to the target address G.
1966
1967                     Unknown MAC address. A priority-100 flow for IPv6 packets
1968                     with match eth.dst == 00:00:00:00:00:00 has the following
1969                     actions:
1970
1971                     nd_ns {
1972                         nd.target = xxreg0;
1973                         output;
1974                     };
1975
1976
1977                     (Ingress  table  IP  Routing initialized reg1 with the IP
1978                     address owned by outport and (xx)reg0 with  the  next-hop
1979                     IP address)
1980
1981                     The  IP  packet  that triggers the ARP/IPv6 NS request is
1982                     dropped.
1983
1984              ·      Known MAC address. A priority-0 flow  with  match  1  has
1985                     actions output;.
1986
1987     Egress Table 0: UNDNAT
1988
1989       This  is  for  already  established connections’ reverse traffic. i.e.,
1990       DNAT has already been done in ingress pipeline and now the  packet  has
1991       entered  the  egress pipeline as part of a reply. For NAT on a distrib‐
1992       uted router, it is unDNATted here. For Gateway routers, the unDNAT pro‐
1993       cessing is carried out in the ingress DNAT table.
1994
1995              ·      For  all the configured load balancing rules for a router
1996                     with  gateway  port  in  OVN_Northbound   database   that
1997                     includes  an  IPv4  address  VIP,  for every backend IPv4
1998                     address B defined for the VIP a priority-120 flow is pro‐
1999                     grammed on redirect-chassis that matches ip && ip4.src ==
2000                     B && outport == GW, where GW is the logical router  gate‐
2001                     way  port  with  an  action ct_dnat;. If the backend IPv4
2002                     address B is also configured with L4 port PORT of  proto‐
2003                     col  P, then the match also includes P.src == PORT. These
2004                     flows are not added for load balancers with IPv6 VIPs.
2005
2006                     If the router is configured to force SNAT  any  load-bal‐
2007                     anced   packets,   above   action  will  be  replaced  by
2008                     flags.force_snat_for_lb = 1; ct_dnat;.
2009
2010              ·      For each configuration in  the  OVN  Northbound  database
2011                     that  asks  to  change  the  destination  IP address of a
2012                     packet from an IP address of A to B, a priority-100  flow
2013                     matches  ip && ip4.src == B && outport == GW, where GW is
2014                     the logical router gateway port, with an action ct_dnat;.
2015
2016                     If the NAT rule cannot be handled in a  distributed  man‐
2017                     ner,  then the priority-100 flow above is only programmed
2018                     on the redirect-chassis.
2019
2020                     If the NAT rule can be handled in a  distributed  manner,
2021                     then  there  is an additional action eth.src = EA;, where
2022                     EA is the ethernet address associated with the IP address
2023                     A  in  the NAT rule. This allows upstream MAC learning to
2024                     point to the correct chassis.
2025
2026              ·      A priority-0 logical flow with match 1 has actions next;.
2027
2028     Egress Table 1: SNAT
2029
2030       Packets that are configured to be SNATed get their  source  IP  address
2031       changed based on the configuration in the OVN Northbound database.
2032
2033       Egress Table 1: SNAT on Gateway Routers
2034
2035              ·      If  the Gateway router in the OVN Northbound database has
2036                     been configured to force SNAT a  packet  (that  has  been
2037                     previously  DNATted)  to  B,  a priority-100 flow matches
2038                     flags.force_snat_for_dnat ==  1  &&  ip  with  an  action
2039                     ct_snat(B);.
2040
2041                     If  the Gateway router in the OVN Northbound database has
2042                     been configured to force SNAT a  packet  (that  has  been
2043                     previously  load-balanced)  to  B,  a  priority-100  flow
2044                     matches flags.force_snat_for_lb == 1 && ip with an action
2045                     ct_snat(B);.
2046
2047                     For  each  configuration  in the OVN Northbound database,
2048                     that asks to change the source IP  address  of  a  packet
2049                     from  an  IP  address  of  A  or  to change the source IP
2050                     address of a packet that belongs to network  A  to  B,  a
2051                     flow   matches   ip  &&  ip4.src  ==  A  with  an  action
2052                     ct_snat(B);. The priority of the flow is calculated based
2053                     on  the  mask of A, with matches having larger masks get‐
2054                     ting higher priorities.
2055
2056                     A priority-0 logical flow with match 1 has actions next;.
2057
2058       Egress Table 1: SNAT on Distributed Routers
2059
2060              ·      For each configuration in the  OVN  Northbound  database,
2061                     that  asks  to  change  the source IP address of a packet
2062                     from an IP address of  A  or  to  change  the  source  IP
2063                     address  of  a  packet  that belongs to network A to B, a
2064                     flow matches ip && ip4.src == A && outport ==  GW,  where
2065                     GW  is  the  logical  router gateway port, with an action
2066                     ct_snat(B);. The priority of the flow is calculated based
2067                     on  the  mask of A, with matches having larger masks get‐
2068                     ting higher priorities.
2069
2070                     If the NAT rule cannot be handled in a  distributed  man‐
2071                     ner,  then the flow above is only programmed on the redi‐
2072                     rect-chassis increasing flow priority by 128 in order  to
2073                     be run first
2074
2075                     If  the  NAT rule can be handled in a distributed manner,
2076                     then there is an additional action eth.src =  EA;,  where
2077                     EA is the ethernet address associated with the IP address
2078                     A in the NAT rule. This allows upstream MAC  learning  to
2079                     point to the correct chassis.
2080
2081              ·      A priority-0 logical flow with match 1 has actions next;.
2082
2083     Egress Table 2: Egress Loopback
2084
2085       For  distributed  logical routers where one of the logical router ports
2086       specifies a redirect-chassis.
2087
2088       Earlier in the ingress pipeline, some east-west traffic was  redirected
2089       to  the  chassisredirect  port,  based  on flows in the UNSNAT and DNAT
2090       ingress tables setting the REGBIT_NAT_REDIRECT flag, which  then  trig‐
2091       gered  a  match  to  a  flow in the Gateway Redirect ingress table. The
2092       intention was not to actually send traffic out the distributed  gateway
2093       port  instance  on  the  redirect-chassis. This traffic was sent to the
2094       distributed gateway port instance in order for DNAT  and/or  SNAT  pro‐
2095       cessing to be applied.
2096
2097       While  UNDNAT  and SNAT processing have already occurred by this point,
2098       this traffic needs to be forced through egress loopback  on  this  dis‐
2099       tributed gateway port instance, in order for UNSNAT and DNAT processing
2100       to be applied, and also for IP routing and ARP resolution after all  of
2101       the NAT processing, so that the packet can be forwarded to the destina‐
2102       tion.
2103
2104       This table has the following flows:
2105
2106              ·      For each dnat_and_snat NAT rule couple in the OVN  North‐
2107                     bound  database  on  a distributed router, a priority-200
2108                     logical with match ip4.dst == external_ip0 && ip4.src  ==
2109                     external_ip1, has action next;
2110
2111                     For  each  NAT  rule  in the OVN Northbound database on a
2112                     distributed router,  a  priority-100  logical  flow  with
2113                     match  ip4.dst  ==  E  &&  outport  == GW, where E is the
2114                     external IP address specified in the NAT rule, and GW  is
2115                     the  logical  router  distributed  gateway port, with the
2116                     following actions:
2117
2118                     clone {
2119                         ct_clear;
2120                         inport = outport;
2121                         outport = "";
2122                         flags = 0;
2123                         flags.loopback = 1;
2124                         reg0 = 0;
2125                         reg1 = 0;
2126                         ...
2127                         reg9 = 0;
2128                         REGBIT_EGRESS_LOOPBACK = 1;
2129                         next(pipeline=ingress, table=0);
2130                     };
2131
2132
2133                     flags.loopback is set since in_port is unchanged and  the
2134                     packet may return back to that port after NAT processing.
2135                     REGBIT_EGRESS_LOOPBACK is set  to  indicate  that  egress
2136                     loopback  has  occurred,  in  order to skip the source IP
2137                     address check against the router address.
2138
2139              ·      A priority-0 logical flow with match 1 has actions next;.
2140
2141     Egress Table 3: Delivery
2142
2143       Packets that reach this table are ready for delivery. It contains:
2144
2145              ·      Priority-110 logical flows that match IP multicast  pack‐
2146                     ets  on  each  enabled logical router port and modify the
2147                     Ethernet source address of the packets  to  the  Ethernet
2148                     address of the port and then execute action output;.
2149
2150              ·      Priority-100  logical  flows  that  match packets on each
2151                     enabled logical router port, with action output;.
2152
2153
2154
2155Open vSwitch 2.12.0               ovn-northd                     ovn-northd(8)
Impressum