[ovs-dev,3/3] ovn: Update TODO, ovn-northd flow table design, ovn-architecture for L3.
diff mbox

Message ID 20151006214510.GD9679@nicira.com
State Not Applicable
Headers show

Commit Message

Ben Pfaff Oct. 6, 2015, 9:45 p.m. UTC
On Tue, Sep 29, 2015 at 03:21:16PM -0700, Ben Pfaff wrote:
> On Tue, Sep 29, 2015 at 03:19:47PM -0700, Ben Pfaff wrote:
> > On Mon, Sep 21, 2015 at 06:06:16PM -0700, Justin Pettit wrote:
> > > > +        <pre>
> > > > +ip4.dst = ip4.src;
> > > > +ip4.src = <var>S</var>;
> > > > +ip4.ttl = 255;
> > > > +icmp4.type = 0;
> > > > +reg0 = ip4.dst;
> > > 
> > > Finally, should it always be the original source IP?  If we have
> > > multiple routers in between, shouldn't it be the gateway's address?
> > > (This question holds for all the places where reg0 is being set.)
> > 
> > XXXXXXXXXXXXx
> 
> Um, that was meant to be an actual reply.  I'll get to that.

It's a real problem.  To resolve it, I think we need to break logical
router ingress table 1 into a pair of tables, one for IP input, one for
IP routing, and enhance the "next" action to allow recirculation within
a logical flow table.  Here's an incremental.  I'll post a full revision
soon.

Patch
diff mbox

diff --git a/ovn/TODO b/ovn/TODO
index c914c10..77f5716 100644
--- a/ovn/TODO
+++ b/ovn/TODO
@@ -75,6 +75,13 @@  See description in ovn-northd(8).
 
 ** New OVN logical actions
 
+*** enhanced "next" action.
+
+OVN logical router flows need to be able to revisit a single logical
+flow table, so that ICMP "destination unreachable" errors generated by
+a logical router can themselves be routed.  One way to do this is to
+enhance the "next" action to take an optional flow table index.
+
 *** arp
 
 Generates an ARP packet based on the current IPv4 packet and allows it
diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index c822dbb..a5abc3b 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -236,7 +236,7 @@ 
       Other packets are implicitly dropped.
     </p>
 
-    <h3>Ingress Table 1: IP Routing</h3>
+    <h3>Ingress Table 1: IP Input</h3>
 
     <p>
       This table is the core of the logical router datapath functionality.  It
@@ -244,12 +244,6 @@ 
       functionality.
     </p>
 
-    <p>
-      Logical flows in this table that advance to the next table set
-      <code>reg0</code> to the next-hop IP address.  (<code>ip4.dst</code>, the
-      final destination, is left unchanged.)
-    </p>
-
     <ul>
       <li>
         <p>
@@ -300,7 +294,6 @@  ip4.dst = ip4.src;
 ip4.src = <var>S</var>;
 ip4.ttl = 255;
 icmp4.type = 0;
-reg0 = ip4.dst;   // Route back to original IP source.
 next;
         </pre>
 
@@ -428,12 +421,27 @@  icmp4 {
     ip4.dst = ip4.src;
     ip4.src = <var>A</var>;
     ip4.ttl = 255;
-    reg0 = ip4.dst;
     next;
 };
         </pre>
       </li>
+    </ul>
+
+    <h3>Ingress Table 2: IP Routing</h3>
 
+    <p>
+      A packet that arrives at this table is an IP packet that should be routed
+      to the address in <code>ip4.dst</code>.  This table implements IP
+      routing, setting <code>reg0</code> to the next-hop IP address (leaving
+      <code>ip4.dst</code>, the packet's final destination, unchanged) and
+      advances to the next table for ARP resolution.
+    </p>
+
+    <p>
+      This table contains the following logical flows:
+    </p>
+
+    <ul>
       <li>
         <p>
           Routing table.  For each route to IPv4 network <var>N</var> with
@@ -449,6 +457,11 @@  next;
         </pre>
 
         <p>
+          (Ingress table 1 already verified that <code>ip4.ttl--;</code> will
+          not yield a TTL exceeded error.)
+        </p>
+
+        <p>
           If the route has a gateway, <var>G</var> is the gateway IP address,
           otherwise it is <code>ip4.dst</code>.
         </p>
@@ -458,8 +471,8 @@  next;
         <p>
           Destination unreachable.  For each router port <var>P</var>, which
           owns IP address <var>A</var>, a priority-0 logical flow with match
-          <code>in_port == <var>P</var> &amp;&amp; !ip.later_frag</code> has
-          the following actions:
+          <code>in_port == <var>P</var> &amp;&amp; !ip.later_frag &amp;&amp;
+          !icmp</code> has the following actions:
         </p>
 
         <pre>
@@ -469,19 +482,23 @@  icmp4 {
     ip4.dst = ip4.src;
     ip4.src = <var>A</var>;
     ip4.ttl = 255;
-    reg0 = ip4.dst;
-    next;
+    next(2);
 };
         </pre>
 
         <p>
+          (The <code>!icmp</code> check prevents recursion if the destination
+          unreachable message itself cannot be routed.)
+        </p>
+
+        <p>
           These flows are omitted if the logical router has a default route,
           that is, a route with netmask 0.0.0.0.
         </p>
       </li>
     </ul>
 
-    <h3>Ingress Table 2: ARP Resolution</h3>
+    <h3>Ingress Table 3: ARP Resolution</h3>
 
     <p>
       Any packet that reaches this table is an IP packet whose next-hop IP