diff mbox

[ovs-dev,v3,5/5] ovn: DNAT and SNAT on a gateway router.

Message ID 1463688154-5102-5-git-send-email-guru@ovn.org
State Changes Requested
Headers show

Commit Message

Gurucharan Shetty May 19, 2016, 8:02 p.m. UTC
For traffic from physical space to virtual space we need DNAT.
The DNAT happens in the gateway router and reaches the logical
port. The return traffic should be unDNATed.

Traffic originating in virtual space heading to physical space
should be SNATed. The return traffic is unSNATted.

East-west traffic with the public destination IP address needs
a DNAT. This traffic is punted to the l3 gateway where DNAT
takes place. This traffic is also SNATed and eventually loops back to
its destination. The SNAT is needed because we need the reverse traffic
to go back to the l3 gateway and not short-circuit directly to the source.

This commit introduces 4 new logical actions.
1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
2. ct_snat(IP): To SNAT to the provided IP address.
3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
4. ct_dnat(IP): To DNAT to the provided IP.

This commit only provides the ability to do IP based NAT. This will
eventually be enhanced to do PORT based NAT too.

Command hints:

Consider a distributed router "R1" that has switch foo (192.168.1.0/24)
with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport bar1
(192.168.2.2) connected to it. You connect "R1" to
a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.

R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
external network).

case: Add pure DNAT (north-south)

Add a DNAT rule in R2:
ovn-nbctl set logical_router R2 dnat:"30.0.0.2"=192.168.1.2

Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.

case2 : Add pure SNAT (south-north)

Add a SNAT rule in R2:

ovn-nbctl set logical_router R2 snat:"192.168.1.0/24"=30.0.0.1

(You need a static route in R1 to send packets destined to outside
world to go through R2)

When foo1 pings alice1, alice1 receives traffic from 30.0.0.1

case3 : SNAT and DNAT (east-west traffic)

Add a SNAT rule for bar1
ovn-nbctl set logical_router R2 snat:"192.168.2.2"=30.0.0.3

When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
and loops back to foo1 with a source ip address of 30.0.0.3

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
---
 ovn/lib/actions.c           |  83 +++++++++++++++++
 ovn/northd/ovn-northd.8.xml | 129 +++++++++++++++++++++++---
 ovn/northd/ovn-northd.c     | 214 ++++++++++++++++++++++++++++++++++++++++++--
 ovn/ovn-nb.ovsschema        |  12 ++-
 ovn/ovn-nb.xml              |  35 ++++++--
 ovn/ovn-sb.xml              |  38 ++++++++
 6 files changed, 487 insertions(+), 24 deletions(-)

Comments

Mickey Spiegel June 3, 2016, 5 a.m. UTC | #1
For the most part it looks good. I do have a few comments inline, a couple of them towards the bottom being significant.

-----"dev" <dev-bounces@openvswitch.org> wrote: -----

>To: dev@openvswitch.org
>From: Gurucharan Shetty 
>Sent by: "dev" 
>Date: 05/19/2016 10:58PM
>Subject: [ovs-dev] [PATCH v3 5/5] ovn: DNAT and SNAT on a gateway router.
>
>For traffic from physical space to virtual space we need DNAT.
>The DNAT happens in the gateway router and reaches the logical
>port. The return traffic should be unDNATed.
>
>Traffic originating in virtual space heading to physical space
>should be SNATed. The return traffic is unSNATted.
>
>East-west traffic with the public destination IP address needs
>a DNAT. This traffic is punted to the l3 gateway where DNAT
>takes place. This traffic is also SNATed and eventually loops back to
>its destination. The SNAT is needed because we need the reverse traffic
>to go back to the l3 gateway and not short-circuit directly to the source.
>
>This commit introduces 4 new logical actions.
>1. ct_snat: To send the packet through SNAT zone to unSNAT packets.
>2. ct_snat(IP): To SNAT to the provided IP address.
>3. ct_dnat: To send the packet throgh DNAT zone to unDNAT packets.
>4. ct_dnat(IP): To DNAT to the provided IP.
>
>This commit only provides the ability to do IP based NAT. This will
>eventually be enhanced to do PORT based NAT too.
>
>Command hints:
>
>Consider a distributed router "R1" that has switch foo (192.168.1.0/24)
>with a lport foo1 (192.168.1.2) and bar (192.168.2.0/24) with lport bar1
>(192.168.2.2) connected to it. You connect "R1" to
>a gateway router "R2" via a switch "join" in (20.0.0.0/24) network.
>
>R2 has a switch "alice" (172.16.1.0/24) connected to it (to simulate
>external network).
>
>case: Add pure DNAT (north-south)
>
>Add a DNAT rule in R2:
>ovn-nbctl set logical_router R2 dnat:"30.0.0.2"=192.168.1.2
>
>Now alice1 should be able to ping 192.168.1.2 via 30.0.0.2.
>
>case2 : Add pure SNAT (south-north)
>
>Add a SNAT rule in R2:
>
>ovn-nbctl set logical_router R2 snat:"192.168.1.0/24"=30.0.0.1
>
>(You need a static route in R1 to send packets destined to outside
>world to go through R2)
>
>When foo1 pings alice1, alice1 receives traffic from 30.0.0.1
>
>case3 : SNAT and DNAT (east-west traffic)
>
>Add a SNAT rule for bar1
>ovn-nbctl set logical_router R2 snat:"192.168.2.2"=30.0.0.3
>
>When bar1 pings 30.0.0.2, the traffic jumps to the gateway router
>and loops back to foo1 with a source ip address of 30.0.0.3
>
>Signed-off-by: Gurucharan Shetty <guru@ovn.org>
>---
> ovn/lib/actions.c           |  83 +++++++++++++++++
> ovn/northd/ovn-northd.8.xml | 129 +++++++++++++++++++++++---
> ovn/northd/ovn-northd.c     | 214
>++++++++++++++++++++++++++++++++++++++++++--
> ovn/ovn-nb.ovsschema        |  12 ++-
> ovn/ovn-nb.xml              |  35 ++++++--
> ovn/ovn-sb.xml              |  38 ++++++++
> 6 files changed, 487 insertions(+), 24 deletions(-)
>
>diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
>index 5f0bf19..4a486a0 100644
>--- a/ovn/lib/actions.c
>+++ b/ovn/lib/actions.c
>@@ -442,6 +442,85 @@ emit_ct(struct action_context *ctx, bool recirc_next, bool commit)
>     add_prerequisite(ctx, "ip");
> }
> 
>+static void
>+parse_ct_nat(struct action_context *ctx, bool snat)
>+{
>+    const size_t ct_offset = ctx->ofpacts->size;
>+    ofpbuf_pull(ctx->ofpacts, ct_offset);
>+
>+    struct ofpact_conntrack *ct = ofpact_put_CT(ctx->ofpacts);
>+
>+    if (ctx->ap->cur_ltable < ctx->ap->n_tables) {
>+        ct->recirc_table = ctx->ap->first_ptable + ctx->ap->cur_ltable + 1;
>+    } else {
>+        action_error(ctx,
>+                     "\"ct_[sd]nat\" action not allowed in last table.");
>+        return;
>+    }
>+
>+    if (snat) {
>+        ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
>+    } else {
>+        ct->zone_src.field = mf_from_id(MFF_LOG_DNAT_ZONE);
>+    }
>+    ct->zone_src.ofs = 0;
>+    ct->zone_src.n_bits = 16;
>+    ct->flags = 0;
>+    ct->alg = 0;
>+
>+    add_prerequisite(ctx, "ip");
>+
>+    struct ofpact_nat *nat;
>+    size_t nat_offset;
>+    nat_offset = ctx->ofpacts->size;
>+    ofpbuf_pull(ctx->ofpacts, nat_offset);
>+
>+    nat = ofpact_put_NAT(ctx->ofpacts);
>+    nat->flags = 0;
>+    nat->range_af = AF_UNSPEC;
>+
>+    int commit = 0;
>+    if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
>+        ovs_be32 ip;
>+        if (ctx->lexer->token.type == LEX_T_INTEGER
>+            && ctx->lexer->token.format == LEX_F_IPV4) {
>+            ip = ctx->lexer->token.value.ipv4;
>+        } else {
>+            action_syntax_error(ctx, "invalid ip");
>+            return;
>+        }
>+
>+        nat->range_af = AF_INET;
>+        nat->range.addr.ipv4.min = ip;
>+        if (snat) {
>+            nat->flags |= NX_NAT_F_SRC;
>+        } else {
>+            nat->flags |= NX_NAT_F_DST;
>+        }
>+        commit = NX_CT_F_COMMIT;
>+        lexer_get(ctx->lexer);
>+        if (!lexer_match(ctx->lexer, LEX_T_RPAREN)) {
>+            action_syntax_error(ctx, "expecting `)'");
>+            return;
>+        }
>+    }
>+
>+    ctx->ofpacts->header = ofpbuf_push_uninit(ctx->ofpacts, nat_offset);
>+    ct = ctx->ofpacts->header;
>+    ct->flags |= commit;
>+
>+    /* XXX: For performance reasons, we try to prevent additional
>+     * recirculations.  So far, ct_snat which is used in a gateway router
>+     * does not need a recirculation. ct_snat(IP) does need a recirculation.
>+     * Should we consider a method to let the actions specify whether a action
>+     * needs recirculation if there more use cases?. */
>+    if (!commit && snat) {
>+        ct->recirc_table = NX_CT_RECIRC_NONE;
>+    }
>+    ofpact_finish(ctx->ofpacts, &ct->ofpact);
>+    ofpbuf_push_uninit(ctx->ofpacts, ct_offset);
>+}
>+
> static bool
> parse_action(struct action_context *ctx)
> {
>@@ -469,6 +548,10 @@ parse_action(struct action_context *ctx)
>         emit_ct(ctx, true, false);
>     } else if (lexer_match_id(ctx->lexer, "ct_commit")) {
>         emit_ct(ctx, false, true);
>+    } else if (lexer_match_id(ctx->lexer, "ct_dnat")) {
>+        parse_ct_nat(ctx, false);
>+    } else if (lexer_match_id(ctx->lexer, "ct_snat")) {
>+        parse_ct_nat(ctx, true);
>     } else if (lexer_match_id(ctx->lexer, "arp")) {
>         parse_arp_action(ctx);
>     } else if (lexer_match_id(ctx->lexer, "get_arp")) {
>diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
>index 970c352..838fbb7 100644
>--- a/ovn/northd/ovn-northd.8.xml
>+++ b/ovn/northd/ovn-northd.8.xml
>@@ -500,11 +500,40 @@ next;
> 
>       <li>
>         <p>
>-          Reply to ARP requests.  These flows reply to ARP requests for the
>-          router's own IP address.  For each router port <var>P</var> that owns
>-          IP address <var>A</var> and Ethernet address <var>E</var>, a
>-          priority-90 flow matches <code>inport == <var>P</var> &amp;&amp;
>-          arp.op == 1 &amp;&amp; arp.tpa == <var>A</var></code> (ARP request)
>+          Reply to ARP requests.
>+        </p>
>+
>+        <p>
>+          These flows reply to ARP requests for the router's own IP address.
>+          For each router port <var>P</var> that owns IP address <var>A</var>
>+          and Ethernet address <var>E</var>, a priority-90 flow matches
>+          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
>+          arp.tpa == <var>A</var></code> (ARP request) with the following
>+          actions:
>+        </p>
>+
>+        <pre>
>+eth.dst = eth.src;
>+eth.src = <var>E</var>;
>+arp.op = 2; /* ARP reply. */
>+arp.tha = arp.sha;
>+arp.sha = <var>E</var>;
>+arp.tpa = arp.spa;
>+arp.spa = <var>A</var>;
>+outport = <var>P</var>;
>+inport = ""; /* Allow sending out inport. */
>+output;
>+        </pre>
>+      </li>
>+
>+      <li>
>+        <p>
>+          These flows reply to ARP requests for the virtual IP addresses
>+          configured in the router for DNAT. For a configured DNAT IP address
>+          <var>A</var>, for each router port <var>P</var> with Ethernet
>+          address <var>E</var>, a priority-90 flow matches
>+          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
>+          arp.tpa == <var>A</var></code> (ARP request)
>           with the following actions:
>         </p>
> 
>@@ -646,7 +675,63 @@ icmp4 {
>       </li>
>     </ul>
> 
>-    <h3>Ingress Table 2: IP Routing</h3>
>+    <h3>Ingress Table 2: SNAT</h3>
>+
>+    <p>
>+      This is for already established connections' reverse traffic.
>+      i.e., SNAT has already been done in egress pipeline and now the
>+      packet has entered the ingress pipeline as part of a reply. It is
>+      unSNATted here.
>+    </p>
>+
>+    <ul>
>+      <li>
>+        <p>
>+          For each configuration in the OVN Northdbound database, that asks
>+          to change the source IP address of a packet from <var>A</var> to
>+          <var>B</var>, a priority-100 flow matches <code>ip4.dst ==
>+          <var>B</var></code> with an action <code>ct_snat; next;</code>.
>+        </p>
>+
>+        <p>
>+          A priority-0 logical flow with match <code>1</code> has actions
>+          <code>next;</code>.
>+        </p>
>+      </li>
>+    </ul>
>+
>+    <h3>Ingress Table 3: DNAT</h3>
>+
>+    <p>
>+      Packets enter the pipeline with destination IP address that needs to
>+      be DNATted from a virtual IP address to a real IP address. Packets
>+      in the reverse direction needs to be unDNATed.
>+    </p>
>+    <ul>
>+      <li>
>+        <p>
>+          For each configuration in the OVN Northdbound database, that asks
>+          to change the destination IP address of a packet from <var>A</var> to
>+          <var>B</var>, a priority-100 flow matches <code>ip4.dst ==
>+          <var>A</var></code> with an action <code>ct_dnat(<var>B</var>);
>+          </code>.
>+        </p>
>+
>+        <p>
>+          For each configuration in the OVN Northdbound database, that asks
>+          to change the destination IP address of a packet from <var>A</var> to
>+          <var>B</var>, a priority-50 flow matches every <code>ip</code> packet
>+          with an action <code>ct_dnat;</code>.

Looking at the code, it looks like any IP packet matches the priority-50 flow. Something like:
For all IP packets, a priority-50 flow with an action <code>ct_dnat;</code>.

>+        </p>
>+
>+        <p>
>+          A priority-0 logical flow with match <code>1</code> has actions
>+          <code>next;</code>.
>+        </p>
>+      </li>
>+    </ul>
>+
>+    <h3>Ingress Table 4: IP Routing</h3>
> 
>     <p>
>       A packet that arrives at this table is an IP packet that should be routed
>@@ -655,7 +740,7 @@ icmp4 {
>       <code>ip4.dst</code>, the packet's final destination, unchanged) and
>       advances to the next table for ARP resolution.  It also sets
>       <code>reg1</code> to the IP address owned by the selected router port
>-      (which is used later in table 4 as the IP source address for an ARP
>+      (which is used later in table 6 as the IP source address for an ARP
>       request, if needed).
>     </p>
> 
>@@ -726,7 +811,7 @@ icmp4 {
>       </li>
>     </ul>
> 
>-    <h3>Ingress Table 3: ARP Resolution</h3>
>+    <h3>Ingress Table 5: ARP Resolution</h3>
> 
>     <p>
>       Any packet that reaches this table is an IP packet whose next-hop IP
>@@ -781,7 +866,7 @@ icmp4 {
>       </li>
>     </ul>
> 
>-    <h3>Ingress Table 4: ARP Request</h3>
>+    <h3>Ingress Table 6: ARP Request</h3>
> 
>     <p>
>       In the common case where the Ethernet destination has been resolved, this
>@@ -806,7 +891,7 @@ arp {
>         </pre>
> 
>         <p>
>-          (Ingress table 2 initialized <code>reg1</code> with the IP address
>+          (Ingress table 4 initialized <code>reg1</code> with the IP address
>           owned by <code>outport</code>.)
>         </p>
> 
>@@ -821,7 +906,29 @@ arp {
>       </li>
>     </ul>
> 
>-    <h3>Egress Table 0: Delivery</h3>
>+    <h3>Egress Table 0: SNAT</h3>
>+
>+    <p>
>+      Packets that are configured to be SNATed get their source IP address
>+      changed based on the configuration in the OVN Northdbound database.
>+    </p>
>+    <ul>
>+      <li>
>+        <p>
>+          For each configuration in the OVN Northdbound database, that asks
>+          to change the source IP address of a packet from <var>A</var> to
>+          <var>B</var>, a priority-100 flow matches <code>ip4.src ==
>+          <var>A</var></code> with an action <code>ct_snat(<var>B</var>)
>+          </code>.
>+        </p>
>+        <p>
>+          A priority-0 logical flow with match <code>1</code> has actions
>+          <code>next;</code>.
>+        </p>
>+      </li>
>+    </ul>
>+
>+    <h3>Egress Table 1: Delivery</h3>
> 
>     <p>
>       Packets that reach this table are ready for delivery.  It contains
>diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>index 7852d83..3ce88a7 100644
>--- a/ovn/northd/ovn-northd.c
>+++ b/ovn/northd/ovn-northd.c
>@@ -105,12 +105,15 @@ enum ovn_stage {
>     /* Logical router ingress stages. */ \
>     PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission") \
>     PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input") \
>-    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing") \
>-    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve") \
>-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request") \
>+    PIPELINE_STAGE(ROUTER, IN,  SNAT,        2, "lr_in_snat") \

I find calling this table SNAT misleading, since it is actually reverting the destination address.
How about UNDO_SNAT or UNSNAT?

>+    PIPELINE_STAGE(ROUTER, IN,  DNAT,        3, "lr_in_dnat") \
>+    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  4, "lr_in_ip_routing") \
>+    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 5, "lr_in_arp_resolve") \
>+    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 6, "lr_in_arp_request") \
> \
>     /* Logical router egress stages. */ \
>-    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,    0, "lr_out_delivery")
>+    PIPELINE_STAGE(ROUTER, OUT, SNAT,      0, "lr_out_snat") \
>+    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,  1, "lr_out_delivery")
> 
> #define PIPELINE_STAGE(DP_TYPE, PIPELINE, STAGE, TABLE, NAME)   \
>     S_##DP_TYPE##_##PIPELINE##_##STAGE                          \
>@@ -1965,6 +1968,44 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>         free(match);
>         free(actions);
> 
>+        /* ARP handling for virtual IP addresses.
>+         *
>+         * DNAT IP addresses are virtual IP addresses that need ARP
>+         * handling. */
>+        struct smap_node *node;
>+        SMAP_FOR_EACH(node, &op->od->nbr->dnat) {
>+            ovs_be32 ip;
>+            if (!ip_parse(node->key, &ip) || !ip) {
>+                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip address %s in dnat configuration"
>+                             "for router %s", node->key, op->key);
>+                continue;
>+            }
>+
>+            match = xasprintf(
>+                "inport == %s && arp.tpa == "IP_FMT" && arp.op == 1",
>+                op->json_key, IP_ARGS(ip));
>+            actions = xasprintf(
>+                "eth.dst = eth.src; "
>+                "eth.src = "ETH_ADDR_FMT"; "
>+                "arp.op = 2; /* ARP reply */ "
>+                "arp.tha = arp.sha; "
>+                "arp.sha = "ETH_ADDR_FMT"; "
>+                "arp.tpa = arp.spa; "
>+                "arp.spa = "IP_FMT"; "
>+                "outport = %s; "
>+                "inport = \"\"; /* Allow sending out inport. */ "
>+                "output;",
>+                ETH_ADDR_ARGS(op->mac),
>+                ETH_ADDR_ARGS(op->mac),
>+                IP_ARGS(ip),
>+                op->json_key);
>+            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
>+                          match, actions);
>+            free(match);
>+            free(actions);
>+        }
>+
>         /* Drop IP traffic to this router. */
>         match = xasprintf("ip4.dst == "IP_FMT, IP_ARGS(op->ip));
>         ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 60,
>@@ -1972,6 +2013,123 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>         free(match);
>     }
> 
>+    /* Ingress SNAT. This is for already established connections' reverse

As mentioned earlier, I prefer calling this Ingress Undo SNAT or something along those lines.

>+     * traffic. i.e., SNAT has already been done in egress pipeline and now
>+     * the packet has entered the ingress pipeline as part of a reply.
>+     * We undo the SNAT here.
>+     *
>+     * Undoing SNAT has to happen before DNAT processing. This is because when
>+     * the packet was DNATed in ingress pipeline, it did not know about the
>+     * possibility of eventual additional SNAT in egress pipeline. */
>+    HMAP_FOR_EACH (od, key_node, datapaths) {
>+        if (!od->nbr) {
>+            continue;
>+        }
>+
>+        /* Packets are allowed by default. */
>+        ovn_lflow_add(lflows, od, S_ROUTER_IN_SNAT, 0, "1", "next;");
>+
>+        /* SNAT rules are only valid on non-distributed routers. */
>+        if (!smap_get(&od->nbr->options, "chassis")) {
>+            continue;
>+        }
>+
>+        struct smap_node *node;
>+        SMAP_FOR_EACH(node, &od->nbr->snat) {
>+            ovs_be32 ip, mask;
>+
>+            char *error = ip_parse_masked(node->key, &ip, &mask);
>+            if (error) {
>+                static struct vlog_rate_limit rl
>+                        = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip network or ip %s for snat",
>+                             node->key);
>+                continue;
>+            }
>+
>+            error = ip_parse_masked(node->value, &ip, &mask);
>+            if (error || mask != OVS_BE32_MAX) {
>+                static struct vlog_rate_limit rl
>+                        = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip address %s for snat", node->value);
>+                continue;
>+            }
>+
>+            char *match;
>+            match = xasprintf("ip && ip4.dst == %s", node->value);
>+            ovn_lflow_add(lflows, od, S_ROUTER_IN_SNAT, 100,
>+                          match, "ct_snat; next;");
>+            free(match);
>+        }
>+    }
>+
>+    /* Ingress DNAT. Packets enter the pipeline with destination IP address
>+     * that needs to be DNATted from a virtual IP address to a real
>+     * IP address.  Packets in the reverse direction get unDNATed. */
>+    HMAP_FOR_EACH (od, key_node, datapaths) {
>+        if (!od->nbr) {
>+            continue;
>+        }
>+
>+        /* Packets are allowed by default. */
>+        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 0, "1", "next;");
>+
>+        /* DNAT rules are only valid on non-distributed routers. */
>+        if (!smap_get(&od->nbr->options, "chassis")) {
>+            continue;
>+        }
>+
>+        struct smap_node *node;
>+        SMAP_FOR_EACH(node, &od->nbr->dnat) {
>+            char *match, *actions;
>+            ovs_be32 ip, mask;
>+
>+            char *error = ip_parse_masked(node->key, &ip, &mask);
>+            if (error || mask != OVS_BE32_MAX) {
>+                static struct vlog_rate_limit rl
>+                        = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip address key %s for dnat", node->key);
>+                continue;
>+            }
>+
>+            error = ip_parse_masked(node->value, &ip, &mask);
>+            if (error || mask != OVS_BE32_MAX) {
>+                static struct vlog_rate_limit rl
>+                        = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip address value %s for dnat",
>+                             node->value);
>+                continue;
>+            }
>+
>+            /* Packet when it goes from the initiator to destination.
>+             * We need to zero the inport because the router can
>+             * send the packet back through the same interface. */
>+            match = xasprintf("ip && ip4.dst == %s", node->key);
>+            actions = xasprintf("inport = \"\"; ct_dnat(%s);", node->value);
>+            ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
>+                          match, actions);
>+            free(match);
>+            free(actions);
>+        }
>+
>+        /* Re-circulate every other packet through the DNAT zone.
>+         * This helps with 2 things.
>+         *
>+         * 1. Any packet that needs to be unDNATed in the reverse
>+         * direction gets unDNATed. Ideally this could be done in
>+         * the egress pipeline. But since the gateway router
>+         * does not have any feature that depends on the source
>+         * ip address being virtual IP address for IP routing,
>+         * we can do it here, saving a future re-circulation.
>+         *
>+         * 2. Any packet that was sent through SNAT zone in the
>+         * previous table automatically gets re-circulated to get
>+         * back the new destination IP address that is needed for
>+         * routing in the openflow pipeline. */
>+        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
>+                      "ip", "inport = \"\"; ct_dnat;");
>+    }
>+
>     /* Logical router ingress table 2: IP Routing.
>      *
>      * A packet that arrives at this table is an IP packet that should be
>@@ -2172,7 +2330,53 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>         ovn_lflow_add(lflows, od, S_ROUTER_IN_ARP_REQUEST, 0, "1", "output;");
>     }
> 
>-    /* Logical router egress table 0: Delivery (priority 100).
>+    /* Egress SNAT. Packets enter the pipeline with source ip address
>+     * that needs to be SNATted to a virtual ip address. */
>+    HMAP_FOR_EACH (od, key_node, datapaths) {
>+        if (!od->nbr) {
>+            continue;
>+        }
>+
>+        /* Packets are allowed by default. */
>+        ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 0, "1", "next;");
>+
>+        /* SNAT rules are only valid on non-distributed routers. */
>+        if (!smap_get(&od->nbr->options, "chassis")) {
>+            continue;
>+        }
>+
>+        struct smap_node *node;
>+        SMAP_FOR_EACH(node, &od->nbr->snat) {
>+            char *match, *actions;
>+            ovs_be32 ip, mask;
>+
>+            char *error = ip_parse_masked(node->key, &ip, &mask);
>+            if (error) {
>+                static struct vlog_rate_limit rl
>+                        = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip or ip network %s in snat key",
>+                             node->key);
>+                continue;
>+            }
>+
>+            error = ip_parse_masked(node->value, &ip, &mask);
>+            if (error || mask != OVS_BE32_MAX) {
>+                static struct vlog_rate_limit rl
>+                        = VLOG_RATE_LIMIT_INIT(5, 1);
>+                VLOG_WARN_RL(&rl, "bad ip address %s for snat", node->value);
>+                continue;
>+            }
>+
>+            match = xasprintf("ip && ip4.src == %s", node->key);
>+            actions = xasprintf("ct_snat(%s);", node->value);
>+            ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 100,

You are allowing the key to be a prefix, with mask other than OVS_BE32_MAX.
This implies that a user can specify multiple overlapping prefixes.
Conflicts should be resolved by going with the longest matching prefix.

We actually need this behavior for OpenStack. If the source IP address is a
fixed IP that has a corresponding floating IP, then the source IP address
should be mapped to the floating IP address rather than the router gateway
address. We need longest match to win to get this behavior.

So, the priority should be (mask+1).

>+                          match, actions);
>+            free(match);
>+            free(actions);
>+        }
>+    }
>+
>+    /* Logical router egress table 1: Delivery (priority 100).
>      *
>      * Priority 100 rules deliver packets to enabled logical ports. */
>     HMAP_FOR_EACH (op, key_node, ports) {
>diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
>index fa21b30..4d9c613 100644
>--- a/ovn/ovn-nb.ovsschema
>+++ b/ovn/ovn-nb.ovsschema
>@@ -1,7 +1,7 @@
> {
>         "name": "OVN_Northbound",
>-       "version": "2.1.2",
>-       "cksum": "429668869 5325",
>+       "version": "2.1.3",
>+       "cksum": "400575131 5731",
>         "tables": {
>                 "Logical_Switch": {
>                         "columns": {
>@@ -78,6 +78,14 @@
>                                    "max": "unlimited"}},
>                                 "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
>                                 "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
>+                               "dnat" : {
>+                                       "type": {"key": {"type": "string"},
>+                                                        "value": {"type" : "string"},
>+                                                         "min": 0, "max": "unlimited"}},
>+                               "snat" : {
>+                                       "type": {"key": {"type": "string"},
>+                                                         "value": {"type" : "string"},
>+                                                        "min": 0, "max": "unlimited"}},

As mentioned above, in OpenStack, if the source IP address is a fixed IP
that has a corresponding floating IP, then the source IP address should be
mapped to the floating IP address rather than the router gateway address.

Separate dnat and snat lists as defined here can be made to work to support
the OpenStack behavior using this patch, by copying floating IPs into both
the dnat and snat lists. However this is a little unwieldy.

If we ever move to distributed handling of floating IPs, which I am still
working on, then the duplication will be difficult to handle for east/west
flows. It would be easier if one definition could apply to both dnat and snat.
An additional value could indicate if the rule applies to only dnat or both.
Or, perhaps there could just be a single list of nat rules with an additional
value indicating if the rule applies to snat or dnat or both.

Note also that for distributed handling of floating IPs, we will need MAC
addresses to go along with the floating IPs. This makes me think it might
be worth having an indirection to a separate nat table.

Mickey


>                 "options": {
>                                          "type": {"key": "string",
>                                                           "value": "string",
>diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>index d239499..508a865 100644
>--- a/ovn/ovn-nb.xml
>+++ b/ovn/ovn-nb.xml
>@@ -631,18 +631,41 @@
>       router has all ingress and egress traffic dropped.
>     </column>
> 
>+    <column name="dnat">
>+      A map of externally visible IP address to their corresponding IP
>+      addresses in the logical space.  The externally visible IP address
>+      is DNATed to the IP address in the logical space.  DNAT only works
>+      on routers that are non-distributed.
>+    </column>
>+
>+    <column name="snat">
>+      A map of IP network (e.g 192.168.1.0/24) or an IP address to another
>+      IP address.  All IP packets with their source IP address that either
>+      matches the key or is in the provided network is SNATed
>+      into the IP address in the value.  SNAT only works on routers that are
>+      non-distributed.
>+    </column>
>+
>     <group title="Options">
>       <p>
>         Additional options for the logical router.
>       </p>
> 
>       <column name="options" key="chassis">
>-        If set, indicates that the logical router in question is
>-        non-distributed and resides in the set chassis. The same
>-        value is also used by <code>ovn-controller</code> to
>-        uniquely identify the chassis in the OVN deployment and
>-        comes from <code>external_ids:system-id</code> in the
>-        <code>Open_vSwitch</code> table of Open_vSwitch database.
>+        <p>
>+          If set, indicates that the logical router in question is
>+          non-distributed and resides in the set chassis.  The same
>+          value is also used by <code>ovn-controller</code> to
>+          uniquely identify the chassis in the OVN deployment and
>+          comes from <code>external_ids:system-id</code> in the
>+          <code>Open_vSwitch</code> table of Open_vSwitch database.
>+        </p>
>+
>+        <p>
>+          The non-distributed router (also known as Gateway router)
>+          can only be connected to a distributed router via a switch
>+          if SNAT and DNAT is to be configured in the Gateway router.
>+        </p>
>       </column>
>     </group>
>     
>diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
>index 741228c..3a0268f 100644
>--- a/ovn/ovn-sb.xml
>+++ b/ovn/ovn-sb.xml
>@@ -951,6 +951,44 @@
>           </p>
>         </dd>
> 
>+        <dt><code>ct_dnat;</code></dt>
>+        <dt><code>ct_dnat(<var>IP</var>);</code></dt>
>+        <dd>
>+          <p>
>+            <code>ct_dnat</code> sends the packet through the DNAT zone to
>+            unDNAT any packet that was DNATed in a different direction.
>+          </p>
>+          <p>
>+            <code>ct_dnat(<var>IP</var>)</code> sends the packet through the
>+            DNAT zone to change the destination IP address of the packet to
>+            the one provided inside the parenthesis and commits the connection.
>+          </p>
>+          <p>
>+            Both <code>ct_dnat</code> and <code>ct_dnat(<var>IP</var>)</code>
>+            recirculate the packet in the datapath.
>+          </p>
>+        </dd>
>+
>+        <dt><code>ct_snat;</code></dt>
>+        <dt><code>ct_snat(<var>IP</var>);</code></dt>
>+        <dd>
>+          <p>
>+            <code>ct_snat</code> sends the packet through the SNAT zone to
>+            unSNAT any packet that was SNATed in a different direction.
>+          </p>
>+          <p>
>+            <code>ct_snat(<var>IP</var>)</code> sends the packet through the
>+            SNAT zone to change the source IP address of the packet to
>+            the one provided inside the parenthesis and commits the connection.
>+          </p>
>+          <p>
>+            <code>ct_snat</code> does not recirculate the packet in the
>+            datapath and hence should be followed by a <code>next;</code>
>+            action. <code>ct_snat(<var>IP</var>)</code> does
>+            recirculate the packet in the datapath.
>+          </p>
>+        </dd>
>+
>         <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
>         <dd>
>           <p>
>-- 
>1.9.1
>
>_______________________________________________
>dev mailing list
>dev@openvswitch.org
>http://openvswitch.org/mailman/listinfo/dev
>
Gurucharan Shetty June 3, 2016, 5:41 a.m. UTC | #2
>
>
>
> Looking at the code, it looks like any IP packet matches the priority-50
> flow. Something like:
> For all IP packets, a priority-50 flow with an action
> <code>ct_dnat;</code>.
>

Agreed. Will fix this as part of the next version.

>
>
> >index 7852d83..3ce88a7 100644
> >--- a/ovn/northd/ovn-northd.c
> >+++ b/ovn/northd/ovn-northd.c
> >@@ -105,12 +105,15 @@ enum ovn_stage {
> >     /* Logical router ingress stages. */ \
> >     PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission") \
> >     PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input") \
> >-    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing") \
> >-    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve") \
> >-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request") \
> >+    PIPELINE_STAGE(ROUTER, IN,  SNAT,        2, "lr_in_snat") \
>
> I find calling this table SNAT misleading, since it is actually reverting
> the destination address.
> How about UNDO_SNAT or UNSNAT?
>
I am okay with UNSNAT. What about DNAT as it is currently doing both unDNAT
and DNAT? Leave it at DNAT?





> You are allowing the key to be a prefix, with mask other than OVS_BE32_MAX.
> This implies that a user can specify multiple overlapping prefixes.
> Conflicts should be resolved by going with the longest matching prefix.
>
> We actually need this behavior for OpenStack. If the source IP address is a
> fixed IP that has a corresponding floating IP, then the source IP address
> should be mapped to the floating IP address rather than the router gateway
> address. We need longest match to win to get this behavior.
>
> So, the priority should be (mask+1).
>

Right. I will get this right in the next version.


>
> >--- a/ovn/ovn-nb.ovsschema
> >+++ b/ovn/ovn-nb.ovsschema
> >@@ -1,7 +1,7 @@
> > {
> >         "name": "OVN_Northbound",
> >-       "version": "2.1.2",
> >-       "cksum": "429668869 5325",
> >+       "version": "2.1.3",
> >+       "cksum": "400575131 5731",
> >         "tables": {
> >                 "Logical_Switch": {
> >                         "columns": {
> >@@ -78,6 +78,14 @@
> >                                    "max": "unlimited"}},
> >                                 "default_gw": {"type": {"key": "string",
> "min": 0, "max": 1}},
> >                                 "enabled": {"type": {"key": "boolean",
> "min": 0, "max": 1}},
> >+                               "dnat" : {
> >+                                       "type": {"key": {"type":
> "string"},
> >+                                                        "value": {"type"
> : "string"},
> >+                                                         "min": 0,
> "max": "unlimited"}},
> >+                               "snat" : {
> >+                                       "type": {"key": {"type":
> "string"},
> >+                                                         "value":
> {"type" : "string"},
> >+                                                        "min": 0, "max":
> "unlimited"}},
>
> As mentioned above, in OpenStack, if the source IP address is a fixed IP
> that has a corresponding floating IP, then the source IP address should be
> mapped to the floating IP address rather than the router gateway address.
>
> Separate dnat and snat lists as defined here can be made to work to support
> the OpenStack behavior using this patch, by copying floating IPs into both
> the dnat and snat lists. However this is a little unwieldy.
>

I see your point. But the behavior you mention is a very OpenStack thing.
OVN config being independent of OpenStack specific behavior looks better to
me, Unless we can think up with a general purpose schema.

>
> If we ever move to distributed handling of floating IPs, which I am still
> working on, then the duplication will be difficult to handle for east/west
> flows. It would be easier if one definition could apply to both dnat and
> snat.
> An additional value could indicate if the rule applies to only dnat or
> both.
> Or, perhaps there could just be a single list of nat rules with an
> additional
> value indicating if the rule applies to snat or dnat or both.
>

This needs some thought. Do you strongly feel that this needs to be handled
now, or would you mind sending a patch that proposes your schema after this
schema goes in?



>
> Note also that for distributed handling of floating IPs, we will need MAC
> addresses to go along with the floating IPs. This makes me think it might
> be worth having an indirection to a separate nat table.
>
> Mickey
>
>
> >                 "options": {
> >                                          "type": {"key": "string",
> >                                                           "value":
> "string",
> >diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
> >index d239499..508a865 100644
> >--- a/ovn/ovn-nb.xml
> >+++ b/ovn/ovn-nb.xml
> >@@ -631,18 +631,41 @@
> >       router has all ingress and egress traffic dropped.
> >     </column>
> >
> >+    <column name="dnat">
> >+      A map of externally visible IP address to their corresponding IP
> >+      addresses in the logical space.  The externally visible IP address
> >+      is DNATed to the IP address in the logical space.  DNAT only works
> >+      on routers that are non-distributed.
> >+    </column>
> >+
> >+    <column name="snat">
> >+      A map of IP network (e.g 192.168.1.0/24) or an IP address to
> another
> >+      IP address.  All IP packets with their source IP address that
> either
> >+      matches the key or is in the provided network is SNATed
> >+      into the IP address in the value.  SNAT only works on routers that
> are
> >+      non-distributed.
> >+    </column>
> >+
> >     <group title="Options">
> >       <p>
> >         Additional options for the logical router.
> >       </p>
> >
> >       <column name="options" key="chassis">
> >-        If set, indicates that the logical router in question is
> >-        non-distributed and resides in the set chassis. The same
> >-        value is also used by <code>ovn-controller</code> to
> >-        uniquely identify the chassis in the OVN deployment and
> >-        comes from <code>external_ids:system-id</code> in the
> >-        <code>Open_vSwitch</code> table of Open_vSwitch database.
> >+        <p>
> >+          If set, indicates that the logical router in question is
> >+          non-distributed and resides in the set chassis.  The same
> >+          value is also used by <code>ovn-controller</code> to
> >+          uniquely identify the chassis in the OVN deployment and
> >+          comes from <code>external_ids:system-id</code> in the
> >+          <code>Open_vSwitch</code> table of Open_vSwitch database.
> >+        </p>
> >+
> >+        <p>
> >+          The non-distributed router (also known as Gateway router)
> >+          can only be connected to a distributed router via a switch
> >+          if SNAT and DNAT is to be configured in the Gateway router.
> >+        </p>
> >       </column>
> >     </group>
> >
> >diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> >index 741228c..3a0268f 100644
> >--- a/ovn/ovn-sb.xml
> >+++ b/ovn/ovn-sb.xml
> >@@ -951,6 +951,44 @@
> >           </p>
> >         </dd>
> >
> >+        <dt><code>ct_dnat;</code></dt>
> >+        <dt><code>ct_dnat(<var>IP</var>);</code></dt>
> >+        <dd>
> >+          <p>
> >+            <code>ct_dnat</code> sends the packet through the DNAT zone
> to
> >+            unDNAT any packet that was DNATed in a different direction.
> >+          </p>
> >+          <p>
> >+            <code>ct_dnat(<var>IP</var>)</code> sends the packet through
> the
> >+            DNAT zone to change the destination IP address of the packet
> to
> >+            the one provided inside the parenthesis and commits the
> connection.
> >+          </p>
> >+          <p>
> >+            Both <code>ct_dnat</code> and
> <code>ct_dnat(<var>IP</var>)</code>
> >+            recirculate the packet in the datapath.
> >+          </p>
> >+        </dd>
> >+
> >+        <dt><code>ct_snat;</code></dt>
> >+        <dt><code>ct_snat(<var>IP</var>);</code></dt>
> >+        <dd>
> >+          <p>
> >+            <code>ct_snat</code> sends the packet through the SNAT zone
> to
> >+            unSNAT any packet that was SNATed in a different direction.
> >+          </p>
> >+          <p>
> >+            <code>ct_snat(<var>IP</var>)</code> sends the packet through
> the
> >+            SNAT zone to change the source IP address of the packet to
> >+            the one provided inside the parenthesis and commits the
> connection.
> >+          </p>
> >+          <p>
> >+            <code>ct_snat</code> does not recirculate the packet in the
> >+            datapath and hence should be followed by a <code>next;</code>
> >+            action. <code>ct_snat(<var>IP</var>)</code> does
> >+            recirculate the packet in the datapath.
> >+          </p>
> >+        </dd>
> >+
> >         <dt><code>arp { <var>action</var>; </code>...<code>
> };</code></dt>
> >         <dd>
> >           <p>
> >--
> >1.9.1
> >
> >_______________________________________________
> >dev mailing list
> >dev@openvswitch.org
> >http://openvswitch.org/mailman/listinfo/dev
> >
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Gurucharan Shetty June 3, 2016, 4:47 p.m. UTC | #3
>
>
>> As mentioned above, in OpenStack, if the source IP address is a fixed IP
>> that has a corresponding floating IP, then the source IP address should be
>> mapped to the floating IP address rather than the router gateway address.
>>
>> Separate dnat and snat lists as defined here can be made to work to
>> support
>> the OpenStack behavior using this patch, by copying floating IPs into both
>> the dnat and snat lists. However this is a little unwieldy.
>>
>
> I see your point. But the behavior you mention is a very OpenStack thing.
> OVN config being independent of OpenStack specific behavior looks better to
> me, Unless we can think up with a general purpose schema.
>
>>
>> If we ever move to distributed handling of floating IPs, which I am still
>> working on, then the duplication will be difficult to handle for east/west
>> flows. It would be easier if one definition could apply to both dnat and
>> snat.
>> An additional value could indicate if the rule applies to only dnat or
>> both.
>> Or, perhaps there could just be a single list of nat rules with an
>> additional
>> value indicating if the rule applies to snat or dnat or both.
>>
>
> This needs some thought. Do you strongly feel that this needs to be
> handled now, or would you mind sending a patch that proposes your schema
> after this schema goes in?
>

As an addendum, the current schema does dnat:30.0.0.3=192.168.1.2. I would
like to enhance it to also be able to provide ports. Something like
dnat:30.0.0.3:4443=192.168.1.2:80

So we will need to include the above with the new table that you are
thinking too.

>
>
>
>>
>> Note also that for distributed handling of floating IPs, we will need MAC
>> addresses to go along with the floating IPs. This makes me think it might
>> be worth having an indirection to a separate nat table.
>>
>



>
>> Mickey
>>
>>
>> >                 "options": {
>> >                                          "type": {"key": "string",
>> >                                                           "value":
>> "string",
>> >diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
>> >index d239499..508a865 100644
>> >--- a/ovn/ovn-nb.xml
>> >+++ b/ovn/ovn-nb.xml
>> >@@ -631,18 +631,41 @@
>> >       router has all ingress and egress traffic dropped.
>> >     </column>
>> >
>> >+    <column name="dnat">
>> >+      A map of externally visible IP address to their corresponding IP
>> >+      addresses in the logical space.  The externally visible IP address
>> >+      is DNATed to the IP address in the logical space.  DNAT only works
>> >+      on routers that are non-distributed.
>> >+    </column>
>> >+
>> >+    <column name="snat">
>> >+      A map of IP network (e.g 192.168.1.0/24) or an IP address to
>> another
>> >+      IP address.  All IP packets with their source IP address that
>> either
>> >+      matches the key or is in the provided network is SNATed
>> >+      into the IP address in the value.  SNAT only works on routers
>> that are
>> >+      non-distributed.
>> >+    </column>
>> >+
>> >     <group title="Options">
>> >       <p>
>> >         Additional options for the logical router.
>> >       </p>
>> >
>> >       <column name="options" key="chassis">
>> >-        If set, indicates that the logical router in question is
>> >-        non-distributed and resides in the set chassis. The same
>> >-        value is also used by <code>ovn-controller</code> to
>> >-        uniquely identify the chassis in the OVN deployment and
>> >-        comes from <code>external_ids:system-id</code> in the
>> >-        <code>Open_vSwitch</code> table of Open_vSwitch database.
>> >+        <p>
>> >+          If set, indicates that the logical router in question is
>> >+          non-distributed and resides in the set chassis.  The same
>> >+          value is also used by <code>ovn-controller</code> to
>> >+          uniquely identify the chassis in the OVN deployment and
>> >+          comes from <code>external_ids:system-id</code> in the
>> >+          <code>Open_vSwitch</code> table of Open_vSwitch database.
>> >+        </p>
>> >+
>> >+        <p>
>> >+          The non-distributed router (also known as Gateway router)
>> >+          can only be connected to a distributed router via a switch
>> >+          if SNAT and DNAT is to be configured in the Gateway router.
>> >+        </p>
>> >       </column>
>> >     </group>
>> >
>> >diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
>> >index 741228c..3a0268f 100644
>> >--- a/ovn/ovn-sb.xml
>> >+++ b/ovn/ovn-sb.xml
>> >@@ -951,6 +951,44 @@
>> >           </p>
>> >         </dd>
>> >
>> >+        <dt><code>ct_dnat;</code></dt>
>> >+        <dt><code>ct_dnat(<var>IP</var>);</code></dt>
>> >+        <dd>
>> >+          <p>
>> >+            <code>ct_dnat</code> sends the packet through the DNAT zone
>> to
>> >+            unDNAT any packet that was DNATed in a different direction.
>> >+          </p>
>> >+          <p>
>> >+            <code>ct_dnat(<var>IP</var>)</code> sends the packet
>> through the
>> >+            DNAT zone to change the destination IP address of the
>> packet to
>> >+            the one provided inside the parenthesis and commits the
>> connection.
>> >+          </p>
>> >+          <p>
>> >+            Both <code>ct_dnat</code> and
>> <code>ct_dnat(<var>IP</var>)</code>
>> >+            recirculate the packet in the datapath.
>> >+          </p>
>> >+        </dd>
>> >+
>> >+        <dt><code>ct_snat;</code></dt>
>> >+        <dt><code>ct_snat(<var>IP</var>);</code></dt>
>> >+        <dd>
>> >+          <p>
>> >+            <code>ct_snat</code> sends the packet through the SNAT zone
>> to
>> >+            unSNAT any packet that was SNATed in a different direction.
>> >+          </p>
>> >+          <p>
>> >+            <code>ct_snat(<var>IP</var>)</code> sends the packet
>> through the
>> >+            SNAT zone to change the source IP address of the packet to
>> >+            the one provided inside the parenthesis and commits the
>> connection.
>> >+          </p>
>> >+          <p>
>> >+            <code>ct_snat</code> does not recirculate the packet in the
>> >+            datapath and hence should be followed by a
>> <code>next;</code>
>> >+            action. <code>ct_snat(<var>IP</var>)</code> does
>> >+            recirculate the packet in the datapath.
>> >+          </p>
>> >+        </dd>
>> >+
>> >         <dt><code>arp { <var>action</var>; </code>...<code>
>> };</code></dt>
>> >         <dd>
>> >           <p>
>> >--
>> >1.9.1
>> >
>> >_______________________________________________
>> >dev mailing list
>> >dev@openvswitch.org
>> >http://openvswitch.org/mailman/listinfo/dev
>> >
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
>
>
Mickey Spiegel June 3, 2016, 8:21 p.m. UTC | #4
Please see replies inline.

-----Guru Shetty <guru@ovn.org> wrote: -----

>To: Mickey Spiegel/San Jose/IBM@IBMUS
>From: Guru Shetty <guru@ovn.org>
>Date: 06/02/2016 10:41PM
>Cc: ovs dev <dev@openvswitch.org>
>Subject: Re: [ovs-dev] [PATCH v3 5/5] ovn: DNAT and SNAT on a gateway
>router.
>
>
>> 
>> Looking at the code, it looks like any IP packet matches the priority-50 flow. Something like:
>> For all IP packets, a priority-50 flow with an action <code>ct_dnat;</code>.
>
>Agreed. Will fix this as part of the next version.  
>
>>
>> >index 7852d83..3ce88a7 100644
>> >--- a/ovn/northd/ovn-northd.c
>> >+++ b/ovn/northd/ovn-northd.c
>> >@@ -105,12 +105,15 @@ enum ovn_stage {
>> >     /* Logical router ingress stages. */ \
>> >     PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission") \
>> >     PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input") \
>> >-    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing") \
>> >-    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve") \
>> >-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request") \
>> >+    PIPELINE_STAGE(ROUTER, IN,  SNAT,        2, "lr_in_snat") \
>> 
>> I find calling this table SNAT misleading, since it is actually reverting the destination address.
>> How about UNDO_SNAT or UNSNAT?
>I am okay with UNSNAT. What about DNAT as it is currently doing both unDNAT and DNAT? Leave it at DNAT?

Yes, leaving it at DNAT looks good to me.

>> You are allowing the key to be a prefix, with mask other than OVS_BE32_MAX.
>> This implies that a user can specify multiple overlapping prefixes.
>> Conflicts should be resolved by going with the longest matching prefix.
>> 
>> We actually need this behavior for OpenStack. If the source IP address is a
>> fixed IP that has a corresponding floating IP, then the source IP address
>> should be mapped to the floating IP address rather than the router gateway
>> address. We need longest match to win to get this behavior.
>> 
>> So, the priority should be (mask+1).
>
>Right. I will get this right in the next version. 

Thanks. That will let us move forward with the OpenStack networking-ovn plugin.

>>
>> >--- a/ovn/ovn-nb.ovsschema
>> >+++ b/ovn/ovn-nb.ovsschema
>> >@@ -1,7 +1,7 @@
>> > {
>> >         "name": "OVN_Northbound",
>> >-       "version": "2.1.2",
>> >-       "cksum": "429668869 5325",
>> >+       "version": "2.1.3",
>> >+       "cksum": "400575131 5731",
>> >         "tables": {
>> >                 "Logical_Switch": {
>> >                         "columns": {
>> >@@ -78,6 +78,14 @@
>> >                                    "max": "unlimited"}},
>> >                                 "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
>> >                                 "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
>> >+                               "dnat" : {
>> >+                                       "type": {"key": {"type": "string"},
>> >+                                                        "value": {"type" : "string"},
>> >+                                                         "min": 0, "max": "unlimited"}},
>> >+                               "snat" : {
>> >+                                       "type": {"key": {"type": "string"},
>> >+                                                         "value": {"type" : "string"},
>> >+                                                        "min": 0, "max": "unlimited"}},
>> 
>> As mentioned above, in OpenStack, if the source IP address is a fixed IP
>> that has a corresponding floating IP, then the source IP address should be
>> mapped to the floating IP address rather than the router gateway address.
>> 
>> Separate dnat and snat lists as defined here can be made to work to support
>> the OpenStack behavior using this patch, by copying floating IPs into both
>> the dnat and snat lists. However this is a little unwieldy.
>
>I see your point. But the behavior you mention is a very OpenStack
>thing.  OVN config being independent of OpenStack specific behavior
>looks better to me, Unless we can think up with a general purpose schema. 

I definitely agree that the OVN config should be independent of OpenStack.
See below for the proposed changes.

>>
>> If we ever move to distributed handling of floating IPs, which I am still
>> working on, then the duplication will be difficult to handle for east/west
>> flows. It would be easier if one definition could apply to both dnat and snat.
>> An additional value could indicate if the rule applies to only dnat or both.
>> Or, perhaps there could just be a single list of nat rules with an additional
>> value indicating if the rule applies to snat or dnat or both.
>
>This needs some thought. Do you strongly feel that this needs to be
>handled now, or would you mind sending a patch that proposes your
>schema after this schema goes in?

I am always a little nervous about putting things in schema that I know will change later,
especially when it comes to the structure of the config.

I am thinking of ovn-nb.ovsschema changes along the following lines:

         "Logical_Router": {
             "columns": {
                 "name": {"type": "string"},
                 "ports": {"type": {"key": {"type": "uuid",
                                            "refTable": "Logical_Router_Port",
                                            "refType": "strong"},
                                    "min": 0,
                                    "max": "unlimited"}},
                 "static_routes": {"type": {"key": {"type": "uuid",
                                             "refTable": "Logical_Router_Static_Route",
                                             "refType": "strong"},
                                    "min": 0,
                                    "max": "unlimited"}},
                 "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
                 "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
+                "nat": {"type": {"key": {"type": "uuid",
+                                         "refTable": "NAT8,
+                                         "refType": "strong"},
+                                   "min": 0,
+                                   "max": "unlimited"}},
                 "options": {
                      "type": {"key": "string",
                               "value": "string",
                               "min": 0,
                               "max": "unlimited"}},
                 "external_ids": {
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "isRoot": true},


+        &NAT8: {
+            "columns": {
+                "outside_ip": {"type": {"key": "string", "min": 0, "max": 1}},
+                "inside_ip": {"type": {"key": "string", "min": 0, "max": 1}},
+                "direction": {"type": {"key": {"type": "string",
+                                       "enum": ["set", ["dnat", &snat", "dnat_and_snat"]]}}},
+            "isRoot": false},

DNAT maps from destination outside_ip to inside_ip.
SNAT maps from source inside_ip to outside_ip.
If direction == dnat or direction == dnat_and_snat, then check for inside_ip mask == 32
If direction == snat or direction == dnat_and_snat, then check for outside_ip == 32

>As an addendum, the current schema does dnat:30.0.0.3=192.168.1.2. I
>would like to enhance it to also be able to provide ports. Something
>like dnat:30.0.0.3:4443=192.168.1.2:80
>
>So we will need to include the above with the new table that you are
>thinking too.

You can add outside_protocol, outside_l4_port, inside_protocol, and inside_l4_port.

>> Note also that for distributed handling of floating IPs, we will need MAC
>> addresses to go along with the floating IPs. This makes me think it might
>> be worth having an indirection to a separate nat table.

We will add outside_mac in the patch for distributed handling of floating IPs.

Mickey
Gurucharan Shetty June 7, 2016, 3:13 p.m. UTC | #5
>
>
>
> I am always a little nervous about putting things in schema that I know
> will change later,
> especially when it comes to the structure of the config.
>
> I am thinking of ovn-nb.ovsschema changes along the following lines:
>
>          "Logical_Router": {
>              "columns": {
>                  "name": {"type": "string"},
>                  "ports": {"type": {"key": {"type": "uuid",
>                                             "refTable":
> "Logical_Router_Port",
>                                             "refType": "strong"},
>                                     "min": 0,
>                                     "max": "unlimited"}},
>                  "static_routes": {"type": {"key": {"type": "uuid",
>                                              "refTable":
> "Logical_Router_Static_Route",
>                                              "refType": "strong"},
>                                     "min": 0,
>                                     "max": "unlimited"}},
>                  "default_gw": {"type": {"key": "string", "min": 0, "max":
> 1}},
>                  "enabled": {"type": {"key": "boolean", "min": 0, "max":
> 1}},
> +                "nat": {"type": {"key": {"type": "uuid",
> +                                         "refTable": "NAT”,
> +                                         "refType": "strong"},
> +                                   "min": 0,
> +                                   "max": "unlimited"}},
>                  "options": {
>                       "type": {"key": "string",
>                                "value": "string",
>                                "min": 0,
>                                "max": "unlimited"}},
>                  "external_ids": {
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}}},
>              "isRoot": true},
>
>
> +        “NAT”: {
> +            "columns": {
> +                "outside_ip": {"type": {"key": "string", "min": 0, "max":
> 1}},
> +                "inside_ip": {"type": {"key": "string", "min": 0, "max":
> 1}},
> +                "direction": {"type": {"key": {"type": "string",
> +                                       "enum": ["set", ["dnat", “snat",
> "dnat_and_snat"]]}}},
> +            "isRoot": false},
>
> DNAT maps from destination outside_ip to inside_ip.
> SNAT maps from source inside_ip to outside_ip.
> If direction == dnat or direction == dnat_and_snat, then check for
> inside_ip mask == 32
> If direction == snat or direction == dnat_and_snat, then check for
> outside_ip == 32


The names "inside_ip" and "outside_ip" keeps tripping me up. The
nomenclature feels similar to tunnel outside_ip and tunnel inside_ip. What
do you think about "logical_ip" and "physical_ip" instead?


>
>
> >As an addendum, the current schema does dnat:30.0.0.3=192.168.1.2. I
> >would like to enhance it to also be able to provide ports. Something
> >like dnat:30.0.0.3:4443=192.168.1.2:80
> >
> >So we will need to include the above with the new table that you are
> >thinking too.
>
> You can add outside_protocol, outside_l4_port, inside_protocol, and
> inside_l4_port.
>
> >> Note also that for distributed handling of floating IPs, we will need
> MAC
> >> addresses to go along with the floating IPs. This makes me think it
> might
> >> be worth having an indirection to a separate nat table.
>
> We will add outside_mac in the patch for distributed handling of floating
> IPs.
>
> Mickey
>
>
>
Mickey Spiegel June 9, 2016, 1:34 a.m. UTC | #6
Guru,

>The  names "inside_ip" and "outside_ip" keeps tripping me up. The  nomenclature feels similar to tunnel outside_ip and tunnel inside_ip.  What do you think about "logical_ip" and "physical_ip" instead?

A quick search revealed that "inside" and "outside" are not always used the way that I thought, so my proposed terminology does not really work.

Thinking about "logical_ip" and "physical_ip", I am concerned that "physical_ip" will confuse people. When I hear "physical_ip", my mind wanders to pieces of hardware, which is not what this is about.

How about "internal_ip"/"external_ip"?

Mickey

-----Guru Shetty <guru@ovn.org> wrote: -----
To: Mickey Spiegel/San Jose/IBM@IBMUS
From: Guru Shetty <guru@ovn.org>
Date: 06/07/2016 08:14AM
Cc: ovs dev <dev@openvswitch.org>
Subject: Re: [ovs-dev] [PATCH v3 5/5] ovn: DNAT and SNAT on a gateway router.



I am always a little nervous about putting things in schema that I know will change later,
especially when it comes to the structure of the config.

I am thinking of ovn-nb.ovsschema changes along the following lines:

         "Logical_Router": {
             "columns": {
                 "name": {"type": "string"},
                 "ports": {"type": {"key": {"type": "uuid",
                                            "refTable": "Logical_Router_Port",
                                            "refType": "strong"},
                                    "min": 0,
                                    "max": "unlimited"}},
                 "static_routes": {"type": {"key": {"type": "uuid",
                                             "refTable": "Logical_Router_Static_Route",
                                             "refType": "strong"},
                                    "min": 0,
                                    "max": "unlimited"}},
                 "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
                 "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
+                "nat": {"type": {"key": {"type": "uuid",
+                                         "refTable": "NAT”,
+                                         "refType": "strong"},
+                                   "min": 0,
+                                   "max": "unlimited"}},
                 "options": {
                      "type": {"key": "string",
                               "value": "string",
                               "min": 0,
                               "max": "unlimited"}},
                 "external_ids": {
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "isRoot": true},


+        “NAT”: {
+            "columns": {
+                "outside_ip": {"type": {"key": "string", "min": 0, "max": 1}},
+                "inside_ip": {"type": {"key": "string", "min": 0, "max": 1}},
+                "direction": {"type": {"key": {"type": "string",
+                                       "enum": ["set", ["dnat", “snat", "dnat_and_snat"]]}}},
+            "isRoot": false},

DNAT maps from destination outside_ip to inside_ip.
SNAT maps from source inside_ip to outside_ip.
If direction == dnat or direction == dnat_and_snat, then check for inside_ip mask == 32
If direction == snat or direction == dnat_and_snat, then check for outside_ip == 32

The names "inside_ip" and "outside_ip" keeps tripping me up. The nomenclature feels similar to tunnel outside_ip and tunnel inside_ip. What do you think about "logical_ip" and "physical_ip" instead? 
 

>As an addendum, the current schema does dnat:30.0.0.3=192.168.1.2. I
>would like to enhance it to also be able to provide ports. Something
>like dnat:30.0.0.3:4443=192.168.1.2:80
>
>So we will need to include the above with the new table that you are
>thinking too.

You can add outside_protocol, outside_l4_port, inside_protocol, and inside_l4_port.

>> Note also that for distributed handling of floating IPs, we will need MAC
>> addresses to go along with the floating IPs. This makes me think it might
>> be worth having an indirection to a separate nat table.

We will add outside_mac in the patch for distributed handling of floating IPs.

Mickey
Gurucharan Shetty June 9, 2016, 5:20 p.m. UTC | #7
>
>
> Thinking about "logical_ip" and "physical_ip", I am concerned that
> "physical_ip" will confuse people. When I hear "physical_ip", my mind
> wanders to pieces of hardware, which is not what this is about.
>
> How about "internal_ip"/"external_ip"?
>

I decided to use "logical_ip"/"external_ip". logical_ip feels like a good
term as I can easily associate it with logical_port.


>
> Mickey
>
> -----Guru Shetty <guru@ovn.org> wrote: -----
> To: Mickey Spiegel/San Jose/IBM@IBMUS
> From: Guru Shetty <guru@ovn.org>
> Date: 06/07/2016 08:14AM
> Cc: ovs dev <dev@openvswitch.org>
> Subject: Re: [ovs-dev] [PATCH v3 5/5] ovn: DNAT and SNAT on a gateway
> router.
>
>
>
> I am always a little nervous about putting things in schema that I know
> will change later,
> especially when it comes to the structure of the config.
>
> I am thinking of ovn-nb.ovsschema changes along the following lines:
>
>          "Logical_Router": {
>              "columns": {
>                  "name": {"type": "string"},
>                  "ports": {"type": {"key": {"type": "uuid",
>                                             "refTable":
> "Logical_Router_Port",
>                                             "refType": "strong"},
>                                     "min": 0,
>                                     "max": "unlimited"}},
>                  "static_routes": {"type": {"key": {"type": "uuid",
>                                              "refTable":
> "Logical_Router_Static_Route",
>                                              "refType": "strong"},
>                                     "min": 0,
>                                     "max": "unlimited"}},
>                  "default_gw": {"type": {"key": "string", "min": 0, "max":
> 1}},
>                  "enabled": {"type": {"key": "boolean", "min": 0, "max":
> 1}},
> +                "nat": {"type": {"key": {"type": "uuid",
> +                                         "refTable": "NAT”,
> +                                         "refType": "strong"},
> +                                   "min": 0,
> +                                   "max": "unlimited"}},
>                  "options": {
>                       "type": {"key": "string",
>                                "value": "string",
>                                "min": 0,
>                                "max": "unlimited"}},
>                  "external_ids": {
>                      "type": {"key": "string", "value": "string",
>                               "min": 0, "max": "unlimited"}}},
>              "isRoot": true},
>
>
> +        “NAT”: {
> +            "columns": {
> +                "outside_ip": {"type": {"key": "string", "min": 0, "max":
> 1}},
> +                "inside_ip": {"type": {"key": "string", "min": 0, "max":
> 1}},
> +                "direction": {"type": {"key": {"type": "string",
> +                                       "enum": ["set", ["dnat", “snat",
> "dnat_and_snat"]]}}},
> +            "isRoot": false},
>
> DNAT maps from destination outside_ip to inside_ip.
> SNAT maps from source inside_ip to outside_ip.
> If direction == dnat or direction == dnat_and_snat, then check for
> inside_ip mask == 32
> If direction == snat or direction == dnat_and_snat, then check for
> outside_ip == 32
>
> The names "inside_ip" and "outside_ip" keeps tripping me up. The
> nomenclature feels similar to tunnel outside_ip and tunnel inside_ip. What
> do you think about "logical_ip" and "physical_ip" instead?
>
>
> >As an addendum, the current schema does dnat:30.0.0.3=192.168.1.2. I
> >would like to enhance it to also be able to provide ports. Something
> >like dnat:30.0.0.3:4443=192.168.1.2:80
> >
> >So we will need to include the above with the new table that you are
> >thinking too.
>
> You can add outside_protocol, outside_l4_port, inside_protocol, and
> inside_l4_port.
>
> >> Note also that for distributed handling of floating IPs, we will need
> MAC
> >> addresses to go along with the floating IPs. This makes me think it
> might
> >> be worth having an indirection to a separate nat table.
>
> We will add outside_mac in the patch for distributed handling of floating
> IPs.
>
> Mickey
>
>
>
>
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
Mickey Spiegel June 9, 2016, 5:33 p.m. UTC | #8
Works for me.

Mickey

-----Guru Shetty <guru@ovn.org> wrote: -----
To: Mickey Spiegel/San Jose/IBM@IBMUS
From: Guru Shetty <guru@ovn.org>
Date: 06/09/2016 10:20AM
Cc: ovs dev <dev@openvswitch.org>
Subject: Re: [ovs-dev] [PATCH v3 5/5] ovn: DNAT and SNAT on a gateway router.


 Thinking about "logical_ip" and "physical_ip", I am concerned that "physical_ip" will confuse people. When I hear "physical_ip", my mind wanders to pieces of hardware, which is not what this is about.
 
 How about "internal_ip"/"external_ip"?

I decided to use "logical_ip"/"external_ip". logical_ip feels like a good term as I can easily associate it with logical_port.
  
 Mickey
 
 -----Guru Shetty <guru@ovn.org> wrote: -----
 To: Mickey Spiegel/San Jose/IBM@IBMUS
 From: Guru Shetty <guru@ovn.org>
 Date: 06/07/2016 08:14AM
 Cc: ovs dev <dev@openvswitch.org>
 Subject: Re: [ovs-dev] [PATCH v3 5/5] ovn: DNAT and SNAT on a gateway router.
 
 
 
 
I am always a little nervous about putting things in schema that I know will change later,
 especially when it comes to the structure of the config.
 
 I am thinking of ovn-nb.ovsschema changes along the following lines:
 
          "Logical_Router": {
              "columns": {
                  "name": {"type": "string"},
                  "ports": {"type": {"key": {"type": "uuid",
                                             "refTable": "Logical_Router_Port",
                                             "refType": "strong"},
                                     "min": 0,
                                     "max": "unlimited"}},
                  "static_routes": {"type": {"key": {"type": "uuid",
                                              "refTable": "Logical_Router_Static_Route",
                                              "refType": "strong"},
                                     "min": 0,
                                     "max": "unlimited"}},
                  "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
                  "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
 +                "nat": {"type": {"key": {"type": "uuid",
 +                                         "refTable": "NAT”,
 +                                         "refType": "strong"},
 +                                   "min": 0,
 +                                   "max": "unlimited"}},
                  "options": {
                       "type": {"key": "string",
                                "value": "string",
                                "min": 0,
                                "max": "unlimited"}},
                  "external_ids": {
                      "type": {"key": "string", "value": "string",
                               "min": 0, "max": "unlimited"}}},
              "isRoot": true},
 
 
 +        “NAT”: {
 +            "columns": {
 +                "outside_ip": {"type": {"key": "string", "min": 0, "max": 1}},
 +                "inside_ip": {"type": {"key": "string", "min": 0, "max": 1}},
 +                "direction": {"type": {"key": {"type": "string",
 +                                       "enum": ["set", ["dnat", “snat", "dnat_and_snat"]]}}},
 +            "isRoot": false},
 
 DNAT maps from destination outside_ip to inside_ip.
 SNAT maps from source inside_ip to outside_ip.
 If direction == dnat or direction == dnat_and_snat, then check for inside_ip mask == 32
 If direction == snat or direction == dnat_and_snat, then check for outside_ip == 32
 
 The names "inside_ip" and "outside_ip" keeps tripping me up. The nomenclature feels similar to tunnel outside_ip and tunnel inside_ip. What do you think about "logical_ip" and "physical_ip" instead?
 
 
 >As an addendum, the current schema does dnat:30.0.0.3=192.168.1.2. I
 >would like to enhance it to also be able to provide ports. Something
 >like dnat:30.0.0.3:4443=192.168.1.2:80
 >
 >So we will need to include the above with the new table that you are
 >thinking too.
 
 You can add outside_protocol, outside_l4_port, inside_protocol, and inside_l4_port.
 
 >> Note also that for distributed handling of floating IPs, we will need MAC
 >> addresses to go along with the floating IPs. This makes me think it might
 >> be worth having an indirection to a separate nat table.
 
 We will add outside_mac in the patch for distributed handling of floating IPs.
 
 Mickey
diff mbox

Patch

diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index 5f0bf19..4a486a0 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -442,6 +442,85 @@  emit_ct(struct action_context *ctx, bool recirc_next, bool commit)
     add_prerequisite(ctx, "ip");
 }
 
+static void
+parse_ct_nat(struct action_context *ctx, bool snat)
+{
+    const size_t ct_offset = ctx->ofpacts->size;
+    ofpbuf_pull(ctx->ofpacts, ct_offset);
+
+    struct ofpact_conntrack *ct = ofpact_put_CT(ctx->ofpacts);
+
+    if (ctx->ap->cur_ltable < ctx->ap->n_tables) {
+        ct->recirc_table = ctx->ap->first_ptable + ctx->ap->cur_ltable + 1;
+    } else {
+        action_error(ctx,
+                     "\"ct_[sd]nat\" action not allowed in last table.");
+        return;
+    }
+
+    if (snat) {
+        ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
+    } else {
+        ct->zone_src.field = mf_from_id(MFF_LOG_DNAT_ZONE);
+    }
+    ct->zone_src.ofs = 0;
+    ct->zone_src.n_bits = 16;
+    ct->flags = 0;
+    ct->alg = 0;
+
+    add_prerequisite(ctx, "ip");
+
+    struct ofpact_nat *nat;
+    size_t nat_offset;
+    nat_offset = ctx->ofpacts->size;
+    ofpbuf_pull(ctx->ofpacts, nat_offset);
+
+    nat = ofpact_put_NAT(ctx->ofpacts);
+    nat->flags = 0;
+    nat->range_af = AF_UNSPEC;
+
+    int commit = 0;
+    if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
+        ovs_be32 ip;
+        if (ctx->lexer->token.type == LEX_T_INTEGER
+            && ctx->lexer->token.format == LEX_F_IPV4) {
+            ip = ctx->lexer->token.value.ipv4;
+        } else {
+            action_syntax_error(ctx, "invalid ip");
+            return;
+        }
+
+        nat->range_af = AF_INET;
+        nat->range.addr.ipv4.min = ip;
+        if (snat) {
+            nat->flags |= NX_NAT_F_SRC;
+        } else {
+            nat->flags |= NX_NAT_F_DST;
+        }
+        commit = NX_CT_F_COMMIT;
+        lexer_get(ctx->lexer);
+        if (!lexer_match(ctx->lexer, LEX_T_RPAREN)) {
+            action_syntax_error(ctx, "expecting `)'");
+            return;
+        }
+    }
+
+    ctx->ofpacts->header = ofpbuf_push_uninit(ctx->ofpacts, nat_offset);
+    ct = ctx->ofpacts->header;
+    ct->flags |= commit;
+
+    /* XXX: For performance reasons, we try to prevent additional
+     * recirculations.  So far, ct_snat which is used in a gateway router
+     * does not need a recirculation. ct_snat(IP) does need a recirculation.
+     * Should we consider a method to let the actions specify whether a action
+     * needs recirculation if there more use cases?. */
+    if (!commit && snat) {
+        ct->recirc_table = NX_CT_RECIRC_NONE;
+    }
+    ofpact_finish(ctx->ofpacts, &ct->ofpact);
+    ofpbuf_push_uninit(ctx->ofpacts, ct_offset);
+}
+
 static bool
 parse_action(struct action_context *ctx)
 {
@@ -469,6 +548,10 @@  parse_action(struct action_context *ctx)
         emit_ct(ctx, true, false);
     } else if (lexer_match_id(ctx->lexer, "ct_commit")) {
         emit_ct(ctx, false, true);
+    } else if (lexer_match_id(ctx->lexer, "ct_dnat")) {
+        parse_ct_nat(ctx, false);
+    } else if (lexer_match_id(ctx->lexer, "ct_snat")) {
+        parse_ct_nat(ctx, true);
     } else if (lexer_match_id(ctx->lexer, "arp")) {
         parse_arp_action(ctx);
     } else if (lexer_match_id(ctx->lexer, "get_arp")) {
diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 970c352..838fbb7 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -500,11 +500,40 @@  next;
 
       <li>
         <p>
-          Reply to ARP requests.  These flows reply to ARP requests for the
-          router's own IP address.  For each router port <var>P</var> that owns
-          IP address <var>A</var> and Ethernet address <var>E</var>, a
-          priority-90 flow matches <code>inport == <var>P</var> &amp;&amp;
-          arp.op == 1 &amp;&amp; arp.tpa == <var>A</var></code> (ARP request)
+          Reply to ARP requests.
+        </p>
+
+        <p>
+          These flows reply to ARP requests for the router's own IP address.
+          For each router port <var>P</var> that owns IP address <var>A</var>
+          and Ethernet address <var>E</var>, a priority-90 flow matches
+          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
+          arp.tpa == <var>A</var></code> (ARP request) with the following
+          actions:
+        </p>
+
+        <pre>
+eth.dst = eth.src;
+eth.src = <var>E</var>;
+arp.op = 2; /* ARP reply. */
+arp.tha = arp.sha;
+arp.sha = <var>E</var>;
+arp.tpa = arp.spa;
+arp.spa = <var>A</var>;
+outport = <var>P</var>;
+inport = ""; /* Allow sending out inport. */
+output;
+        </pre>
+      </li>
+
+      <li>
+        <p>
+          These flows reply to ARP requests for the virtual IP addresses
+          configured in the router for DNAT. For a configured DNAT IP address
+          <var>A</var>, for each router port <var>P</var> with Ethernet
+          address <var>E</var>, a priority-90 flow matches
+          <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
+          arp.tpa == <var>A</var></code> (ARP request)
           with the following actions:
         </p>
 
@@ -646,7 +675,63 @@  icmp4 {
       </li>
     </ul>
 
-    <h3>Ingress Table 2: IP Routing</h3>
+    <h3>Ingress Table 2: SNAT</h3>
+
+    <p>
+      This is for already established connections' reverse traffic.
+      i.e., SNAT has already been done in egress pipeline and now the
+      packet has entered the ingress pipeline as part of a reply. It is
+      unSNATted here.
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          For each configuration in the OVN Northdbound database, that asks
+          to change the source IP address of a packet from <var>A</var> to
+          <var>B</var>, a priority-100 flow matches <code>ip4.dst ==
+          <var>B</var></code> with an action <code>ct_snat; next;</code>.
+        </p>
+
+        <p>
+          A priority-0 logical flow with match <code>1</code> has actions
+          <code>next;</code>.
+        </p>
+      </li>
+    </ul>
+
+    <h3>Ingress Table 3: DNAT</h3>
+
+    <p>
+      Packets enter the pipeline with destination IP address that needs to
+      be DNATted from a virtual IP address to a real IP address. Packets
+      in the reverse direction needs to be unDNATed.
+    </p>
+    <ul>
+      <li>
+        <p>
+          For each configuration in the OVN Northdbound database, that asks
+          to change the destination IP address of a packet from <var>A</var> to
+          <var>B</var>, a priority-100 flow matches <code>ip4.dst ==
+          <var>A</var></code> with an action <code>ct_dnat(<var>B</var>);
+          </code>.
+        </p>
+
+        <p>
+          For each configuration in the OVN Northdbound database, that asks
+          to change the destination IP address of a packet from <var>A</var> to
+          <var>B</var>, a priority-50 flow matches every <code>ip</code> packet
+          with an action <code>ct_dnat;</code>.
+        </p>
+
+        <p>
+          A priority-0 logical flow with match <code>1</code> has actions
+          <code>next;</code>.
+        </p>
+      </li>
+    </ul>
+
+    <h3>Ingress Table 4: IP Routing</h3>
 
     <p>
       A packet that arrives at this table is an IP packet that should be routed
@@ -655,7 +740,7 @@  icmp4 {
       <code>ip4.dst</code>, the packet's final destination, unchanged) and
       advances to the next table for ARP resolution.  It also sets
       <code>reg1</code> to the IP address owned by the selected router port
-      (which is used later in table 4 as the IP source address for an ARP
+      (which is used later in table 6 as the IP source address for an ARP
       request, if needed).
     </p>
 
@@ -726,7 +811,7 @@  icmp4 {
       </li>
     </ul>
 
-    <h3>Ingress Table 3: ARP Resolution</h3>
+    <h3>Ingress Table 5: ARP Resolution</h3>
 
     <p>
       Any packet that reaches this table is an IP packet whose next-hop IP
@@ -781,7 +866,7 @@  icmp4 {
       </li>
     </ul>
 
-    <h3>Ingress Table 4: ARP Request</h3>
+    <h3>Ingress Table 6: ARP Request</h3>
 
     <p>
       In the common case where the Ethernet destination has been resolved, this
@@ -806,7 +891,7 @@  arp {
         </pre>
 
         <p>
-          (Ingress table 2 initialized <code>reg1</code> with the IP address
+          (Ingress table 4 initialized <code>reg1</code> with the IP address
           owned by <code>outport</code>.)
         </p>
 
@@ -821,7 +906,29 @@  arp {
       </li>
     </ul>
 
-    <h3>Egress Table 0: Delivery</h3>
+    <h3>Egress Table 0: SNAT</h3>
+
+    <p>
+      Packets that are configured to be SNATed get their source IP address
+      changed based on the configuration in the OVN Northdbound database.
+    </p>
+    <ul>
+      <li>
+        <p>
+          For each configuration in the OVN Northdbound database, that asks
+          to change the source IP address of a packet from <var>A</var> to
+          <var>B</var>, a priority-100 flow matches <code>ip4.src ==
+          <var>A</var></code> with an action <code>ct_snat(<var>B</var>)
+          </code>.
+        </p>
+        <p>
+          A priority-0 logical flow with match <code>1</code> has actions
+          <code>next;</code>.
+        </p>
+      </li>
+    </ul>
+
+    <h3>Egress Table 1: Delivery</h3>
 
     <p>
       Packets that reach this table are ready for delivery.  It contains
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 7852d83..3ce88a7 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -105,12 +105,15 @@  enum ovn_stage {
     /* Logical router ingress stages. */                              \
     PIPELINE_STAGE(ROUTER, IN,  ADMISSION,   0, "lr_in_admission")    \
     PIPELINE_STAGE(ROUTER, IN,  IP_INPUT,    1, "lr_in_ip_input")     \
-    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  2, "lr_in_ip_routing")   \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 3, "lr_in_arp_resolve")  \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 4, "lr_in_arp_request")  \
+    PIPELINE_STAGE(ROUTER, IN,  SNAT,        2, "lr_in_snat")         \
+    PIPELINE_STAGE(ROUTER, IN,  DNAT,        3, "lr_in_dnat")         \
+    PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  4, "lr_in_ip_routing")   \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 5, "lr_in_arp_resolve")  \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 6, "lr_in_arp_request")  \
                                                                       \
     /* Logical router egress stages. */                               \
-    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,    0, "lr_out_delivery")
+    PIPELINE_STAGE(ROUTER, OUT, SNAT,      0, "lr_out_snat")          \
+    PIPELINE_STAGE(ROUTER, OUT, DELIVERY,  1, "lr_out_delivery")
 
 #define PIPELINE_STAGE(DP_TYPE, PIPELINE, STAGE, TABLE, NAME)   \
     S_##DP_TYPE##_##PIPELINE##_##STAGE                          \
@@ -1965,6 +1968,44 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         free(match);
         free(actions);
 
+        /* ARP handling for virtual IP addresses.
+         *
+         * DNAT IP addresses are virtual IP addresses that need ARP
+         * handling. */
+        struct smap_node *node;
+        SMAP_FOR_EACH(node, &op->od->nbr->dnat) {
+            ovs_be32 ip;
+            if (!ip_parse(node->key, &ip) || !ip) {
+                static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip address %s in dnat configuration"
+                             "for router %s", node->key, op->key);
+                continue;
+            }
+
+            match = xasprintf(
+                "inport == %s && arp.tpa == "IP_FMT" && arp.op == 1",
+                op->json_key, IP_ARGS(ip));
+            actions = xasprintf(
+                "eth.dst = eth.src; "
+                "eth.src = "ETH_ADDR_FMT"; "
+                "arp.op = 2; /* ARP reply */ "
+                "arp.tha = arp.sha; "
+                "arp.sha = "ETH_ADDR_FMT"; "
+                "arp.tpa = arp.spa; "
+                "arp.spa = "IP_FMT"; "
+                "outport = %s; "
+                "inport = \"\"; /* Allow sending out inport. */ "
+                "output;",
+                ETH_ADDR_ARGS(op->mac),
+                ETH_ADDR_ARGS(op->mac),
+                IP_ARGS(ip),
+                op->json_key);
+            ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 90,
+                          match, actions);
+            free(match);
+            free(actions);
+        }
+
         /* Drop IP traffic to this router. */
         match = xasprintf("ip4.dst == "IP_FMT, IP_ARGS(op->ip));
         ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 60,
@@ -1972,6 +2013,123 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         free(match);
     }
 
+    /* Ingress SNAT. This is for already established connections' reverse
+     * traffic. i.e., SNAT has already been done in egress pipeline and now
+     * the packet has entered the ingress pipeline as part of a reply.
+     * We undo the SNAT here.
+     *
+     * Undoing SNAT has to happen before DNAT processing. This is because when
+     * the packet was DNATed in ingress pipeline, it did not know about the
+     * possibility of eventual additional SNAT in egress pipeline. */
+    HMAP_FOR_EACH (od, key_node, datapaths) {
+        if (!od->nbr) {
+            continue;
+        }
+
+        /* Packets are allowed by default. */
+        ovn_lflow_add(lflows, od, S_ROUTER_IN_SNAT, 0, "1", "next;");
+
+        /* SNAT rules are only valid on non-distributed routers. */
+        if (!smap_get(&od->nbr->options, "chassis")) {
+            continue;
+        }
+
+        struct smap_node *node;
+        SMAP_FOR_EACH(node, &od->nbr->snat) {
+            ovs_be32 ip, mask;
+
+            char *error = ip_parse_masked(node->key, &ip, &mask);
+            if (error) {
+                static struct vlog_rate_limit rl
+                        = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip network or ip %s for snat",
+                             node->key);
+                continue;
+            }
+
+            error = ip_parse_masked(node->value, &ip, &mask);
+            if (error || mask != OVS_BE32_MAX) {
+                static struct vlog_rate_limit rl
+                        = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip address %s for snat", node->value);
+                continue;
+            }
+
+            char *match;
+            match = xasprintf("ip && ip4.dst == %s", node->value);
+            ovn_lflow_add(lflows, od, S_ROUTER_IN_SNAT, 100,
+                          match, "ct_snat; next;");
+            free(match);
+        }
+    }
+
+    /* Ingress DNAT. Packets enter the pipeline with destination IP address
+     * that needs to be DNATted from a virtual IP address to a real
+     * IP address.  Packets in the reverse direction get unDNATed. */
+    HMAP_FOR_EACH (od, key_node, datapaths) {
+        if (!od->nbr) {
+            continue;
+        }
+
+        /* Packets are allowed by default. */
+        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 0, "1", "next;");
+
+        /* DNAT rules are only valid on non-distributed routers. */
+        if (!smap_get(&od->nbr->options, "chassis")) {
+            continue;
+        }
+
+        struct smap_node *node;
+        SMAP_FOR_EACH(node, &od->nbr->dnat) {
+            char *match, *actions;
+            ovs_be32 ip, mask;
+
+            char *error = ip_parse_masked(node->key, &ip, &mask);
+            if (error || mask != OVS_BE32_MAX) {
+                static struct vlog_rate_limit rl
+                        = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip address key %s for dnat", node->key);
+                continue;
+            }
+
+            error = ip_parse_masked(node->value, &ip, &mask);
+            if (error || mask != OVS_BE32_MAX) {
+                static struct vlog_rate_limit rl
+                        = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip address value %s for dnat",
+                             node->value);
+                continue;
+            }
+
+            /* Packet when it goes from the initiator to destination.
+             * We need to zero the inport because the router can
+             * send the packet back through the same interface. */
+            match = xasprintf("ip && ip4.dst == %s", node->key);
+            actions = xasprintf("inport = \"\"; ct_dnat(%s);", node->value);
+            ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 100,
+                          match, actions);
+            free(match);
+            free(actions);
+        }
+
+        /* Re-circulate every other packet through the DNAT zone.
+         * This helps with 2 things.
+         *
+         * 1. Any packet that needs to be unDNATed in the reverse
+         * direction gets unDNATed. Ideally this could be done in
+         * the egress pipeline. But since the gateway router
+         * does not have any feature that depends on the source
+         * ip address being virtual IP address for IP routing,
+         * we can do it here, saving a future re-circulation.
+         *
+         * 2. Any packet that was sent through SNAT zone in the
+         * previous table automatically gets re-circulated to get
+         * back the new destination IP address that is needed for
+         * routing in the openflow pipeline. */
+        ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
+                      "ip", "inport = \"\"; ct_dnat;");
+    }
+
     /* Logical router ingress table 2: IP Routing.
      *
      * A packet that arrives at this table is an IP packet that should be
@@ -2172,7 +2330,53 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         ovn_lflow_add(lflows, od, S_ROUTER_IN_ARP_REQUEST, 0, "1", "output;");
     }
 
-    /* Logical router egress table 0: Delivery (priority 100).
+    /* Egress SNAT. Packets enter the pipeline with source ip address
+     * that needs to be SNATted to a virtual ip address. */
+    HMAP_FOR_EACH (od, key_node, datapaths) {
+        if (!od->nbr) {
+            continue;
+        }
+
+        /* Packets are allowed by default. */
+        ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 0, "1", "next;");
+
+        /* SNAT rules are only valid on non-distributed routers. */
+        if (!smap_get(&od->nbr->options, "chassis")) {
+            continue;
+        }
+
+        struct smap_node *node;
+        SMAP_FOR_EACH(node, &od->nbr->snat) {
+            char *match, *actions;
+            ovs_be32 ip, mask;
+
+            char *error = ip_parse_masked(node->key, &ip, &mask);
+            if (error) {
+                static struct vlog_rate_limit rl
+                        = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip or ip network %s in snat key",
+                             node->key);
+                continue;
+            }
+
+            error = ip_parse_masked(node->value, &ip, &mask);
+            if (error || mask != OVS_BE32_MAX) {
+                static struct vlog_rate_limit rl
+                        = VLOG_RATE_LIMIT_INIT(5, 1);
+                VLOG_WARN_RL(&rl, "bad ip address %s for snat", node->value);
+                continue;
+            }
+
+            match = xasprintf("ip && ip4.src == %s", node->key);
+            actions = xasprintf("ct_snat(%s);", node->value);
+            ovn_lflow_add(lflows, od, S_ROUTER_OUT_SNAT, 100,
+                          match, actions);
+            free(match);
+            free(actions);
+        }
+    }
+
+    /* Logical router egress table 1: Delivery (priority 100).
      *
      * Priority 100 rules deliver packets to enabled logical ports. */
     HMAP_FOR_EACH (op, key_node, ports) {
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index fa21b30..4d9c613 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "2.1.2",
-    "cksum": "429668869 5325",
+    "version": "2.1.3",
+    "cksum": "400575131 5731",
     "tables": {
         "Logical_Switch": {
             "columns": {
@@ -78,6 +78,14 @@ 
                                    "max": "unlimited"}},
                 "default_gw": {"type": {"key": "string", "min": 0, "max": 1}},
                 "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
+                "dnat" : {
+                    "type": {"key": {"type": "string"},
+                             "value": {"type" : "string"},
+                             "min": 0, "max": "unlimited"}},
+                "snat" : {
+                    "type": {"key": {"type": "string"},
+                             "value": {"type" : "string"},
+                             "min": 0, "max": "unlimited"}},
                 "options": {
                      "type": {"key": "string",
                               "value": "string",
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index d239499..508a865 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -631,18 +631,41 @@ 
       router has all ingress and egress traffic dropped.
     </column>
 
+    <column name="dnat">
+      A map of externally visible IP address to their corresponding IP
+      addresses in the logical space.  The externally visible IP address
+      is DNATed to the IP address in the logical space.  DNAT only works
+      on routers that are non-distributed.
+    </column>
+
+    <column name="snat">
+      A map of IP network (e.g 192.168.1.0/24) or an IP address to another
+      IP address.  All IP packets with their source IP address that either
+      matches the key or is in the provided network is SNATed
+      into the IP address in the value.  SNAT only works on routers that are
+      non-distributed.
+    </column>
+
     <group title="Options">
       <p>
         Additional options for the logical router.
       </p>
 
       <column name="options" key="chassis">
-        If set, indicates that the logical router in question is
-        non-distributed and resides in the set chassis. The same
-        value is also used by <code>ovn-controller</code> to
-        uniquely identify the chassis in the OVN deployment and
-        comes from <code>external_ids:system-id</code> in the
-        <code>Open_vSwitch</code> table of Open_vSwitch database.
+        <p>
+          If set, indicates that the logical router in question is
+          non-distributed and resides in the set chassis.  The same
+          value is also used by <code>ovn-controller</code> to
+          uniquely identify the chassis in the OVN deployment and
+          comes from <code>external_ids:system-id</code> in the
+          <code>Open_vSwitch</code> table of Open_vSwitch database.
+        </p>
+
+        <p>
+          The non-distributed router (also known as Gateway router)
+          can only be connected to a distributed router via a switch
+          if SNAT and DNAT is to be configured in the Gateway router.
+        </p>
       </column>
     </group>
     
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 741228c..3a0268f 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -951,6 +951,44 @@ 
           </p>
         </dd>
 
+        <dt><code>ct_dnat;</code></dt>
+        <dt><code>ct_dnat(<var>IP</var>);</code></dt>
+        <dd>
+          <p>
+            <code>ct_dnat</code> sends the packet through the DNAT zone to
+            unDNAT any packet that was DNATed in a different direction.
+          </p>
+          <p>
+            <code>ct_dnat(<var>IP</var>)</code> sends the packet through the
+            DNAT zone to change the destination IP address of the packet to
+            the one provided inside the parenthesis and commits the connection.
+          </p>
+          <p>
+            Both <code>ct_dnat</code> and <code>ct_dnat(<var>IP</var>)</code>
+            recirculate the packet in the datapath.
+          </p>
+        </dd>
+
+        <dt><code>ct_snat;</code></dt>
+        <dt><code>ct_snat(<var>IP</var>);</code></dt>
+        <dd>
+          <p>
+            <code>ct_snat</code> sends the packet through the SNAT zone to
+            unSNAT any packet that was SNATed in a different direction.
+          </p>
+          <p>
+            <code>ct_snat(<var>IP</var>)</code> sends the packet through the
+            SNAT zone to change the source IP address of the packet to
+            the one provided inside the parenthesis and commits the connection.
+          </p>
+          <p>
+            <code>ct_snat</code> does not recirculate the packet in the
+            datapath and hence should be followed by a <code>next;</code>
+            action. <code>ct_snat(<var>IP</var>)</code> does
+            recirculate the packet in the datapath.
+          </p>
+        </dd>
+
         <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
         <dd>
           <p>