diff mbox series

[ovs-dev,v2] ovn-northd: Drop IP packets destined to router owned IPs (after NAT).

Message ID 1599554583-1698-1-git-send-email-dceara@redhat.com
State Changes Requested
Headers show
Series [ovs-dev,v2] ovn-northd: Drop IP packets destined to router owned IPs (after NAT). | expand

Commit Message

Dumitru Ceara Sept. 8, 2020, 8:43 a.m. UTC
OVN was dropping IP packets destined to IPs owned by logical routers but
only if those IPs are not used for SNAT rules. However, if a packet
doesn't match an existing NAT session and its destination is still a
router owned IP, it can be safely dropped. Otherwise it will trigger an
unnecessary packet-in in stage lr_in_arp_request.

To achieve that we add flows that drop traffic to router owned IPs in
table lr_in_arp_resolve.

Reported-by: Tim Rozet <trozet@redhat.com>
Reported-at: https://bugzilla.redhat.com/1876174
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
---
V2:
- rebased changes.
---
 northd/ovn-northd.8.xml | 24 ++++++++++++
 northd/ovn-northd.c     | 48 ++++++++++++++++++++++++
 tests/ovn.at            | 98 ++++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 168 insertions(+), 2 deletions(-)

Comments

Numan Siddique Sept. 8, 2020, 10:58 a.m. UTC | #1
On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
>
> OVN was dropping IP packets destined to IPs owned by logical routers but
> only if those IPs are not used for SNAT rules. However, if a packet
> doesn't match an existing NAT session and its destination is still a
> router owned IP, it can be safely dropped. Otherwise it will trigger an
> unnecessary packet-in in stage lr_in_arp_request.
>
> To achieve that we add flows that drop traffic to router owned IPs in
> table lr_in_arp_resolve.
>
> Reported-by: Tim Rozet <trozet@redhat.com>
> Reported-at: https://bugzilla.redhat.com/1876174
> Signed-off-by: Dumitru Ceara <dceara@redhat.com>



Hi Dumitru,

Thanks for the fix.

I have few comments.

Suppose we have the below lr and NAT entries

router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
    port lr0-sw0
        mac: "00:00:00:00:ff:01"
        networks: ["10.0.0.1/24"]
    port lr0-sw1
        mac: "00:00:00:00:ff:02"
        networks: ["20.0.0.1/24"]
    port lr0-public
        mac: "00:00:20:20:12:13"
        networks: ["172.168.0.100/24"]
        gateway chassis: [chassis-1]
    nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
        external ip: "172.168.0.110"
        logical ip: "10.0.0.3"
        type: "dnat_and_snat"
    nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
        external ip: "172.168.0.120"
        logical ip: "20.0.0.3"
        type: "dnat_and_snat"
    nat da28129c-4e81-4e20-865a-768bd88e5a26
        external ip: "172.168.0.100"
        logical ip: "10.0.0.0/24"
        type: "snat"

I see the below lflows added in lr_in_ip_input to drop the pkts for
router port IPs

table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
{10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
{20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)

So I think there is no need to add the lflows in lr_in_arp_resolve to
drop for 10.0.0.1 and 20.0.0.1 again.

I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
packet only for packets destined to the router ips which have NAT
entries.

Thanks
Numan

>
> ---
> V2:
> - rebased changes.
> ---
>  northd/ovn-northd.8.xml | 24 ++++++++++++
>  northd/ovn-northd.c     | 48 ++++++++++++++++++++++++
>  tests/ovn.at            | 98 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 168 insertions(+), 2 deletions(-)
>
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 989e364..b9170fc 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -2943,6 +2943,30 @@ outport = <var>P</var>;
>
>        <li>
>          <p>
> +          Traffic with IP destination an address owned by the router should be
> +          dropped. Such traffic is normally dropped in ingress table
> +          <code>IP Input</code> except for IPs that are also shared with SNAT
> +          rules. However, if there was no unSNAT operation that happened
> +          successfully until this point in the pipeline and the destination IP
> +          of the packet is still a router owned IP, the packets can be safely
> +          dropped.
> +        </p>
> +
> +        <p>
> +          A priority-1 logical flow with match <code>ip4.dst = {..}</code>
> +          matches on traffic destined to router owned IPv4 addresses and
> +          has action <code>drop;</code>.
> +        </p>
> +
> +        <p>
> +          A priority-1 logical flow with match <code>ip6.dst = {..}</code>
> +          matches on traffic destined to router owned IPv4 addresses and
> +          has action <code>drop;</code>.
> +        </p>
> +      </li>
> +
> +      <li>
> +        <p>
>            Dynamic MAC bindings.  These flows resolve MAC-to-IP bindings
>            that have become known dynamically through ARP or neighbor
>            discovery.  (The ingress table <code>ARP Request</code> will
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 681c008..6e063ac 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -10588,6 +10588,54 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
>          }
>      }
>
> +    /* Drop IP traffic destined to router owned IPs. Part of it is dropped
> +     * in stage "lr_in_ip_input" but traffic that could have been unSNATed
> +     * but didn't match any existing session might still end up here.
> +     */
> +    HMAP_FOR_EACH (op, key_node, ports) {
> +        if (!op->nbrp) {
> +            continue;
> +        }
> +
> +        if (op->lrp_networks.n_ipv4_addrs) {
> +            ds_clear(&match);
> +            ds_put_cstr(&match, "ip4.dst == {");
> +
> +            for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> +                ds_put_format(&match, "%s, ",
> +                              op->lrp_networks.ipv4_addrs[i].addr_s);
> +            }
> +
> +            ds_chomp(&match, ' ');
> +            ds_chomp(&match, ',');
> +            ds_put_char(&match, '}');
> +
> +            /* Drop traffic with IP.dest == router-ip. */
> +            ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_ARP_RESOLVE, 1,
> +                                    ds_cstr(&match), "drop;",
> +                                    &op->nbrp->header_);
> +        }
> +
> +        if (op->lrp_networks.n_ipv6_addrs) {
> +            ds_clear(&match);
> +            ds_put_cstr(&match, "ip6.dst == {");
> +
> +            for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
> +                ds_put_format(&match, "%s, ",
> +                              op->lrp_networks.ipv6_addrs[i].addr_s);
> +            }
> +
> +            ds_chomp(&match, ' ');
> +            ds_chomp(&match, ',');
> +            ds_put_char(&match, '}');
> +
> +            /* Drop traffic with IP.dest == router-ip. */
> +            ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_ARP_RESOLVE, 1,
> +                                    ds_cstr(&match), "drop;",
> +                                    &op->nbrp->header_);
> +        }
> +    }
> +
>      HMAP_FOR_EACH (od, key_node, datapaths) {
>          if (!od->nbr) {
>              continue;
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 1fe34b7..0e14dfa 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -21377,8 +21377,12 @@ OVN_POPULATE_ARP
>
>  ovn-nbctl --wait=hv sync
>
> -AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.1], [1], [])
> -AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.2], [1], [])
> +AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.1], [0], [dnl
> +  table=13(lr_in_arp_resolve  ), priority=1    , match=(ip4.dst == {10.0.0.1}), action=(drop;)
> +])
> +AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.2], [0], [dnl
> +  table=13(lr_in_arp_resolve  ), priority=1    , match=(ip4.dst == {10.0.0.2}), action=(drop;)
> +])
>
>  ip_to_hex() {
>      printf "%02x%02x%02x%02x" "$@"
> @@ -21447,6 +21451,96 @@ OVS_WAIT_UNTIL([test x$(as hv1 ovn-appctl -t ovn-controller debug/status) = "xru
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
>
> +# Test dropping traffic destined to router owned IPs.
> +AT_SETUP([ovn -- gateway router drop traffic for own IPs])
> +ovn_start
> +
> +ovn-nbctl lr-add r1 -- set logical_router r1 options:chassis=hv1
> +ovn-nbctl ls-add s1
> +
> +# Connnect r1 to s1.
> +ovn-nbctl lrp-add r1 lrp-r1-s1 00:00:00:00:01:01 10.0.1.1/24
> +ovn-nbctl lsp-add s1 lsp-s1-r1 -- set Logical_Switch_Port lsp-s1-r1 type=router \
> +    options:router-port=lrp-r1-s1 addresses=router
> +
> +# Create logical port p1 in s1
> +ovn-nbctl lsp-add s1 p1 \
> +-- lsp-set-addresses p1 "f0:00:00:00:01:02 10.0.1.2"
> +
> +# Create two hypervisor and create OVS ports corresponding to logical ports.
> +net_add n1
> +
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.0.1
> +ovs-vsctl -- add-port br-int hv1-vif1 -- \
> +    set interface hv1-vif1 external-ids:iface-id=p1 \
> +    options:tx_pcap=hv1/vif1-tx.pcap \
> +    options:rxq_pcap=hv1/vif1-rx.pcap \
> +    ofport-request=1
> +
> +# Pre-populate the hypervisors' ARP tables so that we don't lose any
> +# packets for ARP resolution (native tunneling doesn't queue packets
> +# for ARP resolution).
> +OVN_POPULATE_ARP
> +
> +ovn-nbctl --wait=hv sync
> +
> +sw_key=$(ovn-sbctl --bare --columns tunnel_key list datapath_binding r1)
> +
> +AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.1.1], [0], [dnl
> +  table=13(lr_in_arp_resolve  ), priority=1    , match=(ip4.dst == {10.0.1.1}), action=(drop;)
> +])
> +
> +ip_to_hex() {
> +    printf "%02x%02x%02x%02x" "$@"
> +}
> +
> +# Send ip packets from p1 to lrp-r1-s1
> +src_mac="f00000000102"
> +dst_mac="000000000101"
> +src_ip=`ip_to_hex 10 0 1 2`
> +dst_ip=`ip_to_hex 10 0 1 1`
> +packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
> +
> +# No packet-ins should reach ovn-controller.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep -v n_packets=0 -c], [1], [dnl
> +0
> +])
> +
> +# The packet should have been dropped in the lr_in_ip_input stage.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=11, n_packets=1,.* priority=60,ip,metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> +1
> +])
> +
> +# Use the router IP as SNAT IP.
> +ovn-nbctl set logical_router r1 options:lb_force_snat_ip=10.0.1.1
> +ovn-nbctl --wait=hv sync
> +
> +# Send ip packets from p1 to lrp-r1-s1
> +src_mac="f00000000102"
> +dst_mac="000000000101"
> +src_ip=`ip_to_hex 10 0 1 2`
> +dst_ip=`ip_to_hex 10 0 1 1`
> +packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
> +as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
> +
> +# Even after configuring a router owned IP for SNAT, no packet-ins should
> +# reach ovn-controller.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep -v n_packets=0 -c], [1], [dnl
> +0
> +])
> +
> +# The packet should've been dropped in the lr_in_arp_resolve stage.
> +AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=21, n_packets=1,.* priority=1,ip,metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
> +1
> +])
> +
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +
>  AT_SETUP([ovn -- nb_cfg timestamp])
>  ovn_start
>
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Dumitru Ceara Sept. 8, 2020, 11:24 a.m. UTC | #2
On 9/8/20 12:58 PM, Numan Siddique wrote:
> On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> OVN was dropping IP packets destined to IPs owned by logical routers but
>> only if those IPs are not used for SNAT rules. However, if a packet
>> doesn't match an existing NAT session and its destination is still a
>> router owned IP, it can be safely dropped. Otherwise it will trigger an
>> unnecessary packet-in in stage lr_in_arp_request.
>>
>> To achieve that we add flows that drop traffic to router owned IPs in
>> table lr_in_arp_resolve.
>>
>> Reported-by: Tim Rozet <trozet@redhat.com>
>> Reported-at: https://bugzilla.redhat.com/1876174
>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> 
> 
> 
> Hi Dumitru,
> 
> Thanks for the fix.
> 
> I have few comments.
> 
> Suppose we have the below lr and NAT entries
> 
> router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
>     port lr0-sw0
>         mac: "00:00:00:00:ff:01"
>         networks: ["10.0.0.1/24"]
>     port lr0-sw1
>         mac: "00:00:00:00:ff:02"
>         networks: ["20.0.0.1/24"]
>     port lr0-public
>         mac: "00:00:20:20:12:13"
>         networks: ["172.168.0.100/24"]
>         gateway chassis: [chassis-1]
>     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
>         external ip: "172.168.0.110"
>         logical ip: "10.0.0.3"
>         type: "dnat_and_snat"
>     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
>         external ip: "172.168.0.120"
>         logical ip: "20.0.0.3"
>         type: "dnat_and_snat"
>     nat da28129c-4e81-4e20-865a-768bd88e5a26
>         external ip: "172.168.0.100"
>         logical ip: "10.0.0.0/24"
>         type: "snat"
> 
> I see the below lflows added in lr_in_ip_input to drop the pkts for
> router port IPs
> 
> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
> 
> So I think there is no need to add the lflows in lr_in_arp_resolve to
> drop for 10.0.0.1 and 20.0.0.1 again.
> 
> I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
> packet only for packets destined to the router ips which have NAT
> entries.
> 

Hi Numan,

Thanks for the review.

You're right we could try to optimize a bit the number of flows.

However, I didn't want to duplicate the code that builds snat_ips [0]
and I thought it complicates the already long build_lrouter_flows()
function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
where we add flows for LR_IN_IP_INPUT.

Also, the number of IP addresses per router port is usually low so the
number of redundant logical flows in lr_in_arp_resolve would be low too.

If you really have a strong preference about this, I can try to change
it though.

Regards,
Dumitru

[0] https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L9015
Numan Siddique Sept. 8, 2020, 12:06 p.m. UTC | #3
On Tue, Sep 8, 2020 at 4:54 PM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 9/8/20 12:58 PM, Numan Siddique wrote:
> > On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
> >>
> >> OVN was dropping IP packets destined to IPs owned by logical routers but
> >> only if those IPs are not used for SNAT rules. However, if a packet
> >> doesn't match an existing NAT session and its destination is still a
> >> router owned IP, it can be safely dropped. Otherwise it will trigger an
> >> unnecessary packet-in in stage lr_in_arp_request.
> >>
> >> To achieve that we add flows that drop traffic to router owned IPs in
> >> table lr_in_arp_resolve.
> >>
> >> Reported-by: Tim Rozet <trozet@redhat.com>
> >> Reported-at: https://bugzilla.redhat.com/1876174
> >> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> >
> >
> >
> > Hi Dumitru,
> >
> > Thanks for the fix.
> >
> > I have few comments.
> >
> > Suppose we have the below lr and NAT entries
> >
> > router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
> >     port lr0-sw0
> >         mac: "00:00:00:00:ff:01"
> >         networks: ["10.0.0.1/24"]
> >     port lr0-sw1
> >         mac: "00:00:00:00:ff:02"
> >         networks: ["20.0.0.1/24"]
> >     port lr0-public
> >         mac: "00:00:20:20:12:13"
> >         networks: ["172.168.0.100/24"]
> >         gateway chassis: [chassis-1]
> >     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
> >         external ip: "172.168.0.110"
> >         logical ip: "10.0.0.3"
> >         type: "dnat_and_snat"
> >     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
> >         external ip: "172.168.0.120"
> >         logical ip: "20.0.0.3"
> >         type: "dnat_and_snat"
> >     nat da28129c-4e81-4e20-865a-768bd88e5a26
> >         external ip: "172.168.0.100"
> >         logical ip: "10.0.0.0/24"
> >         type: "snat"
> >
> > I see the below lflows added in lr_in_ip_input to drop the pkts for
> > router port IPs
> >
> > table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> > {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
> > table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> > {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
> >
> > So I think there is no need to add the lflows in lr_in_arp_resolve to
> > drop for 10.0.0.1 and 20.0.0.1 again.
> >
> > I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
> > packet only for packets destined to the router ips which have NAT
> > entries.
> >
>
> Hi Numan,
>
> Thanks for the review.
>
> You're right we could try to optimize a bit the number of flows.
>
> However, I didn't want to duplicate the code that builds snat_ips [0]
> and I thought it complicates the already long build_lrouter_flows()
> function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
> where we add flows for LR_IN_IP_INPUT.
>
> Also, the number of IP addresses per router port is usually low so the
> number of redundant logical flows in lr_in_arp_resolve would be low too.
>
> If you really have a strong preference about this, I can try to change
> it though.
>

Hi Dumitru,

I was thinking more from the scale perspective. I think this issue is
to address ovn-k8s use case.
Although the gateway router on each node will be configured with NAT
entries and it will be
definitely having very few router ports, but I think on a large scale
setup, the cluster router will have
many router ports and we will see these flows on this router even
though there are no NAT entries
for the cluster router.

For the openstack deployment too we will see these flows and a logical
router with gateway router port
may have many other router ports connecting to the tenant logical switches.

It would be good if we address my comment. Or at the least add these
lflows only for routers configured
with NAT entries.

@Han Zhou  Do you have comments or concerns with this patch ?

I am raising these points, because you and Han addressed lflow
explosion issues a while back.
Although in this case, the flows added in this patch do not result in
exponential increase.

Thanks
Numan


> Regards,
> Dumitru
>
> [0] https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L9015
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Dumitru Ceara Sept. 8, 2020, 1:17 p.m. UTC | #4
On 9/8/20 2:06 PM, Numan Siddique wrote:
> On Tue, Sep 8, 2020 at 4:54 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 9/8/20 12:58 PM, Numan Siddique wrote:
>>> On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>
>>>> OVN was dropping IP packets destined to IPs owned by logical routers but
>>>> only if those IPs are not used for SNAT rules. However, if a packet
>>>> doesn't match an existing NAT session and its destination is still a
>>>> router owned IP, it can be safely dropped. Otherwise it will trigger an
>>>> unnecessary packet-in in stage lr_in_arp_request.
>>>>
>>>> To achieve that we add flows that drop traffic to router owned IPs in
>>>> table lr_in_arp_resolve.
>>>>
>>>> Reported-by: Tim Rozet <trozet@redhat.com>
>>>> Reported-at: https://bugzilla.redhat.com/1876174
>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
>>>
>>>
>>>
>>> Hi Dumitru,
>>>
>>> Thanks for the fix.
>>>
>>> I have few comments.
>>>
>>> Suppose we have the below lr and NAT entries
>>>
>>> router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
>>>     port lr0-sw0
>>>         mac: "00:00:00:00:ff:01"
>>>         networks: ["10.0.0.1/24"]
>>>     port lr0-sw1
>>>         mac: "00:00:00:00:ff:02"
>>>         networks: ["20.0.0.1/24"]
>>>     port lr0-public
>>>         mac: "00:00:20:20:12:13"
>>>         networks: ["172.168.0.100/24"]
>>>         gateway chassis: [chassis-1]
>>>     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
>>>         external ip: "172.168.0.110"
>>>         logical ip: "10.0.0.3"
>>>         type: "dnat_and_snat"
>>>     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
>>>         external ip: "172.168.0.120"
>>>         logical ip: "20.0.0.3"
>>>         type: "dnat_and_snat"
>>>     nat da28129c-4e81-4e20-865a-768bd88e5a26
>>>         external ip: "172.168.0.100"
>>>         logical ip: "10.0.0.0/24"
>>>         type: "snat"
>>>
>>> I see the below lflows added in lr_in_ip_input to drop the pkts for
>>> router port IPs
>>>
>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
>>> {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
>>> {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
>>>
>>> So I think there is no need to add the lflows in lr_in_arp_resolve to
>>> drop for 10.0.0.1 and 20.0.0.1 again.
>>>
>>> I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
>>> packet only for packets destined to the router ips which have NAT
>>> entries.
>>>
>>
>> Hi Numan,
>>
>> Thanks for the review.
>>
>> You're right we could try to optimize a bit the number of flows.
>>
>> However, I didn't want to duplicate the code that builds snat_ips [0]
>> and I thought it complicates the already long build_lrouter_flows()
>> function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
>> where we add flows for LR_IN_IP_INPUT.
>>
>> Also, the number of IP addresses per router port is usually low so the
>> number of redundant logical flows in lr_in_arp_resolve would be low too.
>>
>> If you really have a strong preference about this, I can try to change
>> it though.
>>
> 
> Hi Dumitru,
> 
> I was thinking more from the scale perspective. I think this issue is
> to address ovn-k8s use case.
> Although the gateway router on each node will be configured with NAT
> entries and it will be
> definitely having very few router ports, but I think on a large scale
> setup, the cluster router will have
> many router ports and we will see these flows on this router even
> though there are no NAT entries
> for the cluster router.
> 
> For the openstack deployment too we will see these flows and a logical
> router with gateway router port
> may have many other router ports connecting to the tenant logical switches.
> 
> It would be good if we address my comment. Or at the least add these
> lflows only for routers configured
> with NAT entries.
> 

We could also just add the flows to lr_in_arp_resolve and remove the
ones that drop traffic destined to router-owned IPs in lr_in_ip_input.

It's a bit ugly because we're forcing this in a stage that should
resolve ARP but lr_in_ip_input is too early for traffic that might get
NAT-ed.

What do you think?

> @Han Zhou  Do you have comments or concerns with this patch ?
> 
> I am raising these points, because you and Han addressed lflow
> explosion issues a while back.
> Although in this case, the flows added in this patch do not result in
> exponential increase.
> 

Right, here there should be max 1 flow per IP address so it shouldn't
create a scale issue.

Thanks,
Dumitru

> Thanks
> Numan
> 
> 
>> Regards,
>> Dumitru
>>
>> [0] https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L9015
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
Numan Siddique Sept. 8, 2020, 1:42 p.m. UTC | #5
On Tue, Sep 8, 2020 at 6:48 PM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 9/8/20 2:06 PM, Numan Siddique wrote:
> > On Tue, Sep 8, 2020 at 4:54 PM Dumitru Ceara <dceara@redhat.com> wrote:
> >>
> >> On 9/8/20 12:58 PM, Numan Siddique wrote:
> >>> On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
> >>>>
> >>>> OVN was dropping IP packets destined to IPs owned by logical routers but
> >>>> only if those IPs are not used for SNAT rules. However, if a packet
> >>>> doesn't match an existing NAT session and its destination is still a
> >>>> router owned IP, it can be safely dropped. Otherwise it will trigger an
> >>>> unnecessary packet-in in stage lr_in_arp_request.
> >>>>
> >>>> To achieve that we add flows that drop traffic to router owned IPs in
> >>>> table lr_in_arp_resolve.
> >>>>
> >>>> Reported-by: Tim Rozet <trozet@redhat.com>
> >>>> Reported-at: https://bugzilla.redhat.com/1876174
> >>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> >>>
> >>>
> >>>
> >>> Hi Dumitru,
> >>>
> >>> Thanks for the fix.
> >>>
> >>> I have few comments.
> >>>
> >>> Suppose we have the below lr and NAT entries
> >>>
> >>> router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
> >>>     port lr0-sw0
> >>>         mac: "00:00:00:00:ff:01"
> >>>         networks: ["10.0.0.1/24"]
> >>>     port lr0-sw1
> >>>         mac: "00:00:00:00:ff:02"
> >>>         networks: ["20.0.0.1/24"]
> >>>     port lr0-public
> >>>         mac: "00:00:20:20:12:13"
> >>>         networks: ["172.168.0.100/24"]
> >>>         gateway chassis: [chassis-1]
> >>>     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
> >>>         external ip: "172.168.0.110"
> >>>         logical ip: "10.0.0.3"
> >>>         type: "dnat_and_snat"
> >>>     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
> >>>         external ip: "172.168.0.120"
> >>>         logical ip: "20.0.0.3"
> >>>         type: "dnat_and_snat"
> >>>     nat da28129c-4e81-4e20-865a-768bd88e5a26
> >>>         external ip: "172.168.0.100"
> >>>         logical ip: "10.0.0.0/24"
> >>>         type: "snat"
> >>>
> >>> I see the below lflows added in lr_in_ip_input to drop the pkts for
> >>> router port IPs
> >>>
> >>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> >>> {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
> >>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> >>> {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
> >>>
> >>> So I think there is no need to add the lflows in lr_in_arp_resolve to
> >>> drop for 10.0.0.1 and 20.0.0.1 again.
> >>>
> >>> I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
> >>> packet only for packets destined to the router ips which have NAT
> >>> entries.
> >>>
> >>
> >> Hi Numan,
> >>
> >> Thanks for the review.
> >>
> >> You're right we could try to optimize a bit the number of flows.
> >>
> >> However, I didn't want to duplicate the code that builds snat_ips [0]
> >> and I thought it complicates the already long build_lrouter_flows()
> >> function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
> >> where we add flows for LR_IN_IP_INPUT.
> >>
> >> Also, the number of IP addresses per router port is usually low so the
> >> number of redundant logical flows in lr_in_arp_resolve would be low too.
> >>
> >> If you really have a strong preference about this, I can try to change
> >> it though.
> >>
> >
> > Hi Dumitru,
> >
> > I was thinking more from the scale perspective. I think this issue is
> > to address ovn-k8s use case.
> > Although the gateway router on each node will be configured with NAT
> > entries and it will be
> > definitely having very few router ports, but I think on a large scale
> > setup, the cluster router will have
> > many router ports and we will see these flows on this router even
> > though there are no NAT entries
> > for the cluster router.
> >
> > For the openstack deployment too we will see these flows and a logical
> > router with gateway router port
> > may have many other router ports connecting to the tenant logical switches.
> >
> > It would be good if we address my comment. Or at the least add these
> > lflows only for routers configured
> > with NAT entries.
> >
>
> We could also just add the flows to lr_in_arp_resolve and remove the
> ones that drop traffic destined to router-owned IPs in lr_in_ip_input.
>
> It's a bit ugly because we're forcing this in a stage that should
> resolve ARP but lr_in_ip_input is too early for traffic that might get
> NAT-ed.
>
> What do you think?

I think we can keep the existing flows in lr_in_ip_input and add only
the required flows in lr_in_arp_resolve
to drop for the router ips which have NAT entry ?

I would do something like

HMAP_FOR_EACH (op, key_node, ports) {
        if (!op->nbrp) {
            continue;
        }

       for (int i = 0; i < od->nbr->n_nat; i++) {
          struct ovn_nat *nat_entry = &od->nat_entries[i];
          const struct nbrec_nat *nat = nat_entry->nb;

         if  (nat->externa_ip belongs to one of the logical router ip) {
                /* Drop traffic with IP.dest == router-ip. */
               ovn_lflow_add_with_hint(lflows, op->od,
S_ROUTER_IN_ARP_RESOLVE, 1,
                                    "ip4.dst == %s, "drop;",
                                    &op->nbrp->header_, nat_entry->external_ip);

         }
     }
}

Thanks
Numan

>
> > @Han Zhou  Do you have comments or concerns with this patch ?
> >
> > I am raising these points, because you and Han addressed lflow
> > explosion issues a while back.
> > Although in this case, the flows added in this patch do not result in
> > exponential increase.
> >
>
> Right, here there should be max 1 flow per IP address so it shouldn't
> create a scale issue.
>
> Thanks,
> Dumitru
>
> > Thanks
> > Numan
> >
> >
> >> Regards,
> >> Dumitru
> >>
> >> [0] https://github.com/ovn-org/ovn/blob/master/northd/ovn-northd.c#L9015
> >>
> >> _______________________________________________
> >> dev mailing list
> >> dev@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Dumitru Ceara Sept. 8, 2020, 6:39 p.m. UTC | #6
On 9/8/20 3:42 PM, Numan Siddique wrote:
> On Tue, Sep 8, 2020 at 6:48 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>
>> On 9/8/20 2:06 PM, Numan Siddique wrote:
>>> On Tue, Sep 8, 2020 at 4:54 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>
>>>> On 9/8/20 12:58 PM, Numan Siddique wrote:
>>>>> On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>>>
>>>>>> OVN was dropping IP packets destined to IPs owned by logical routers but
>>>>>> only if those IPs are not used for SNAT rules. However, if a packet
>>>>>> doesn't match an existing NAT session and its destination is still a
>>>>>> router owned IP, it can be safely dropped. Otherwise it will trigger an
>>>>>> unnecessary packet-in in stage lr_in_arp_request.
>>>>>>
>>>>>> To achieve that we add flows that drop traffic to router owned IPs in
>>>>>> table lr_in_arp_resolve.
>>>>>>
>>>>>> Reported-by: Tim Rozet <trozet@redhat.com>
>>>>>> Reported-at: https://bugzilla.redhat.com/1876174
>>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
>>>>>
>>>>>
>>>>>
>>>>> Hi Dumitru,
>>>>>
>>>>> Thanks for the fix.
>>>>>
>>>>> I have few comments.
>>>>>
>>>>> Suppose we have the below lr and NAT entries
>>>>>
>>>>> router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
>>>>>     port lr0-sw0
>>>>>         mac: "00:00:00:00:ff:01"
>>>>>         networks: ["10.0.0.1/24"]
>>>>>     port lr0-sw1
>>>>>         mac: "00:00:00:00:ff:02"
>>>>>         networks: ["20.0.0.1/24"]
>>>>>     port lr0-public
>>>>>         mac: "00:00:20:20:12:13"
>>>>>         networks: ["172.168.0.100/24"]
>>>>>         gateway chassis: [chassis-1]
>>>>>     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
>>>>>         external ip: "172.168.0.110"
>>>>>         logical ip: "10.0.0.3"
>>>>>         type: "dnat_and_snat"
>>>>>     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
>>>>>         external ip: "172.168.0.120"
>>>>>         logical ip: "20.0.0.3"
>>>>>         type: "dnat_and_snat"
>>>>>     nat da28129c-4e81-4e20-865a-768bd88e5a26
>>>>>         external ip: "172.168.0.100"
>>>>>         logical ip: "10.0.0.0/24"
>>>>>         type: "snat"
>>>>>
>>>>> I see the below lflows added in lr_in_ip_input to drop the pkts for
>>>>> router port IPs
>>>>>
>>>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
>>>>> {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
>>>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
>>>>> {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
>>>>>
>>>>> So I think there is no need to add the lflows in lr_in_arp_resolve to
>>>>> drop for 10.0.0.1 and 20.0.0.1 again.
>>>>>
>>>>> I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
>>>>> packet only for packets destined to the router ips which have NAT
>>>>> entries.
>>>>>
>>>>
>>>> Hi Numan,
>>>>
>>>> Thanks for the review.
>>>>
>>>> You're right we could try to optimize a bit the number of flows.
>>>>
>>>> However, I didn't want to duplicate the code that builds snat_ips [0]
>>>> and I thought it complicates the already long build_lrouter_flows()
>>>> function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
>>>> where we add flows for LR_IN_IP_INPUT.
>>>>
>>>> Also, the number of IP addresses per router port is usually low so the
>>>> number of redundant logical flows in lr_in_arp_resolve would be low too.
>>>>
>>>> If you really have a strong preference about this, I can try to change
>>>> it though.
>>>>
>>>
>>> Hi Dumitru,
>>>
>>> I was thinking more from the scale perspective. I think this issue is
>>> to address ovn-k8s use case.
>>> Although the gateway router on each node will be configured with NAT
>>> entries and it will be
>>> definitely having very few router ports, but I think on a large scale
>>> setup, the cluster router will have
>>> many router ports and we will see these flows on this router even
>>> though there are no NAT entries
>>> for the cluster router.
>>>
>>> For the openstack deployment too we will see these flows and a logical
>>> router with gateway router port
>>> may have many other router ports connecting to the tenant logical switches.
>>>
>>> It would be good if we address my comment. Or at the least add these
>>> lflows only for routers configured
>>> with NAT entries.
>>>
>>
>> We could also just add the flows to lr_in_arp_resolve and remove the
>> ones that drop traffic destined to router-owned IPs in lr_in_ip_input.
>>
>> It's a bit ugly because we're forcing this in a stage that should
>> resolve ARP but lr_in_ip_input is too early for traffic that might get
>> NAT-ed.
>>
>> What do you think?
> 
> I think we can keep the existing flows in lr_in_ip_input and add only
> the required flows in lr_in_arp_resolve
> to drop for the router ips which have NAT entry ?
> 

OK. I'll work on v3.

> I would do something like
> 
> HMAP_FOR_EACH (op, key_node, ports) {
>         if (!op->nbrp) {
>             continue;
>         }
> 
>        for (int i = 0; i < od->nbr->n_nat; i++) {
>           struct ovn_nat *nat_entry = &od->nat_entries[i];
>           const struct nbrec_nat *nat = nat_entry->nb;
> 
>          if  (nat->externa_ip belongs to one of the logical router ip) {
>                 /* Drop traffic with IP.dest == router-ip. */
>                ovn_lflow_add_with_hint(lflows, op->od,
> S_ROUTER_IN_ARP_RESOLVE, 1,
>                                     "ip4.dst == %s, "drop;",
>                                     &op->nbrp->header_, nat_entry->external_ip);
> 
>          }
>      }
> }

This is not really enough, we'd also need to check if
lb_force_snat_ip/dnat_force_snat_ip are also router IPs so the above
will basically mean rewriting almost the same code as here (lr_in_ip_input):
https://github.com/ovn-org/ovn/blob/fc79d690b9e5479ce0190a9808c21fdca76575c8/northd/ovn-northd.c#L9015

I'll find a way to refactor it a bit so we don't have so much code
duplication and at the same time not have any redundant flows.

Regards,
Dumitru
Dumitru Ceara Sept. 17, 2020, 12:54 p.m. UTC | #7
On 9/8/20 8:39 PM, Dumitru Ceara wrote:
> On 9/8/20 3:42 PM, Numan Siddique wrote:
>> On Tue, Sep 8, 2020 at 6:48 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>>
>>> On 9/8/20 2:06 PM, Numan Siddique wrote:
>>>> On Tue, Sep 8, 2020 at 4:54 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>>
>>>>> On 9/8/20 12:58 PM, Numan Siddique wrote:
>>>>>> On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>>>>
>>>>>>> OVN was dropping IP packets destined to IPs owned by logical routers but
>>>>>>> only if those IPs are not used for SNAT rules. However, if a packet
>>>>>>> doesn't match an existing NAT session and its destination is still a
>>>>>>> router owned IP, it can be safely dropped. Otherwise it will trigger an
>>>>>>> unnecessary packet-in in stage lr_in_arp_request.
>>>>>>>
>>>>>>> To achieve that we add flows that drop traffic to router owned IPs in
>>>>>>> table lr_in_arp_resolve.
>>>>>>>
>>>>>>> Reported-by: Tim Rozet <trozet@redhat.com>
>>>>>>> Reported-at: https://bugzilla.redhat.com/1876174
>>>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi Dumitru,
>>>>>>
>>>>>> Thanks for the fix.
>>>>>>
>>>>>> I have few comments.
>>>>>>
>>>>>> Suppose we have the below lr and NAT entries
>>>>>>
>>>>>> router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
>>>>>>     port lr0-sw0
>>>>>>         mac: "00:00:00:00:ff:01"
>>>>>>         networks: ["10.0.0.1/24"]
>>>>>>     port lr0-sw1
>>>>>>         mac: "00:00:00:00:ff:02"
>>>>>>         networks: ["20.0.0.1/24"]
>>>>>>     port lr0-public
>>>>>>         mac: "00:00:20:20:12:13"
>>>>>>         networks: ["172.168.0.100/24"]
>>>>>>         gateway chassis: [chassis-1]
>>>>>>     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
>>>>>>         external ip: "172.168.0.110"
>>>>>>         logical ip: "10.0.0.3"
>>>>>>         type: "dnat_and_snat"
>>>>>>     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
>>>>>>         external ip: "172.168.0.120"
>>>>>>         logical ip: "20.0.0.3"
>>>>>>         type: "dnat_and_snat"
>>>>>>     nat da28129c-4e81-4e20-865a-768bd88e5a26
>>>>>>         external ip: "172.168.0.100"
>>>>>>         logical ip: "10.0.0.0/24"
>>>>>>         type: "snat"
>>>>>>
>>>>>> I see the below lflows added in lr_in_ip_input to drop the pkts for
>>>>>> router port IPs
>>>>>>
>>>>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
>>>>>> {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
>>>>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
>>>>>> {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
>>>>>>
>>>>>> So I think there is no need to add the lflows in lr_in_arp_resolve to
>>>>>> drop for 10.0.0.1 and 20.0.0.1 again.
>>>>>>
>>>>>> I think it's sufficient to add lflows in lr_in_arp_resolve to drop the
>>>>>> packet only for packets destined to the router ips which have NAT
>>>>>> entries.
>>>>>>
>>>>>
>>>>> Hi Numan,
>>>>>
>>>>> Thanks for the review.
>>>>>
>>>>> You're right we could try to optimize a bit the number of flows.
>>>>>
>>>>> However, I didn't want to duplicate the code that builds snat_ips [0]
>>>>> and I thought it complicates the already long build_lrouter_flows()
>>>>> function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
>>>>> where we add flows for LR_IN_IP_INPUT.
>>>>>
>>>>> Also, the number of IP addresses per router port is usually low so the
>>>>> number of redundant logical flows in lr_in_arp_resolve would be low too.
>>>>>
>>>>> If you really have a strong preference about this, I can try to change
>>>>> it though.
>>>>>
>>>>
>>>> Hi Dumitru,
>>>>
>>>> I was thinking more from the scale perspective. I think this issue is
>>>> to address ovn-k8s use case.
>>>> Although the gateway router on each node will be configured with NAT
>>>> entries and it will be
>>>> definitely having very few router ports, but I think on a large scale
>>>> setup, the cluster router will have
>>>> many router ports and we will see these flows on this router even
>>>> though there are no NAT entries
>>>> for the cluster router.
>>>>
>>>> For the openstack deployment too we will see these flows and a logical
>>>> router with gateway router port
>>>> may have many other router ports connecting to the tenant logical switches.
>>>>
>>>> It would be good if we address my comment. Or at the least add these
>>>> lflows only for routers configured
>>>> with NAT entries.
>>>>
>>>
>>> We could also just add the flows to lr_in_arp_resolve and remove the
>>> ones that drop traffic destined to router-owned IPs in lr_in_ip_input.
>>>
>>> It's a bit ugly because we're forcing this in a stage that should
>>> resolve ARP but lr_in_ip_input is too early for traffic that might get
>>> NAT-ed.
>>>
>>> What do you think?
>>
>> I think we can keep the existing flows in lr_in_ip_input and add only
>> the required flows in lr_in_arp_resolve
>> to drop for the router ips which have NAT entry ?
>>
> 
> OK. I'll work on v3.
> 
>> I would do something like
>>
>> HMAP_FOR_EACH (op, key_node, ports) {
>>         if (!op->nbrp) {
>>             continue;
>>         }
>>
>>        for (int i = 0; i < od->nbr->n_nat; i++) {
>>           struct ovn_nat *nat_entry = &od->nat_entries[i];
>>           const struct nbrec_nat *nat = nat_entry->nb;
>>
>>          if  (nat->externa_ip belongs to one of the logical router ip) {
>>                 /* Drop traffic with IP.dest == router-ip. */
>>                ovn_lflow_add_with_hint(lflows, op->od,
>> S_ROUTER_IN_ARP_RESOLVE, 1,
>>                                     "ip4.dst == %s, "drop;",
>>                                     &op->nbrp->header_, nat_entry->external_ip);
>>
>>          }
>>      }
>> }
> 
> This is not really enough, we'd also need to check if
> lb_force_snat_ip/dnat_force_snat_ip are also router IPs so the above
> will basically mean rewriting almost the same code as here (lr_in_ip_input):
> https://github.com/ovn-org/ovn/blob/fc79d690b9e5479ce0190a9808c21fdca76575c8/northd/ovn-northd.c#L9015
> 
> I'll find a way to refactor it a bit so we don't have so much code
> duplication and at the same time not have any redundant flows.
> 
> Regards,
> Dumitru
> 

Hi Numan,

I sent v4 as series, the first patch being the bug fix and the following
refactoring work:

http://patchwork.ozlabs.org/project/ovn/list/?series=202406

Regards,
Dumitru
Numan Siddique Sept. 17, 2020, 1:19 p.m. UTC | #8
On Thu, Sep 17, 2020 at 6:25 PM Dumitru Ceara <dceara@redhat.com> wrote:

> On 9/8/20 8:39 PM, Dumitru Ceara wrote:
> > On 9/8/20 3:42 PM, Numan Siddique wrote:
> >> On Tue, Sep 8, 2020 at 6:48 PM Dumitru Ceara <dceara@redhat.com> wrote:
> >>>
> >>> On 9/8/20 2:06 PM, Numan Siddique wrote:
> >>>> On Tue, Sep 8, 2020 at 4:54 PM Dumitru Ceara <dceara@redhat.com>
> wrote:
> >>>>>
> >>>>> On 9/8/20 12:58 PM, Numan Siddique wrote:
> >>>>>> On Tue, Sep 8, 2020 at 2:13 PM Dumitru Ceara <dceara@redhat.com>
> wrote:
> >>>>>>>
> >>>>>>> OVN was dropping IP packets destined to IPs owned by logical
> routers but
> >>>>>>> only if those IPs are not used for SNAT rules. However, if a packet
> >>>>>>> doesn't match an existing NAT session and its destination is still
> a
> >>>>>>> router owned IP, it can be safely dropped. Otherwise it will
> trigger an
> >>>>>>> unnecessary packet-in in stage lr_in_arp_request.
> >>>>>>>
> >>>>>>> To achieve that we add flows that drop traffic to router owned IPs
> in
> >>>>>>> table lr_in_arp_resolve.
> >>>>>>>
> >>>>>>> Reported-by: Tim Rozet <trozet@redhat.com>
> >>>>>>> Reported-at: https://bugzilla.redhat.com/1876174
> >>>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Hi Dumitru,
> >>>>>>
> >>>>>> Thanks for the fix.
> >>>>>>
> >>>>>> I have few comments.
> >>>>>>
> >>>>>> Suppose we have the below lr and NAT entries
> >>>>>>
> >>>>>> router 392b19fe-adf9-4b3d-bf43-040d12240e54 (lr0)
> >>>>>>     port lr0-sw0
> >>>>>>         mac: "00:00:00:00:ff:01"
> >>>>>>         networks: ["10.0.0.1/24"]
> >>>>>>     port lr0-sw1
> >>>>>>         mac: "00:00:00:00:ff:02"
> >>>>>>         networks: ["20.0.0.1/24"]
> >>>>>>     port lr0-public
> >>>>>>         mac: "00:00:20:20:12:13"
> >>>>>>         networks: ["172.168.0.100/24"]
> >>>>>>         gateway chassis: [chassis-1]
> >>>>>>     nat 4c26f3f0-0ae0-4c18-9a1e-9d2751389270
> >>>>>>         external ip: "172.168.0.110"
> >>>>>>         logical ip: "10.0.0.3"
> >>>>>>         type: "dnat_and_snat"
> >>>>>>     nat b9296dc8-ac3a-427a-a97a-a94bf16214b2
> >>>>>>         external ip: "172.168.0.120"
> >>>>>>         logical ip: "20.0.0.3"
> >>>>>>         type: "dnat_and_snat"
> >>>>>>     nat da28129c-4e81-4e20-865a-768bd88e5a26
> >>>>>>         external ip: "172.168.0.100"
> >>>>>>         logical ip: "10.0.0.0/24"
> >>>>>>         type: "snat"
> >>>>>>
> >>>>>> I see the below lflows added in lr_in_ip_input to drop the pkts for
> >>>>>> router port IPs
> >>>>>>
> >>>>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> >>>>>> {10.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff01}), action=(drop;)
> >>>>>> table=3 (lr_in_ip_input     ), priority=60   , match=(ip4.dst ==
> >>>>>> {20.0.0.1} || ip6.dst == {fe80::200:ff:fe00:ff02}), action=(drop;)
> >>>>>>
> >>>>>> So I think there is no need to add the lflows in lr_in_arp_resolve
> to
> >>>>>> drop for 10.0.0.1 and 20.0.0.1 again.
> >>>>>>
> >>>>>> I think it's sufficient to add lflows in lr_in_arp_resolve to drop
> the
> >>>>>> packet only for packets destined to the router ips which have NAT
> >>>>>> entries.
> >>>>>>
> >>>>>
> >>>>> Hi Numan,
> >>>>>
> >>>>> Thanks for the review.
> >>>>>
> >>>>> You're right we could try to optimize a bit the number of flows.
> >>>>>
> >>>>> However, I didn't want to duplicate the code that builds snat_ips [0]
> >>>>> and I thought it complicates the already long build_lrouter_flows()
> >>>>> function if I add flows to stage LR_IN_ARP_RESOLVE in the same place
> >>>>> where we add flows for LR_IN_IP_INPUT.
> >>>>>
> >>>>> Also, the number of IP addresses per router port is usually low so
> the
> >>>>> number of redundant logical flows in lr_in_arp_resolve would be low
> too.
> >>>>>
> >>>>> If you really have a strong preference about this, I can try to
> change
> >>>>> it though.
> >>>>>
> >>>>
> >>>> Hi Dumitru,
> >>>>
> >>>> I was thinking more from the scale perspective. I think this issue is
> >>>> to address ovn-k8s use case.
> >>>> Although the gateway router on each node will be configured with NAT
> >>>> entries and it will be
> >>>> definitely having very few router ports, but I think on a large scale
> >>>> setup, the cluster router will have
> >>>> many router ports and we will see these flows on this router even
> >>>> though there are no NAT entries
> >>>> for the cluster router.
> >>>>
> >>>> For the openstack deployment too we will see these flows and a logical
> >>>> router with gateway router port
> >>>> may have many other router ports connecting to the tenant logical
> switches.
> >>>>
> >>>> It would be good if we address my comment. Or at the least add these
> >>>> lflows only for routers configured
> >>>> with NAT entries.
> >>>>
> >>>
> >>> We could also just add the flows to lr_in_arp_resolve and remove the
> >>> ones that drop traffic destined to router-owned IPs in lr_in_ip_input.
> >>>
> >>> It's a bit ugly because we're forcing this in a stage that should
> >>> resolve ARP but lr_in_ip_input is too early for traffic that might get
> >>> NAT-ed.
> >>>
> >>> What do you think?
> >>
> >> I think we can keep the existing flows in lr_in_ip_input and add only
> >> the required flows in lr_in_arp_resolve
> >> to drop for the router ips which have NAT entry ?
> >>
> >
> > OK. I'll work on v3.
> >
> >> I would do something like
> >>
> >> HMAP_FOR_EACH (op, key_node, ports) {
> >>         if (!op->nbrp) {
> >>             continue;
> >>         }
> >>
> >>        for (int i = 0; i < od->nbr->n_nat; i++) {
> >>           struct ovn_nat *nat_entry = &od->nat_entries[i];
> >>           const struct nbrec_nat *nat = nat_entry->nb;
> >>
> >>          if  (nat->externa_ip belongs to one of the logical router ip) {
> >>                 /* Drop traffic with IP.dest == router-ip. */
> >>                ovn_lflow_add_with_hint(lflows, op->od,
> >> S_ROUTER_IN_ARP_RESOLVE, 1,
> >>                                     "ip4.dst == %s, "drop;",
> >>                                     &op->nbrp->header_,
> nat_entry->external_ip);
> >>
> >>          }
> >>      }
> >> }
> >
> > This is not really enough, we'd also need to check if
> > lb_force_snat_ip/dnat_force_snat_ip are also router IPs so the above
> > will basically mean rewriting almost the same code as here
> (lr_in_ip_input):
> >
> https://github.com/ovn-org/ovn/blob/fc79d690b9e5479ce0190a9808c21fdca76575c8/northd/ovn-northd.c#L9015
> >
> > I'll find a way to refactor it a bit so we don't have so much code
> > duplication and at the same time not have any redundant flows.
> >
> > Regards,
> > Dumitru
> >
>
> Hi Numan,
>
> I sent v4 as series, the first patch being the bug fix and the following
> refactoring work:
>
> http://patchwork.ozlabs.org/project/ovn/list/?series=202406
>
>
Thanks. I'll take a look.

Numan


> Regards,
> Dumitru
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
diff mbox series

Patch

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 989e364..b9170fc 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2943,6 +2943,30 @@  outport = <var>P</var>;
 
       <li>
         <p>
+          Traffic with IP destination an address owned by the router should be
+          dropped. Such traffic is normally dropped in ingress table
+          <code>IP Input</code> except for IPs that are also shared with SNAT
+          rules. However, if there was no unSNAT operation that happened
+          successfully until this point in the pipeline and the destination IP
+          of the packet is still a router owned IP, the packets can be safely
+          dropped.
+        </p>
+
+        <p>
+          A priority-1 logical flow with match <code>ip4.dst = {..}</code>
+          matches on traffic destined to router owned IPv4 addresses and
+          has action <code>drop;</code>.
+        </p>
+
+        <p>
+          A priority-1 logical flow with match <code>ip6.dst = {..}</code>
+          matches on traffic destined to router owned IPv4 addresses and
+          has action <code>drop;</code>.
+        </p>
+      </li>
+
+      <li>
+        <p>
           Dynamic MAC bindings.  These flows resolve MAC-to-IP bindings
           that have become known dynamically through ARP or neighbor
           discovery.  (The ingress table <code>ARP Request</code> will
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 681c008..6e063ac 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -10588,6 +10588,54 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         }
     }
 
+    /* Drop IP traffic destined to router owned IPs. Part of it is dropped
+     * in stage "lr_in_ip_input" but traffic that could have been unSNATed
+     * but didn't match any existing session might still end up here.
+     */
+    HMAP_FOR_EACH (op, key_node, ports) {
+        if (!op->nbrp) {
+            continue;
+        }
+
+        if (op->lrp_networks.n_ipv4_addrs) {
+            ds_clear(&match);
+            ds_put_cstr(&match, "ip4.dst == {");
+
+            for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
+                ds_put_format(&match, "%s, ",
+                              op->lrp_networks.ipv4_addrs[i].addr_s);
+            }
+
+            ds_chomp(&match, ' ');
+            ds_chomp(&match, ',');
+            ds_put_char(&match, '}');
+
+            /* Drop traffic with IP.dest == router-ip. */
+            ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_ARP_RESOLVE, 1,
+                                    ds_cstr(&match), "drop;",
+                                    &op->nbrp->header_);
+        }
+
+        if (op->lrp_networks.n_ipv6_addrs) {
+            ds_clear(&match);
+            ds_put_cstr(&match, "ip6.dst == {");
+
+            for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
+                ds_put_format(&match, "%s, ",
+                              op->lrp_networks.ipv6_addrs[i].addr_s);
+            }
+
+            ds_chomp(&match, ' ');
+            ds_chomp(&match, ',');
+            ds_put_char(&match, '}');
+
+            /* Drop traffic with IP.dest == router-ip. */
+            ovn_lflow_add_with_hint(lflows, op->od, S_ROUTER_IN_ARP_RESOLVE, 1,
+                                    ds_cstr(&match), "drop;",
+                                    &op->nbrp->header_);
+        }
+    }
+
     HMAP_FOR_EACH (od, key_node, datapaths) {
         if (!od->nbr) {
             continue;
diff --git a/tests/ovn.at b/tests/ovn.at
index 1fe34b7..0e14dfa 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -21377,8 +21377,12 @@  OVN_POPULATE_ARP
 
 ovn-nbctl --wait=hv sync
 
-AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.1], [1], [])
-AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.2], [1], [])
+AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.1], [0], [dnl
+  table=13(lr_in_arp_resolve  ), priority=1    , match=(ip4.dst == {10.0.0.1}), action=(drop;)
+])
+AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.0.2], [0], [dnl
+  table=13(lr_in_arp_resolve  ), priority=1    , match=(ip4.dst == {10.0.0.2}), action=(drop;)
+])
 
 ip_to_hex() {
     printf "%02x%02x%02x%02x" "$@"
@@ -21447,6 +21451,96 @@  OVS_WAIT_UNTIL([test x$(as hv1 ovn-appctl -t ovn-controller debug/status) = "xru
 OVN_CLEANUP([hv1])
 AT_CLEANUP
 
+# Test dropping traffic destined to router owned IPs.
+AT_SETUP([ovn -- gateway router drop traffic for own IPs])
+ovn_start
+
+ovn-nbctl lr-add r1 -- set logical_router r1 options:chassis=hv1
+ovn-nbctl ls-add s1
+
+# Connnect r1 to s1.
+ovn-nbctl lrp-add r1 lrp-r1-s1 00:00:00:00:01:01 10.0.1.1/24
+ovn-nbctl lsp-add s1 lsp-s1-r1 -- set Logical_Switch_Port lsp-s1-r1 type=router \
+    options:router-port=lrp-r1-s1 addresses=router
+
+# Create logical port p1 in s1
+ovn-nbctl lsp-add s1 p1 \
+-- lsp-set-addresses p1 "f0:00:00:00:01:02 10.0.1.2"
+
+# Create two hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=p1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap \
+    ofport-request=1
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+OVN_POPULATE_ARP
+
+ovn-nbctl --wait=hv sync
+
+sw_key=$(ovn-sbctl --bare --columns tunnel_key list datapath_binding r1)
+
+AT_CHECK([ovn-sbctl lflow-list | grep lr_in_arp_resolve | grep 10.0.1.1], [0], [dnl
+  table=13(lr_in_arp_resolve  ), priority=1    , match=(ip4.dst == {10.0.1.1}), action=(drop;)
+])
+
+ip_to_hex() {
+    printf "%02x%02x%02x%02x" "$@"
+}
+
+# Send ip packets from p1 to lrp-r1-s1
+src_mac="f00000000102"
+dst_mac="000000000101"
+src_ip=`ip_to_hex 10 0 1 2`
+dst_ip=`ip_to_hex 10 0 1 1`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
+
+# No packet-ins should reach ovn-controller.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep -v n_packets=0 -c], [1], [dnl
+0
+])
+
+# The packet should have been dropped in the lr_in_ip_input stage.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=11, n_packets=1,.* priority=60,ip,metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
+1
+])
+
+# Use the router IP as SNAT IP.
+ovn-nbctl set logical_router r1 options:lb_force_snat_ip=10.0.1.1
+ovn-nbctl --wait=hv sync
+
+# Send ip packets from p1 to lrp-r1-s1
+src_mac="f00000000102"
+dst_mac="000000000101"
+src_ip=`ip_to_hex 10 0 1 2`
+dst_ip=`ip_to_hex 10 0 1 1`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
+
+# Even after configuring a router owned IP for SNAT, no packet-ins should
+# reach ovn-controller.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep "actions=controller" | grep -v n_packets=0 -c], [1], [dnl
+0
+])
+
+# The packet should've been dropped in the lr_in_arp_resolve stage.
+AT_CHECK([as hv1 ovs-ofctl dump-flows br-int | grep -E "table=21, n_packets=1,.* priority=1,ip,metadata=0x${sw_key},nw_dst=10.0.1.1 actions=drop" -c], [0], [dnl
+1
+])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+
 AT_SETUP([ovn -- nb_cfg timestamp])
 ovn_start