[ovs-dev] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted
diff mbox series

Message ID 20190620030207.17383-1-vdasari@gmail.com
State New
Headers show
Series
  • [ovs-dev] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted
Related show

Commit Message

Vasu Dasari June 20, 2019, 3:02 a.m. UTC
Say an ARP entry is learnt on a OVS port and when such a port is deleted,
learnt entry should be removed from the port. It would have be aged out after
ARP ageout time. This code will clean up immediately.

Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to
verify neighbors are added and removed on deletion of a ports and bridges.

Discussion for this addition is at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html

Signed-off-by: Vasu Dasari <vdasari@gmail.com>
---
 lib/tnl-neigh-cache.c        | 20 +++++++++++++++++
 lib/tnl-neigh-cache.h        |  1 +
 ofproto/ofproto-dpif-xlate.c |  3 +++
 tests/tunnel.at              | 43 ++++++++++++++++++++++++++++++++++++
 4 files changed, 67 insertions(+)

Comments

0-day Robot June 20, 2019, 3:58 a.m. UTC | #1
Bleep bloop.  Greetings Vasu Dasari, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Improper whitespace around control block
#39 FILE: lib/tnl-neigh-cache.c:230:
    CMAP_FOR_EACH(neigh, cmap_node, &table) {

ERROR: Improper whitespace around control block
#74 FILE: ofproto/ofproto-dpif-xlate.c:1484:
    if(xport) {

Lines checked: 133, Warnings: 0, Errors: 2


Please check this out.  If you feel there has been an error, please email aconole@bytheb.org

Thanks,
0-day Robot
Ginwala, Aliasgar via dev June 24, 2019, 7:58 p.m. UTC | #2
On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> Say an ARP entry is learnt on a OVS port and when such a port is deleted,
> learnt entry should be removed from the port. It would have be aged out after
> ARP ageout time. This code will clean up immediately.
> 
> Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to
> verify neighbors are added and removed on deletion of a ports and bridges.
> 
> Discussion for this addition is at:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html
> 
> Signed-off-by: Vasu Dasari <vdasari@gmail.com>
> ---
>  lib/tnl-neigh-cache.c        | 20 +++++++++++++++++
>  lib/tnl-neigh-cache.h        |  1 +
>  ofproto/ofproto-dpif-xlate.c |  3 +++
>  tests/tunnel.at              | 43 ++++++++++++++++++++++++++++++++++++
>  4 files changed, 67 insertions(+)
> 
> diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c
> index b28f9f1bb..8718405db 100644
> --- a/lib/tnl-neigh-cache.c
> +++ b/lib/tnl-neigh-cache.c
> @@ -220,6 +220,26 @@ tnl_neigh_cache_run(void)
>      }
>  }
>  
> +void
> +tnl_neigh_flush(const char br_name[])

It seems the file uses a convention declaring using IFNAMSIZ

> +{
> +    struct tnl_neigh_entry *neigh;
> +    bool changed = false;
> +
> +    ovs_mutex_lock(&mutex);
> +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> +            tnl_neigh_delete(neigh);
> +            changed = true;

Do you expect to match on additional entries? Otherwise it
could bail out here.


> +        }
> +    }
> +    ovs_mutex_unlock(&mutex);
> +
> +    if (changed) {
> +        seq_change(tnl_conf_seq);
> +    }
> +}
> +
>  static void
>  tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED,
>                      const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h
> index fee8e6a6f..ded9c2f86 100644
> --- a/lib/tnl-neigh-cache.h
> +++ b/lib/tnl-neigh-cache.h
> @@ -37,5 +37,6 @@ int tnl_neigh_lookup(const char dev_name[], const struct in6_addr *dst,
>                       struct eth_addr *mac);
>  void tnl_neigh_cache_init(void);
>  void tnl_neigh_cache_run(void);
> +void tnl_neigh_flush(const char dev_name[]);
>  
>  #endif
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 73966a4e8..e0cd8edab 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -1481,6 +1481,9 @@ xlate_ofport_remove(struct ofport_dpif *ofport)
>      ovs_assert(new_xcfg);
>  
>      xport = xport_lookup(new_xcfg, ofport);
> +    if(xport) {
         ^--- space needed here.

> +        tnl_neigh_flush(netdev_get_name(xport->netdev));
> +    }
>      xlate_xport_remove(new_xcfg, xport);


Shouldn't this be in xlate_xport_remove()?

However, as I mentioned in the discussion, the tnl IP address could
be on a device that doesn't belong to OVS at all. For example:

br-tun
   |
   +----- vxlan0 --- remote-ipaddr=192.168.1.10
   |
   +----- vxlan1 --- remote-ipaddr=192.168.2.10

And then you have:
   eth0 --- 192.168.1.1/24

   eth1 --- 192.168.2.1/24

In this case, if we took either eth0 or eth1 down, the cache is not
immediately flushed.

The tnl endpoint must have a valid route, so I suggest to hook the
tnl_neigh_cache_flush into route-table.c which receives a notification
when a route changes. If a route is deleted, we should flush the
device's tnl cache. Doing so, would cover both userspace and kernel
datapath for OVS and non-OVS devices.

What do you think?

Thanks,
fbl


> diff --git a/tests/tunnel.at b/tests/tunnel.at
> index 035c54f67..6d7550724 100644
> --- a/tests/tunnel.at
> +++ b/tests/tunnel.at
> @@ -920,3 +920,46 @@ dnl which is not correct
>  
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([tunnel - neighbor entry add and deletion])
> +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \
> +                    options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \
> +                    options:key=5 ofport_request=1 \
> +                    -- add-port br0 p2 -- set Interface p2 type=gre \
> +                    options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \
> +                    ofport_request=2])
> +AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy], [0])
> +
> +dnl Populate tunnel neighbor cache table
> +AT_CHECK([
> +    ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01
> +    ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01
> +    ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01
> +    ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01
> +], [0], [stdout])
> +
> +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> +10.0.0.1                                      00:00:10:00:00:01   p1
> +10.0.1.1                                      00:00:10:00:01:01   p2
> +10.0.2.1                                      00:00:10:00:02:01   br0
> +20.0.0.1                                      00:00:20:00:00:01   br1
> +])
> +
> +dnl neighbor table after deleting port p1
> +AT_CHECK([ovs-vsctl del-port br0 p1],[0], [stdout])
> +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | grep -w p1 | sort], [0], [dnl
> +])
> +
> +dnl neighbor table after deleting bridge br0
> +AT_CHECK([ovs-vsctl del-br br0],[0], [stdout])
> +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> +20.0.0.1                                      00:00:20:00:00:01   br1
> +])
> +
> +dnl neighbor table after deleting bridge br1
> +AT_CHECK([ovs-vsctl del-br br1],[0], [stdout])
> +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> +])
> +
> +OVS_VSWITCHD_STOP
> +AT_CLEANUP
> -- 
> 2.17.2 (Apple Git-113)
>
Vasu Dasari June 24, 2019, 9:43 p.m. UTC | #3
Thanks Flavio for your inputs.

My comments inline:

*Vasu Dasari*


On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote:

> On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > Say an ARP entry is learnt on a OVS port and when such a port is deleted,
> > learnt entry should be removed from the port. It would have be aged out
> after
> > ARP ageout time. This code will clean up immediately.
> >
> > Added test case(tunnel - neighbor entry add and deletion) in tunnel.at,
> to
> > verify neighbors are added and removed on deletion of a ports and
> bridges.
> >
> > Discussion for this addition is at:
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html
> >
> > Signed-off-by: Vasu Dasari <vdasari@gmail.com>
> > ---
> >  lib/tnl-neigh-cache.c        | 20 +++++++++++++++++
> >  lib/tnl-neigh-cache.h        |  1 +
> >  ofproto/ofproto-dpif-xlate.c |  3 +++
> >  tests/tunnel.at              | 43 ++++++++++++++++++++++++++++++++++++
> >  4 files changed, 67 insertions(+)
> >
> > diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c
> > index b28f9f1bb..8718405db 100644
> > --- a/lib/tnl-neigh-cache.c
> > +++ b/lib/tnl-neigh-cache.c
> > @@ -220,6 +220,26 @@ tnl_neigh_cache_run(void)
> >      }
> >  }
> >
> > +void
> > +tnl_neigh_flush(const char br_name[])
>
> It seems the file uses a convention declaring using IFNAMSIZ
>

Sure. I did not notice that. Will fix this.

>
> > +{
> > +    struct tnl_neigh_entry *neigh;
> > +    bool changed = false;
> > +
> > +    ovs_mutex_lock(&mutex);
> > +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> > +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > +            tnl_neigh_delete(neigh);
> > +            changed = true;
>
> Do you expect to match on additional entries? Otherwise it
> could bail out here.
>
>
Say there are two or more neighbors on the port or on bridge, then by
bailing out we would be missing others. So, will leave it there.


>
> > +        }
> > +    }
> > +    ovs_mutex_unlock(&mutex);
> > +
> > +    if (changed) {
> > +        seq_change(tnl_conf_seq);
> > +    }
> > +}
> > +
> >  static void
> >  tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED,
> >                      const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> > diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h
> > index fee8e6a6f..ded9c2f86 100644
> > --- a/lib/tnl-neigh-cache.h
> > +++ b/lib/tnl-neigh-cache.h
> > @@ -37,5 +37,6 @@ int tnl_neigh_lookup(const char dev_name[], const
> struct in6_addr *dst,
> >                       struct eth_addr *mac);
> >  void tnl_neigh_cache_init(void);
> >  void tnl_neigh_cache_run(void);
> > +void tnl_neigh_flush(const char dev_name[]);
> >
> >  #endif
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index 73966a4e8..e0cd8edab 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -1481,6 +1481,9 @@ xlate_ofport_remove(struct ofport_dpif *ofport)
> >      ovs_assert(new_xcfg);
> >
> >      xport = xport_lookup(new_xcfg, ofport);
> > +    if(xport) {
>          ^--- space needed here.
>
> > +        tnl_neigh_flush(netdev_get_name(xport->netdev));
> > +    }
> >      xlate_xport_remove(new_xcfg, xport);
>
>
> Shouldn't this be in xlate_xport_remove()?
>
>
My first attempt was to put the hook in xlate_xport_remove(). But, this
function is also getting called from another code path as well
xlate_txn_commit() which is getting called as part of type_run() callback
for ofproto_class.

type_run()(from ofproto-dpif.c)
    xlate_txn_commit()
        xlate_xcfg_free()
            xlate_xbridge_remove()
                xlate_xport_remove()

From documentation this function can be called from periodically and hence
flushing ARP entries. So, chose to do the flush when a ofport is removed.

However, as I mentioned in the discussion, the tnl IP address could
> be on a device that doesn't belong to OVS at all. For example:
>
> br-tun
>    |
>    +----- vxlan0 --- remote-ipaddr=192.168.1.10
>    |
>    +----- vxlan1 --- remote-ipaddr=192.168.2.10
>
> And then you have:
>    eth0 --- 192.168.1.1/24
>
>    eth1 --- 192.168.2.1/24
>
> In this case, if we took either eth0 or eth1 down, the cache is not
> immediately flushed.
>
> The tnl endpoint must have a valid route, so I suggest to hook the
> tnl_neigh_cache_flush into route-table.c which receives a notification
> when a route changes. If a route is deleted, we should flush the
> device's tnl cache. Doing so, would cover both userspace and kernel
> datapath for OVS and non-OVS devices.
>
>
I see what you are saying. Let me play with code a bit and resubmit patch.

Thanks
-Vasu

> What do you think?
>
> Thanks,
> fbl
>
>
> > diff --git a/tests/tunnel.at b/tests/tunnel.at
> > index 035c54f67..6d7550724 100644
> > --- a/tests/tunnel.at
> > +++ b/tests/tunnel.at
> > @@ -920,3 +920,46 @@ dnl which is not correct
> >
> >  OVS_VSWITCHD_STOP
> >  AT_CLEANUP
> > +
> > +AT_SETUP([tunnel - neighbor entry add and deletion])
> > +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \
> > +                    options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \
> > +                    options:key=5 ofport_request=1 \
> > +                    -- add-port br0 p2 -- set Interface p2 type=gre \
> > +                    options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \
> > +                    ofport_request=2])
> > +AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy],
> [0])
> > +
> > +dnl Populate tunnel neighbor cache table
> > +AT_CHECK([
> > +    ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01
> > +    ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01
> > +    ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01
> > +    ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01
> > +], [0], [stdout])
> > +
> > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> > +10.0.0.1                                      00:00:10:00:00:01   p1
> > +10.0.1.1                                      00:00:10:00:01:01   p2
> > +10.0.2.1                                      00:00:10:00:02:01   br0
> > +20.0.0.1                                      00:00:20:00:00:01   br1
> > +])
> > +
> > +dnl neighbor table after deleting port p1
> > +AT_CHECK([ovs-vsctl del-port br0 p1],[0], [stdout])
> > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | grep -w p1 | sort],
> [0], [dnl
> > +])
> > +
> > +dnl neighbor table after deleting bridge br0
> > +AT_CHECK([ovs-vsctl del-br br0],[0], [stdout])
> > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> > +20.0.0.1                                      00:00:20:00:00:01   br1
> > +])
> > +
> > +dnl neighbor table after deleting bridge br1
> > +AT_CHECK([ovs-vsctl del-br br1],[0], [stdout])
> > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> > +])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
> > --
> > 2.17.2 (Apple Git-113)
> >
>
Ginwala, Aliasgar via dev June 24, 2019, 10:15 p.m. UTC | #4
On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote:
> > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > +{
> > > +    struct tnl_neigh_entry *neigh;
> > > +    bool changed = false;
> > > +
> > > +    ovs_mutex_lock(&mutex);
> > > +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> > > +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > +            tnl_neigh_delete(neigh);
> > > +            changed = true;
> >
> > Do you expect to match on additional entries? Otherwise it
> > could bail out here.
> >
> >
> Say there are two or more neighbors on the port or on bridge, then by
> bailing out we would be missing others. So, will leave it there.

You're right.

[...]
> > However, as I mentioned in the discussion, the tnl IP address could
> > be on a device that doesn't belong to OVS at all. For example:
[...]
> > The tnl endpoint must have a valid route, so I suggest to hook the
> > tnl_neigh_cache_flush into route-table.c which receives a notification
> > when a route changes. If a route is deleted, we should flush the
> > device's tnl cache. Doing so, would cover both userspace and kernel
> > datapath for OVS and non-OVS devices.
> >
> >
> I see what you are saying. Let me play with code a bit and resubmit patch.

OK, looking forward to it!
Thanks Vasu, 
fbl
Vasu Dasari July 3, 2019, 12:50 a.m. UTC | #5
Hi Flavio,

I am trying to emulate the test case scenario you mentioned earlier, where
in we need to clean you neighbor cache when an external interface goes
down. To study that, I wrote a small script based off of test case
system-traffic.at, "datapath - ping over vxlan tunnel". And this experiment
gave me good experience around the understanding of netdev and kmod
implementation of VxLAN.

Various tests I have performed using the new script include,
1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests.
And in this mode, tested following cases
   a. VxLAN interface being part of OVS.
   b. VxLAN interface external to OVS
   I see that in kernel mode of operation, native tunneling is off, and
hence no ARP entries are learnt in tnl-neigh-cache. Please refer to
ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we are
making here(as part of my commit) are not effected in kernel mode of
operation.

2. ovs-vswitchd started with "--disable-system" option for performing
usermode VxLAN tests. And in this mode, tested following cases
   a. VxLAN interface being part of OVS. This test case works. In this
case, entries are cleanly removed by user deleting the ports from the
bridge.
   b. VxLAN interface external to OVS. I could not get this case to work.

3. ovs-vswitchd started normally(without --disable-system option) for
performing usermode VxLAN tests. And in this mode, tested following cases
   a. VxLAN interface being part of OVS. This test case works. In this
case, entries are cleanly removed by user deleting the ports from the
bridge.
   b. VxLAN interface external to OVS. I could not get this case to work.

I could not 2b and 3b use cases to at all. Do you expect these to work
normally?  The reason I ask this is, as I understand from the code, even
though "ovs/route/show" shows the route to remote vtep, OVS does not know
which ODP port to take to send the packet out of. Please refer to
ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() and
hence packet transmission fails with "native tunnel routing failed".

If you agree that 2b and 3b are not supported use cases, then I can submit
my code changes as per your review comments.

Please let me know what you think of.
-Vasu

*Vasu Dasari*


On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl@sysclose.org> wrote:

> On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote:
> > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > > +{
> > > > +    struct tnl_neigh_entry *neigh;
> > > > +    bool changed = false;
> > > > +
> > > > +    ovs_mutex_lock(&mutex);
> > > > +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> > > > +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > > +            tnl_neigh_delete(neigh);
> > > > +            changed = true;
> > >
> > > Do you expect to match on additional entries? Otherwise it
> > > could bail out here.
> > >
> > >
> > Say there are two or more neighbors on the port or on bridge, then by
> > bailing out we would be missing others. So, will leave it there.
>
> You're right.
>
> [...]
> > > However, as I mentioned in the discussion, the tnl IP address could
> > > be on a device that doesn't belong to OVS at all. For example:
> [...]
> > > The tnl endpoint must have a valid route, so I suggest to hook the
> > > tnl_neigh_cache_flush into route-table.c which receives a notification
> > > when a route changes. If a route is deleted, we should flush the
> > > device's tnl cache. Doing so, would cover both userspace and kernel
> > > datapath for OVS and non-OVS devices.
> > >
> > >
> > I see what you are saying. Let me play with code a bit and resubmit
> patch.
>
> OK, looking forward to it!
> Thanks Vasu,
> fbl
>
Ginwala, Aliasgar via dev July 8, 2019, 1:41 p.m. UTC | #6
On Tue, Jul 02, 2019 at 08:50:16PM -0400, Vasu Dasari wrote:
> Hi Flavio,
> 
> I am trying to emulate the test case scenario you mentioned earlier, where
> in we need to clean you neighbor cache when an external interface goes
> down. To study that, I wrote a small script based off of test case
> system-traffic.at, "datapath - ping over vxlan tunnel". And this experiment
> gave me good experience around the understanding of netdev and kmod
> implementation of VxLAN.
> 
> Various tests I have performed using the new script include,
> 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests.
> And in this mode, tested following cases
>    a. VxLAN interface being part of OVS.
>    b. VxLAN interface external to OVS
>    I see that in kernel mode of operation, native tunneling is off, and
> hence no ARP entries are learnt in tnl-neigh-cache. Please refer to
> ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we are
> making here(as part of my commit) are not effected in kernel mode of
> operation.

In either case my understanding is that OvS doesn't care about the
VXLAN and just hands off the packet to the interface. In another
words, it's the interface's job to encapsulate the traffic and
that's why it doesn't impact on the tnl-neigh-cache.


> 2. ovs-vswitchd started with "--disable-system" option for performing
> usermode VxLAN tests. And in this mode, tested following cases
>    a. VxLAN interface being part of OVS. This test case works. In this
> case, entries are cleanly removed by user deleting the ports from the
> bridge.

Right, this is the scenario you tested first, if I recall correctly.

>    b. VxLAN interface external to OVS. I could not get this case to work.

I think OvS will inject the packet to the VXLAN interface, but I
never tried this as it seems unusual scenario to have.

> 3. ovs-vswitchd started normally(without --disable-system option) for
> performing usermode VxLAN tests. And in this mode, tested following cases
>    a. VxLAN interface being part of OVS. This test case works. In this
> case, entries are cleanly removed by user deleting the ports from the
> bridge.
>    b. VxLAN interface external to OVS. I could not get this case to work.

This looks like a perfect copy from item #2 which is okay, just checking
if there was no copy&paste error somewhere :-)

 
> I could not 2b and 3b use cases to at all. Do you expect these to work
> normally?  The reason I ask this is, as I understand from the code, even
> though "ovs/route/show" shows the route to remote vtep, OVS does not know
> which ODP port to take to send the packet out of. Please refer to
> ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() and
> hence packet transmission fails with "native tunnel routing failed".
> 
> If you agree that 2b and 3b are not supported use cases, then I can submit
> my code changes as per your review comments.

Let me step back a bit because in the scenario I told initially had
the egress device (ethX) with the endpoint address outside of the
bridge while the VXLAN interface was always part of the bridge.
Therefore we had two scenarios to cover with the difference being
the endpoint address being part or not part of the OvS bridge. 

However, looking more closely to the code if the VXLAN is part of the
bridge in userspace, then native tunnel will need the tnl-neigh-cache
to build the headers.  Then it sends out the ARP Request packet to the
datapath only, so it doesn't seem to support endpoint addresses outside
of the OvS. I guess your initial patch covering only interfaces as part
of OvS was good enough then.

Do you agree with that?

Thanks!
fbl


> 
> Please let me know what you think of.
> -Vasu
> 
> *Vasu Dasari*
> 
> 
> On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl@sysclose.org> wrote:
> 
> > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote:
> > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > > > +{
> > > > > +    struct tnl_neigh_entry *neigh;
> > > > > +    bool changed = false;
> > > > > +
> > > > > +    ovs_mutex_lock(&mutex);
> > > > > +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> > > > > +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > > > +            tnl_neigh_delete(neigh);
> > > > > +            changed = true;
> > > >
> > > > Do you expect to match on additional entries? Otherwise it
> > > > could bail out here.
> > > >
> > > >
> > > Say there are two or more neighbors on the port or on bridge, then by
> > > bailing out we would be missing others. So, will leave it there.
> >
> > You're right.
> >
> > [...]
> > > > However, as I mentioned in the discussion, the tnl IP address could
> > > > be on a device that doesn't belong to OVS at all. For example:
> > [...]
> > > > The tnl endpoint must have a valid route, so I suggest to hook the
> > > > tnl_neigh_cache_flush into route-table.c which receives a notification
> > > > when a route changes. If a route is deleted, we should flush the
> > > > device's tnl cache. Doing so, would cover both userspace and kernel
> > > > datapath for OVS and non-OVS devices.
> > > >
> > > >
> > > I see what you are saying. Let me play with code a bit and resubmit
> > patch.
> >
> > OK, looking forward to it!
> > Thanks Vasu,
> > fbl
> >
Vasu Dasari July 9, 2019, 10:51 p.m. UTC | #7
My comments inline:

On Mon, Jul 8, 2019 at 9:41 AM Flavio Leitner <fbl@sysclose.org> wrote:

> On Tue, Jul 02, 2019 at 08:50:16PM -0400, Vasu Dasari wrote:
> > Hi Flavio,
> >
> > I am trying to emulate the test case scenario you mentioned earlier,
> where
> > in we need to clean you neighbor cache when an external interface goes
> > down. To study that, I wrote a small script based off of test case
> > system-traffic.at, "datapath - ping over vxlan tunnel". And this
> experiment
> > gave me good experience around the understanding of netdev and kmod
> > implementation of VxLAN.
> >
> > Various tests I have performed using the new script include,
> > 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests.
> > And in this mode, tested following cases
> >    a. VxLAN interface being part of OVS.
> >    b. VxLAN interface external to OVS
> >    I see that in kernel mode of operation, native tunneling is off, and
> > hence no ARP entries are learnt in tnl-neigh-cache. Please refer to
> > ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we
> are
> > making here(as part of my commit) are not effected in kernel mode of
> > operation.
>
> In either case my understanding is that OvS doesn't care about the
> VXLAN and just hands off the packet to the interface. In another
> words, it's the interface's job to encapsulate the traffic and
> that's why it doesn't impact on the tnl-neigh-cache.
>
>
Agree.

>
> > 2. ovs-vswitchd started with "--disable-system" option for performing
> > usermode VxLAN tests. And in this mode, tested following cases
> >    a. VxLAN interface being part of OVS. This test case works. In this
> > case, entries are cleanly removed by user deleting the ports from the
> > bridge.
>
> Right, this is the scenario you tested first, if I recall correctly.
>
>
Yes.

> >    b. VxLAN interface external to OVS. I could not get this case to work.
>
> I think OvS will inject the packet to the VXLAN interface, but I
> never tried this as it seems unusual scenario to have.
>
> Agree.

> > 3. ovs-vswitchd started normally(without --disable-system option) for
> > performing usermode VxLAN tests. And in this mode, tested following cases
> >    a. VxLAN interface being part of OVS. This test case works. In this
> > case, entries are cleanly removed by user deleting the ports from the
> > bridge.
> >    b. VxLAN interface external to OVS. I could not get this case to work.
>
> This looks like a perfect copy from item #2 which is okay, just checking
> if there was no copy&paste error somewhere :-)
>
>
This case is different from case 2, where in ovs-vswitchd is started
without disable-system option. In this mode, ovs-router learns route from
kernel.



>
> > I could not 2b and 3b use cases to at all. Do you expect these to work
> > normally?  The reason I ask this is, as I understand from the code, even
> > though "ovs/route/show" shows the route to remote vtep, OVS does not know
> > which ODP port to take to send the packet out of. Please refer to
> > ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow()
> and
> > hence packet transmission fails with "native tunnel routing failed".
> >
> > If you agree that 2b and 3b are not supported use cases, then I can
> submit
> > my code changes as per your review comments.
>
> Let me step back a bit because in the scenario I told initially had
> the egress device (ethX) with the endpoint address outside of the
> bridge while the VXLAN interface was always part of the bridge.
> Therefore we had two scenarios to cover with the difference being
> the endpoint address being part or not part of the OvS bridge.
>
> However, looking more closely to the code if the VXLAN is part of the
> bridge in userspace, then native tunnel will need the tnl-neigh-cache
> to build the headers.  Then it sends out the ARP Request packet to the
> datapath only, so it doesn't seem to support endpoint addresses outside
> of the OvS. I guess your initial patch covering only interfaces as part
> of OvS was good enough then.
>
> Do you agree with that?
>

Agree. Will send a patch with yours and Ben's comments.

Thanks for the review.
-Vasu


> Thanks!
> fbl
>
>
> >
> > Please let me know what you think of.
> > -Vasu
> >
> > *Vasu Dasari*
> >
> >
> > On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl@sysclose.org> wrote:
> >
> > > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> > > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org>
> wrote:
> > > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > > > > +{
> > > > > > +    struct tnl_neigh_entry *neigh;
> > > > > > +    bool changed = false;
> > > > > > +
> > > > > > +    ovs_mutex_lock(&mutex);
> > > > > > +    CMAP_FOR_EACH(neigh, cmap_node, &table) {
> > > > > > +        if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > > > > +            tnl_neigh_delete(neigh);
> > > > > > +            changed = true;
> > > > >
> > > > > Do you expect to match on additional entries? Otherwise it
> > > > > could bail out here.
> > > > >
> > > > >
> > > > Say there are two or more neighbors on the port or on bridge, then by
> > > > bailing out we would be missing others. So, will leave it there.
> > >
> > > You're right.
> > >
> > > [...]
> > > > > However, as I mentioned in the discussion, the tnl IP address could
> > > > > be on a device that doesn't belong to OVS at all. For example:
> > > [...]
> > > > > The tnl endpoint must have a valid route, so I suggest to hook the
> > > > > tnl_neigh_cache_flush into route-table.c which receives a
> notification
> > > > > when a route changes. If a route is deleted, we should flush the
> > > > > device's tnl cache. Doing so, would cover both userspace and kernel
> > > > > datapath for OVS and non-OVS devices.
> > > > >
> > > > >
> > > > I see what you are saying. Let me play with code a bit and resubmit
> > > patch.
> > >
> > > OK, looking forward to it!
> > > Thanks Vasu,
> > > fbl
> > >
>

Patch
diff mbox series

diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c
index b28f9f1bb..8718405db 100644
--- a/lib/tnl-neigh-cache.c
+++ b/lib/tnl-neigh-cache.c
@@ -220,6 +220,26 @@  tnl_neigh_cache_run(void)
     }
 }
 
+void
+tnl_neigh_flush(const char br_name[])
+{
+    struct tnl_neigh_entry *neigh;
+    bool changed = false;
+
+    ovs_mutex_lock(&mutex);
+    CMAP_FOR_EACH(neigh, cmap_node, &table) {
+        if (nullable_string_is_equal(neigh->br_name, br_name)) {
+            tnl_neigh_delete(neigh);
+            changed = true;
+        }
+    }
+    ovs_mutex_unlock(&mutex);
+
+    if (changed) {
+        seq_change(tnl_conf_seq);
+    }
+}
+
 static void
 tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED,
                     const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h
index fee8e6a6f..ded9c2f86 100644
--- a/lib/tnl-neigh-cache.h
+++ b/lib/tnl-neigh-cache.h
@@ -37,5 +37,6 @@  int tnl_neigh_lookup(const char dev_name[], const struct in6_addr *dst,
                      struct eth_addr *mac);
 void tnl_neigh_cache_init(void);
 void tnl_neigh_cache_run(void);
+void tnl_neigh_flush(const char dev_name[]);
 
 #endif
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 73966a4e8..e0cd8edab 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -1481,6 +1481,9 @@  xlate_ofport_remove(struct ofport_dpif *ofport)
     ovs_assert(new_xcfg);
 
     xport = xport_lookup(new_xcfg, ofport);
+    if(xport) {
+        tnl_neigh_flush(netdev_get_name(xport->netdev));
+    }
     xlate_xport_remove(new_xcfg, xport);
 }
 
diff --git a/tests/tunnel.at b/tests/tunnel.at
index 035c54f67..6d7550724 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -920,3 +920,46 @@  dnl which is not correct
 
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([tunnel - neighbor entry add and deletion])
+OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \
+                    options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \
+                    options:key=5 ofport_request=1 \
+                    -- add-port br0 p2 -- set Interface p2 type=gre \
+                    options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \
+                    ofport_request=2])
+AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy], [0])
+
+dnl Populate tunnel neighbor cache table
+AT_CHECK([
+    ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01
+    ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01
+    ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01
+    ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01
+], [0], [stdout])
+
+AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
+10.0.0.1                                      00:00:10:00:00:01   p1
+10.0.1.1                                      00:00:10:00:01:01   p2
+10.0.2.1                                      00:00:10:00:02:01   br0
+20.0.0.1                                      00:00:20:00:00:01   br1
+])
+
+dnl neighbor table after deleting port p1
+AT_CHECK([ovs-vsctl del-port br0 p1],[0], [stdout])
+AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | grep -w p1 | sort], [0], [dnl
+])
+
+dnl neighbor table after deleting bridge br0
+AT_CHECK([ovs-vsctl del-br br0],[0], [stdout])
+AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
+20.0.0.1                                      00:00:20:00:00:01   br1
+])
+
+dnl neighbor table after deleting bridge br1
+AT_CHECK([ovs-vsctl del-br br1],[0], [stdout])
+AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP