Message ID | 20190620030207.17383-1-vdasari@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted | expand |
Bleep bloop. Greetings Vasu Dasari, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Improper whitespace around control block #39 FILE: lib/tnl-neigh-cache.c:230: CMAP_FOR_EACH(neigh, cmap_node, &table) { ERROR: Improper whitespace around control block #74 FILE: ofproto/ofproto-dpif-xlate.c:1484: if(xport) { Lines checked: 133, Warnings: 0, Errors: 2 Please check this out. If you feel there has been an error, please email aconole@bytheb.org Thanks, 0-day Robot
On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote: > Say an ARP entry is learnt on a OVS port and when such a port is deleted, > learnt entry should be removed from the port. It would have be aged out after > ARP ageout time. This code will clean up immediately. > > Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to > verify neighbors are added and removed on deletion of a ports and bridges. > > Discussion for this addition is at: > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html > > Signed-off-by: Vasu Dasari <vdasari@gmail.com> > --- > lib/tnl-neigh-cache.c | 20 +++++++++++++++++ > lib/tnl-neigh-cache.h | 1 + > ofproto/ofproto-dpif-xlate.c | 3 +++ > tests/tunnel.at | 43 ++++++++++++++++++++++++++++++++++++ > 4 files changed, 67 insertions(+) > > diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c > index b28f9f1bb..8718405db 100644 > --- a/lib/tnl-neigh-cache.c > +++ b/lib/tnl-neigh-cache.c > @@ -220,6 +220,26 @@ tnl_neigh_cache_run(void) > } > } > > +void > +tnl_neigh_flush(const char br_name[]) It seems the file uses a convention declaring using IFNAMSIZ > +{ > + struct tnl_neigh_entry *neigh; > + bool changed = false; > + > + ovs_mutex_lock(&mutex); > + CMAP_FOR_EACH(neigh, cmap_node, &table) { > + if (nullable_string_is_equal(neigh->br_name, br_name)) { > + tnl_neigh_delete(neigh); > + changed = true; Do you expect to match on additional entries? Otherwise it could bail out here. > + } > + } > + ovs_mutex_unlock(&mutex); > + > + if (changed) { > + seq_change(tnl_conf_seq); > + } > +} > + > static void > tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED, > const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) > diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h > index fee8e6a6f..ded9c2f86 100644 > --- a/lib/tnl-neigh-cache.h > +++ b/lib/tnl-neigh-cache.h > @@ -37,5 +37,6 @@ int tnl_neigh_lookup(const char dev_name[], const struct in6_addr *dst, > struct eth_addr *mac); > void tnl_neigh_cache_init(void); > void tnl_neigh_cache_run(void); > +void tnl_neigh_flush(const char dev_name[]); > > #endif > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > index 73966a4e8..e0cd8edab 100644 > --- a/ofproto/ofproto-dpif-xlate.c > +++ b/ofproto/ofproto-dpif-xlate.c > @@ -1481,6 +1481,9 @@ xlate_ofport_remove(struct ofport_dpif *ofport) > ovs_assert(new_xcfg); > > xport = xport_lookup(new_xcfg, ofport); > + if(xport) { ^--- space needed here. > + tnl_neigh_flush(netdev_get_name(xport->netdev)); > + } > xlate_xport_remove(new_xcfg, xport); Shouldn't this be in xlate_xport_remove()? However, as I mentioned in the discussion, the tnl IP address could be on a device that doesn't belong to OVS at all. For example: br-tun | +----- vxlan0 --- remote-ipaddr=192.168.1.10 | +----- vxlan1 --- remote-ipaddr=192.168.2.10 And then you have: eth0 --- 192.168.1.1/24 eth1 --- 192.168.2.1/24 In this case, if we took either eth0 or eth1 down, the cache is not immediately flushed. The tnl endpoint must have a valid route, so I suggest to hook the tnl_neigh_cache_flush into route-table.c which receives a notification when a route changes. If a route is deleted, we should flush the device's tnl cache. Doing so, would cover both userspace and kernel datapath for OVS and non-OVS devices. What do you think? Thanks, fbl > diff --git a/tests/tunnel.at b/tests/tunnel.at > index 035c54f67..6d7550724 100644 > --- a/tests/tunnel.at > +++ b/tests/tunnel.at > @@ -920,3 +920,46 @@ dnl which is not correct > > OVS_VSWITCHD_STOP > AT_CLEANUP > + > +AT_SETUP([tunnel - neighbor entry add and deletion]) > +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \ > + options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \ > + options:key=5 ofport_request=1 \ > + -- add-port br0 p2 -- set Interface p2 type=gre \ > + options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \ > + ofport_request=2]) > +AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy], [0]) > + > +dnl Populate tunnel neighbor cache table > +AT_CHECK([ > + ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01 > + ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01 > + ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01 > + ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01 > +], [0], [stdout]) > + > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > +10.0.0.1 00:00:10:00:00:01 p1 > +10.0.1.1 00:00:10:00:01:01 p2 > +10.0.2.1 00:00:10:00:02:01 br0 > +20.0.0.1 00:00:20:00:00:01 br1 > +]) > + > +dnl neighbor table after deleting port p1 > +AT_CHECK([ovs-vsctl del-port br0 p1],[0], [stdout]) > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | grep -w p1 | sort], [0], [dnl > +]) > + > +dnl neighbor table after deleting bridge br0 > +AT_CHECK([ovs-vsctl del-br br0],[0], [stdout]) > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > +20.0.0.1 00:00:20:00:00:01 br1 > +]) > + > +dnl neighbor table after deleting bridge br1 > +AT_CHECK([ovs-vsctl del-br br1],[0], [stdout]) > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > +]) > + > +OVS_VSWITCHD_STOP > +AT_CLEANUP > -- > 2.17.2 (Apple Git-113) >
Thanks Flavio for your inputs. My comments inline: *Vasu Dasari* On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote: > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote: > > Say an ARP entry is learnt on a OVS port and when such a port is deleted, > > learnt entry should be removed from the port. It would have be aged out > after > > ARP ageout time. This code will clean up immediately. > > > > Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, > to > > verify neighbors are added and removed on deletion of a ports and > bridges. > > > > Discussion for this addition is at: > > https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html > > > > Signed-off-by: Vasu Dasari <vdasari@gmail.com> > > --- > > lib/tnl-neigh-cache.c | 20 +++++++++++++++++ > > lib/tnl-neigh-cache.h | 1 + > > ofproto/ofproto-dpif-xlate.c | 3 +++ > > tests/tunnel.at | 43 ++++++++++++++++++++++++++++++++++++ > > 4 files changed, 67 insertions(+) > > > > diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c > > index b28f9f1bb..8718405db 100644 > > --- a/lib/tnl-neigh-cache.c > > +++ b/lib/tnl-neigh-cache.c > > @@ -220,6 +220,26 @@ tnl_neigh_cache_run(void) > > } > > } > > > > +void > > +tnl_neigh_flush(const char br_name[]) > > It seems the file uses a convention declaring using IFNAMSIZ > Sure. I did not notice that. Will fix this. > > > +{ > > + struct tnl_neigh_entry *neigh; > > + bool changed = false; > > + > > + ovs_mutex_lock(&mutex); > > + CMAP_FOR_EACH(neigh, cmap_node, &table) { > > + if (nullable_string_is_equal(neigh->br_name, br_name)) { > > + tnl_neigh_delete(neigh); > > + changed = true; > > Do you expect to match on additional entries? Otherwise it > could bail out here. > > Say there are two or more neighbors on the port or on bridge, then by bailing out we would be missing others. So, will leave it there. > > > + } > > + } > > + ovs_mutex_unlock(&mutex); > > + > > + if (changed) { > > + seq_change(tnl_conf_seq); > > + } > > +} > > + > > static void > > tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED, > > const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) > > diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h > > index fee8e6a6f..ded9c2f86 100644 > > --- a/lib/tnl-neigh-cache.h > > +++ b/lib/tnl-neigh-cache.h > > @@ -37,5 +37,6 @@ int tnl_neigh_lookup(const char dev_name[], const > struct in6_addr *dst, > > struct eth_addr *mac); > > void tnl_neigh_cache_init(void); > > void tnl_neigh_cache_run(void); > > +void tnl_neigh_flush(const char dev_name[]); > > > > #endif > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > > index 73966a4e8..e0cd8edab 100644 > > --- a/ofproto/ofproto-dpif-xlate.c > > +++ b/ofproto/ofproto-dpif-xlate.c > > @@ -1481,6 +1481,9 @@ xlate_ofport_remove(struct ofport_dpif *ofport) > > ovs_assert(new_xcfg); > > > > xport = xport_lookup(new_xcfg, ofport); > > + if(xport) { > ^--- space needed here. > > > + tnl_neigh_flush(netdev_get_name(xport->netdev)); > > + } > > xlate_xport_remove(new_xcfg, xport); > > > Shouldn't this be in xlate_xport_remove()? > > My first attempt was to put the hook in xlate_xport_remove(). But, this function is also getting called from another code path as well xlate_txn_commit() which is getting called as part of type_run() callback for ofproto_class. type_run()(from ofproto-dpif.c) xlate_txn_commit() xlate_xcfg_free() xlate_xbridge_remove() xlate_xport_remove() From documentation this function can be called from periodically and hence flushing ARP entries. So, chose to do the flush when a ofport is removed. However, as I mentioned in the discussion, the tnl IP address could > be on a device that doesn't belong to OVS at all. For example: > > br-tun > | > +----- vxlan0 --- remote-ipaddr=192.168.1.10 > | > +----- vxlan1 --- remote-ipaddr=192.168.2.10 > > And then you have: > eth0 --- 192.168.1.1/24 > > eth1 --- 192.168.2.1/24 > > In this case, if we took either eth0 or eth1 down, the cache is not > immediately flushed. > > The tnl endpoint must have a valid route, so I suggest to hook the > tnl_neigh_cache_flush into route-table.c which receives a notification > when a route changes. If a route is deleted, we should flush the > device's tnl cache. Doing so, would cover both userspace and kernel > datapath for OVS and non-OVS devices. > > I see what you are saying. Let me play with code a bit and resubmit patch. Thanks -Vasu > What do you think? > > Thanks, > fbl > > > > diff --git a/tests/tunnel.at b/tests/tunnel.at > > index 035c54f67..6d7550724 100644 > > --- a/tests/tunnel.at > > +++ b/tests/tunnel.at > > @@ -920,3 +920,46 @@ dnl which is not correct > > > > OVS_VSWITCHD_STOP > > AT_CLEANUP > > + > > +AT_SETUP([tunnel - neighbor entry add and deletion]) > > +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \ > > + options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \ > > + options:key=5 ofport_request=1 \ > > + -- add-port br0 p2 -- set Interface p2 type=gre \ > > + options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \ > > + ofport_request=2]) > > +AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy], > [0]) > > + > > +dnl Populate tunnel neighbor cache table > > +AT_CHECK([ > > + ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01 > > + ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01 > > + ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01 > > + ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01 > > +], [0], [stdout]) > > + > > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > > +10.0.0.1 00:00:10:00:00:01 p1 > > +10.0.1.1 00:00:10:00:01:01 p2 > > +10.0.2.1 00:00:10:00:02:01 br0 > > +20.0.0.1 00:00:20:00:00:01 br1 > > +]) > > + > > +dnl neighbor table after deleting port p1 > > +AT_CHECK([ovs-vsctl del-port br0 p1],[0], [stdout]) > > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | grep -w p1 | sort], > [0], [dnl > > +]) > > + > > +dnl neighbor table after deleting bridge br0 > > +AT_CHECK([ovs-vsctl del-br br0],[0], [stdout]) > > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > > +20.0.0.1 00:00:20:00:00:01 br1 > > +]) > > + > > +dnl neighbor table after deleting bridge br1 > > +AT_CHECK([ovs-vsctl del-br br1],[0], [stdout]) > > +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl > > +]) > > + > > +OVS_VSWITCHD_STOP > > +AT_CLEANUP > > -- > > 2.17.2 (Apple Git-113) > > >
On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote: > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote: > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote: > > > +{ > > > + struct tnl_neigh_entry *neigh; > > > + bool changed = false; > > > + > > > + ovs_mutex_lock(&mutex); > > > + CMAP_FOR_EACH(neigh, cmap_node, &table) { > > > + if (nullable_string_is_equal(neigh->br_name, br_name)) { > > > + tnl_neigh_delete(neigh); > > > + changed = true; > > > > Do you expect to match on additional entries? Otherwise it > > could bail out here. > > > > > Say there are two or more neighbors on the port or on bridge, then by > bailing out we would be missing others. So, will leave it there. You're right. [...] > > However, as I mentioned in the discussion, the tnl IP address could > > be on a device that doesn't belong to OVS at all. For example: [...] > > The tnl endpoint must have a valid route, so I suggest to hook the > > tnl_neigh_cache_flush into route-table.c which receives a notification > > when a route changes. If a route is deleted, we should flush the > > device's tnl cache. Doing so, would cover both userspace and kernel > > datapath for OVS and non-OVS devices. > > > > > I see what you are saying. Let me play with code a bit and resubmit patch. OK, looking forward to it! Thanks Vasu, fbl
Hi Flavio, I am trying to emulate the test case scenario you mentioned earlier, where in we need to clean you neighbor cache when an external interface goes down. To study that, I wrote a small script based off of test case system-traffic.at, "datapath - ping over vxlan tunnel". And this experiment gave me good experience around the understanding of netdev and kmod implementation of VxLAN. Various tests I have performed using the new script include, 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests. And in this mode, tested following cases a. VxLAN interface being part of OVS. b. VxLAN interface external to OVS I see that in kernel mode of operation, native tunneling is off, and hence no ARP entries are learnt in tnl-neigh-cache. Please refer to ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we are making here(as part of my commit) are not effected in kernel mode of operation. 2. ovs-vswitchd started with "--disable-system" option for performing usermode VxLAN tests. And in this mode, tested following cases a. VxLAN interface being part of OVS. This test case works. In this case, entries are cleanly removed by user deleting the ports from the bridge. b. VxLAN interface external to OVS. I could not get this case to work. 3. ovs-vswitchd started normally(without --disable-system option) for performing usermode VxLAN tests. And in this mode, tested following cases a. VxLAN interface being part of OVS. This test case works. In this case, entries are cleanly removed by user deleting the ports from the bridge. b. VxLAN interface external to OVS. I could not get this case to work. I could not 2b and 3b use cases to at all. Do you expect these to work normally? The reason I ask this is, as I understand from the code, even though "ovs/route/show" shows the route to remote vtep, OVS does not know which ODP port to take to send the packet out of. Please refer to ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() and hence packet transmission fails with "native tunnel routing failed". If you agree that 2b and 3b are not supported use cases, then I can submit my code changes as per your review comments. Please let me know what you think of. -Vasu *Vasu Dasari* On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl@sysclose.org> wrote: > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote: > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote: > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote: > > > > +{ > > > > + struct tnl_neigh_entry *neigh; > > > > + bool changed = false; > > > > + > > > > + ovs_mutex_lock(&mutex); > > > > + CMAP_FOR_EACH(neigh, cmap_node, &table) { > > > > + if (nullable_string_is_equal(neigh->br_name, br_name)) { > > > > + tnl_neigh_delete(neigh); > > > > + changed = true; > > > > > > Do you expect to match on additional entries? Otherwise it > > > could bail out here. > > > > > > > > Say there are two or more neighbors on the port or on bridge, then by > > bailing out we would be missing others. So, will leave it there. > > You're right. > > [...] > > > However, as I mentioned in the discussion, the tnl IP address could > > > be on a device that doesn't belong to OVS at all. For example: > [...] > > > The tnl endpoint must have a valid route, so I suggest to hook the > > > tnl_neigh_cache_flush into route-table.c which receives a notification > > > when a route changes. If a route is deleted, we should flush the > > > device's tnl cache. Doing so, would cover both userspace and kernel > > > datapath for OVS and non-OVS devices. > > > > > > > > I see what you are saying. Let me play with code a bit and resubmit > patch. > > OK, looking forward to it! > Thanks Vasu, > fbl >
On Tue, Jul 02, 2019 at 08:50:16PM -0400, Vasu Dasari wrote: > Hi Flavio, > > I am trying to emulate the test case scenario you mentioned earlier, where > in we need to clean you neighbor cache when an external interface goes > down. To study that, I wrote a small script based off of test case > system-traffic.at, "datapath - ping over vxlan tunnel". And this experiment > gave me good experience around the understanding of netdev and kmod > implementation of VxLAN. > > Various tests I have performed using the new script include, > 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests. > And in this mode, tested following cases > a. VxLAN interface being part of OVS. > b. VxLAN interface external to OVS > I see that in kernel mode of operation, native tunneling is off, and > hence no ARP entries are learnt in tnl-neigh-cache. Please refer to > ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we are > making here(as part of my commit) are not effected in kernel mode of > operation. In either case my understanding is that OvS doesn't care about the VXLAN and just hands off the packet to the interface. In another words, it's the interface's job to encapsulate the traffic and that's why it doesn't impact on the tnl-neigh-cache. > 2. ovs-vswitchd started with "--disable-system" option for performing > usermode VxLAN tests. And in this mode, tested following cases > a. VxLAN interface being part of OVS. This test case works. In this > case, entries are cleanly removed by user deleting the ports from the > bridge. Right, this is the scenario you tested first, if I recall correctly. > b. VxLAN interface external to OVS. I could not get this case to work. I think OvS will inject the packet to the VXLAN interface, but I never tried this as it seems unusual scenario to have. > 3. ovs-vswitchd started normally(without --disable-system option) for > performing usermode VxLAN tests. And in this mode, tested following cases > a. VxLAN interface being part of OVS. This test case works. In this > case, entries are cleanly removed by user deleting the ports from the > bridge. > b. VxLAN interface external to OVS. I could not get this case to work. This looks like a perfect copy from item #2 which is okay, just checking if there was no copy&paste error somewhere :-) > I could not 2b and 3b use cases to at all. Do you expect these to work > normally? The reason I ask this is, as I understand from the code, even > though "ovs/route/show" shows the route to remote vtep, OVS does not know > which ODP port to take to send the packet out of. Please refer to > ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() and > hence packet transmission fails with "native tunnel routing failed". > > If you agree that 2b and 3b are not supported use cases, then I can submit > my code changes as per your review comments. Let me step back a bit because in the scenario I told initially had the egress device (ethX) with the endpoint address outside of the bridge while the VXLAN interface was always part of the bridge. Therefore we had two scenarios to cover with the difference being the endpoint address being part or not part of the OvS bridge. However, looking more closely to the code if the VXLAN is part of the bridge in userspace, then native tunnel will need the tnl-neigh-cache to build the headers. Then it sends out the ARP Request packet to the datapath only, so it doesn't seem to support endpoint addresses outside of the OvS. I guess your initial patch covering only interfaces as part of OvS was good enough then. Do you agree with that? Thanks! fbl > > Please let me know what you think of. > -Vasu > > *Vasu Dasari* > > > On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl@sysclose.org> wrote: > > > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote: > > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> wrote: > > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote: > > > > > +{ > > > > > + struct tnl_neigh_entry *neigh; > > > > > + bool changed = false; > > > > > + > > > > > + ovs_mutex_lock(&mutex); > > > > > + CMAP_FOR_EACH(neigh, cmap_node, &table) { > > > > > + if (nullable_string_is_equal(neigh->br_name, br_name)) { > > > > > + tnl_neigh_delete(neigh); > > > > > + changed = true; > > > > > > > > Do you expect to match on additional entries? Otherwise it > > > > could bail out here. > > > > > > > > > > > Say there are two or more neighbors on the port or on bridge, then by > > > bailing out we would be missing others. So, will leave it there. > > > > You're right. > > > > [...] > > > > However, as I mentioned in the discussion, the tnl IP address could > > > > be on a device that doesn't belong to OVS at all. For example: > > [...] > > > > The tnl endpoint must have a valid route, so I suggest to hook the > > > > tnl_neigh_cache_flush into route-table.c which receives a notification > > > > when a route changes. If a route is deleted, we should flush the > > > > device's tnl cache. Doing so, would cover both userspace and kernel > > > > datapath for OVS and non-OVS devices. > > > > > > > > > > > I see what you are saying. Let me play with code a bit and resubmit > > patch. > > > > OK, looking forward to it! > > Thanks Vasu, > > fbl > >
My comments inline: On Mon, Jul 8, 2019 at 9:41 AM Flavio Leitner <fbl@sysclose.org> wrote: > On Tue, Jul 02, 2019 at 08:50:16PM -0400, Vasu Dasari wrote: > > Hi Flavio, > > > > I am trying to emulate the test case scenario you mentioned earlier, > where > > in we need to clean you neighbor cache when an external interface goes > > down. To study that, I wrote a small script based off of test case > > system-traffic.at, "datapath - ping over vxlan tunnel". And this > experiment > > gave me good experience around the understanding of netdev and kmod > > implementation of VxLAN. > > > > Various tests I have performed using the new script include, > > 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests. > > And in this mode, tested following cases > > a. VxLAN interface being part of OVS. > > b. VxLAN interface external to OVS > > I see that in kernel mode of operation, native tunneling is off, and > > hence no ARP entries are learnt in tnl-neigh-cache. Please refer to > > ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we > are > > making here(as part of my commit) are not effected in kernel mode of > > operation. > > In either case my understanding is that OvS doesn't care about the > VXLAN and just hands off the packet to the interface. In another > words, it's the interface's job to encapsulate the traffic and > that's why it doesn't impact on the tnl-neigh-cache. > > Agree. > > > 2. ovs-vswitchd started with "--disable-system" option for performing > > usermode VxLAN tests. And in this mode, tested following cases > > a. VxLAN interface being part of OVS. This test case works. In this > > case, entries are cleanly removed by user deleting the ports from the > > bridge. > > Right, this is the scenario you tested first, if I recall correctly. > > Yes. > > b. VxLAN interface external to OVS. I could not get this case to work. > > I think OvS will inject the packet to the VXLAN interface, but I > never tried this as it seems unusual scenario to have. > > Agree. > > 3. ovs-vswitchd started normally(without --disable-system option) for > > performing usermode VxLAN tests. And in this mode, tested following cases > > a. VxLAN interface being part of OVS. This test case works. In this > > case, entries are cleanly removed by user deleting the ports from the > > bridge. > > b. VxLAN interface external to OVS. I could not get this case to work. > > This looks like a perfect copy from item #2 which is okay, just checking > if there was no copy&paste error somewhere :-) > > This case is different from case 2, where in ovs-vswitchd is started without disable-system option. In this mode, ovs-router learns route from kernel. > > > I could not 2b and 3b use cases to at all. Do you expect these to work > > normally? The reason I ask this is, as I understand from the code, even > > though "ovs/route/show" shows the route to remote vtep, OVS does not know > > which ODP port to take to send the packet out of. Please refer to > > ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() > and > > hence packet transmission fails with "native tunnel routing failed". > > > > If you agree that 2b and 3b are not supported use cases, then I can > submit > > my code changes as per your review comments. > > Let me step back a bit because in the scenario I told initially had > the egress device (ethX) with the endpoint address outside of the > bridge while the VXLAN interface was always part of the bridge. > Therefore we had two scenarios to cover with the difference being > the endpoint address being part or not part of the OvS bridge. > > However, looking more closely to the code if the VXLAN is part of the > bridge in userspace, then native tunnel will need the tnl-neigh-cache > to build the headers. Then it sends out the ARP Request packet to the > datapath only, so it doesn't seem to support endpoint addresses outside > of the OvS. I guess your initial patch covering only interfaces as part > of OvS was good enough then. > > Do you agree with that? > Agree. Will send a patch with yours and Ben's comments. Thanks for the review. -Vasu > Thanks! > fbl > > > > > > Please let me know what you think of. > > -Vasu > > > > *Vasu Dasari* > > > > > > On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner <fbl@sysclose.org> wrote: > > > > > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote: > > > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner <fbl@sysclose.org> > wrote: > > > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote: > > > > > > +{ > > > > > > + struct tnl_neigh_entry *neigh; > > > > > > + bool changed = false; > > > > > > + > > > > > > + ovs_mutex_lock(&mutex); > > > > > > + CMAP_FOR_EACH(neigh, cmap_node, &table) { > > > > > > + if (nullable_string_is_equal(neigh->br_name, br_name)) { > > > > > > + tnl_neigh_delete(neigh); > > > > > > + changed = true; > > > > > > > > > > Do you expect to match on additional entries? Otherwise it > > > > > could bail out here. > > > > > > > > > > > > > > Say there are two or more neighbors on the port or on bridge, then by > > > > bailing out we would be missing others. So, will leave it there. > > > > > > You're right. > > > > > > [...] > > > > > However, as I mentioned in the discussion, the tnl IP address could > > > > > be on a device that doesn't belong to OVS at all. For example: > > > [...] > > > > > The tnl endpoint must have a valid route, so I suggest to hook the > > > > > tnl_neigh_cache_flush into route-table.c which receives a > notification > > > > > when a route changes. If a route is deleted, we should flush the > > > > > device's tnl cache. Doing so, would cover both userspace and kernel > > > > > datapath for OVS and non-OVS devices. > > > > > > > > > > > > > > I see what you are saying. Let me play with code a bit and resubmit > > > patch. > > > > > > OK, looking forward to it! > > > Thanks Vasu, > > > fbl > > > >
diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c index b28f9f1bb..8718405db 100644 --- a/lib/tnl-neigh-cache.c +++ b/lib/tnl-neigh-cache.c @@ -220,6 +220,26 @@ tnl_neigh_cache_run(void) } } +void +tnl_neigh_flush(const char br_name[]) +{ + struct tnl_neigh_entry *neigh; + bool changed = false; + + ovs_mutex_lock(&mutex); + CMAP_FOR_EACH(neigh, cmap_node, &table) { + if (nullable_string_is_equal(neigh->br_name, br_name)) { + tnl_neigh_delete(neigh); + changed = true; + } + } + ovs_mutex_unlock(&mutex); + + if (changed) { + seq_change(tnl_conf_seq); + } +} + static void tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED, const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED) diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h index fee8e6a6f..ded9c2f86 100644 --- a/lib/tnl-neigh-cache.h +++ b/lib/tnl-neigh-cache.h @@ -37,5 +37,6 @@ int tnl_neigh_lookup(const char dev_name[], const struct in6_addr *dst, struct eth_addr *mac); void tnl_neigh_cache_init(void); void tnl_neigh_cache_run(void); +void tnl_neigh_flush(const char dev_name[]); #endif diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 73966a4e8..e0cd8edab 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -1481,6 +1481,9 @@ xlate_ofport_remove(struct ofport_dpif *ofport) ovs_assert(new_xcfg); xport = xport_lookup(new_xcfg, ofport); + if(xport) { + tnl_neigh_flush(netdev_get_name(xport->netdev)); + } xlate_xport_remove(new_xcfg, xport); } diff --git a/tests/tunnel.at b/tests/tunnel.at index 035c54f67..6d7550724 100644 --- a/tests/tunnel.at +++ b/tests/tunnel.at @@ -920,3 +920,46 @@ dnl which is not correct OVS_VSWITCHD_STOP AT_CLEANUP + +AT_SETUP([tunnel - neighbor entry add and deletion]) +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \ + options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \ + options:key=5 ofport_request=1 \ + -- add-port br0 p2 -- set Interface p2 type=gre \ + options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \ + ofport_request=2]) +AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy], [0]) + +dnl Populate tunnel neighbor cache table +AT_CHECK([ + ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01 + ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01 + ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01 + ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01 +], [0], [stdout]) + +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +10.0.0.1 00:00:10:00:00:01 p1 +10.0.1.1 00:00:10:00:01:01 p2 +10.0.2.1 00:00:10:00:02:01 br0 +20.0.0.1 00:00:20:00:00:01 br1 +]) + +dnl neighbor table after deleting port p1 +AT_CHECK([ovs-vsctl del-port br0 p1],[0], [stdout]) +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | grep -w p1 | sort], [0], [dnl +]) + +dnl neighbor table after deleting bridge br0 +AT_CHECK([ovs-vsctl del-br br0],[0], [stdout]) +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +20.0.0.1 00:00:20:00:00:01 br1 +]) + +dnl neighbor table after deleting bridge br1 +AT_CHECK([ovs-vsctl del-br br1],[0], [stdout]) +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl +]) + +OVS_VSWITCHD_STOP +AT_CLEANUP
Say an ARP entry is learnt on a OVS port and when such a port is deleted, learnt entry should be removed from the port. It would have be aged out after ARP ageout time. This code will clean up immediately. Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to verify neighbors are added and removed on deletion of a ports and bridges. Discussion for this addition is at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html Signed-off-by: Vasu Dasari <vdasari@gmail.com> --- lib/tnl-neigh-cache.c | 20 +++++++++++++++++ lib/tnl-neigh-cache.h | 1 + ofproto/ofproto-dpif-xlate.c | 3 +++ tests/tunnel.at | 43 ++++++++++++++++++++++++++++++++++++ 4 files changed, 67 insertions(+)