diff mbox series

[ovs-dev] Allow to create tunnel ports with the same config

Message ID 20201119033501.798597-1-ihrachys@redhat.com
State Superseded, archived
Headers show
Series [ovs-dev] Allow to create tunnel ports with the same config | expand

Commit Message

Ihar Hrachyshka Nov. 19, 2020, 3:35 a.m. UTC
It's a legal setup where tunnel ports with the same config are created
on different bridges served by Open vSwitch. Specifically, multiple
OVN controllers may emulate multiple chassis running on the same
physical host, in which case they may need to create separate tunnel
ports to connect to the same remote chassis on their respective
bridges.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
---
 ofproto/tunnel.c | 19 ++-----------------
 1 file changed, 2 insertions(+), 17 deletions(-)

Comments

Ben Pfaff Nov. 19, 2020, 5:12 a.m. UTC | #1
On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> It's a legal setup where tunnel ports with the same config are created
> on different bridges served by Open vSwitch. Specifically, multiple
> OVN controllers may emulate multiple chassis running on the same
> physical host, in which case they may need to create separate tunnel
> ports to connect to the same remote chassis on their respective
> bridges.
> 
> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>

That makes sense for sending packets, but which one is supposed to
receive a packet when one arrives for that tunnel?
Ihar Hrachyshka Nov. 19, 2020, 1:31 p.m. UTC | #2
On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
>
> On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > It's a legal setup where tunnel ports with the same config are created
> > on different bridges served by Open vSwitch. Specifically, multiple
> > OVN controllers may emulate multiple chassis running on the same
> > physical host, in which case they may need to create separate tunnel
> > ports to connect to the same remote chassis on their respective
> > bridges.
> >
> > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
>
> That makes sense for sending packets, but which one is supposed to
> receive a packet when one arrives for that tunnel?
>

If we are talking in OVN context, each virtual ovn-controller chassis
on the same host has to have a different IP / port to distinguish
between chassis. I don't think it's a useful configuration to have two
two incoming tunnel ports with the same config for the same IP that
are not served by separate DST IP addresses.

Ihar
Ben Pfaff Nov. 19, 2020, 5:19 p.m. UTC | #3
On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > It's a legal setup where tunnel ports with the same config are created
> > > on different bridges served by Open vSwitch. Specifically, multiple
> > > OVN controllers may emulate multiple chassis running on the same
> > > physical host, in which case they may need to create separate tunnel
> > > ports to connect to the same remote chassis on their respective
> > > bridges.
> > >
> > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> >
> > That makes sense for sending packets, but which one is supposed to
> > receive a packet when one arrives for that tunnel?
> >
> 
> If we are talking in OVN context, each virtual ovn-controller chassis
> on the same host has to have a different IP / port to distinguish
> between chassis. I don't think it's a useful configuration to have two
> two incoming tunnel ports with the same config for the same IP that
> are not served by separate DST IP addresses.

I think that this code rejects tunnel ports with exactly the same
configuration, though.  If the two ports were configured with different
local IP addresses, then this code would not flag a conflict.  Do I
misunderstand?
Ihar Hrachyshka Nov. 20, 2020, 5:44 p.m. UTC | #4
On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <blp@ovn.org> wrote:
>
> On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> > >
> > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > It's a legal setup where tunnel ports with the same config are created
> > > > on different bridges served by Open vSwitch. Specifically, multiple
> > > > OVN controllers may emulate multiple chassis running on the same
> > > > physical host, in which case they may need to create separate tunnel
> > > > ports to connect to the same remote chassis on their respective
> > > > bridges.
> > > >
> > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > >
> > > That makes sense for sending packets, but which one is supposed to
> > > receive a packet when one arrives for that tunnel?
> > >
> >
> > If we are talking in OVN context, each virtual ovn-controller chassis
> > on the same host has to have a different IP / port to distinguish
> > between chassis. I don't think it's a useful configuration to have two
> > two incoming tunnel ports with the same config for the same IP that
> > are not served by separate DST IP addresses.
>
> I think that this code rejects tunnel ports with exactly the same
> configuration, though.  If the two ports were configured with different
> local IP addresses, then this code would not flag a conflict.  Do I
> misunderstand?
>

Local IP addresses are not part of tunnel interface options map. Only
remote_ip and dst_port are. Does it address your concern?

Ihar
Ben Pfaff Nov. 20, 2020, 6:16 p.m. UTC | #5
On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote:
> On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> > > >
> > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > > It's a legal setup where tunnel ports with the same config are created
> > > > > on different bridges served by Open vSwitch. Specifically, multiple
> > > > > OVN controllers may emulate multiple chassis running on the same
> > > > > physical host, in which case they may need to create separate tunnel
> > > > > ports to connect to the same remote chassis on their respective
> > > > > bridges.
> > > > >
> > > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > > >
> > > > That makes sense for sending packets, but which one is supposed to
> > > > receive a packet when one arrives for that tunnel?
> > > >
> > >
> > > If we are talking in OVN context, each virtual ovn-controller chassis
> > > on the same host has to have a different IP / port to distinguish
> > > between chassis. I don't think it's a useful configuration to have two
> > > two incoming tunnel ports with the same config for the same IP that
> > > are not served by separate DST IP addresses.
> >
> > I think that this code rejects tunnel ports with exactly the same
> > configuration, though.  If the two ports were configured with different
> > local IP addresses, then this code would not flag a conflict.  Do I
> > misunderstand?
> >
> 
> Local IP addresses are not part of tunnel interface options map. Only
> remote_ip and dst_port are. Does it address your concern?

I don't see where you're getting that.  Tunnel configuration allows both
local and remote IPs to be specified.  The match structure has slots for
both IP addresses, and both of them may be specified.

If the remote (or local) IP addresses are different, then this warning
won't come up.  And you said that the remote IP addresses are different:
"I don't think it's a useful configuration to have two two incoming
tunnel ports with the same config for the same IP that are not served by
separate DST IP addresses."

tnl_port_receive() looks up the tunnel that should receive a packet.  It
calls into tnl_find().  This calls tnl_find_exact() for each of several
lookup tables.  In the case where a warning would be given here, we know
that tnl_find_exact() would find two matching elements in the map,
because that's the function that we use to know to issue the warning.
If we disable the warning and insert the second match anyway, there will
be two elements in the map with exactly the same key, tnl_find_exact()
will only ever return one of them, thus the other will never receive any
packets.

What am I missing?
Ihar Hrachyshka Nov. 20, 2020, 7:23 p.m. UTC | #6
On Fri, Nov 20, 2020 at 1:16 PM Ben Pfaff <blp@ovn.org> wrote:
>
> On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote:
> > On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <blp@ovn.org> wrote:
> > >
> > > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> > > > >
> > > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > > > It's a legal setup where tunnel ports with the same config are created
> > > > > > on different bridges served by Open vSwitch. Specifically, multiple
> > > > > > OVN controllers may emulate multiple chassis running on the same
> > > > > > physical host, in which case they may need to create separate tunnel
> > > > > > ports to connect to the same remote chassis on their respective
> > > > > > bridges.
> > > > > >
> > > > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > > > >
> > > > > That makes sense for sending packets, but which one is supposed to
> > > > > receive a packet when one arrives for that tunnel?
> > > > >
> > > >
> > > > If we are talking in OVN context, each virtual ovn-controller chassis
> > > > on the same host has to have a different IP / port to distinguish
> > > > between chassis. I don't think it's a useful configuration to have two
> > > > two incoming tunnel ports with the same config for the same IP that
> > > > are not served by separate DST IP addresses.
> > >
> > > I think that this code rejects tunnel ports with exactly the same
> > > configuration, though.  If the two ports were configured with different
> > > local IP addresses, then this code would not flag a conflict.  Do I
> > > misunderstand?
> > >
> >
> > Local IP addresses are not part of tunnel interface options map. Only
> > remote_ip and dst_port are. Does it address your concern?
>
> I don't see where you're getting that.  Tunnel configuration allows both
> local and remote IPs to be specified.  The match structure has slots for
> both IP addresses, and both of them may be specified.
>
> If the remote (or local) IP addresses are different, then this warning
> won't come up.  And you said that the remote IP addresses are different:
> "I don't think it's a useful configuration to have two two incoming
> tunnel ports with the same config for the same IP that are not served by
> separate DST IP addresses."
>
> tnl_port_receive() looks up the tunnel that should receive a packet.  It
> calls into tnl_find().  This calls tnl_find_exact() for each of several
> lookup tables.  In the case where a warning would be given here, we know
> that tnl_find_exact() would find two matching elements in the map,
> because that's the function that we use to know to issue the warning.
> If we disable the warning and insert the second match anyway, there will
> be two elements in the map with exactly the same key, tnl_find_exact()
> will only ever return one of them, thus the other will never receive any
> packets.
>
> What am I missing?
>

Oh I missed that local_ip *may* be configured for a tunnel endpoint,
it's just not configured by OVN controller *right now*. We can deal
with it in OVN controller instead of OVS then. Sorry for the noise,
this patch can be dropped.

Ihar
Ben Pfaff Nov. 20, 2020, 7:32 p.m. UTC | #7
On Fri, Nov 20, 2020 at 02:23:04PM -0500, Ihar Hrachyshka wrote:
> On Fri, Nov 20, 2020 at 1:16 PM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote:
> > > On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <blp@ovn.org> wrote:
> > > >
> > > > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > > > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> > > > > >
> > > > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > > > > It's a legal setup where tunnel ports with the same config are created
> > > > > > > on different bridges served by Open vSwitch. Specifically, multiple
> > > > > > > OVN controllers may emulate multiple chassis running on the same
> > > > > > > physical host, in which case they may need to create separate tunnel
> > > > > > > ports to connect to the same remote chassis on their respective
> > > > > > > bridges.
> > > > > > >
> > > > > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > > > > >
> > > > > > That makes sense for sending packets, but which one is supposed to
> > > > > > receive a packet when one arrives for that tunnel?
> > > > > >
> > > > >
> > > > > If we are talking in OVN context, each virtual ovn-controller chassis
> > > > > on the same host has to have a different IP / port to distinguish
> > > > > between chassis. I don't think it's a useful configuration to have two
> > > > > two incoming tunnel ports with the same config for the same IP that
> > > > > are not served by separate DST IP addresses.
> > > >
> > > > I think that this code rejects tunnel ports with exactly the same
> > > > configuration, though.  If the two ports were configured with different
> > > > local IP addresses, then this code would not flag a conflict.  Do I
> > > > misunderstand?
> > > >
> > >
> > > Local IP addresses are not part of tunnel interface options map. Only
> > > remote_ip and dst_port are. Does it address your concern?
> >
> > I don't see where you're getting that.  Tunnel configuration allows both
> > local and remote IPs to be specified.  The match structure has slots for
> > both IP addresses, and both of them may be specified.
> >
> > If the remote (or local) IP addresses are different, then this warning
> > won't come up.  And you said that the remote IP addresses are different:
> > "I don't think it's a useful configuration to have two two incoming
> > tunnel ports with the same config for the same IP that are not served by
> > separate DST IP addresses."
> >
> > tnl_port_receive() looks up the tunnel that should receive a packet.  It
> > calls into tnl_find().  This calls tnl_find_exact() for each of several
> > lookup tables.  In the case where a warning would be given here, we know
> > that tnl_find_exact() would find two matching elements in the map,
> > because that's the function that we use to know to issue the warning.
> > If we disable the warning and insert the second match anyway, there will
> > be two elements in the map with exactly the same key, tnl_find_exact()
> > will only ever return one of them, thus the other will never receive any
> > packets.
> >
> > What am I missing?
> >
> 
> Oh I missed that local_ip *may* be configured for a tunnel endpoint,
> it's just not configured by OVN controller *right now*. We can deal
> with it in OVN controller instead of OVS then. Sorry for the noise,
> this patch can be dropped.

Oh, wonderful, we understand each other now, and there's nothing
mysterious going on in the code that I didn't know about.  Cool!
Ihar Hrachyshka Dec. 2, 2020, 1:48 a.m. UTC | #8
Hi Ben,

I was trying to make local_ip work for my needs and hit some
unexpected connectivity loss between tunnel endpoints when source IPs
enforced in port config. Perhaps there's something I am missing about
using local_ip with ovs userspace routing. Perhaps you'll be able to
validate my thinking below.

So, for OVN purposes, I would like to have two tunnel ports that
differ in local_ip only (same remote_ip). In OVN test suite, the
following configuration is set up:

(br-int<->br-phys) - <n1> - (br-phys<->br-int)

Each (...) block is a separate (emulated) host with its own OVS/OVN
stack, interconnected through a n1 global bridge that emulates fabric
(everything runs on the same physical host using three sets of OVS
software). Each br-int has a geneve port that has both remote_ip and
local_ip set (point to each other). When br-phys bridges are created,
the following commands are executed to set up userspace IP/routing
stack inside OVS:

ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1

where $ip on two sides belong to the same /$masklen subnet,
$bridge==br-phys. By doing this, OVS route entries are created for the
subnet.

Without local_ip set for tunnel ports, traffic gets tagged on br-int,
and, as per created route entries, sent over br-phys->n1->br-phys
chain. When local_ip are set on tunnel ports, then packets are lost
(they don't reach br-phys).

After some debugging, I realized this happens because
ovs_router_lookup returns false when looking for a proper routing
entry.

This happens because a code path added by
8e4e45887ec3eb5f5833fd7b415b63ff47fc9642 ("ofproto-dpif-xlate: makes
OVS native tunneling honor tunnel-specified source addresses") is
triggered, where an attempt is made to check that the source IP indeed
belongs to the host. For that, the entry added by the ovs/route/add
command above is extracted and then its corresponding route entry
'local' property is checked. Since all route entries added by the
command are set to local==false, the code assumes it's not a local IP
address and it denies to steer the packet forward.

There's probably a reason why ovs/route/add doesn't support setting
local==true. My understanding is that in real life, local routing
table entries are created *implicitly* when a new IP address is added
to an interface. If one were to try to emulate the same behavior in
OVS routing userspace, one would probably have to add corresponding
routing entry points when adding a IP address to the bridge via
netdev-dummy/ip[46]addr commands.

When I modify netdev_dummy_ip4addr to do the following after setting
the IP address for the bridge, the test in question now works:

diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index 71df29184..ee01381d2 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -39,6 +39,7 @@
 #include "pcap-file.h"
 #include "openvswitch/poll-loop.h"
 #include "openvswitch/shash.h"
+#include "ovs-router.h"
 #include "sset.h"
 #include "stream.h"
 #include "unaligned.h"
@@ -1941,6 +1942,12 @@ netdev_dummy_ip4addr(struct unixctl_conn *conn,
int argc OVS_UNUSED,
         error = ip_parse_masked(argv[2], &ip.s_addr, &mask.s_addr);
         if (!error) {
             netdev_dummy_set_in4(netdev, ip, mask);
+
+            // insert local route entry for the new address
+            struct in6_addr ip6;
+            in6_addr_set_mapped_ipv4(&ip6, ip.s_addr);
+            ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any);
+
             unixctl_command_reply(conn, "OK");
         } else {
             unixctl_command_reply_error(conn, error);
@@ -1970,6 +1977,10 @@ netdev_dummy_ip6addr(struct unixctl_conn *conn,
int argc OVS_UNUSED,

             mask = ipv6_create_mask(plen);
             netdev_dummy_set_in6(netdev, &ip6, &mask);
+
+            // insert local route entry for the new address
+            ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any);
+
             unixctl_command_reply(conn, "OK");
         } else {
             unixctl_command_reply_error(conn, error);

(I believe the code should also remove the local entries on IP removed.)

Is there a good reason why netdev-dummy/ip4addr doesn't set local
routing entries for the new IP? Or is it just an oversight in
userspace routing implementation?

Thanks for any suggestions,
Ihar

On Fri, Nov 20, 2020 at 2:32 PM Ben Pfaff <blp@ovn.org> wrote:
>
> On Fri, Nov 20, 2020 at 02:23:04PM -0500, Ihar Hrachyshka wrote:
> > On Fri, Nov 20, 2020 at 1:16 PM Ben Pfaff <blp@ovn.org> wrote:
> > >
> > > On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote:
> > > > On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <blp@ovn.org> wrote:
> > > > >
> > > > > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > > > > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> > > > > > >
> > > > > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > > > > > It's a legal setup where tunnel ports with the same config are created
> > > > > > > > on different bridges served by Open vSwitch. Specifically, multiple
> > > > > > > > OVN controllers may emulate multiple chassis running on the same
> > > > > > > > physical host, in which case they may need to create separate tunnel
> > > > > > > > ports to connect to the same remote chassis on their respective
> > > > > > > > bridges.
> > > > > > > >
> > > > > > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > > > > > >
> > > > > > > That makes sense for sending packets, but which one is supposed to
> > > > > > > receive a packet when one arrives for that tunnel?
> > > > > > >
> > > > > >
> > > > > > If we are talking in OVN context, each virtual ovn-controller chassis
> > > > > > on the same host has to have a different IP / port to distinguish
> > > > > > between chassis. I don't think it's a useful configuration to have two
> > > > > > two incoming tunnel ports with the same config for the same IP that
> > > > > > are not served by separate DST IP addresses.
> > > > >
> > > > > I think that this code rejects tunnel ports with exactly the same
> > > > > configuration, though.  If the two ports were configured with different
> > > > > local IP addresses, then this code would not flag a conflict.  Do I
> > > > > misunderstand?
> > > > >
> > > >
> > > > Local IP addresses are not part of tunnel interface options map. Only
> > > > remote_ip and dst_port are. Does it address your concern?
> > >
> > > I don't see where you're getting that.  Tunnel configuration allows both
> > > local and remote IPs to be specified.  The match structure has slots for
> > > both IP addresses, and both of them may be specified.
> > >
> > > If the remote (or local) IP addresses are different, then this warning
> > > won't come up.  And you said that the remote IP addresses are different:
> > > "I don't think it's a useful configuration to have two two incoming
> > > tunnel ports with the same config for the same IP that are not served by
> > > separate DST IP addresses."
> > >
> > > tnl_port_receive() looks up the tunnel that should receive a packet.  It
> > > calls into tnl_find().  This calls tnl_find_exact() for each of several
> > > lookup tables.  In the case where a warning would be given here, we know
> > > that tnl_find_exact() would find two matching elements in the map,
> > > because that's the function that we use to know to issue the warning.
> > > If we disable the warning and insert the second match anyway, there will
> > > be two elements in the map with exactly the same key, tnl_find_exact()
> > > will only ever return one of them, thus the other will never receive any
> > > packets.
> > >
> > > What am I missing?
> > >
> >
> > Oh I missed that local_ip *may* be configured for a tunnel endpoint,
> > it's just not configured by OVN controller *right now*. We can deal
> > with it in OVN controller instead of OVS then. Sorry for the noise,
> > this patch can be dropped.
>
> Oh, wonderful, we understand each other now, and there's nothing
> mysterious going on in the code that I didn't know about.  Cool!
>
Ben Pfaff Dec. 2, 2020, 6:59 a.m. UTC | #9
On Tue, Dec 01, 2020 at 08:48:56PM -0500, Ihar Hrachyshka wrote:
> Is there a good reason why netdev-dummy/ip4addr doesn't set local
> routing entries for the new IP? Or is it just an oversight in
> userspace routing implementation?

Oh my.  If I ever knew this, I don't now.  I thought it was Pravin who
originally wrote this code, although the history suggests that Ethan was
also involved.

Pravin, Ethan, do you have any thoughts here?  Lots of context below.

> Hi Ben,
> 
> I was trying to make local_ip work for my needs and hit some
> unexpected connectivity loss between tunnel endpoints when source IPs
> enforced in port config. Perhaps there's something I am missing about
> using local_ip with ovs userspace routing. Perhaps you'll be able to
> validate my thinking below.
> 
> So, for OVN purposes, I would like to have two tunnel ports that
> differ in local_ip only (same remote_ip). In OVN test suite, the
> following configuration is set up:
> 
> (br-int<->br-phys) - <n1> - (br-phys<->br-int)
> 
> Each (...) block is a separate (emulated) host with its own OVS/OVN
> stack, interconnected through a n1 global bridge that emulates fabric
> (everything runs on the same physical host using three sets of OVS
> software). Each br-int has a geneve port that has both remote_ip and
> local_ip set (point to each other). When br-phys bridges are created,
> the following commands are executed to set up userspace IP/routing
> stack inside OVS:
> 
> ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
> ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1
> 
> where $ip on two sides belong to the same /$masklen subnet,
> $bridge==br-phys. By doing this, OVS route entries are created for the
> subnet.
> 
> Without local_ip set for tunnel ports, traffic gets tagged on br-int,
> and, as per created route entries, sent over br-phys->n1->br-phys
> chain. When local_ip are set on tunnel ports, then packets are lost
> (they don't reach br-phys).
> 
> After some debugging, I realized this happens because
> ovs_router_lookup returns false when looking for a proper routing
> entry.
> 
> This happens because a code path added by
> 8e4e45887ec3eb5f5833fd7b415b63ff47fc9642 ("ofproto-dpif-xlate: makes
> OVS native tunneling honor tunnel-specified source addresses") is
> triggered, where an attempt is made to check that the source IP indeed
> belongs to the host. For that, the entry added by the ovs/route/add
> command above is extracted and then its corresponding route entry
> 'local' property is checked. Since all route entries added by the
> command are set to local==false, the code assumes it's not a local IP
> address and it denies to steer the packet forward.
> 
> There's probably a reason why ovs/route/add doesn't support setting
> local==true. My understanding is that in real life, local routing
> table entries are created *implicitly* when a new IP address is added
> to an interface. If one were to try to emulate the same behavior in
> OVS routing userspace, one would probably have to add corresponding
> routing entry points when adding a IP address to the bridge via
> netdev-dummy/ip[46]addr commands.
> 
> When I modify netdev_dummy_ip4addr to do the following after setting
> the IP address for the bridge, the test in question now works:
> 
> diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
> index 71df29184..ee01381d2 100644
> --- a/lib/netdev-dummy.c
> +++ b/lib/netdev-dummy.c
> @@ -39,6 +39,7 @@
>  #include "pcap-file.h"
>  #include "openvswitch/poll-loop.h"
>  #include "openvswitch/shash.h"
> +#include "ovs-router.h"
>  #include "sset.h"
>  #include "stream.h"
>  #include "unaligned.h"
> @@ -1941,6 +1942,12 @@ netdev_dummy_ip4addr(struct unixctl_conn *conn,
> int argc OVS_UNUSED,
>          error = ip_parse_masked(argv[2], &ip.s_addr, &mask.s_addr);
>          if (!error) {
>              netdev_dummy_set_in4(netdev, ip, mask);
> +
> +            // insert local route entry for the new address
> +            struct in6_addr ip6;
> +            in6_addr_set_mapped_ipv4(&ip6, ip.s_addr);
> +            ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any);
> +
>              unixctl_command_reply(conn, "OK");
>          } else {
>              unixctl_command_reply_error(conn, error);
> @@ -1970,6 +1977,10 @@ netdev_dummy_ip6addr(struct unixctl_conn *conn,
> int argc OVS_UNUSED,
> 
>              mask = ipv6_create_mask(plen);
>              netdev_dummy_set_in6(netdev, &ip6, &mask);
> +
> +            // insert local route entry for the new address
> +            ovs_router_insert(0, &ip6, 128, true, argv[1], &in6addr_any);
> +
>              unixctl_command_reply(conn, "OK");
>          } else {
>              unixctl_command_reply_error(conn, error);
> 
> (I believe the code should also remove the local entries on IP removed.)
> 
> Is there a good reason why netdev-dummy/ip4addr doesn't set local
> routing entries for the new IP? Or is it just an oversight in
> userspace routing implementation?
> 
> Thanks for any suggestions,
> Ihar
> 
> On Fri, Nov 20, 2020 at 2:32 PM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Fri, Nov 20, 2020 at 02:23:04PM -0500, Ihar Hrachyshka wrote:
> > > On Fri, Nov 20, 2020 at 1:16 PM Ben Pfaff <blp@ovn.org> wrote:
> > > >
> > > > On Fri, Nov 20, 2020 at 12:44:30PM -0500, Ihar Hrachyshka wrote:
> > > > > On Thu, Nov 19, 2020 at 12:19 PM Ben Pfaff <blp@ovn.org> wrote:
> > > > > >
> > > > > > On Thu, Nov 19, 2020 at 08:31:44AM -0500, Ihar Hrachyshka wrote:
> > > > > > > On Thu, Nov 19, 2020 at 12:51 AM Ben Pfaff <blp@ovn.org> wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 18, 2020 at 10:35:01PM -0500, Ihar Hrachyshka wrote:
> > > > > > > > > It's a legal setup where tunnel ports with the same config are created
> > > > > > > > > on different bridges served by Open vSwitch. Specifically, multiple
> > > > > > > > > OVN controllers may emulate multiple chassis running on the same
> > > > > > > > > physical host, in which case they may need to create separate tunnel
> > > > > > > > > ports to connect to the same remote chassis on their respective
> > > > > > > > > bridges.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > > > > > > >
> > > > > > > > That makes sense for sending packets, but which one is supposed to
> > > > > > > > receive a packet when one arrives for that tunnel?
> > > > > > > >
> > > > > > >
> > > > > > > If we are talking in OVN context, each virtual ovn-controller chassis
> > > > > > > on the same host has to have a different IP / port to distinguish
> > > > > > > between chassis. I don't think it's a useful configuration to have two
> > > > > > > two incoming tunnel ports with the same config for the same IP that
> > > > > > > are not served by separate DST IP addresses.
> > > > > >
> > > > > > I think that this code rejects tunnel ports with exactly the same
> > > > > > configuration, though.  If the two ports were configured with different
> > > > > > local IP addresses, then this code would not flag a conflict.  Do I
> > > > > > misunderstand?
> > > > > >
> > > > >
> > > > > Local IP addresses are not part of tunnel interface options map. Only
> > > > > remote_ip and dst_port are. Does it address your concern?
> > > >
> > > > I don't see where you're getting that.  Tunnel configuration allows both
> > > > local and remote IPs to be specified.  The match structure has slots for
> > > > both IP addresses, and both of them may be specified.
> > > >
> > > > If the remote (or local) IP addresses are different, then this warning
> > > > won't come up.  And you said that the remote IP addresses are different:
> > > > "I don't think it's a useful configuration to have two two incoming
> > > > tunnel ports with the same config for the same IP that are not served by
> > > > separate DST IP addresses."
> > > >
> > > > tnl_port_receive() looks up the tunnel that should receive a packet.  It
> > > > calls into tnl_find().  This calls tnl_find_exact() for each of several
> > > > lookup tables.  In the case where a warning would be given here, we know
> > > > that tnl_find_exact() would find two matching elements in the map,
> > > > because that's the function that we use to know to issue the warning.
> > > > If we disable the warning and insert the second match anyway, there will
> > > > be two elements in the map with exactly the same key, tnl_find_exact()
> > > > will only ever return one of them, thus the other will never receive any
> > > > packets.
> > > >
> > > > What am I missing?
> > > >
> > >
> > > Oh I missed that local_ip *may* be configured for a tunnel endpoint,
> > > it's just not configured by OVN controller *right now*. We can deal
> > > with it in OVN controller instead of OVS then. Sorry for the noise,
> > > this patch can be dropped.
> >
> > Oh, wonderful, we understand each other now, and there's nothing
> > mysterious going on in the code that I didn't know about.  Cool!
> >
>
diff mbox series

Patch

diff --git a/ofproto/tunnel.c b/ofproto/tunnel.c
index 3455ed233..fc45d40f9 100644
--- a/ofproto/tunnel.c
+++ b/ofproto/tunnel.c
@@ -148,11 +148,11 @@  ofproto_tunnel_init(void)
 
 static bool
 tnl_port_add__(const struct ofport_dpif *ofport, const struct netdev *netdev,
-               odp_port_t odp_port, bool warn, bool native_tnl, const char name[])
+               odp_port_t odp_port, bool warn OVS_UNUSED, bool native_tnl,
+               const char name[])
     OVS_REQ_WRLOCK(rwlock)
 {
     const struct netdev_tunnel_config *cfg;
-    struct tnl_port *existing_port;
     struct tnl_port *tnl_port;
     struct hmap **map;
 
@@ -174,21 +174,6 @@  tnl_port_add__(const struct ofport_dpif *ofport, const struct netdev *netdev,
     tnl_port->match.pt_mode = netdev_get_pt_mode(netdev);
 
     map = tnl_match_map(&tnl_port->match);
-    existing_port = tnl_find_exact(&tnl_port->match, *map);
-    if (existing_port) {
-        if (warn) {
-            struct ds ds = DS_EMPTY_INITIALIZER;
-            tnl_match_fmt(&tnl_port->match, &ds);
-            VLOG_WARN("%s: attempting to add tunnel port with same config as "
-                      "port '%s' (%s)", tnl_port_get_name(tnl_port),
-                      tnl_port_get_name(existing_port), ds_cstr(&ds));
-            ds_destroy(&ds);
-        }
-        netdev_close(tnl_port->netdev);
-        free(tnl_port);
-        return false;
-    }
-
     hmap_insert(ofport_map, &tnl_port->ofport_node, hash_pointer(ofport, 0));
 
     if (!*map) {