mbox series

[net-next,0/6] Support PMTU discovery with bridged UDP tunnels

Message ID cover.1596487323.git.sbrivio@redhat.com
Headers show
Series Support PMTU discovery with bridged UDP tunnels | expand

Message

Stefano Brivio Aug. 3, 2020, 8:52 p.m. UTC
Currently, PMTU discovery for UDP tunnels only works if packets are
routed to the encapsulating interfaces, not bridged.

This results from the fact that we generally don't have valid routes
to the senders we can use to relay ICMP and ICMPv6 errors, and makes
PMTU discovery completely non-functional for VXLAN and GENEVE ports of
both regular bridges and Open vSwitch instances.

If the sender is local, and packets are forwarded to the port by a
regular bridge, all it takes is to generate a corresponding route
exception on the encapsulating device. The bridge then finds the route
exception carrying the PMTU value estimate as it forwards frames, and
relays ICMP messages back to the socket of the local sender. Patch 1/6
fixes this case.

If the sender resides on another node, we actually need to reply to
IP and IPv6 packets ourselves and send these ICMP or ICMPv6 errors
back, using the same encapsulating device. Patch 2/6, based on an
original idea by Florian Westphal, adds the needed functionality,
while patches 3/6 and 4/6 add matching support for VXLAN and GENEVE.

Finally, 5/6 and 6/6 introduce selftests for all combinations of
inner and outer IP versions, covering both VXLAN and GENEVE, with
both regular bridges and Open vSwitch instances.

Stefano Brivio (6):
  ipv4: route: Ignore output interface in FIB lookup for PMTU route
  tunnels: PMTU discovery support for directly bridged IP packets
  vxlan: Support for PMTU discovery on directly bridged links
  geneve: Support for PMTU discovery on directly bridged links
  selftests: pmtu.sh: Add tests for bridged UDP tunnels
  selftests: pmtu.sh: Add tests for UDP tunnels handled by Open vSwitch

 drivers/net/bareudp.c               |   5 +-
 drivers/net/geneve.c                |  57 ++++-
 drivers/net/vxlan.c                 |  49 +++-
 include/net/dst.h                   |  10 -
 include/net/ip_tunnels.h            |  88 +++++++
 net/ipv4/ip_tunnel_core.c           | 122 ++++++++++
 net/ipv4/route.c                    |   1 +
 tools/testing/selftests/net/pmtu.sh | 347 +++++++++++++++++++++++++++-
 8 files changed, 650 insertions(+), 29 deletions(-)

Comments

Florian Westphal Aug. 3, 2020, 11:28 p.m. UTC | #1
Stefano Brivio <sbrivio@redhat.com> wrote:
> Currently, PMTU discovery for UDP tunnels only works if packets are
> routed to the encapsulating interfaces, not bridged.
> 
> This results from the fact that we generally don't have valid routes
> to the senders we can use to relay ICMP and ICMPv6 errors, and makes
> PMTU discovery completely non-functional for VXLAN and GENEVE ports of
> both regular bridges and Open vSwitch instances.
> 
> If the sender is local, and packets are forwarded to the port by a
> regular bridge, all it takes is to generate a corresponding route
> exception on the encapsulating device. The bridge then finds the route
> exception carrying the PMTU value estimate as it forwards frames, and
> relays ICMP messages back to the socket of the local sender. Patch 1/6
> fixes this case.
> 
> If the sender resides on another node, we actually need to reply to
> IP and IPv6 packets ourselves and send these ICMP or ICMPv6 errors
> back, using the same encapsulating device. Patch 2/6, based on an
> original idea by Florian Westphal, adds the needed functionality,
> while patches 3/6 and 4/6 add matching support for VXLAN and GENEVE.
> 
> Finally, 5/6 and 6/6 introduce selftests for all combinations of
> inner and outer IP versions, covering both VXLAN and GENEVE, with
> both regular bridges and Open vSwitch instances.

Thanks for taking over and brining this into shape, this looks good to
me.

Given such setups will become easily get stuck on first pmtu update
it would be good to get this applied now, even tough merge window is
already open.
David Ahern Aug. 3, 2020, 11:46 p.m. UTC | #2
On 8/3/20 5:28 PM, Florian Westphal wrote:
> Stefano Brivio <sbrivio@redhat.com> wrote:
>> Currently, PMTU discovery for UDP tunnels only works if packets are
>> routed to the encapsulating interfaces, not bridged.
>>
>> This results from the fact that we generally don't have valid routes
>> to the senders we can use to relay ICMP and ICMPv6 errors, and makes
>> PMTU discovery completely non-functional for VXLAN and GENEVE ports of
>> both regular bridges and Open vSwitch instances.
>>
>> If the sender is local, and packets are forwarded to the port by a
>> regular bridge, all it takes is to generate a corresponding route
>> exception on the encapsulating device. The bridge then finds the route
>> exception carrying the PMTU value estimate as it forwards frames, and
>> relays ICMP messages back to the socket of the local sender. Patch 1/6
>> fixes this case.
>>
>> If the sender resides on another node, we actually need to reply to
>> IP and IPv6 packets ourselves and send these ICMP or ICMPv6 errors
>> back, using the same encapsulating device. Patch 2/6, based on an
>> original idea by Florian Westphal, adds the needed functionality,
>> while patches 3/6 and 4/6 add matching support for VXLAN and GENEVE.
>>
>> Finally, 5/6 and 6/6 introduce selftests for all combinations of
>> inner and outer IP versions, covering both VXLAN and GENEVE, with
>> both regular bridges and Open vSwitch instances.
> 
> Thanks for taking over and brining this into shape, this looks good to
> me.

+1. I'm sure this took quite a bit of your time. Thanks for doing that.
I like this version much better.
David Miller Aug. 4, 2020, 1:25 a.m. UTC | #3
From: Stefano Brivio <sbrivio@redhat.com>
Date: Mon,  3 Aug 2020 22:52:08 +0200

> Currently, PMTU discovery for UDP tunnels only works if packets are
> routed to the encapsulating interfaces, not bridged.
> 
> This results from the fact that we generally don't have valid routes
> to the senders we can use to relay ICMP and ICMPv6 errors, and makes
> PMTU discovery completely non-functional for VXLAN and GENEVE ports of
> both regular bridges and Open vSwitch instances.
> 
> If the sender is local, and packets are forwarded to the port by a
> regular bridge, all it takes is to generate a corresponding route
> exception on the encapsulating device. The bridge then finds the route
> exception carrying the PMTU value estimate as it forwards frames, and
> relays ICMP messages back to the socket of the local sender. Patch 1/6
> fixes this case.
> 
> If the sender resides on another node, we actually need to reply to
> IP and IPv6 packets ourselves and send these ICMP or ICMPv6 errors
> back, using the same encapsulating device. Patch 2/6, based on an
> original idea by Florian Westphal, adds the needed functionality,
> while patches 3/6 and 4/6 add matching support for VXLAN and GENEVE.
> 
> Finally, 5/6 and 6/6 introduce selftests for all combinations of
> inner and outer IP versions, covering both VXLAN and GENEVE, with
> both regular bridges and Open vSwitch instances.

Please address the feedback you've received and I will apply this
series, thank you.