mbox series

[net-next,v2,0/6] Support PMTU discovery with bridged UDP tunnels

Message ID cover.1596520062.git.sbrivio@redhat.com
Headers show
Series Support PMTU discovery with bridged UDP tunnels | expand

Message

Stefano Brivio Aug. 4, 2020, 5:53 a.m. UTC
Currently, PMTU discovery for UDP tunnels only works if packets are
routed to the encapsulating interfaces, not bridged.

This results from the fact that we generally don't have valid routes
to the senders we can use to relay ICMP and ICMPv6 errors, and makes
PMTU discovery completely non-functional for VXLAN and GENEVE ports of
both regular bridges and Open vSwitch instances.

If the sender is local, and packets are forwarded to the port by a
regular bridge, all it takes is to generate a corresponding route
exception on the encapsulating device. The bridge then finds the route
exception carrying the PMTU value estimate as it forwards frames, and
relays ICMP messages back to the socket of the local sender. Patch 1/6
fixes this case.

If the sender resides on another node, we actually need to reply to
IP and IPv6 packets ourselves and send these ICMP or ICMPv6 errors
back, using the same encapsulating device. Patch 2/6, based on an
original idea by Florian Westphal, adds the needed functionality,
while patches 3/6 and 4/6 add matching support for VXLAN and GENEVE.

Finally, 5/6 and 6/6 introduce selftests for all combinations of
inner and outer IP versions, covering both VXLAN and GENEVE, with
both regular bridges and Open vSwitch instances.

v2: Add helper to check for any bridge port, skip oif check for PMTU
    routes for bridge ports only, split IPv4 and IPv6 helpers and
    functions (all suggested by David Ahern)

Stefano Brivio (6):
  ipv4: route: Ignore output interface in FIB lookup for PMTU route
  tunnels: PMTU discovery support for directly bridged IP packets
  vxlan: Support for PMTU discovery on directly bridged links
  geneve: Support for PMTU discovery on directly bridged links
  selftests: pmtu.sh: Add tests for bridged UDP tunnels
  selftests: pmtu.sh: Add tests for UDP tunnels handled by Open vSwitch

 drivers/net/bareudp.c               |   5 +-
 drivers/net/geneve.c                |  55 ++++-
 drivers/net/vxlan.c                 |  47 +++-
 include/linux/netdevice.h           |   5 +
 include/net/dst.h                   |  10 -
 include/net/ip_tunnels.h            |   2 +
 net/ipv4/ip_tunnel_core.c           | 244 +++++++++++++++++++
 net/ipv4/route.c                    |   5 +
 tools/testing/selftests/net/pmtu.sh | 347 +++++++++++++++++++++++++++-
 9 files changed, 691 insertions(+), 29 deletions(-)

Comments

David Miller Aug. 4, 2020, 8:02 p.m. UTC | #1
From: Stefano Brivio <sbrivio@redhat.com>
Date: Tue,  4 Aug 2020 07:53:41 +0200

> Currently, PMTU discovery for UDP tunnels only works if packets are
> routed to the encapsulating interfaces, not bridged.
> 
> This results from the fact that we generally don't have valid routes
> to the senders we can use to relay ICMP and ICMPv6 errors, and makes
> PMTU discovery completely non-functional for VXLAN and GENEVE ports of
> both regular bridges and Open vSwitch instances.
> 
> If the sender is local, and packets are forwarded to the port by a
> regular bridge, all it takes is to generate a corresponding route
> exception on the encapsulating device. The bridge then finds the route
> exception carrying the PMTU value estimate as it forwards frames, and
> relays ICMP messages back to the socket of the local sender. Patch 1/6
> fixes this case.
> 
> If the sender resides on another node, we actually need to reply to
> IP and IPv6 packets ourselves and send these ICMP or ICMPv6 errors
> back, using the same encapsulating device. Patch 2/6, based on an
> original idea by Florian Westphal, adds the needed functionality,
> while patches 3/6 and 4/6 add matching support for VXLAN and GENEVE.
> 
> Finally, 5/6 and 6/6 introduce selftests for all combinations of
> inner and outer IP versions, covering both VXLAN and GENEVE, with
> both regular bridges and Open vSwitch instances.
> 
> v2: Add helper to check for any bridge port, skip oif check for PMTU
>     routes for bridge ports only, split IPv4 and IPv6 helpers and
>     functions (all suggested by David Ahern)

Series applied with the extraneous newline in the selftest changes of
patch #5 removed.

Thank you.