Message ID | 20170908070142.4440-1-daniel.axtens@canonical.com |
---|---|
Headers | show |
Series | Fixes for LP#1715812 | expand |
On 08/09/17 08:01, Daniel Axtens wrote: > (This is the Xenial patchset. Some patches required minor backporting.) > > [SRU Justification] > > [Impact] > A host can lose access to another host whose MAC address changes if > they have active connections to other hosts that share a route. The > ARP cache does not time out as expected - instead the old MAC address > is continuously reconfirmed. > > [Fix] > Apply series [1], which changes the algorithm for neighbour confirmation. > That is, from upstream: > 51ce8bd4d17a net: pending_confirm is not used anymore > 0dec879f636f net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP > 63fca65d0863 net: add confirm_neigh method to dst_ops > c3a2e8370534 tcp: replace dst_confirm with sk_dst_confirm > c86a773c7802 sctp: add dst_pending_confirm flag > 4ff0620354f2 net: add dst_pending_confirm flag to skbuff > 9b8805a32559 sock: add sk_dst_pending_confirm flag > > [Test case] > Create 3 real or virtual systems, all hooked up to a switch. > One system needs an active-backup bond with fail_over_mac=1 num_grat_arp=0. > > Put all the systems in the same subnet, e.g. 192.168.200.0/24 > > Call the system with the bond A, and the other two systems B and C. > > On B, run in 3 shells: > - netperf -t TCP_RR to C > - ping -f A > - watch 'ip -s neigh show 192.168.200.0/24' > > On A, cause the bond to fail over. > > Observe that: > > - without the patches, B intermittently fails to notice the change in > A's MAC address. This presents as the ping failing and not > recovering, and the arp table showing the old mac address never > timing out and never being replace with a new mac address. > > - with the patches, the arp cache times out and B sends another mac > probe and detects A's new address. > > It helps to use taskset to put ping and netperf on the same CPU, or use single-CPU vms. > > See [2] for more details. > > [References] > [2] Original report: https://www.mail-archive.com/netdev@vger.kernel.org/msg138762.html > [1]: https://www.spinics.net/lists/linux-rdma/msg45907.html > > Julian Anastasov (7): > sock: add sk_dst_pending_confirm flag > net: add dst_pending_confirm flag to skbuff > sctp: add dst_pending_confirm flag > tcp: replace dst_confirm with sk_dst_confirm > net: add confirm_neigh method to dst_ops > net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP > net: pending_confirm is not used anymore > > drivers/net/vrf.c | 5 ++++- > include/linux/skbuff.h | 12 ++++++++++++ > include/net/arp.h | 16 ++++++++++++++++ > include/net/dst.h | 21 +++++++++------------ > include/net/dst_ops.h | 2 ++ > include/net/ndisc.h | 17 +++++++++++++++++ > include/net/sctp/sctp.h | 6 ++---- > include/net/sctp/structs.h | 4 ++++ > include/net/sock.h | 25 +++++++++++++++++++++++++ > net/core/dst.c | 1 - > net/core/sock.c | 2 ++ > net/ipv4/ip_output.c | 11 ++++++++++- > net/ipv4/ping.c | 3 ++- > net/ipv4/raw.c | 6 +++++- > net/ipv4/route.c | 19 +++++++++++++++++++ > net/ipv4/tcp_input.c | 12 +++--------- > net/ipv4/tcp_metrics.c | 7 ++----- > net/ipv4/tcp_output.c | 2 ++ > net/ipv4/udp.c | 3 ++- > net/ipv6/ip6_output.c | 7 +++++++ > net/ipv6/raw.c | 6 +++++- > net/ipv6/route.c | 43 ++++++++++++++++++++++++++++++------------- > net/ipv6/udp.c | 3 ++- > net/l2tp/l2tp_ip6.c | 3 ++- > net/sctp/associola.c | 3 +-- > net/sctp/output.c | 10 +++++++++- > net/sctp/outqueue.c | 2 +- > net/sctp/sm_make_chunk.c | 6 ++---- > net/sctp/sm_sideeffect.c | 2 +- > net/sctp/socket.c | 4 ++-- > net/sctp/transport.c | 16 +++++++++++++++- > net/xfrm/xfrm_policy.c | 19 +++++++++++++++++++ > 32 files changed, 234 insertions(+), 64 deletions(-) > Firstly, this patchset does touch a fair amount of code across the network stack, so I was originally reluctant to say this is OK for a SRU. However, these are correct looking upstream packports and clean cherry picks that legitimately address bug, so it seems reasonable to apply these. I noticed that [PATCH 3/7] sctp: add dst_pending_confirm flag, commit c86a773c78025f5b825bacd7b846f4fa60dc0317 has a trivial "cosmetic" fix to it from upstream: commit 486a43db2e26b87125b5629e1ade516f90833934 Author: Xin Long <lucien.xin@gmail.com> Date: Sat Mar 18 19:12:22 2017 +0800 sctp: remove temporary variable confirm from sctp_packet_transmit however, this is a trivial fix and probably can be ignored. I've given these patches a workout with a network stress-test and I see no regressions. After reading some notes on this patch set from Jay also increases my confidence in Ack'ing this patch set... so... Acked-by: Colin Ian King <colin.king@canonical.com>
On 08.09.2017 09:01, Daniel Axtens wrote: > (This is the Xenial patchset. Some patches required minor backporting.) > > [SRU Justification] > > [Impact] > A host can lose access to another host whose MAC address changes if > they have active connections to other hosts that share a route. The > ARP cache does not time out as expected - instead the old MAC address > is continuously reconfirmed. > > [Fix] > Apply series [1], which changes the algorithm for neighbour confirmation. > That is, from upstream: > 51ce8bd4d17a net: pending_confirm is not used anymore > 0dec879f636f net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP > 63fca65d0863 net: add confirm_neigh method to dst_ops > c3a2e8370534 tcp: replace dst_confirm with sk_dst_confirm > c86a773c7802 sctp: add dst_pending_confirm flag > 4ff0620354f2 net: add dst_pending_confirm flag to skbuff > 9b8805a32559 sock: add sk_dst_pending_confirm flag > > [Test case] > Create 3 real or virtual systems, all hooked up to a switch. > One system needs an active-backup bond with fail_over_mac=1 num_grat_arp=0. > > Put all the systems in the same subnet, e.g. 192.168.200.0/24 > > Call the system with the bond A, and the other two systems B and C. > > On B, run in 3 shells: > - netperf -t TCP_RR to C > - ping -f A > - watch 'ip -s neigh show 192.168.200.0/24' > > On A, cause the bond to fail over. > > Observe that: > > - without the patches, B intermittently fails to notice the change in > A's MAC address. This presents as the ping failing and not > recovering, and the arp table showing the old mac address never > timing out and never being replace with a new mac address. > > - with the patches, the arp cache times out and B sends another mac > probe and detects A's new address. > > It helps to use taskset to put ping and netperf on the same CPU, or use single-CPU vms. > > See [2] for more details. > > [References] > [2] Original report: https://www.mail-archive.com/netdev@vger.kernel.org/msg138762.html > [1]: https://www.spinics.net/lists/linux-rdma/msg45907.html > > Julian Anastasov (7): > sock: add sk_dst_pending_confirm flag > net: add dst_pending_confirm flag to skbuff > sctp: add dst_pending_confirm flag > tcp: replace dst_confirm with sk_dst_confirm > net: add confirm_neigh method to dst_ops > net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP > net: pending_confirm is not used anymore > > drivers/net/vrf.c | 5 ++++- > include/linux/skbuff.h | 12 ++++++++++++ > include/net/arp.h | 16 ++++++++++++++++ > include/net/dst.h | 21 +++++++++------------ > include/net/dst_ops.h | 2 ++ > include/net/ndisc.h | 17 +++++++++++++++++ > include/net/sctp/sctp.h | 6 ++---- > include/net/sctp/structs.h | 4 ++++ > include/net/sock.h | 25 +++++++++++++++++++++++++ > net/core/dst.c | 1 - > net/core/sock.c | 2 ++ > net/ipv4/ip_output.c | 11 ++++++++++- > net/ipv4/ping.c | 3 ++- > net/ipv4/raw.c | 6 +++++- > net/ipv4/route.c | 19 +++++++++++++++++++ > net/ipv4/tcp_input.c | 12 +++--------- > net/ipv4/tcp_metrics.c | 7 ++----- > net/ipv4/tcp_output.c | 2 ++ > net/ipv4/udp.c | 3 ++- > net/ipv6/ip6_output.c | 7 +++++++ > net/ipv6/raw.c | 6 +++++- > net/ipv6/route.c | 43 ++++++++++++++++++++++++++++++------------- > net/ipv6/udp.c | 3 ++- > net/l2tp/l2tp_ip6.c | 3 ++- > net/sctp/associola.c | 3 +-- > net/sctp/output.c | 10 +++++++++- > net/sctp/outqueue.c | 2 +- > net/sctp/sm_make_chunk.c | 6 ++---- > net/sctp/sm_sideeffect.c | 2 +- > net/sctp/socket.c | 4 ++-- > net/sctp/transport.c | 16 +++++++++++++++- > net/xfrm/xfrm_policy.c | 19 +++++++++++++++++++ > 32 files changed, 234 insertions(+), 64 deletions(-) > Acked-by: Stefan Bader <stefan.bader@canonical.com> Trusting Colin did test the backports individually (because the related bug report does not explicitly state whether the test case has ben run with test kernels for Zesty and/or Xenial). One thing I forgot to mention in the Zesty ACK: I also only found that one fixup which Colin mentioned. The only noticeable difference would be that the stack usage is minimally lower due to the dropped local variable. I would not thing this warrants its inclusion but would leave it to Daniel or Jay to decide. -Stefan
On 08.09.2017 09:01, Daniel Axtens wrote: > (This is the Xenial patchset. Some patches required minor backporting.) > > [SRU Justification] > > [Impact] > A host can lose access to another host whose MAC address changes if > they have active connections to other hosts that share a route. The > ARP cache does not time out as expected - instead the old MAC address > is continuously reconfirmed. > > [Fix] > Apply series [1], which changes the algorithm for neighbour confirmation. > That is, from upstream: > 51ce8bd4d17a net: pending_confirm is not used anymore > 0dec879f636f net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP > 63fca65d0863 net: add confirm_neigh method to dst_ops > c3a2e8370534 tcp: replace dst_confirm with sk_dst_confirm > c86a773c7802 sctp: add dst_pending_confirm flag > 4ff0620354f2 net: add dst_pending_confirm flag to skbuff > 9b8805a32559 sock: add sk_dst_pending_confirm flag > > [Test case] > Create 3 real or virtual systems, all hooked up to a switch. > One system needs an active-backup bond with fail_over_mac=1 num_grat_arp=0. > > Put all the systems in the same subnet, e.g. 192.168.200.0/24 > > Call the system with the bond A, and the other two systems B and C. > > On B, run in 3 shells: > - netperf -t TCP_RR to C > - ping -f A > - watch 'ip -s neigh show 192.168.200.0/24' > > On A, cause the bond to fail over. > > Observe that: > > - without the patches, B intermittently fails to notice the change in > A's MAC address. This presents as the ping failing and not > recovering, and the arp table showing the old mac address never > timing out and never being replace with a new mac address. > > - with the patches, the arp cache times out and B sends another mac > probe and detects A's new address. > > It helps to use taskset to put ping and netperf on the same CPU, or use single-CPU vms. > > See [2] for more details. > > [References] > [2] Original report: https://www.mail-archive.com/netdev@vger.kernel.org/msg138762.html > [1]: https://www.spinics.net/lists/linux-rdma/msg45907.html > > Julian Anastasov (7): > sock: add sk_dst_pending_confirm flag > net: add dst_pending_confirm flag to skbuff > sctp: add dst_pending_confirm flag > tcp: replace dst_confirm with sk_dst_confirm > net: add confirm_neigh method to dst_ops > net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP > net: pending_confirm is not used anymore > > drivers/net/vrf.c | 5 ++++- > include/linux/skbuff.h | 12 ++++++++++++ > include/net/arp.h | 16 ++++++++++++++++ > include/net/dst.h | 21 +++++++++------------ > include/net/dst_ops.h | 2 ++ > include/net/ndisc.h | 17 +++++++++++++++++ > include/net/sctp/sctp.h | 6 ++---- > include/net/sctp/structs.h | 4 ++++ > include/net/sock.h | 25 +++++++++++++++++++++++++ > net/core/dst.c | 1 - > net/core/sock.c | 2 ++ > net/ipv4/ip_output.c | 11 ++++++++++- > net/ipv4/ping.c | 3 ++- > net/ipv4/raw.c | 6 +++++- > net/ipv4/route.c | 19 +++++++++++++++++++ > net/ipv4/tcp_input.c | 12 +++--------- > net/ipv4/tcp_metrics.c | 7 ++----- > net/ipv4/tcp_output.c | 2 ++ > net/ipv4/udp.c | 3 ++- > net/ipv6/ip6_output.c | 7 +++++++ > net/ipv6/raw.c | 6 +++++- > net/ipv6/route.c | 43 ++++++++++++++++++++++++++++++------------- > net/ipv6/udp.c | 3 ++- > net/l2tp/l2tp_ip6.c | 3 ++- > net/sctp/associola.c | 3 +-- > net/sctp/output.c | 10 +++++++++- > net/sctp/outqueue.c | 2 +- > net/sctp/sm_make_chunk.c | 6 ++---- > net/sctp/sm_sideeffect.c | 2 +- > net/sctp/socket.c | 4 ++-- > net/sctp/transport.c | 16 +++++++++++++++- > net/xfrm/xfrm_policy.c | 19 +++++++++++++++++++ > 32 files changed, 234 insertions(+), 64 deletions(-) > Applied to Xenial master-next