Message ID | 1519224183.55655.40.camel@gmail.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] tcp_bbr: better deal with suboptimal GSO | expand |
Hi, On Wed, 2018-02-21 at 06:43 -0800, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > BBR uses tcp_tso_autosize() in an attempt to probe what would be the > burst sizes and to adjust cwnd in bbr_target_cwnd() with following > gold formula : > > /* Allow enough full-sized skbs in flight to utilize end systems. */ > cwnd += 3 * bbr->tso_segs_goal; > > But GSO can be lacking or be constrained to very small > units (ip link set dev ... gso_max_segs 2) > > What we really want is to have enough packets in flight so that both > GSO and GRO are efficient. > > So in the case GSO is off or downgraded, we still want to have the same > number of packets in flight as if GSO/TSO was fully operational, so > that GRO can hopefully be working efficiently. > > To fix this issue, we make tcp_tso_autosize() unaware of > sk->sk_gso_max_segs > > Only tcp_tso_segs() has to enforce the gso_max_segs limit. > > Tested: > > ethtool -K eth0 tso off gso off > tc qd replace dev eth0 root pfifo_fast > > Before patch: > for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done > 691 (ss -temoi shows cwnd is stuck around 6 ) > 667 > 651 > 631 > 517 > > After patch : > # for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done > 1733 (ss -temoi shows cwnd is around 386 ) > 1778 > 1746 > 1781 > 1718 > > Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> > --- > net/ipv4/tcp_output.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index b2bca373f8bee35267df49b5947a6793fed71a12..6818042cd8a9a1778f54637861647091afd9a769 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -1730,7 +1730,7 @@ u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, > */ > segs = max_t(u32, bytes / mss_now, min_tso_segs); > > - return min_t(u32, segs, sk->sk_gso_max_segs); > + return segs; > } > EXPORT_SYMBOL(tcp_tso_autosize); > Very minor nit, why don't: return max_t(u32, bytes / mss_now, min_tso_segs); and drop the 'segs' local variable? Thanks, Paolo
On Wed, Feb 21, 2018 at 7:01 AM, Paolo Abeni <pabeni@redhat.com> wrote: > > Very minor nit, why don't: > > return max_t(u32, bytes / mss_now, min_tso_segs); > > and drop the 'segs' local variable? Simply to ease backports. We had some constant changes in this function in the past.
On Wed, Feb 21, 2018 at 9:43 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > BBR uses tcp_tso_autosize() in an attempt to probe what would be the > burst sizes and to adjust cwnd in bbr_target_cwnd() with following > gold formula : > > /* Allow enough full-sized skbs in flight to utilize end systems. */ > cwnd += 3 * bbr->tso_segs_goal; > > But GSO can be lacking or be constrained to very small > units (ip link set dev ... gso_max_segs 2) > > What we really want is to have enough packets in flight so that both > GSO and GRO are efficient. > > So in the case GSO is off or downgraded, we still want to have the same > number of packets in flight as if GSO/TSO was fully operational, so > that GRO can hopefully be working efficiently. > > To fix this issue, we make tcp_tso_autosize() unaware of > sk->sk_gso_max_segs > > Only tcp_tso_segs() has to enforce the gso_max_segs limit. > > Tested: > > ethtool -K eth0 tso off gso off > tc qd replace dev eth0 root pfifo_fast > > Before patch: > for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done > 691 (ss -temoi shows cwnd is stuck around 6 ) > 667 > 651 > 631 > 517 > > After patch : > # for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done > 1733 (ss -temoi shows cwnd is around 386 ) > 1778 > 1746 > 1781 > 1718 > > Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> > --- > net/ipv4/tcp_output.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) Acked-by: Neal Cardwell <ncardwell@google.com> Thanks, Eric! neal
On Wed, Feb 21, 2018 at 10:14 AM Neal Cardwell <ncardwell@google.com> wrote: > On Wed, Feb 21, 2018 at 9:43 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > From: Eric Dumazet <edumazet@google.com> > > > > BBR uses tcp_tso_autosize() in an attempt to probe what would be the > > burst sizes and to adjust cwnd in bbr_target_cwnd() with following > > gold formula : > > > > /* Allow enough full-sized skbs in flight to utilize end systems. */ > > cwnd += 3 * bbr->tso_segs_goal; > > > > But GSO can be lacking or be constrained to very small > > units (ip link set dev ... gso_max_segs 2) > > > > What we really want is to have enough packets in flight so that both > > GSO and GRO are efficient. > > > > So in the case GSO is off or downgraded, we still want to have the same > > number of packets in flight as if GSO/TSO was fully operational, so > > that GRO can hopefully be working efficiently. > > > > To fix this issue, we make tcp_tso_autosize() unaware of > > sk->sk_gso_max_segs > > > > Only tcp_tso_segs() has to enforce the gso_max_segs limit. > > > > Tested: > > > > ethtool -K eth0 tso off gso off > > tc qd replace dev eth0 root pfifo_fast > > > > Before patch: > > for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done > > 691 (ss -temoi shows cwnd is stuck around 6 ) > > 667 > > 651 > > 631 > > 517 > > > > After patch : > > # for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done > > 1733 (ss -temoi shows cwnd is around 386 ) > > 1778 > > 1746 > > 1781 > > 1718 > > > > Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> > > --- > > net/ipv4/tcp_output.c | 9 +++++---- > > 1 file changed, 5 insertions(+), 4 deletions(-) > Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Thank you Eric for the nice patch!
On Wed, 2018-02-21 at 07:09 -0800, Eric Dumazet wrote: > On Wed, Feb 21, 2018 at 7:01 AM, Paolo Abeni <pabeni@redhat.com> wrote: > > > > Very minor nit, why don't: > > > > return max_t(u32, bytes / mss_now, min_tso_segs); > > > > and drop the 'segs' local variable? > > Simply to ease backports. > > We had some constant changes in this function in the past. Ok, thank you for the explanation. No objections on my side. Cheers, Paolo
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 21 Feb 2018 06:43:03 -0800 > From: Eric Dumazet <edumazet@google.com> > > BBR uses tcp_tso_autosize() in an attempt to probe what would be the > burst sizes and to adjust cwnd in bbr_target_cwnd() with following > gold formula : > > /* Allow enough full-sized skbs in flight to utilize end systems. */ > cwnd += 3 * bbr->tso_segs_goal; > > But GSO can be lacking or be constrained to very small > units (ip link set dev ... gso_max_segs 2) > > What we really want is to have enough packets in flight so that both > GSO and GRO are efficient. > > So in the case GSO is off or downgraded, we still want to have the same > number of packets in flight as if GSO/TSO was fully operational, so > that GRO can hopefully be working efficiently. > > To fix this issue, we make tcp_tso_autosize() unaware of > sk->sk_gso_max_segs > > Only tcp_tso_segs() has to enforce the gso_max_segs limit. . .. > Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Applied and queued up for -stable, thanks Eric.
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index b2bca373f8bee35267df49b5947a6793fed71a12..6818042cd8a9a1778f54637861647091afd9a769 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1730,7 +1730,7 @@ u32 tcp_tso_autosize(const struct sock *sk, unsigned int mss_now, */ segs = max_t(u32, bytes / mss_now, min_tso_segs); - return min_t(u32, segs, sk->sk_gso_max_segs); + return segs; } EXPORT_SYMBOL(tcp_tso_autosize); @@ -1742,9 +1742,10 @@ static u32 tcp_tso_segs(struct sock *sk, unsigned int mss_now) const struct tcp_congestion_ops *ca_ops = inet_csk(sk)->icsk_ca_ops; u32 tso_segs = ca_ops->tso_segs_goal ? ca_ops->tso_segs_goal(sk) : 0; - return tso_segs ? : - tcp_tso_autosize(sk, mss_now, - sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs); + if (!tso_segs) + tso_segs = tcp_tso_autosize(sk, mss_now, + sock_net(sk)->ipv4.sysctl_tcp_min_tso_segs); + return min_t(u32, tso_segs, sk->sk_gso_max_segs); } /* Returns the portion of skb which can be sent right away */