Message ID | CADVnQykPOVf7PoK8-6Mvbmbcpe1eYY=cfKHgiNGDNSFfCDPb-A@mail.gmail.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Haven't you missed "return ssthresh;" statement? On неділя, 10 січня 2016 р. 12:29:17 EET Neal Cardwell wrote: > On Sun, Jan 10, 2016 at 9:57 AM, Oleksandr Natalenko > > <oleksandr@natalenko.name> wrote: > > I use YeAH. But YeAH code wasn't touched between 4.2 and 4.3. > > Oh, interesting. Looks like tcp_yeah_ssthresh() has a bug where its > intended reduction can be bigger than tp->snd_cwnd, leading to it > return a zero ssthresh (or even an ssthresh that underflows to ~4 > billion). If tcp_yeah_ssthresh() returns an ssthresh of 0 then PRR > will try to pull the cwnd down to 0. > > Can you please leave ECN and Yeah enabled and run something like the > following patch, to verify this conjecture? If the conjecture is > right, then the tcp_yeah warning should fire but not the new > tcp_cwnd_reduction() warning: > > ----------- > diff --git a/net/ipv4/tcp_yeah.c b/net/ipv4/tcp_yeah.c > index 17d3566..ef60cba 100644 > --- a/net/ipv4/tcp_yeah.c > +++ b/net/ipv4/tcp_yeah.c > @@ -206,6 +206,7 @@ static u32 tcp_yeah_ssthresh(struct sock *sk) > const struct tcp_sock *tp = tcp_sk(sk); > struct yeah *yeah = inet_csk_ca(sk); > u32 reduction; > + s32 ssthresh; > > if (yeah->doing_reno_now < TCP_YEAH_RHO) { > reduction = yeah->lastQ; > @@ -219,7 +220,9 @@ static u32 tcp_yeah_ssthresh(struct sock *sk) > yeah->fast_count = 0; > yeah->reno_count = max(yeah->reno_count>>1, 2U); > > - return tp->snd_cwnd - reduction; > + ssthresh = tp->snd_cwnd - reduction; > + if (WARN_ON_ONCE(ssthresh <= 0)) > + ssthresh = 1; > } > > static struct tcp_congestion_ops tcp_yeah __read_mostly = { > ----------- > > If that works, then we may just want a version of this patch without > the warning. > > Thanks! > neal
On Sun, Jan 10, 2016 at 12:50 PM, Oleksandr Natalenko
<oleksandr@natalenko.name> wrote:
> Haven't you missed "return ssthresh;" statement?
Yes, sorry. You'd clearly need to return ssthresh as well. :-)
neal
OK, it seems the assumption about YeAH is correct. Here is stacktrace fired by WARN_ON_ONCE(): https://gist.github.com/851cedcfca60d6120035 Is there sufficient info for you to prepare upstream patch? On неділя, 10 січня 2016 р. 12:29:17 EET Neal Cardwell wrote: > On Sun, Jan 10, 2016 at 9:57 AM, Oleksandr Natalenko > > <oleksandr@natalenko.name> wrote: > > I use YeAH. But YeAH code wasn't touched between 4.2 and 4.3. > > Oh, interesting. Looks like tcp_yeah_ssthresh() has a bug where its > intended reduction can be bigger than tp->snd_cwnd, leading to it > return a zero ssthresh (or even an ssthresh that underflows to ~4 > billion). If tcp_yeah_ssthresh() returns an ssthresh of 0 then PRR > will try to pull the cwnd down to 0. > > Can you please leave ECN and Yeah enabled and run something like the > following patch, to verify this conjecture? If the conjecture is > right, then the tcp_yeah warning should fire but not the new > tcp_cwnd_reduction() warning: > > ----------- > diff --git a/net/ipv4/tcp_yeah.c b/net/ipv4/tcp_yeah.c > index 17d3566..ef60cba 100644 > --- a/net/ipv4/tcp_yeah.c > +++ b/net/ipv4/tcp_yeah.c > @@ -206,6 +206,7 @@ static u32 tcp_yeah_ssthresh(struct sock *sk) > const struct tcp_sock *tp = tcp_sk(sk); > struct yeah *yeah = inet_csk_ca(sk); > u32 reduction; > + s32 ssthresh; > > if (yeah->doing_reno_now < TCP_YEAH_RHO) { > reduction = yeah->lastQ; > @@ -219,7 +220,9 @@ static u32 tcp_yeah_ssthresh(struct sock *sk) > yeah->fast_count = 0; > yeah->reno_count = max(yeah->reno_count>>1, 2U); > > - return tp->snd_cwnd - reduction; > + ssthresh = tp->snd_cwnd - reduction; > + if (WARN_ON_ONCE(ssthresh <= 0)) > + ssthresh = 1; > } > > static struct tcp_congestion_ops tcp_yeah __read_mostly = { > ----------- > > If that works, then we may just want a version of this patch without > the warning. > > Thanks! > neal
On Sun, Jan 10, 2016 at 4:56 PM, Oleksandr Natalenko <oleksandr@natalenko.name> wrote: > OK, it seems the assumption about YeAH is correct. Here is stacktrace fired by > WARN_ON_ONCE(): > > https://gist.github.com/851cedcfca60d6120035 > > Is there sufficient info for you to prepare upstream patch? Great. Thanks for running that test! Yes, I think that's enough info. We sent a proposed upstream patch here: http://patchwork.ozlabs.org/patch/566072/ tcp_yeah: don't set ssthresh below 2 Feel free to add your "Tested-By" to the thread if you have a chance to test that patch. Thanks, neal
Compiled and booted OK, no issues so far, but I'd prefer to give it a several days test. Thanks! On понеділок, 11 січня 2016 р. 13:47:08 EET Neal Cardwell wrote: > Great. Thanks for running that test! Yes, I think that's enough info. > We sent a proposed upstream patch here: > > http://patchwork.ozlabs.org/patch/566072/ > tcp_yeah: don't set ssthresh below 2 > > Feel free to add your "Tested-By" to the thread if you have a chance > to test that patch.
diff --git a/net/ipv4/tcp_yeah.c b/net/ipv4/tcp_yeah.c index 17d3566..ef60cba 100644 --- a/net/ipv4/tcp_yeah.c +++ b/net/ipv4/tcp_yeah.c @@ -206,6 +206,7 @@ static u32 tcp_yeah_ssthresh(struct sock *sk) const struct tcp_sock *tp = tcp_sk(sk); struct yeah *yeah = inet_csk_ca(sk); u32 reduction; + s32 ssthresh; if (yeah->doing_reno_now < TCP_YEAH_RHO) { reduction = yeah->lastQ; @@ -219,7 +220,9 @@ static u32 tcp_yeah_ssthresh(struct sock *sk) yeah->fast_count = 0; yeah->reno_count = max(yeah->reno_count>>1, 2U); - return tp->snd_cwnd - reduction; + ssthresh = tp->snd_cwnd - reduction; + if (WARN_ON_ONCE(ssthresh <= 0)) + ssthresh = 1; } static struct tcp_congestion_ops tcp_yeah __read_mostly = {
On Sun, Jan 10, 2016 at 9:57 AM, Oleksandr Natalenko <oleksandr@natalenko.name> wrote: > I use YeAH. But YeAH code wasn't touched between 4.2 and 4.3. Oh, interesting. Looks like tcp_yeah_ssthresh() has a bug where its intended reduction can be bigger than tp->snd_cwnd, leading to it return a zero ssthresh (or even an ssthresh that underflows to ~4 billion). If tcp_yeah_ssthresh() returns an ssthresh of 0 then PRR will try to pull the cwnd down to 0. Can you please leave ECN and Yeah enabled and run something like the following patch, to verify this conjecture? If the conjecture is right, then the tcp_yeah warning should fire but not the new tcp_cwnd_reduction() warning: ----------- ----------- If that works, then we may just want a version of this patch without the warning. Thanks! neal