diff mbox

tcp_sack problem Re: [Bug 11721] after upgrade to 2.6.27 i cannot navigate

Message ID Pine.LNX.4.64.0810201106200.7072@wrl-59.cs.helsinki.fi
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Ilpo Järvinen Oct. 20, 2008, 9:38 a.m. UTC
On Sat, 18 Oct 2008, Jarek Poplawski wrote:

> [RESEND] I forgot to add Aldo's email before - sorry!
> 
> On Sat, Oct 18, 2008 at 11:02:52PM +0200, Jarek Poplawski wrote:
> > Nice job Aldo!
> > 
> > I forward this message to netdev and Cc our best tcp expert.
> > Any new replies should be rather "Reply All" (not bugzilla only).
> > 
> > Thanks,
> > Jarek P.
> > 
> > On Sat, Oct 18, 2008 at 01:20:56PM -0700, bugme-daemon@bugzilla.kernel.org wrote:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=11721
> > > 
> > ...
> > > ------- Comment #30 from sentiniate@tiscali.it  2008-10-18 13:20 -------
> > > absolutely i'm not fed up! and actually i thank you for your patience.
> > > 
> > > i set paperino as you wrote and added ifconfig eth0 mtu 1400 as well
> > > i set the same value of mtu in topolino as well, i hope i was not wrong doing
> > > so!
> > > 
> > > i attach the files you ask for, but now i can navigate with 2.6.27-rc1-git1 as
> > > well!
> > > 
> > > afterwards i've run some more tests and have found out that the culprit is
> > > tcp_sack, if using kernel 2.6.27-rc1-gt1 it is set to "1" i cannot navigate if
> > > to "0" i can.
> > > so i enclose, should it be of any help, also the tcpdump on eth0 of topolino
> > > when on paperino tcp_sack is set to "1".

So this ended up into tcp domain after all (I took earlier a brief 
anyway and found out that there are not that many changes 2.6.26..2.6.27 
-- net/ipv4/tcp*.c include/net/tcp.h)...

I compared your packet against a good one from elsewhere.. I couldn't 
compare your latest dumps fully because attachments 18366 and 18367 are 
with different TCP options (you forgot zeros to sysctls in them?)... 
Anyway, only thing that seemed to be different to that case from elsewhere 
were those extra bytes in the beginning (some below ip protocol that 
gets captured by tcpdump?) which are equal in both working and broken case 
of yours and the different ordering of the tcp options as noted by Jarek 
earlier. I tried to go through the fields one by one but nothing seemed to 
be wrong...

...Might be something crazy in the way that is too picky on tcp option 
ordering which wouldn't surprise me that much... :-) Please try if
the patch below does any difference (on paperino is enough, the gw seems 
innocent here).

If that didn't help, can you please restore the sysctls to 1 and redo
2.6.26.6 dump (like in attachment 18366) so that I get a fully comparable 
sample.

Comments

Jarek Poplawski Oct. 20, 2008, 9:51 a.m. UTC | #1
On Mon, Oct 20, 2008 at 12:38:51PM +0300, Ilpo Järvinen wrote:
...
> ...Might be something crazy in the way that is too picky on tcp option 
> ordering which wouldn't surprise me that much... :-) Please try if
> the patch below does any difference (on paperino is enough, the gw seems 
> innocent here).

I've just thought about this too(!), and it seems that turning of
tcp_timestamps should do similar reorder. But of course, I'll try
not to disturb anymore...

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilpo Järvinen Oct. 20, 2008, 9:58 a.m. UTC | #2
On Mon, 20 Oct 2008, Jarek Poplawski wrote:

> On Mon, Oct 20, 2008 at 12:38:51PM +0300, Ilpo Järvinen wrote:
> ...
> > ...Might be something crazy in the way that is too picky on tcp option 
> > ordering which wouldn't surprise me that much... :-) Please try if
> > the patch below does any difference (on paperino is enough, the gw seems 
> > innocent here).
> 
> I've just thought about this too(!), and it seems that turning of
> tcp_timestamps should do similar reorder. But of course, I'll try
> not to disturb anymore...

Yeah, you should be right. No need to patch+compile+boot hassle him, 
turning timestamps should be "enough" to move mss into first slot... :-)

If that works we probably need to play around a bit to figure out what
is the actual cause (since it could involve timestamps or mss which the 
sysctl alone cannot prove).
Jarek Poplawski Oct. 20, 2008, 5:15 p.m. UTC | #3
On Mon, Oct 20, 2008 at 08:47:37AM -0700, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11721
...
> ------- Comment #40 from sentiniate@tiscali.it  2008-10-20 08:47 -------
> ok setting echo 0 > /proc/sys/net/ipv4/tcp_timestamps solves the problem as
> well.
> 
> anyway i'm willing to do whatever test you deem necessary if this can help you
> solve the issue, so as soon as i can i'll apply the patch provided by  Ilpo
> Järvinen and compile (do not worry i've been a linux user for the last 10
> years and up to 4 or 5 yrs ago i used to compile even three times a week!)
> 
> thanks
> aldo

Aldo, I forward this message to Ilpo and netdev, because "they" aren't
on the Cc list of this report. (With next replies try to use the mail
with their addresses.)

I think, you could test Ilpo's patch as well: this would make slightly
different test/header. (Of course, all these /proc tcp values and mtus
should be back to normal.)

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aldo Maggi Oct. 20, 2008, 7:48 p.m. UTC | #4
Jarek,
sorry i did not read your last msg before, so i have used the bug page
for posting.

i repeat here what i wrote there:

Ilpo,
i'm sending herewith attached the output of
tcpdump -i eth0 -nXX -c3 'dst port 80 and tcp-syn != 0' on topolino when:
1) kernel 2.6.26.6 is running on paperino and tcp_window_scaling,
tcp_timestamps, tcp_sack are set to "1"
2) kernel 2.6.26.6 is running on paperino and tcp_window_scaling,
tcp_timestamps, tcp_sack are set to "0"
3) kernel 2.6.27-rc1-git1 is running on paperino and tcp_window_scaling,
tcp_timestamps, tcp_sack are set to "1"
4) kernel 2.6.27-rc1-git1 is running on paperino and tcp_window_scaling,
tcp_timestamps, tcp_sack are set to "0"

moreover, i read more carefully what you and Jarek wrote above, please let me
know if you deem still necessary that i apply the patch you provided and
compile or if you like i compile the latest kernel or whatever.

thanks
aldo
Jarek Poplawski Oct. 20, 2008, 8:51 p.m. UTC | #5
On Mon, Oct 20, 2008 at 09:48:18PM +0200, Aldo Maggi wrote:
...
> moreover, i read more carefully what you and Jarek wrote above, please let me
> know if you deem still necessary that i apply the patch you provided and
> compile or if you like i compile the latest kernel or whatever.

Aldo, my advice, in case Ilpo is busy now:

Since it looks like the problem/bug isn't in the kernel, but rather
somewhere in between (router, modem etc.) which is mislead by changed
(compared to 2.6.26) order of tcp options, I doubt Ilpo needs now
these dumps so much: it should be enough to say works or doesn't work.

You have checked it works in 2 cases (without tcp_sack, and without
tcp_timestamp) when mss is the first option again. With Ilpo's patch
these two options could be still present, and mss at the beginning.
It should work because it's like 2.6.26 did this, but I think, it
would be nice to check this anyway, if it's not a problem. Then
trying the latest kernel (at least 2.6.27) with this patch should be
most useful.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilpo Järvinen Oct. 20, 2008, 8:55 p.m. UTC | #6
On Mon, 20 Oct 2008, Aldo Maggi wrote:

> i repeat here what i wrote there:
> 
> Ilpo,
> i'm sending herewith attached the output of
> tcpdump -i eth0 -nXX -c3 'dst port 80 and tcp-syn != 0' on topolino when:
> 1) kernel 2.6.26.6 is running on paperino and tcp_window_scaling,
> tcp_timestamps, tcp_sack are set to "1"
> 2) kernel 2.6.26.6 is running on paperino and tcp_window_scaling,
> tcp_timestamps, tcp_sack are set to "0"
> 3) kernel 2.6.27-rc1-git1 is running on paperino and tcp_window_scaling,
> tcp_timestamps, tcp_sack are set to "1"
> 4) kernel 2.6.27-rc1-git1 is running on paperino and tcp_window_scaling,
> tcp_timestamps, tcp_sack are set to "0"

Thanks, very unambiquously being told what each case is, helps a lot... :-)

Difference is: ip checksum, ipid, lsb of dstaddr, sport, seqno, tcp 
checksum, timestamp. Plus the option reordering as discussed before. Here 
are the options:

    28  0204  <
    29  05b4  <
    30  0402    0402
    31  080a    080a
    32  0019  | ffff
    33  db30  | 8169
    34  0000    0000
    35  0000    0000
    36        > 0204
    37        > 05b4
    38  0103    0103
    39  0306    0306

Could you try if tcp_window_scaling=1, tcp_sack=1, tcp_timestamps=0 also 
works, if you didn't already (it's not that obvious from the earlier 
acknowledgement you gave that these were the exact options you used or 
not)...

> moreover, i read more carefully what you and Jarek wrote above, please let me
> know if you deem still necessary that i apply the patch you provided and
> compile or if you like i compile the latest kernel or whatever.

If that ws=1,s=1,ts=0 test succeeds, then testing my patch would be useful 
too. Anyway it seems that we're dealing with some violation of spec here
(by either the peer or some middle node, 2.6.27-whatever is not doing 
something outside of spec), just trying to narrow down which of the 
options is the actual cause. We might need to try later some other 
ordering of options too to rule out possibilities by trial-and-error
(ie., unless you have a better knowledge about the devices on the path, 
information which I sort of expect to be out of (y)our reach :-)).
Ilpo Järvinen Oct. 20, 2008, 9:04 p.m. UTC | #7
In general, please use reply to all when dealing with linux developers (so 
that related people & lists end up being cc'ed), we just work that way :-).
I did restore CCs for you this time.

On Mon, 20 Oct 2008, Aldo Maggi wrote:

> bear with me but i do not wish to create problems to you all, therefore
> i'd like to be sure to be doing the correct procedure before going on
> with the compilation:
> 
> i've used your patch with the sources of kernel 2.6.27-rc1 which i had
> already patched vith git1 because of problems with the compilation
> (problems which had been solved with git1 patch).
> 
> what follows is the output of patch, is it correct?
> 
> paperino:/usr/src/linux-2.6.27-rc1-git1# patch -p1 < ../patch_iarno
> patching file net/ipv4/tcp_output.c
> Hunk #1 succeeded at 371 (offset -5 lines).
> Hunk #2 succeeded at 393 (offset -5 lines).

-5 offset might well happen and patch was able to apply it right 
regardless of that (I might have had some unrelated modifications 
or so in my local dev tree which causes the line numbers to shift
a bit).

> i'll anyway go on with the compilation, but i don't want that you
> expect i do something and i do something else! :-D

I'm fine with the kernel and your procedure... :-) And also 2.6.27 won't 
make much difference to 2.6.27-rc-something I think, so you can ignore 
Jarek's suggestion to use 2.6.27+patch if you feel -rc is easier for you 
now that you've started with it. I'm fine with either kernel 
version+patch.
Ilpo Järvinen Oct. 20, 2008, 9:51 p.m. UTC | #8
Reply to all please... ...I restored them again for you... :-)

On Mon, 20 Oct 2008, Aldo Maggi wrote:

> Il giorno Mon, 20 Oct 2008 23:55:48 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> ha scritto:
> 
> [...]
> 
> > Could you try if tcp_window_scaling=1, tcp_sack=1, tcp_timestamps=0
> > also works, if you didn't already (it's not that obvious from the
> > earlier acknowledgement you gave that these were the exact options
> > you used or not)...
> 
> sorry! i was not precise!
> those are exactly the options i tried in my acknowledgment no. 40,
> i.e. tcp_window_scaling and tcp_sack equal to "1" and tcp_timestamps=0

Np, thanks for the confirmation. I sort of assumed this already since 
the alternative interpretation of Jarek's mails didn't make too much 
sense after all (and I jumped into this thread in between, so I just 
scanned them through with haste on the first time concentrating more
on your responses and which of the logs I should download :-)).

> > > moreover, i read more carefully what you and Jarek wrote above,
> > > please let me know if you deem still necessary that i apply the
> > > patch you provided and compile or if you like i compile the latest
> > > kernel or whatever.
> > 
> > If that ws=1,s=1,ts=0 test succeeds, then testing my patch would be
> > useful too. 
> 
> i'm right now compiling the -27-rc1 kernel, on my pc it will take at
> least 3 hrs (that is, it will finish 02.00 rome time :-D )

Ok, I'll check that tomorrow then as mine trimmed down 2.6.28-rc (ah, 
it's still -g something, no rc1 just yet) did complete already :-), just 
going to try to bombard google.it from here too and then it's the bedtime 
for me...

> should it work i'll apply your patch t0 2.6.27.2 and compile again. 

Ok, it's quite likely that both will work with the patch. Anyway, it's 
enough that you do the future tests with 2.6.27.y if we decide to try some 
other ordering (I'll try to think a bit of that tomorrow to see what 
would be to most sensible ordering).
Aldo Maggi Oct. 20, 2008, 10:13 p.m. UTC | #9
Il giorno Tue, 21 Oct 2008 00:51:26 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> ha scritto:

> Reply to all please... ...I restored them again for you... :-)

sorry! i'll be more careful in the future!


[...]

thanks :-)
aldo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aldo Maggi Oct. 21, 2008, 5:26 a.m. UTC | #10
[...]

> > 
> > i'm right now compiling the -27-rc1 kernel, on my pc it will take at
> > least 3 hrs (that is, it will finish 02.00 rome time :-D )
> 

ok! the new kernel with ilpo's patch works!!
please find herewith attached the tcpdump output on eth0 on topolino 
while on paperino kernel 2.6.27-rc1-git1 with ilpo patch was running ,
the three tcp files all set to 1

[...]

as soon as i come back home i will start compilation of 2.6.27.2 with
ilpo patch

thanks 
aldo
diff mbox

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 990a584..850a4e9 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -376,6 +376,12 @@  static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
 		*md5_hash = NULL;
 	}
 
+	if (unlikely(opts->mss)) {
+		*ptr++ = htonl((TCPOPT_MSS << 24) |
+			       (TCPOLEN_MSS << 16) |
+			       opts->mss);
+	}
+
 	if (likely(OPTION_TS & opts->options)) {
 		if (unlikely(OPTION_SACK_ADVERTISE & opts->options)) {
 			*ptr++ = htonl((TCPOPT_SACK_PERM << 24) |
@@ -392,12 +398,6 @@  static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
 		*ptr++ = htonl(opts->tsecr);
 	}
 
-	if (unlikely(opts->mss)) {
-		*ptr++ = htonl((TCPOPT_MSS << 24) |
-			       (TCPOLEN_MSS << 16) |
-			       opts->mss);
-	}
-
 	if (unlikely(OPTION_SACK_ADVERTISE & opts->options &&
 		     !(OPTION_TS & opts->options))) {
 		*ptr++ = htonl((TCPOPT_NOP << 24) |