Patchwork tcp FRTO: in-order-only "TCP proxy" fragility workaround (fwd)

login
register
mail settings
Submitter Thomas Gleixner
Date Oct. 2, 2008, 10:42 a.m.
Message ID <alpine.LFD.1.10.0810021142370.5549@apollo.tec.linutronix.de>
Download mbox | patch
Permalink /patch/2375/
State Not Applicable
Headers show

Comments

Dâniel Fraga - Oct. 2, 2008, 11:17 a.m.
On Thu, 2 Oct 2008 12:42:44 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

>   echo sym-verbose > /debug/tracing/iter_ctrl

	Ok, this one I changed to:

echo verbose > /debug/tracing/iter_ctrl

	;) since there's no sym-verbose.

> Find below a patch which disables the tracer when the listen overflow
> is detected. It could occassionally be hit in a legimate case as well,
> but we did not come up with any better idea.

	Ok, I applied the patch, but it returns this error:

  CC      kernel/trace/ftrace.o
kernel/trace/ftrace.c: In function 'stop_event_tracer_now':
kernel/trace/ftrace.c:73: error: 'tracer_enabled' undeclared (first use in this function)
kernel/trace/ftrace.c:73: error: (Each undeclared identifier is reported only once
kernel/trace/ftrace.c:73: error: for each function it appears in.)
make[2]: *** [kernel/trace/ftrace.o] Error 1
make[1]: *** [kernel/trace] Error 2
make: *** [kernel] Error 2

> If it triggers, please cat /debug/tracing/trace to a file. Can you
> bzip2 and upload it to some place. The trace does not contain any
> sensible private information.

	Ok. Thanks.
Dâniel Fraga - Oct. 2, 2008, 2:09 p.m.
On Thu, 2 Oct 2008 14:08:14 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> grmbl. tracer_enabled should have been ftrace_enabled of course.

	Ok Thomas. I got a stall on my desktop.

	It happens when I try to use sudo and it just stay stalled (I
think trying to access /dev/log or something like that).

	I got the trace here:

http://www.abusar.org/ftrace/trace.txt.bz2

	after the stall.

	Could you please check if there's something wrong?

	I'd prefer if the stall happened in the server, but it still
didn't happen. Anyway, it's the same problem.

	Ps: this time I couldn't solve it with nmap (sometimes nmap
solves, sometimes it doesn't).

	Ps2: in this stall, the tracing wasn't disabled automatically
by your patch, because it happened in a different situation (when I
tried to use sudo). But I know it's the same problem, because the
"solution" I use is always the same (nmap) and until it's not stalled
anymore, I can't connect to any port on my machine.

	Ps3: if this trace is useless, wait until I get something on
the server.
Dâniel Fraga - Oct. 2, 2008, 8:56 p.m.
On Thu, 2 Oct 2008 21:16:32 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> Hmm, not sure. sudo is a different beast. All I can see in the trace
> is that reiserfs is turning itself in circles (which is nothing new),
> but there is nothing which might tell us about a stall. At the point
> where you stopped the trace the relevant info was already replaced by
> new stuff. It's a circular buffer which gets overwritten.

	Ok, no problem.

	I'll keep waiting for a stall on the server-side..
Dâniel Fraga - Oct. 5, 2008, 7:31 p.m.
On Sun, 5 Oct 2008 20:09:22 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> did you get the content of /debug/tracing/trace nevertheless ?

	Yes! I got several... maybe one is valid....

> The cat /debug/tracing/tracing_enabled == 1 might be just a user space
> interface thingy.

	Ok, here they are (I hope one of them is correct):

http://www.abusar.org/ftrace/

	These ones I get after I saw the kernel message "Tracer
stopped".

http://www.abusar.org/ftrace/trace.txt.bz2

http://www.abusar.org/ftrace/trace2.txt.bz2

http://www.abusar.org/ftrace/trace3.txt.bz2

http://www.abusar.org/ftrace/trace4.txt.bz2

http://www.abusar.org/ftrace/trace5.txt.bz2

	These ones were during a stall and after a nmap:

http://www.abusar.org/ftrace/trace-stop.txt.bz2

http://www.abusar.org/ftrace/trace-stop2.txt.bz2

	If none of these is valid, I will use monit to
check for Tracer stopped and force the tracing to be disabled.
Dâniel Fraga - Oct. 6, 2008, 1:27 a.m.
On Sun, 5 Oct 2008 20:09:22 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> did you get the content of /debug/tracing/trace nevertheless ?
> 
> The cat /debug/tracing/tracing_enabled == 1 might be just a user space
> interface thingy.

	Hi Thomas, I added this one too (from my desktop):

http://www.abusar.org/ftrace/trace-desktop.txt.bz2

	I don't know if it will help, but if you could look the entries
generated by sudo and fetchnews, they were stalled. sudo has a lot of:

            sudo-1081  [01] 34460.250986: _spin_lock+0x4/0x20 <ffffffff80453754> <-sys_close+0x3b/0x100 <ffffffff8029e06b>
            sudo-1081  [01] 34460.250986: sys_close+0x14/0x100 <ffffffff8029e044> <-system_call_fastpath+0x16/0x1b <ffffffff8020bd1b>
            sudo-1081  [01] 34460.250987: _spin_lock+0x4/0x20 <ffffffff80453754> <-sys_close+0x3b/0x100 <ffffffff8029e06b>
            sudo-1081  [01] 34460.250987: sys_close+0x14/0x100 <ffffffff8029e044> <-system_call_fastpath+0x16/0x1b <ffffffff8020bd1b>
            sudo-1081  [01] 34460.250987: _spin_lock+0x4/0x20 <ffffffff80453754> <-sys_close+0x3b/0x100 <ffffffff8029e06b>
            sudo-1081  [01] 34460.250987: sys_close+0x14/0x100 <ffffffff8029e044> <-system_call_fastpath+0x16/0x1b <ffffffff8020bd1b>
            sudo-1081  [01] 34460.250987: _spin_lock+0x4/0x20 <ffffffff80453754> <-sys_close+0x3b/0x100 <ffffffff8029e06b>
            sudo-1081  [01] 34460.250988: sys_close+0x14/0x100 <ffffffff8029e044> <-system_call_fastpath+0x16/0x1b <ffffffff8020bd1b>
            sudo-1081  [01] 34460.250988: _spin_lock+0x4/0x20 <ffffffff80453754> <-sys_close+0x3b/0x100 <ffffffff8029e06b>

	I don't know if it's normal...

	Thank you.
Dâniel Fraga - Oct. 8, 2008, 7:34 p.m.
On Sun, 5 Oct 2008 20:09:22 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> The cat /debug/tracing/tracing_enabled == 1 might be just a user space
> interface thingy.
> 
> Thanks,

	Thomas, Ilpo etc, I got a trace exactly after the patch
provided by Thomas detected the stall:

http://www.abusar.org/ftrace/trace-final.txt.bz2

	Thomas, since you didn't reply to the previous messages, I
assume the previous traces were invalid. But could you please check
this one?

	It would be wise to search for "monit" in the trace, since
monit activated the script "stop-tracing.sh" which stops the trace and
dump the trace output to the file. So what probably interest you is
what comes *before* "monit".

	"monit" appears for the first time in line 16921.

	To make it more clear, I configured monit with the following:

check file syslog with path /var/adm/messages
	if match "Tracer stopped" then exec "/home/fraga/stop-tracing.sh"

	And the stop-tracing.sh script is just that:

#!/bin/sh
/usr/local/bin/echo 0 > /debug/tracing/tracing_enabled
/usr/local/bin/cat /debug/tracing/trace > /home/fraga/trace.txt

	So I expect that this trace is right now.

	If monit was too slow in detecting the "Tracer stopped" log entry, I'll 
use a Perl script which is faster.

	Thanks, I wait for your reply.

	Ps: flood_mail.pl is just a perl script I use to prevent people from flooding
my smtp server with mail.

	Ps2: I just don't know if it would be better to stop the tracing *before*
dump the trace or it would be better to dump the trace without stopping it... I noticed
that even the "echo" I use to stop the tracing, generated a lot of useless tracing information.
Dâniel Fraga - Oct. 13, 2008, 4:08 a.m.
On Wed, 8 Oct 2008 21:56:00 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> Will do. Sorry was distracted. Steven Rostedt posted a patch to stop
> the tracer really from the kernel, I need to verify that it works on
> .27. Will do tomorrow when my brain is more awake than now.

	Hi Thomas, did you have time to verify that patch? If so, I'd
like to test it. Thanks.
Dâniel Fraga - Oct. 17, 2008, 11:51 p.m.
On Mon, 13 Oct 2008 16:35:08 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

	Just reporting another stall event... Who knows it helps in some way...

> ...might turn out related. It has somewhat similar problem though 
> different syscall. Not enough info on what kernel version etc. 
> there though.

	Well, I can be wrong, but I think those stalls are related to USB (or USB helped to 
"solve" the stall temporarily)... I explain below (I have a monitor with usb hub connected):

Oct 17 19:35:55 tux sudo: pam_unix(sudo:session): session opened for user root by fraga(uid=0)
Oct 17 19:35:55 tux sudo: pam_unix(sudo:session): session closed for user root

	^---------- sudo opened the session at 19:35

Oct 17 19:38:16 tux sudo:    fraga : TTY=pts/2 ; PWD=/usr/local/lib ; USER=root ; COMMAND=/usr/local/bin/rm libdvdread.a libdvdread.la libdvdread.so libdvdread.so.3 libdvdread.so.3.2.1 libdvdread.so.4 libdvdread
.so.4.1.2

	^---------- I tried to run this command with sudo. And it stalled. Notice that it was logged only on 19:38.

Oct 17 19:38:16 tux sudo: pam_unix(sudo:session): session opened for user root by fraga(uid=0)
Oct 17 19:38:16 tux sudo: pam_unix(sudo:session): session closed for user root
Oct 17 19:40:20 tux vmunix: ci000/000:21ub/-/-./-../-..:./nu/nu7

	^----------- Here is the stall. You can see that the log is corrupted, every other line.

Oct 17 19:40:20 tux vmunix: 6ipt S I 11 os Lgtc pia S os]o s-000:21112

	^----------- Every other line. Here is when I turned off my monitor with USB... Unless it was an extreme coincidence, I'm sure this event
of turning off and...

Oct 17 19:40:25 tux vmunix: sd 9:0:0:0: [sdb] Attached SCSI removable disk
Oct 17 19:40:25 tux scsi.agent[5498]: disk at /devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host9/target9:0:0/9:0:0:0

	^---------- ...turning on again, helped to "solve" the stall. The stall ended when I turned off the monitor.

Oct 17 19:40:28 tux sudo:    fraga : TTY=pts/2 ; PWD=/usr/local/lib ; USER=root ; COMMAND=/usr/local/bin/ls
Oct 17 19:40:28 tux sudo: pam_unix(sudo:session): session opened for user root by fraga(uid=0)
Oct 17 19:40:28 tux sudo: pam_unix(sudo:session): session closed for user root
Oct 17 19:40:30 tux scsi.agent[5310]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host9/type does not exist
Oct 17 19:40:35 tux scsi.agent[5493]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host9/target9:0:0/type does not exist
Oct 17 20:12:37 tux vmunix: usb-storage: device found at 14
Oct 17 20:12:37 tux vmunix: usb-storage: waiting for device to settle before scanning
Oct 17 20:12:42 tux vmunix: scsi 10:0:0:0: Direct-Access     Generic  ls SC   53 Q  NI 
Oct 17 20:12:42 tux vmunix: k
Oct 17 20:12:42 tux scsi.agent[7439]: disk at /devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host10/target10:0:0/10:0:0:0
Oct 17 20:12:47 tux vmunix: DROP INPUT: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:15:e9:e2:00:db:08:00 SRC=192.168.0.1 DST=255.255.255.255 LEN=334 TOS=0x00 PREC=0x00 TTL=64 ID=6668 PROTO=UDP SPT=67 DPT=68 LEN=314 
Oct 17 20:12:47 tux scsi.agent[7252]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host10/type does not exist
Oct 17 20:12:53 tux scsi.agent[7435]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host10/target10:0:0/type does not exist

	***
	
	At the server I have an usb device connected too...

	OR at eh end it's just a symptom of the problem... I don't
know. The interesting thing is that as soon as I turned off the monitor 
(at the exact moment), the speaker beeped, because I used TAB to auto
complete my command, and I knew that sudo wasn't stalled
anymore.

	Maybe what the kernel does when we remove a usb hub could give us a hint about the stalling issue...
Ilpo Järvinen - Oct. 20, 2008, 9:48 p.m.
On Fri, 17 Oct 2008, Dâniel Fraga wrote:

> On Mon, 13 Oct 2008 16:35:08 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> 	Just reporting another stall event... Who knows it helps in some way...
> 
> > ...might turn out related. It has somewhat similar problem though 
> > different syscall. Not enough info on what kernel version etc. 
> > there though.
> 
> 	Well, I can be wrong, but I think those stalls are related to USB (or USB helped to 
> "solve" the stall temporarily)... I explain below (I have a monitor with usb hub connected):
> 
> Oct 17 19:35:55 tux sudo: pam_unix(sudo:session): session opened for user root by fraga(uid=0)
> Oct 17 19:35:55 tux sudo: pam_unix(sudo:session): session closed for user root
> 
> 	^---------- sudo opened the session at 19:35
> 
> Oct 17 19:38:16 tux sudo:    fraga : TTY=pts/2 ; PWD=/usr/local/lib ; USER=root ; COMMAND=/usr/local/bin/rm libdvdread.a libdvdread.la libdvdread.so libdvdread.so.3 libdvdread.so.3.2.1 libdvdread.so.4 libdvdread
> .so.4.1.2
> 
> 	^---------- I tried to run this command with sudo. And it stalled. Notice that it was logged only on 19:38.
> 
> Oct 17 19:38:16 tux sudo: pam_unix(sudo:session): session opened for user root by fraga(uid=0)
> Oct 17 19:38:16 tux sudo: pam_unix(sudo:session): session closed for user root
> Oct 17 19:40:20 tux vmunix: ci000/000:21ub/-/-./-../-..:./nu/nu7
> 
> 	^----------- Here is the stall. You can see that the log is corrupted, every other line.
> 
> Oct 17 19:40:20 tux vmunix: 6ipt S I 11 os Lgtc pia S os]o s-000:21112
> 
> 	^----------- Every other line. Here is when I turned off my monitor with USB... Unless it was an extreme coincidence, I'm sure this event
> of turning off and...

You probably meant every other character... :-) The same as with the 
netfilter logs... I've two open question related to it, where did the 
every other letter go...? and why not a larger block but only a single 
char is missing (at a time)? I've no idea about printk's internals, maybe 
I'll try to ping somebody tomorrow about it.

> Oct 17 19:40:25 tux vmunix: sd 9:0:0:0: [sdb] Attached SCSI removable disk
> Oct 17 19:40:25 tux scsi.agent[5498]: disk at /devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host9/target9:0:0/9:0:0:0
> 
> 	^---------- ...turning on again, helped to "solve" the stall. The stall ended when I turned off the monitor.
> 
> Oct 17 19:40:28 tux sudo:    fraga : TTY=pts/2 ; PWD=/usr/local/lib ; USER=root ; COMMAND=/usr/local/bin/ls
> Oct 17 19:40:28 tux sudo: pam_unix(sudo:session): session opened for user root by fraga(uid=0)
> Oct 17 19:40:28 tux sudo: pam_unix(sudo:session): session closed for user root
> Oct 17 19:40:30 tux scsi.agent[5310]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host9/type does not exist
> Oct 17 19:40:35 tux scsi.agent[5493]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host9/target9:0:0/type does not exist
> Oct 17 20:12:37 tux vmunix: usb-storage: device found at 14
> Oct 17 20:12:37 tux vmunix: usb-storage: waiting for device to settle before scanning
> Oct 17 20:12:42 tux vmunix: scsi 10:0:0:0: Direct-Access     Generic  ls SC   53 Q  NI 
> Oct 17 20:12:42 tux vmunix: k
> Oct 17 20:12:42 tux scsi.agent[7439]: disk at /devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host10/target10:0:0/10:0:0:0
> Oct 17 20:12:47 tux vmunix: DROP INPUT: IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:00:15:e9:e2:00:db:08:00 SRC=192.168.0.1 DST=255.255.255.255 LEN=334 TOS=0x00 PREC=0x00 TTL=64 ID=6668 PROTO=UDP SPT=67 DPT=68 LEN=314 
> Oct 17 20:12:47 tux scsi.agent[7252]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host10/type does not exist
> Oct 17 20:12:53 tux scsi.agent[7435]: Attribute /sys/devices/pci0000:00/0000:00:02.1/usb2/2-1/2-1.1/2-1.1.1/2-1.1.1:1.0/host10/target10:0:0/type does not exist
> 
> 	***
> 	
> 	At the server I have an usb device connected too...
> 
> 	OR at eh end it's just a symptom of the problem... I don't
> know. The interesting thing is that as soon as I turned off the monitor 
> (at the exact moment), the speaker beeped, because I used TAB to auto
> complete my command, and I knew that sudo wasn't stalled
> anymore.
> 
> 	Maybe what the kernel does when we remove a usb hub could give us a hint about the stalling issue...

I doubt we'll find the cause from networking...

Trying something a bit more generic:

Try running vmstat 1 > somelog.file to see if there's something 
particularly interesting behavior during the stall.
Ilpo Järvinen - Oct. 20, 2008, 9:49 p.m.
On Tue, 21 Oct 2008, Ilpo Järvinen wrote:

> Trying something a bit more generic:
> 
> Try running vmstat 1 > somelog.file to see if there's something 
> particularly interesting behavior during the stall.

Won't hurt doing this on both the server and the client... :-)
Dâniel Fraga - Oct. 22, 2008, 2:12 a.m.
On Tue, 21 Oct 2008 00:48:16 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Try running vmstat 1 > somelog.file to see if there's something 
> particularly interesting behavior during the stall.

	Here it is. I used nmap more or less in the middle of this log.
The stall happened just on my client machine (stalled sudo) so there's only one 
log ok? Thanks.

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 1  0  52540  69928  63832 1001580    0    1    17    48   30  119 18  1 80  1
 0  0  52540  69844  63832 1001580    0    0     0     0  633  921  0  0 100  0
 0  0  52540  69844  63832 1001580    0    0     0   100  621  916  0  0 99  0
 1  0  52540  69844  63832 1001580    0    0     0     0  611  893  0  0 100  0
 0  0  52540  69716  63832 1001580    0    0     0     0  671  961  0  0 100  0
 1  0  52540  69552  63832 1001580    0    0     0     0  813  975  4  0 95  0
 0  0  52540  69596  63832 1001580    0    0     0     0  716  947  2  0 98  0
 0  0  52540  69596  63832 1001580    0    0     0    36  866 2047  2  2 96  0
 0  0  52540  69596  63832 1001580    0    0     0     0  688  964  0  0 100  0
 0  0  52540  69596  63832 1001580    0    0     0     0  721 1005  0  0 99  0
 0  0  52540  69720  63832 1001580    0    0     0     0  679  958  0  0 99  0
 0  0  52540  69720  63832 1001580    0    0     4     0  651  935  0  0 99  1
 0  0  52540  69760  63832 1001584    0    0     0    36  990 2105  2  2 96  0
 0  0  52540  69720  63832 1001584    0    0     0     0  869 1187  0  0 99  0
 0  0  52540  69720  63832 1001584    0    0     0     0  937 1272  0  0 100  0
 0  0  52540  69472  63832 1001584    0    0     0     0 1486 1702 17  3 79  0
 1  0  52540  68092  63832 1001604    0    0     0     0 2263 2553 36 13 50  0
 0  0  52540  68092  63832 1001604    0    0     0   336 1214 1223  3  4 92  0
 3  0  52540  67752  63832 1001604    0    0     0     0 1031 1192  1  0 99  0
 0  0  52540  67712  63832 1001604    0    0     0     0  988 1149  1  0 99  0
 0  0  52540  67712  63832 1001608    0    0     0     0  989 1127  0  0 100  0
 1  0  52540  67712  63832 1001608    0    0     0     0  985 1124  0  0 100  0
 0  0  52540  67752  63832 1001608    0    0     0   512 1149 1106  0  0 99  0
 2  0  52540  67544  63832 1001608    0    0     0     0 1463 1329  7  7 86  0
 2  0  52540  66496  63832 1001612    0    0     0     0 1755 2439 21  5 74  0
 0  0  52540  66332  63832 1001620    0    0     0     0 1597 1534 23 12 65  0
 0  0  52540  65992  63836 1001620    0    0    72     0  861 1155  0  0 97  2
 0  0  52540  65952  63836 1001688    0    0     0   304  819 1111  1  0 98  0
 0  0  52540  65704  63840 1001696    0    0    12     0  847 1194  1  0 97  1
 0  0  52540  65704  63840 1001696    0    0     0     0  797 1123  0  0 99  0
 2  0  52540  65620  63840 1001696    0    0     0     0  827 1155  1  0 99  0
 0  0  52540  65580  63840 1001696    0    0     0     0  747 1033  0  0 100  0
 0  0  52540  65496  63840 1001696    0    0     0    84  969 2128  2  3 95  0
 0  0  52540  64368  63924 1002608  480    0  1444   148 1593 1471 10  5 57 28
 0  0  52540  64480  63924 1002600    0    0     0     0  854 1141  1  1 98  0
 1  0  52540  64480  63924 1002604    0    0    12     0 1517 1409 18  1 81  0
 0  0  52540  64476  63924 1002624    0    0     0     0 1717 1331 27  1 72  0
 0  0  52540  64268  63980 1002600    0    0    56   300 1816 1543 17 14 66  3
 0  0  52540  64240  64004 1002616    0    0    24     0  805 1063  2  0 93  4
 0  0  52540  64240  64020 1002628    0    0    16     0  877 1033  1  0 96  3
 1  0  52540  64272  64048 1002632    0    0    28     0 1202 2408  3  3 90  3
 0  0  52540  64232  64068 1002640    0    0    20     0  961 1121  2  0 93  4
 0  0  52540  64232  64076 1002664    0    0     8   288  975 1197  1  0 97  2
 1  0  52540  64232  64100 1002668    0    0    24     0  974 1247  1  0 93  6
 0  0  52540  64232  64100 1002676    0    0     0     0  961 1177  1  0 98  0
 0  0  52540  64108  64300 1002684    0    0   200     0 1157 1337  4  0 74 21
 1  0  52540  64108  64328 1002668    0    0    28     0  909 1156  1  1 92  6
 0  0  52540  64108  64352 1002688    0    0    24   244  879 1117  1  0 94  4
 0  1  52540  64024  64376 1002712    0    0    24     0  932 1171  1  0 94  4
 0  0  52540  63984  64396 1002728    0    0    20     0  974 1190  3  1 91  4
 1  0  52540  63900  64412 1002732    0    0    16     0  908 1143  1  0 96  3
 0  0  52540  63860  64436 1002760    0    0    24     0  857 1105  0  0 92  7
 0  0  52540  63608  64444 1002796    0    0     4   380  961 1129  1  1 97  1
 0  0  52540  63612  64456 1002824    0    0    12     0  881 1088  2  0 95  3
Ilpo Järvinen - Oct. 30, 2008, 10:43 a.m.
On Wed, 22 Oct 2008, Dâniel Fraga wrote:

> On Tue, 21 Oct 2008 00:48:16 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Try running vmstat 1 > somelog.file to see if there's something 
> > particularly interesting behavior during the stall.
> 
> 	Here it is. I used nmap more or less in the middle of this log.
> The stall happened just on my client machine (stalled sudo) so there's only one 
> log ok? Thanks.

Nothing particularly interesting in it (sorry for the delay, there were 
some other bugs in the meantime to fix and partially because I was out 
of ideas). Minor increase in cpu/wa after nmap, before that it was 0-2, 
mostly zero.

Another thing that occurred to me today is that all these events I know of 
are related to syslog/klogd (I don't know too well how they relate to each 
other and what functionality is provided by each, thus I refer them here 
as one :-)).

- sudo sendmsg'ed to /dev/log and stalled
- every-2nd-char missing in printk relates to it
- usb does cause printks
- server processes likely do some logging (and can then hang on waiting
  until that finishes and therefore eventually no worker is available to 
  process requests)

...Some other type of case I'm missing here?

Perhaps we could try to solve it though stracing syslogd...

Did you ever actually try btw with something else than NO_TICK, we 
discussed it at some point of time but I'm not sure what was the end 
result of what got tested and what not?
Dâniel Fraga - Oct. 30, 2008, 6:16 p.m.
On Thu, 30 Oct 2008 12:43:05 +0200 (EET)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Nothing particularly interesting in it (sorry for the delay, there were 
> some other bugs in the meantime to fix and partially because I was out 
> of ideas). Minor increase in cpu/wa after nmap, before that it was 0-2, 
> mostly zero.

	Hi Ilpo. No problem, it's better late than never ;)

> Another thing that occurred to me today is that all these events I know of 
> are related to syslog/klogd (I don't know too well how they relate to each 
> other and what functionality is provided by each, thus I refer them here 
> as one :-)).
> 
> - sudo sendmsg'ed to /dev/log and stalled
> - every-2nd-char missing in printk relates to it
> - usb does cause printks
> - server processes likely do some logging (and can then hang on waiting
>   until that finishes and therefore eventually no worker is available to 
>   process requests)
> 
> ...Some other type of case I'm missing here?

	Yes, it is that. Anyway I'm talking directly to netfilter
people and they suggested that I use ulogd to capture the exact packet
when the stall happens using pcap. I'll try that. If they can figure it
out what happens to make the log prints every other char, maybe we have
a hint.

> Perhaps we could try to solve it though stracing syslogd...

	Yes I can trace it.

> Did you ever actually try btw with something else than NO_TICK, we 
> discussed it at some point of time but I'm not sure what was the end 
> result of what got tested and what not?

	I tried with many options, with TICK enabled and disabled, high
res timer enabled and disabled and other options in the kernel, but it
didn't make any difference.

	I'll try with netfilter people and if i get something, I
will reply ok?
Dâniel Fraga - Nov. 2, 2008, 5:56 a.m.
On Thu, 30 Oct 2008 12:43:05 +0200 (EET)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Perhaps we could try to solve it though stracing syslogd...

	Well Ilpo, you're right, what I'm about to write here will make
me very ashamed, but the truth must be told! The culprit was syslogd!
Almost unbeliavable, but I had been using and old syslogd version for
about 5 years!

	How can I'm sure that it's syslogd's fault? Simply, because I
had a stall today and when I killed syslogd everything was back to
normal.

	Well, I reinstalled GNU inetutils 1.5 (which I had already
installed before), but I don't know why it put syslogd
in /usr/local/libexec directory.

	But no problem. I'll just wait a few more days to test if
syslogd is the only responsible for this, but I'm 90% sure it is.

	So, just posting this, so if someone, who knows, some day, have
a similar problem, can read this message and avoid all the problems I
had.

	I apologize for thinking that it was a kernel fault. Anyway,
one more lesson I learned: do not keep old binaries lying around... ;)

	Thanks everyone, mainly Ilpo for giving me all tools to reach 
to this point.

	Ps: just for curiosity, I was using a syslogd binary from Mar,
3, 2003! Extremely old! This is so old, it was compiled for Linux
2.2.5. Or maybe I was too lazy and copied it from another machine...

syslogd: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for
GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped

	Ps2: I'll close the bug I opened on bugzilla.

	Ps3: anyway, it's interesting how a small piece of the system
(syslogd) can generate those kinds of problems... I mean, a simple
error on syslogd could lead to a complete stall on connections, just
because everything is waiting for it to log through /dev/log. Of course
the problem was the binary, but it could have a time out, so even if it
was in fact a buggy syslogd, it won't cause such a stall on the
system. I really don't know what changed from 2.6.24 to 2.6.25, but
maybe 2.6.24 had such a timeout? Maybe I'm just silly writing that...
you guys know much more than me.

	Ps4: maybe now we can understand why nmap solved the issue...
Ilpo Järvinen - Nov. 3, 2008, 3:37 p.m.
On Sun, 2 Nov 2008, Dâniel Fraga wrote:

> On Thu, 30 Oct 2008 12:43:05 +0200 (EET)
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:
> 
> > Perhaps we could try to solve it though stracing syslogd...
> 
> 	Well Ilpo, you're right, what I'm about to write here will make
> me very ashamed, but the truth must be told! The culprit was syslogd!
> Almost unbeliavable, but I had been using and old syslogd version for
> about 5 years!
> 
> 	How can I'm sure that it's syslogd's fault? Simply, because I
> had a stall today and when I killed syslogd everything was back to
> normal.

Once there's any kind of flow control, anything jamming downstream will 
eventually make upstream to stall as well (or to appear as not working 
as expected. Sadly, it's exactly opposite from correctness point of view 
as flow control is a feature in TCP, not a bug :-)). Thus I occassionally 
run to these tcp with flow control not working reports which turn to be 
totally unrelated.

This still doesn't explain everything though afaik... E.g., why did the 
sendto() to SOCK_DGRAM socket hung.

> 	But no problem. I'll just wait a few more days to test if
> syslogd is the only responsible for this, but I'm 90% sure it is.

And you had the same old syslogd on both hosts?

In any case the loss of every other character deterministically sounds 
like a real bug in the syslogd since it doesn't make too much sense to 
happen in kernel->syslogd communication (where I'd expect it to not show 
up in such consistent pattern but would cause more randomness).

> 	I apologize for thinking that it was a kernel fault.

It's not clear what caused this to happen _now_, nor the exact mechanism.

> 	Ps: just for curiosity, I was using a syslogd binary from Mar,
> 3, 2003! Extremely old! This is so old, it was compiled for Linux
> 2.2.5. Or maybe I was too lazy and copied it from another machine...

In theory this shouldn't be too big problem, but I'm hardly an expert of 
those things and syslogd is anyway more thightly coupled to kernel than 
some random app.

> 	Ps3: anyway, it's interesting how a small piece of the system
> (syslogd) can generate those kinds of problems... I mean, a simple
> error on syslogd could lead to a complete stall on connections, just
> because everything is waiting for it to log through /dev/log.

This is more of a philosophical question than something else... it's 
always balancing between data loss (=possibly losing a logline of an 
important event) or possibility of a stall. But this shouldn't be a 
concern in the case where SOCK_DGRAM was used by the sudo (like in the 
strace you sent to sudo people), in general UDP doesn't guarantee 
reliability so not delivering wouldn't be a problem but I don't know if 
PF_FILE domain does something otherwise in there.

> Of course
> the problem was the binary, but it could have a time out, so even if it
> was in fact a buggy syslogd, it won't cause such a stall on the
> system. I really don't know what changed from 2.6.24 to 2.6.25, but
> maybe 2.6.24 had such a timeout? Maybe I'm just silly writing that...
> you guys know much more than me.

Until we know more details than that killing syslogd helped it's hard to 
tell what is the actual cause. And I have no clue about semantics of 
/dev/log anyway.

> 	Ps4: maybe now we can understand why nmap solved the issue...

Not very clear but at least sudo does some writing there too.
Dâniel Fraga - Nov. 3, 2008, 5:03 p.m.
On Mon, 3 Nov 2008 17:37:09 +0200 (EET)
"Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> wrote:

> Once there's any kind of flow control, anything jamming downstream will 
> eventually make upstream to stall as well (or to appear as not working 
> as expected. Sadly, it's exactly opposite from correctness point of view 
> as flow control is a feature in TCP, not a bug :-)). Thus I occassionally 
> run to these tcp with flow control not working reports which turn to be 
> totally unrelated.
> 
> This still doesn't explain everything though afaik... E.g., why did the 
> sendto() to SOCK_DGRAM socket hung.

	Well, the fact that the problem happened since 2.6.25 kernel
make me believe that it could exist a possible kernel issue too, but I
think that most part was caused by syslogd.

> And you had the same old syslogd on both hosts?

	Yes. My desktop and server have the same installation.

> In any case the loss of every other character deterministically sounds 
> like a real bug in the syslogd since it doesn't make too much sense to 
> happen in kernel->syslogd communication (where I'd expect it to not show 
> up in such consistent pattern but would cause more randomness).

	Yes. With the new compiled syslogd it doesn't happen anymore.
And I don't have stall too.

> It's not clear what caused this to happen _now_, nor the exact mechanism.

	Ok.

> This is more of a philosophical question than something else... it's 
> always balancing between data loss (=possibly losing a logline of an 
> important event) or possibility of a stall. But this shouldn't be a 
> concern in the case where SOCK_DGRAM was used by the sudo (like in the 
> strace you sent to sudo people), in general UDP doesn't guarantee 
> reliability so not delivering wouldn't be a problem but I don't know if 
> PF_FILE domain does something otherwise in there.

	I see.

> Until we know more details than that killing syslogd helped it's hard to 
> tell what is the actual cause. And I have no clue about semantics of 
> /dev/log anyway.

	Ok. Anyway, at least the problem was registered and if in the
future we have something related, maybe this can help someone.

Patch

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index f6e3af3..a1cecb0 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -68,6 +68,12 @@  static void ftrace_list_func(unsigned long ip, unsigned long parent_ip)
 	};
 }
 
+void stop_event_tracer_now(void)
+{
+	tracer_enabled = 0;
+	printk(KERN_INFO "Tracer stopped\n");
+}
+
 /**
  * clear_ftrace_function - reset the ftrace function
  *
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index f976fc5..527dd45 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -485,6 +485,8 @@  struct sock *tcp_create_openreq_child(struct sock *sk, struct request_sock *req,
 	return newsk;
 }
 
+extern void stop_event_tracer_now(void);
+
 /*
  *	Process an incoming packet for SYN_RECV sockets represented
  *	as a request_sock.
@@ -687,6 +689,7 @@  struct sock *tcp_check_req(struct sock *sk,struct sk_buff *skb,
 	return child;
 
 listen_overflow:
+	stop_event_tracer_now();
 	if (!sysctl_tcp_abort_on_overflow) {
 		inet_rsk(req)->acked = 1;
 		return NULL;