diff mbox

TCP not retransmitting

Message ID alpine.DEB.2.00.0912101518050.7024@wel-95.cs.helsinki.fi
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Ilpo Järvinen Dec. 10, 2009, 1:24 p.m. UTC
Changed subject and dropped all but netdev from Cc.

On Mon, 7 Dec 2009, Frederic Leroy wrote:

> Le Mon, 7 Dec 2009 20:50:11 +0200 (EET),
> "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> a écrit :
> 
> > On Mon, 7 Dec 2009, Damian Lukowski wrote:
> > 
> > > This patch fixes a problem in the TCP connection timeout
> > > calculation. Currently, timeout decisions are made on the basis of
> > > 6fa12c85031485dff38ce550c24f10da23b0adaa.
> > [...]
> >> >  static inline struct sk_buff *tcp_send_head(struct sock *sk) 
> > 
> > Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> > 
> > Also, this should be added:
> > 
> > Reported-by: Frederic Leroy <fredo@starox.org>
> > 
> > ...and I think he has already tested this fix a number of times in 
> > different forms, so I'd already include his Tested-by: too.
> 
> If you need a bad connection to do some test, don't hesitate !
> Thank you guys ;)

Hi,

Now that the more important issues are out of the way, could try with the 
following debug patch, if we could get some information on why the 
retransmission are not happening when they're supposed to (the EAGAIN 
problem we noticed earlier). Besides trying with 2.6.32, it might also be 
worth of the effort to try that in 2.6.31 to see if that also gives those 
EAGAIN things or if it's a new issue. It is quite insignificant whether 
to previous fix patch is included or not though perhaps easier to know 
that the problem really happened if a stall is experiencable vs. the fixed 
kernel.

Comments

Frederic Leroy Dec. 11, 2009, 12:57 p.m. UTC | #1
Hi Ilpo,

On Thu, Dec 10, 2009 at 03:24:20PM +0200, Ilpo Järvinen wrote:
> Hi,
> 
> Now that the more important issues are out of the way, could try with the 
> following debug patch, if we could get some information on why the 
> retransmission are not happening when they're supposed to (the EAGAIN 
> problem we noticed earlier). Besides trying with 2.6.32, it might also be 
> worth of the effort to try that in 2.6.31 to see if that also gives those 
> EAGAIN things or if it's a new issue. It is quite insignificant whether 
> to previous fix patch is included or not though perhaps easier to know 
> that the problem really happened if a stall is experiencable vs. the fixed 
> kernel.

I made 3 test this time. Unlike the other, I put them appart in a new directory : 
http://www.starox.org/pub/scp_stall/eagain/

The first was made with 2.6.31 kernel.
I disconnected the ethernet cable for about 30s.
There was some eagain error.

The two other were made with 2.6.32, without the fix.
I didn't disconnected anything as I know it will fails quickly :)
The first try has an eagain error.

> 
> -- 
>  i.
> 
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index fcd278a..c29aed0 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1894,6 +1894,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
>  		icsk->icsk_mtup.probe_size = 0;
>  	}
>  
> +	printk("sk: %p wm_a: %u wm_q: %u sbuf: %u\n", sk,
> +		atomic_read(&sk->sk_wmem_alloc), sk->sk_wmem_queued, sk->sk_sndbuf);
>  	/* Do not sent more than we queued. 1/4 is reserved for possible
>  	 * copying overhead: fragmentation, tunneling, mangling etc.
>  	 */
> @@ -1986,6 +1988,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
>  		 */
>  		TCP_SKB_CB(skb)->ack_seq = tp->snd_nxt;
>  	}
> +	printk("sk: %p E %u", sk, err);
                           ^
                      missing \n ;-)
>  	return err;
>  }
diff mbox

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index fcd278a..c29aed0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1894,6 +1894,8 @@  int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 		icsk->icsk_mtup.probe_size = 0;
 	}
 
+	printk("sk: %p wm_a: %u wm_q: %u sbuf: %u\n", sk,
+		atomic_read(&sk->sk_wmem_alloc), sk->sk_wmem_queued, sk->sk_sndbuf);
 	/* Do not sent more than we queued. 1/4 is reserved for possible
 	 * copying overhead: fragmentation, tunneling, mangling etc.
 	 */
@@ -1986,6 +1988,7 @@  int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 		 */
 		TCP_SKB_CB(skb)->ack_seq = tp->snd_nxt;
 	}
+	printk("sk: %p E %u", sk, err);
 	return err;
 }