Patchwork [net-next-2.6] net: rfs: enable RFS before first data packet is received

login
register
mail settings
Submitter Eric Dumazet
Date June 15, 2011, 2:15 a.m.
Message ID <1308104128.4578.10.camel@edumazet-laptop>
Download mbox | patch
Permalink /patch/100468/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - June 15, 2011, 2:15 a.m.
First packet received on a passive tcp flow is not correctly RFS
steered.

One sock_rps_record_flow() call is missing in inet_accept()

But before that, we also must record rxhash when child socket is setup.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Jamal Hadi Salim <hadi@cyberus.ca>
---
Netconf2011 workshop ;)

 net/ipv4/af_inet.c  |    1 +
 net/ipv4/tcp_ipv4.c |    1 +
 2 files changed, 2 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings - June 16, 2011, 11:50 p.m.
On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
> First packet received on a passive tcp flow is not correctly RFS
> steered.
> 
> One sock_rps_record_flow() call is missing in inet_accept()
> 
> But before that, we also must record rxhash when child socket is setup.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Tom Herbert <therbert@google.com>
> CC: Ben Hutchings <bhutchings@solarflare.com>
> CC: Jamal Hadi Salim <hadi@cyberus.ca>
> ---
> Netconf2011 workshop ;)
> 
>  net/ipv4/af_inet.c  |    1 +
>  net/ipv4/tcp_ipv4.c |    1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 83673d2..0600f0f 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -676,6 +676,7 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags)
>  
>  	lock_sock(sk2);
>  
> +	sock_rps_record_flow(sk2);
>  	WARN_ON(!((1 << sk2->sk_state) &
>  		  (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT | TCPF_CLOSE)));
>  
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 617dee3..955b8e6 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
>  			goto discard;
>  
>  		if (nsk != sk) {
> +			sock_rps_save_rxhash(nsk, skb->rxhash);
>  			if (tcp_child_process(sk, nsk, skb)) {
>  				rsk = nsk;
>  				goto reset;
> 

I haven't tried this, but it looks reasonable to me.

What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.

Ben.
David Miller - June 17, 2011, 3:38 a.m.
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Fri, 17 Jun 2011 00:50:46 +0100

> On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
>> @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
>>  			goto discard;
>>  
>>  		if (nsk != sk) {
>> +			sock_rps_save_rxhash(nsk, skb->rxhash);
>>  			if (tcp_child_process(sk, nsk, skb)) {
>>  				rsk = nsk;
>>  				goto reset;
>> 
> 
> I haven't tried this, but it looks reasonable to me.
> 
> What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.

Indeed ipv6 side needs the same fix.

Eric please add that part and resubmit.  And in fact I might stick
this into net-2.6 instead of net-next-2.6

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 83673d2..0600f0f 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -676,6 +676,7 @@  int inet_accept(struct socket *sock, struct socket *newsock, int flags)
 
 	lock_sock(sk2);
 
+	sock_rps_record_flow(sk2);
 	WARN_ON(!((1 << sk2->sk_state) &
 		  (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT | TCPF_CLOSE)));
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 617dee3..955b8e6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1594,6 +1594,7 @@  int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 			goto discard;
 
 		if (nsk != sk) {
+			sock_rps_save_rxhash(nsk, skb->rxhash);
 			if (tcp_child_process(sk, nsk, skb)) {
 				rsk = nsk;
 				goto reset;