diff mbox

[net-next,1/2] net: SOCKWQ_ASYNC_NOSPACE optimizations

Message ID 1462220607.5535.273.camel@edumazet-glaptop3.roam.corp.google.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet May 2, 2016, 8:23 p.m. UTC
On Mon, 2016-05-02 at 21:12 +0200, Jiri Pirko wrote:
> Mon, May 02, 2016 at 06:22:18PM CEST, eric.dumazet@gmail.com wrote:
> >On Mon, 2016-05-02 at 18:16 +0200, Jiri Pirko wrote:
> >> Mon, Apr 25, 2016 at 07:39:32PM CEST, edumazet@google.com wrote:
> >> >SOCKWQ_ASYNC_NOSPACE is tested in sock_wake_async()
> >> >so that a SIGIO signal is sent when needed.
> >> >
> >> >tcp_sendmsg() clears the bit.
> >> >tcp_poll() sets the bit when stream is not writeable.
> >> >
> >> >We can avoid two atomic operations by first checking if socket
> >> >is actually interested in the FASYNC business (most sockets in
> >> >real applications do not use AIO, but select()/poll()/epoll())
> >> >
> >> >This also removes one cache line miss to access sk->sk_wq->flags
> >> >in tcp_sendmsg()
> >> >
> >> >Signed-off-by: Eric Dumazet <edumazet@google.com>
> >> 
> >> I just bisected down to this. This is causing a regression for me when
> >> my nfs mount becomes stuck. I can easily reproduce this if you need to
> >> test the fix.
> >
> >What do you mean by 'when nfs mount becomes stuck' ?
> >
> >Is this patch making nfs not functional , or does it make recovery from
> >some nfs error bad ?
> 
> I can mount nfs on the host. But when I do something (compile a kernel
> module in my case), it gets stuck. Then I cannot even ssh to the machine.
> No messages in dmesg. I didn't debug it any further. I just bisected and
> verified that this patch caused this behaviour.

Interesting.

It looks like net/sunrpc/xprtsock.c should set SOCK_FASYNC
even if it is not actually using fasync_list

Could you try this quick hack to check if this is the right way ?

Thanks !

Comments

David Miller May 2, 2016, 8:31 p.m. UTC | #1
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 May 2016 13:23:27 -0700

> It looks like net/sunrpc/xprtsock.c should set SOCK_FASYNC
> even if it is not actually using fasync_list
> 
> Could you try this quick hack to check if this is the right way ?

Indeed, it tests the ASYNC bit without enabling FASYNC.

There are three other places that do this: macvtap, tun, dlm lowcomms.
Jiri Pirko May 2, 2016, 8:45 p.m. UTC | #2
Mon, May 02, 2016 at 10:23:27PM CEST, eric.dumazet@gmail.com wrote:
>On Mon, 2016-05-02 at 21:12 +0200, Jiri Pirko wrote:
>> Mon, May 02, 2016 at 06:22:18PM CEST, eric.dumazet@gmail.com wrote:
>> >On Mon, 2016-05-02 at 18:16 +0200, Jiri Pirko wrote:
>> >> Mon, Apr 25, 2016 at 07:39:32PM CEST, edumazet@google.com wrote:
>> >> >SOCKWQ_ASYNC_NOSPACE is tested in sock_wake_async()
>> >> >so that a SIGIO signal is sent when needed.
>> >> >
>> >> >tcp_sendmsg() clears the bit.
>> >> >tcp_poll() sets the bit when stream is not writeable.
>> >> >
>> >> >We can avoid two atomic operations by first checking if socket
>> >> >is actually interested in the FASYNC business (most sockets in
>> >> >real applications do not use AIO, but select()/poll()/epoll())
>> >> >
>> >> >This also removes one cache line miss to access sk->sk_wq->flags
>> >> >in tcp_sendmsg()
>> >> >
>> >> >Signed-off-by: Eric Dumazet <edumazet@google.com>
>> >> 
>> >> I just bisected down to this. This is causing a regression for me when
>> >> my nfs mount becomes stuck. I can easily reproduce this if you need to
>> >> test the fix.
>> >
>> >What do you mean by 'when nfs mount becomes stuck' ?
>> >
>> >Is this patch making nfs not functional , or does it make recovery from
>> >some nfs error bad ?
>> 
>> I can mount nfs on the host. But when I do something (compile a kernel
>> module in my case), it gets stuck. Then I cannot even ssh to the machine.
>> No messages in dmesg. I didn't debug it any further. I just bisected and
>> verified that this patch caused this behaviour.
>
>Interesting.
>
>It looks like net/sunrpc/xprtsock.c should set SOCK_FASYNC
>even if it is not actually using fasync_list
>
>Could you try this quick hack to check if this is the right way ?

Yep, works, I do not see the issue with this patch anymore. Thanks.


>
>Thanks !
>
>diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>index a6c68dc086af83233ee315642638f4a1990ee622..b90c5397b5e137c6cc8accad6eebe2b876363d4e 100644
>--- a/net/sunrpc/xprtsock.c
>+++ b/net/sunrpc/xprtsock.c
>@@ -1950,6 +1950,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
> 		sk->sk_user_data = xprt;
> 		sk->sk_data_ready = xs_data_ready;
> 		sk->sk_write_space = xs_udp_write_space;
>+		sock_set_flag(sk, SOCK_FASYNC);
> 		sk->sk_error_report = xs_error_report;
> 		sk->sk_allocation = GFP_NOIO;
> 
>@@ -2136,6 +2137,7 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
> 		sk->sk_user_data = xprt;
> 		sk->sk_data_ready = xs_data_ready;
> 		sk->sk_write_space = xs_udp_write_space;
>+		sock_set_flag(sk, SOCK_FASYNC);
> 		sk->sk_allocation = GFP_NOIO;
> 
> 		xprt_set_connected(xprt);
>@@ -2237,6 +2239,7 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
> 		sk->sk_data_ready = xs_tcp_data_ready;
> 		sk->sk_state_change = xs_tcp_state_change;
> 		sk->sk_write_space = xs_tcp_write_space;
>+		sock_set_flag(sk, SOCK_FASYNC);
> 		sk->sk_error_report = xs_error_report;
> 		sk->sk_allocation = GFP_NOIO;
> 
>
>
Eric Dumazet May 2, 2016, 8:55 p.m. UTC | #3
On Mon, 2016-05-02 at 16:31 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 02 May 2016 13:23:27 -0700
> 
> > It looks like net/sunrpc/xprtsock.c should set SOCK_FASYNC
> > even if it is not actually using fasync_list
> > 
> > Could you try this quick hack to check if this is the right way ?
> 
> Indeed, it tests the ASYNC bit without enabling FASYNC.
> 
> There are three other places that do this: macvtap, tun, dlm lowcomms.

Yes, although macvtap and tun have a private usage of this bit.

When the flag was moved (commit ceb5d58b217098a657f3850b7a2640f995032e62
"net: fix sock_wake_async() rcu protection"), I did not change the code
in these drivers. And apparently nobody complained (linux-4.4)

drivers/net/macvtap.c:501:          !test_and_clear_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags))
drivers/net/macvtap.c:588:          (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &q->sock.flags) &&
drivers/net/tun.c:1111:     (!test_and_set_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags) &&
drivers/net/tun.c:1576: if (!test_and_clear_bit(SOCKWQ_ASYNC_NOSPACE, &sk->sk_socket->flags))

fs/dlm/lowcomms.c probably needs a fix.
Jiri Pirko May 9, 2016, 6:24 a.m. UTC | #4
Mon, May 02, 2016 at 10:45:44PM CEST, jiri@resnulli.us wrote:
>Mon, May 02, 2016 at 10:23:27PM CEST, eric.dumazet@gmail.com wrote:
>>On Mon, 2016-05-02 at 21:12 +0200, Jiri Pirko wrote:
>>> Mon, May 02, 2016 at 06:22:18PM CEST, eric.dumazet@gmail.com wrote:
>>> >On Mon, 2016-05-02 at 18:16 +0200, Jiri Pirko wrote:
>>> >> Mon, Apr 25, 2016 at 07:39:32PM CEST, edumazet@google.com wrote:
>>> >> >SOCKWQ_ASYNC_NOSPACE is tested in sock_wake_async()
>>> >> >so that a SIGIO signal is sent when needed.
>>> >> >
>>> >> >tcp_sendmsg() clears the bit.
>>> >> >tcp_poll() sets the bit when stream is not writeable.
>>> >> >
>>> >> >We can avoid two atomic operations by first checking if socket
>>> >> >is actually interested in the FASYNC business (most sockets in
>>> >> >real applications do not use AIO, but select()/poll()/epoll())
>>> >> >
>>> >> >This also removes one cache line miss to access sk->sk_wq->flags
>>> >> >in tcp_sendmsg()
>>> >> >
>>> >> >Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> >> 
>>> >> I just bisected down to this. This is causing a regression for me when
>>> >> my nfs mount becomes stuck. I can easily reproduce this if you need to
>>> >> test the fix.
>>> >
>>> >What do you mean by 'when nfs mount becomes stuck' ?
>>> >
>>> >Is this patch making nfs not functional , or does it make recovery from
>>> >some nfs error bad ?
>>> 
>>> I can mount nfs on the host. But when I do something (compile a kernel
>>> module in my case), it gets stuck. Then I cannot even ssh to the machine.
>>> No messages in dmesg. I didn't debug it any further. I just bisected and
>>> verified that this patch caused this behaviour.
>>
>>Interesting.
>>
>>It looks like net/sunrpc/xprtsock.c should set SOCK_FASYNC
>>even if it is not actually using fasync_list
>>
>>Could you try this quick hack to check if this is the right way ?
>
>Yep, works, I do not see the issue with this patch anymore. Thanks.

Eric, any news with this issue?

Thanks.
diff mbox

Patch

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index a6c68dc086af83233ee315642638f4a1990ee622..b90c5397b5e137c6cc8accad6eebe2b876363d4e 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1950,6 +1950,7 @@  static int xs_local_finish_connecting(struct rpc_xprt *xprt,
 		sk->sk_user_data = xprt;
 		sk->sk_data_ready = xs_data_ready;
 		sk->sk_write_space = xs_udp_write_space;
+		sock_set_flag(sk, SOCK_FASYNC);
 		sk->sk_error_report = xs_error_report;
 		sk->sk_allocation = GFP_NOIO;
 
@@ -2136,6 +2137,7 @@  static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 		sk->sk_user_data = xprt;
 		sk->sk_data_ready = xs_data_ready;
 		sk->sk_write_space = xs_udp_write_space;
+		sock_set_flag(sk, SOCK_FASYNC);
 		sk->sk_allocation = GFP_NOIO;
 
 		xprt_set_connected(xprt);
@@ -2237,6 +2239,7 @@  static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 		sk->sk_data_ready = xs_tcp_data_ready;
 		sk->sk_state_change = xs_tcp_state_change;
 		sk->sk_write_space = xs_tcp_write_space;
+		sock_set_flag(sk, SOCK_FASYNC);
 		sk->sk_error_report = xs_error_report;
 		sk->sk_allocation = GFP_NOIO;