diff mbox

sock: honor GFP_ATOMIC when allocating send skbs

Message ID 1435583216-19877-1-git-send-email-Jason@zx2c4.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Jason A. Donenfeld June 29, 2015, 1:06 p.m. UTC
Some sockets set sock->sk->sk_allocation = GFP_ATOMIC. In spite of this,
functions that call sock_alloc_send_skb will then call
sock_alloc_send_pskb, which very often results in sleeping. Since the
intention of callers setting sk_allocation = GFP_ATOMIC might be to be
able to send from atomic context, we need to honor this and not sleep.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 net/core/sock.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Eric Dumazet June 29, 2015, 1:36 p.m. UTC | #1
On Mon, 2015-06-29 at 15:06 +0200, Jason A. Donenfeld wrote:
> Some sockets set sock->sk->sk_allocation = GFP_ATOMIC. In spite of this,
> functions that call sock_alloc_send_skb will then call
> sock_alloc_send_pskb, which very often results in sleeping. Since the
> intention of callers setting sk_allocation = GFP_ATOMIC might be to be
> able to send from atomic context, we need to honor this and not sleep.


What exact problem have you noticed ? We need details please.

GFP_ATOMIC in these path does not mean 'do not wait' but instead
'allocate from emergency pools'.

We already have many ways to state ' do not wait', maybe you should use
them.

> 
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>  net/core/sock.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 1e1fe9a..f00e691 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1804,34 +1804,37 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
>  				     unsigned long data_len, int noblock,
>  				     int *errcode, int max_page_order)
>  {
>  	struct sk_buff *skb;
>  	long timeo;
>  	int err;
>  
>  	timeo = sock_sndtimeo(sk, noblock);
>  	for (;;) {
>  		err = sock_error(sk);
>  		if (err != 0)
>  			goto failure;
>  
>  		err = -EPIPE;
>  		if (sk->sk_shutdown & SEND_SHUTDOWN)
>  			goto failure;
>  
> +		if (sk->sk_allocation & GFP_ATOMIC)
> +			break;
> +

This is the wrong place to put this test, as following one is probably
the one that is hit most of the times (fast path)


>  		if (sk_wmem_alloc_get(sk) < sk->sk_sndbuf)
>  			break;


Anyway, testing for GFP_ATOMIC 'flag' is wrong.

You probably meant to test __GFP_WAIT instead, but you need to give more
details.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason A. Donenfeld June 29, 2015, 2 p.m. UTC | #2
Hi Eric,

Thanks for your feedback on this. The precise problem is a BUG in the
scheduler code when sleep is called from atomic context, due to having
to wait on that allocation. This is triggered by a module I'm writing.

I'm probably doing something I shouldn't... I'm calling udp_sendmsg
from atomic context: bad news bears, right? Really what I'd like to do
is create an skb with headroom, fill it with data, and then hand it
off to the udp/ip stack to prepend the udp and ip headers and
calculate the flowi4 and correct dev and checksums for the skb. But I
can't seem to find a correct API for this. So instead I'm using a
socket and calling sendmsg on it, which unfortunately winds up in an
extra allocation and a memcpy when sending. Might there be a better
way to do what I want? That is: pass a buffer off to the networking
stack, a) with zero copies [without having to use sendpage], and b)
from atomic context. I'd like to avoid using workqueues/kthreads/etc
if possible.

Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 1e1fe9a..f00e691 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1804,34 +1804,37 @@  struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 				     unsigned long data_len, int noblock,
 				     int *errcode, int max_page_order)
 {
 	struct sk_buff *skb;
 	long timeo;
 	int err;
 
 	timeo = sock_sndtimeo(sk, noblock);
 	for (;;) {
 		err = sock_error(sk);
 		if (err != 0)
 			goto failure;
 
 		err = -EPIPE;
 		if (sk->sk_shutdown & SEND_SHUTDOWN)
 			goto failure;
 
+		if (sk->sk_allocation & GFP_ATOMIC)
+			break;
+
 		if (sk_wmem_alloc_get(sk) < sk->sk_sndbuf)
 			break;
 
 		set_bit(SOCK_ASYNC_NOSPACE, &sk->sk_socket->flags);
 		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
 		err = -EAGAIN;
 		if (!timeo)
 			goto failure;
 		if (signal_pending(current))
 			goto interrupted;
 		timeo = sock_wait_for_wmem(sk, timeo);
 	}
 	skb = alloc_skb_with_frags(header_len, data_len, max_page_order,
 				   errcode, sk->sk_allocation);
 	if (skb)
 		skb_set_owner_w(skb, sk);
 	return skb;