From patchwork Mon Sep 1 01:30:27 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Willem de Bruijn X-Patchwork-Id: 384607 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3C1171401AB for ; Mon, 1 Sep 2014 11:30:37 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752020AbaIABad (ORCPT ); Sun, 31 Aug 2014 21:30:33 -0400 Received: from mail-pa0-f73.google.com ([209.85.220.73]:57154 "EHLO mail-pa0-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751679AbaIABac (ORCPT ); Sun, 31 Aug 2014 21:30:32 -0400 Received: by mail-pa0-f73.google.com with SMTP id kx10so1867115pab.4 for ; Sun, 31 Aug 2014 18:30:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=+ey/z8W8MVyb8f4f6GDYB/x9B6GGu7Oc46UwIXnprLI=; b=TedHy5UDRHTGx6pT6iNv8lTtU2vWJshs1E+c1ADgz7yzVbeRjVoVfE88DnAYu2ahEE cW+FXRomeiYBUgSHKEoHCh7ibaz3Hpty8hIWxImP0AxmbpgHNh3Qy0sLE34I+F37UdGP tGr61tQUUuscBNoq0S5Zx+dQPWb7Cy/iQ92v8DT2oHgO1C6W1TTKJymnKQqqjTNS0pfO jF4d1p5kKg4V5vHEfCLNP8tnYeRS8BqRI65QzrBE1yim8I5vAMKNT3aOMLjBuCL21/YF MqixalMptY26GTu3C5zQkGmPafFbRfvmoIXx+3zqOdgknSFuR5Oa82nWhjG+NnFGwvBr jlvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=+ey/z8W8MVyb8f4f6GDYB/x9B6GGu7Oc46UwIXnprLI=; b=fLu/OE0TjAc+X56/q7eEt4Ed0AnAO9EjNnEnLz2+DKullbW5KMz1G633U0i0OHPJZK oTsdAcT7tedS5/L1+ZpO6PX6ORVrkZwMp2y1t6YV5en1b9UNT0/DSm6Z6lJvHPki6F6m yNlaSTvEnwF+SFtZYiO6+mdqtMR2cY1cfM/jOClt1ZXL7t3FfMi3GRh2YetD4N1mV/YN nrilnNvYGzFcVHOVwpKv/2zp+bn0+tZVdb3zIcZtR8gDS/ZUC0C8VIoHXnVfF709k0Tm LjsCRlNjWZ1XDZ9JpXHhlOraY5bnqdJZiGFi/7PNXw7jrrADsxsHEaqY5/Pc5wUNXOat bfRg== X-Gm-Message-State: ALoCoQk+XBeDVF5rd6T0edLPOcK2X660/K5Au+Xz5Nv4njH5e1t4D0VK2d4N0WjwipoZL1FNq46T X-Received: by 10.66.248.232 with SMTP id yp8mr13883030pac.22.1409535031421; Sun, 31 Aug 2014 18:30:31 -0700 (PDT) Received: from corp2gmr1-1.hot.corp.google.com (corp2gmr1-1.hot.corp.google.com [172.24.189.92]) by gmr-mx.google.com with ESMTPS id t28si334372yhb.4.2014.08.31.18.30.31 for (version=TLSv1.1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 31 Aug 2014 18:30:31 -0700 (PDT) Received: from gopher.nyc.corp.google.com (gopher.nyc.corp.google.com [172.26.106.37]) by corp2gmr1-1.hot.corp.google.com (Postfix) with ESMTP id 1FCCA31C1B4; Sun, 31 Aug 2014 18:30:31 -0700 (PDT) Received: by gopher.nyc.corp.google.com (Postfix, from userid 29878) id B3EE5C08CD; Sun, 31 Aug 2014 21:30:30 -0400 (EDT) From: Willem de Bruijn To: netdev@vger.kernel.org Cc: davem@davemloft.net, Willem de Bruijn Subject: [PATCH net-next] sock: deduplicate errqueue dequeue Date: Sun, 31 Aug 2014 21:30:27 -0400 Message-Id: <1409535027-16963-1-git-send-email-willemb@google.com> X-Mailer: git-send-email 2.1.0.rc2.206.gedb03e5 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org sk->sk_error_queue is dequeued in four locations. All share the exact same logic. Deduplicate. Also collapse the two critical sections for dequeue (at the top of the recv handler) and signal (at the bottom). This moves signal generation for the next packet forward, which should be harmless. It also changes the behavior if the recv handler exits early with an error. Previously, a signal for follow-up packets on the errqueue would then not be scheduled. The new behavior, to always signal, is arguably a bug fix. For rxrpc, the change causes the same function to be called repeatedly for each queued packet (because the recv handler == sk_error_report). It is likely that all packets will fail for the same reason (e.g., memory exhaustion). This code runs without sk_lock held, so it is not safe to trust that sk->sk_err is immutable inbetween releasing q->lock and the subsequent test. Introduce int err just to avoid this potential race. Signed-off-by: Willem de Bruijn --- include/net/sock.h | 1 + net/core/skbuff.c | 20 ++++++++++++++++++++ net/core/sock.c | 14 ++------------ net/ipv4/ip_sockglue.c | 15 ++------------- net/ipv6/datagram.c | 15 ++------------- net/rxrpc/ar-error.c | 14 +------------- 6 files changed, 28 insertions(+), 51 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 7f2ab72..3fde613 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2041,6 +2041,7 @@ void sk_stop_timer(struct sock *sk, struct timer_list *timer); int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb); int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb); +struct sk_buff *sock_dequeue_err_skb(struct sock *sk); /* * Recover an error report and clear atomically diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 163b673..53ce536 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3491,6 +3491,26 @@ int sock_queue_err_skb(struct sock *sk, struct sk_buff *skb) } EXPORT_SYMBOL(sock_queue_err_skb); +struct sk_buff *sock_dequeue_err_skb(struct sock *sk) +{ + struct sk_buff_head *q = &sk->sk_error_queue; + struct sk_buff *skb, *skb_next; + int err = 0; + + spin_lock_bh(&q->lock); + skb = __skb_dequeue(q); + if (skb && (skb_next = skb_peek(q))) + err = SKB_EXT_ERR(skb_next)->ee.ee_errno; + spin_unlock_bh(&q->lock); + + sk->sk_err = err; + if (err) + sk->sk_error_report(sk); + + return skb; +} +EXPORT_SYMBOL(sock_dequeue_err_skb); + void __skb_tstamp_tx(struct sk_buff *orig_skb, struct skb_shared_hwtstamps *hwtstamps, struct sock *sk, int tstype) diff --git a/net/core/sock.c b/net/core/sock.c index f7f2352..f1a638e 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2488,11 +2488,11 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len, int level, int type) { struct sock_exterr_skb *serr; - struct sk_buff *skb, *skb2; + struct sk_buff *skb; int copied, err; err = -EAGAIN; - skb = skb_dequeue(&sk->sk_error_queue); + skb = sock_dequeue_err_skb(sk); if (skb == NULL) goto out; @@ -2513,16 +2513,6 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len, msg->msg_flags |= MSG_ERRQUEUE; err = copied; - /* Reset and regenerate socket error */ - spin_lock_bh(&sk->sk_error_queue.lock); - sk->sk_err = 0; - if ((skb2 = skb_peek(&sk->sk_error_queue)) != NULL) { - sk->sk_err = SKB_EXT_ERR(skb2)->ee.ee_errno; - spin_unlock_bh(&sk->sk_error_queue.lock); - sk->sk_error_report(sk); - } else - spin_unlock_bh(&sk->sk_error_queue.lock); - out_free_skb: kfree_skb(skb); out: diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index 5cb830c..455e75b 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -405,7 +405,7 @@ void ip_local_error(struct sock *sk, int err, __be32 daddr, __be16 port, u32 inf int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) { struct sock_exterr_skb *serr; - struct sk_buff *skb, *skb2; + struct sk_buff *skb; DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name); struct { struct sock_extended_err ee; @@ -415,7 +415,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) int copied; err = -EAGAIN; - skb = skb_dequeue(&sk->sk_error_queue); + skb = sock_dequeue_err_skb(sk); if (skb == NULL) goto out; @@ -462,17 +462,6 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) msg->msg_flags |= MSG_ERRQUEUE; err = copied; - /* Reset and regenerate socket error */ - spin_lock_bh(&sk->sk_error_queue.lock); - sk->sk_err = 0; - skb2 = skb_peek(&sk->sk_error_queue); - if (skb2 != NULL) { - sk->sk_err = SKB_EXT_ERR(skb2)->ee.ee_errno; - spin_unlock_bh(&sk->sk_error_queue.lock); - sk->sk_error_report(sk); - } else - spin_unlock_bh(&sk->sk_error_queue.lock); - out_free_skb: kfree_skb(skb); out: diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 1844e87..2cdc383 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -332,7 +332,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) { struct ipv6_pinfo *np = inet6_sk(sk); struct sock_exterr_skb *serr; - struct sk_buff *skb, *skb2; + struct sk_buff *skb; DECLARE_SOCKADDR(struct sockaddr_in6 *, sin, msg->msg_name); struct { struct sock_extended_err ee; @@ -342,7 +342,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) int copied; err = -EAGAIN; - skb = skb_dequeue(&sk->sk_error_queue); + skb = sock_dequeue_err_skb(sk); if (skb == NULL) goto out; @@ -415,17 +415,6 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) msg->msg_flags |= MSG_ERRQUEUE; err = copied; - /* Reset and regenerate socket error */ - spin_lock_bh(&sk->sk_error_queue.lock); - sk->sk_err = 0; - if ((skb2 = skb_peek(&sk->sk_error_queue)) != NULL) { - sk->sk_err = SKB_EXT_ERR(skb2)->ee.ee_errno; - spin_unlock_bh(&sk->sk_error_queue.lock); - sk->sk_error_report(sk); - } else { - spin_unlock_bh(&sk->sk_error_queue.lock); - } - out_free_skb: kfree_skb(skb); out: diff --git a/net/rxrpc/ar-error.c b/net/rxrpc/ar-error.c index db57458..74c0fcd 100644 --- a/net/rxrpc/ar-error.c +++ b/net/rxrpc/ar-error.c @@ -37,7 +37,7 @@ void rxrpc_UDP_error_report(struct sock *sk) _enter("%p{%d}", sk, local->debug_id); - skb = skb_dequeue(&sk->sk_error_queue); + skb = sock_dequeue_err_skb(sk); if (!skb) { _leave("UDP socket errqueue empty"); return; @@ -111,18 +111,6 @@ void rxrpc_UDP_error_report(struct sock *sk) skb_queue_tail(&trans->error_queue, skb); rxrpc_queue_work(&trans->error_handler); - /* reset and regenerate socket error */ - spin_lock_bh(&sk->sk_error_queue.lock); - sk->sk_err = 0; - skb = skb_peek(&sk->sk_error_queue); - if (skb) { - sk->sk_err = SKB_EXT_ERR(skb)->ee.ee_errno; - spin_unlock_bh(&sk->sk_error_queue.lock); - sk->sk_error_report(sk); - } else { - spin_unlock_bh(&sk->sk_error_queue.lock); - } - _leave(""); }