From patchwork Sun Dec 6 21:11:38 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rainer Weikusat X-Patchwork-Id: 553203 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 67DEC140291 for ; Mon, 7 Dec 2015 08:12:51 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754671AbbLFVM3 (ORCPT ); Sun, 6 Dec 2015 16:12:29 -0500 Received: from tiger.mobileactivedefense.com ([217.174.251.109]:51846 "EHLO tiger.mobileactivedefense.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754299AbbLFVLs (ORCPT ); Sun, 6 Dec 2015 16:11:48 -0500 Received: from doppelsaurus.mobileactivedefense.com (host-92-24-32-229.ppp.as43234.net [92.24.32.229]) (authenticated bits=0) by tiger.mobileactivedefense.com (8.13.5.20060308/8.13.5/Debian-3ubuntu1.1) with ESMTP id tB6LBhVe011254 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 6 Dec 2015 21:11:44 GMT Received: from doppelsaurus.mobileactivedefense.com (localhost [127.0.0.1]) by doppelsaurus.mobileactivedefense.com (8.14.4/8.14.4/Debian-4) with ESMTP id tB6LBcgc004070; Sun, 6 Dec 2015 21:11:38 GMT Received: (from rw@localhost) by doppelsaurus.mobileactivedefense.com (8.14.4/8.14.3/Submit) id tB6LBcK0004068; Sun, 6 Dec 2015 21:11:38 GMT From: Rainer Weikusat To: David Miller Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATH 02/02] af_unix: fix unix_dgram_recvmsg entry locking User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Date: Sun, 06 Dec 2015 21:11:38 +0000 Message-ID: <8737vf7051.fsf@doppelsaurus.mobileactivedefense.com> MIME-Version: 1.0 X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (tiger.mobileactivedefense.com [217.174.251.109]); Sun, 06 Dec 2015 21:11:45 +0000 (GMT) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The current unix_dgram_recvsmg code acquires the u->readlock mutex in order to protect access to the peek offset prior to calling __skb_recv_datagram for actually receiving data. This implies that a blocking reader will go to sleep with this mutex held if there's presently no data to return to userspace. Two non-desirable side effects of this are that a later non-blocking read call on the same socket will block on the ->readlock mutex until the earlier blocking call releases it (or the readers is interrupted) and that later blocking read calls will wait longer than the effective socket read timeout says they should: The timeout will only start 'ticking' once such a reader hits the schedule_timeout in wait_for_more_packets (core.c) while the time it already had to wait until it could acquire the mutex is unaccounted for. The patch avoids both by using the __skb_try_recv_datagram and __skb_wait_for_more packets functions created by the first patch to implement a unix_dgram_recvmsg read loop which releases the readlock mutex prior to going to sleep and reacquires it as needed afterwards. Non-blocking readers will thus immediately return with -EAGAIN if there's no data available regardless of any concurrent blocking readers and all blocking readers will end up sleeping via schedule_timeout, thus honouring the configured socket receive timeout. Signed-Off-By: Rainer Weikusat --- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 45aebd9..47dfa97 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2078,8 +2078,8 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, struct scm_cookie scm; struct sock *sk = sock->sk; struct unix_sock *u = unix_sk(sk); - int noblock = flags & MSG_DONTWAIT; - struct sk_buff *skb; + struct sk_buff *skb, *last; + long timeo; int err; int peeked, skip; @@ -2087,26 +2087,32 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, if (flags&MSG_OOB) goto out; - err = mutex_lock_interruptible(&u->readlock); - if (unlikely(err)) { - /* recvmsg() in non blocking mode is supposed to return -EAGAIN - * sk_rcvtimeo is not honored by mutex_lock_interruptible() - */ - err = noblock ? -EAGAIN : -ERESTARTSYS; - goto out; - } + timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT); - skip = sk_peek_offset(sk, flags); + do { + mutex_lock(&u->readlock); - skb = __skb_recv_datagram(sk, flags, &peeked, &skip, &err); - if (!skb) { + skip = sk_peek_offset(sk, flags); + skb = __skb_try_recv_datagram(sk, flags, &peeked, &skip, &err, + &last); + if (skb) + break; + + mutex_unlock(&u->readlock); + + if (err != -EAGAIN) + break; + } while (timeo && + !__skb_wait_for_more_packets(sk, &err, &timeo, last)); + + if (!skb) { /* implies readlock unlocked */ unix_state_lock(sk); /* Signal EOF on disconnected non-blocking SEQPACKET socket. */ if (sk->sk_type == SOCK_SEQPACKET && err == -EAGAIN && (sk->sk_shutdown & RCV_SHUTDOWN)) err = 0; unix_state_unlock(sk); - goto out_unlock; + goto out; } wake_up_interruptible_sync_poll(&u->peer_wait, @@ -2162,7 +2168,6 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, out_free: skb_free_datagram(sk, skb); -out_unlock: mutex_unlock(&u->readlock); out: return err;