From patchwork Fri Nov 11 15:50:21 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Tim Gardner <tim.gardner@canonical.com>
X-Patchwork-Id: 693781
Return-Path: <kernel-team-bounces@lists.ubuntu.com>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from huckleberry.canonical.com (huckleberry.canonical.com
	[91.189.94.19])
	by ozlabs.org (Postfix) with ESMTP id 3tFks84yfhz9t1F;
	Sat, 12 Nov 2016 02:50:36 +1100 (AEDT)
Authentication-Results: ozlabs.org;
	dkim=fail reason="signature verification failed" (2048-bit key;
	unprotected) header.d=canonical-com.20150623.gappssmtp.com
	header.i=@canonical-com.20150623.gappssmtp.com
	header.b="m+BNj6Wl"; dkim-atps=neutral
Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com)
	by huckleberry.canonical.com with esmtp (Exim 4.76)
	(envelope-from <kernel-team-bounces@lists.ubuntu.com>)
	id 1c5E63-0001dX-NS; Fri, 11 Nov 2016 15:50:31 +0000
Received: from mail-pg0-f42.google.com ([74.125.83.42])
	by huckleberry.canonical.com with esmtps
	(TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.76) (envelope-from <tim.gardner@canonical.com>)
	id 1c5E5z-0001dS-DJ
	for kernel-team@lists.ubuntu.com; Fri, 11 Nov 2016 15:50:27 +0000
Received: by mail-pg0-f42.google.com with SMTP id x23so11291705pgx.1
	for <kernel-team@lists.ubuntu.com>;
	Fri, 11 Nov 2016 07:50:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=canonical-com.20150623.gappssmtp.com; s=20150623;
	h=from:to:subject:date:message-id;
	bh=n49JGDE3I0jwu8aJF35c7PPzNjDQRWBhWl2sF9rEL7w=;
	b=m+BNj6WlQQ74FoObneVX+yvFG6hI19n9aI/Lbz6iAAISRuvvNVOeNwsNHbz9lCrmQv
	qNqFkh/FqhrNeL5FFbl/sCt8OauKkKumW1JJ//rV+phKpRj4qybTrNkW/jpmtyUCn8OR
	z6k8HTrwguYlQkT9i6yBPFXRORlriYd7eJ9TilaxLQA+H3lSPdDGMjSGwkn81dfNJGnR
	urlVa4vJY6pbjVa8iL4/wqrxd168KaPU0qfsY1vZRO1vX8tetR1bdXF2NbbEiWYV/H94
	51Ln4cjnM2nry2i9TS2IzCVXmexoXFmaGyiECocf5+p2dW4Dz14PYhrjf+8LN4G+pnJJ
	NJwg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:subject:date:message-id;
	bh=n49JGDE3I0jwu8aJF35c7PPzNjDQRWBhWl2sF9rEL7w=;
	b=lgBcfVtndrk7eJqSlHVMfci6svLiywKN9hpDdFllco2Lu9odPwzwZ9f9lwliNNikxg
	Sp8uNaiuLybGfbLIhI3bIP/mh6rQuUc+uWrdhUUfULmQc4PsFccXkt82rAjxOvKvtFXy
	ZYAm80cayDiONZKb49Jmjq0SkN1xvG7mrnqmj/hIQzTi2GdF+wlnzXxt/K/CbDFRERaG
	57XHfxFokc9AKLi5A1tWts8CG0GO1u/Mx2b7/5xUoplBHliJKLgbPakuHDPWlHd87n5e
	wEA6ftzZVqpzoO3wk/4TRkUOeXMxXB6Fj9oAY5f9qPQsTcsNChDwBVff0umKhs3T9lC3
	g4eQ==
X-Gm-Message-State: 
 ABUngvcPARu55Pkm5rfq0xSjr++H6uftDERYPc2xyJe666fdbmnwSVa9jKmZkUsBCNm39Sfm
X-Received: by 10.98.220.157 with SMTP id c29mr8091230pfl.29.1478879425786;
	Fri, 11 Nov 2016 07:50:25 -0800 (PST)
Received: from localhost.localdomain
	(host-174-45-44-32.hln-mt.client.bresnan.net. [174.45.44.32])
	by smtp.gmail.com with ESMTPSA id
	13sm16132475pfz.30.2016.11.11.07.50.24
	for <kernel-team@lists.ubuntu.com>
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Fri, 11 Nov 2016 07:50:25 -0800 (PST)
From: Tim Gardner <tim.gardner@canonical.com>
To: kernel-team@lists.ubuntu.com
Subject: [PATCH Trusty regression SRU] aio: fix reqs_available handling
Date: Fri, 11 Nov 2016 08:50:21 -0700
Message-Id: <1478879421-7864-1-git-send-email-tim.gardner@canonical.com>
X-Mailer: git-send-email 2.7.4
X-BeenThere: kernel-team@lists.ubuntu.com
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Kernel team discussions <kernel-team.lists.ubuntu.com>
List-Unsubscribe: <https://lists.ubuntu.com/mailman/options/kernel-team>,
	<mailto:kernel-team-request@lists.ubuntu.com?subject=unsubscribe>
List-Archive: <https://lists.ubuntu.com/archives/kernel-team>
List-Post: <mailto:kernel-team@lists.ubuntu.com>
List-Help: <mailto:kernel-team-request@lists.ubuntu.com?subject=help>
List-Subscribe: <https://lists.ubuntu.com/mailman/listinfo/kernel-team>,
	<mailto:kernel-team-request@lists.ubuntu.com?subject=subscribe>
MIME-Version: 1.0
Errors-To: kernel-team-bounces@lists.ubuntu.com
Sender: kernel-team-bounces@lists.ubuntu.com

From: Benjamin LaHaise <bcrl@kvack.org>

BugLink: http://bugs.launchpad.net/bugs/1641129

As reported by Dan Aloni, commit f8567a3845ac ("aio: fix aio request
leak when events are reaped by userspace") introduces a regression when
user code attempts to perform io_submit() with more events than are
available in the ring buffer.  Reverting that commit would reintroduce a
regression when user space event reaping is used.

Fixing this bug is a bit more involved than the previous attempts to fix
this regression.  Since we do not have a single point at which we can
count events as being reaped by user space and io_getevents(), we have
to track event completion by looking at the number of events left in the
event ring.  So long as there are as many events in the ring buffer as
there have been completion events generate, we cannot call
put_reqs_available().  The code to check for this is now placed in
refill_reqs_available().

A test program from Dan and modified by me for verifying this bug is available
at http://www.kvack.org/~bcrl/20140824-aio_bug.c .

Reported-by: Dan Aloni <dan@kernelim.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Dan Aloni <dan@kernelim.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: stable@vger.kernel.org      # v3.16 and anything that f8567a3845ac was backported to
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit d856f32a86b2b015ab180ab7a55e455ed8d3ccc5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
---
 fs/aio.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 73 insertions(+), 4 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index 775476b..db7adac 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -136,6 +136,7 @@ struct kioctx {
 
 	struct {
 		unsigned	tail;
+		unsigned	completed_events;
 		spinlock_t	completion_lock;
 	} ____cacheline_aligned_in_smp;
 
@@ -876,6 +877,68 @@ out:
 	return ret;
 }
 
+/* refill_reqs_available
+ *	Updates the reqs_available reference counts used for tracking the
+ *	number of free slots in the completion ring.  This can be called
+ *	from aio_complete() (to optimistically update reqs_available) or
+ *	from aio_get_req() (the we're out of events case).  It must be
+ *	called holding ctx->completion_lock.
+ */
+static void refill_reqs_available(struct kioctx *ctx, unsigned head,
+                                  unsigned tail)
+{
+	unsigned events_in_ring, completed;
+
+	/* Clamp head since userland can write to it. */
+	head %= ctx->nr_events;
+	if (head <= tail)
+		events_in_ring = tail - head;
+	else
+		events_in_ring = ctx->nr_events - (head - tail);
+
+	completed = ctx->completed_events;
+	if (events_in_ring < completed)
+		completed -= events_in_ring;
+	else
+		completed = 0;
+
+	if (!completed)
+		return;
+
+	ctx->completed_events -= completed;
+	put_reqs_available(ctx, completed);
+}
+
+/* user_refill_reqs_available
+ *	Called to refill reqs_available when aio_get_req() encounters an
+ *	out of space in the completion ring.
+ */
+static void user_refill_reqs_available(struct kioctx *ctx)
+{
+	spin_lock_irq(&ctx->completion_lock);
+	if (ctx->completed_events) {
+		struct aio_ring *ring;
+		unsigned head;
+
+		/* Access of ring->head may race with aio_read_events_ring()
+		 * here, but that's okay since whether we read the old version
+		 * or the new version, and either will be valid.  The important
+		 * part is that head cannot pass tail since we prevent
+		 * aio_complete() from updating tail by holding
+		 * ctx->completion_lock.  Even if head is invalid, the check
+		 * against ctx->completed_events below will make sure we do the
+		 * safe/right thing.
+		 */
+		ring = kmap_atomic(ctx->ring_pages[0]);
+		head = ring->head;
+		kunmap_atomic(ring);
+
+		refill_reqs_available(ctx, head, ctx->tail);
+	}
+
+	spin_unlock_irq(&ctx->completion_lock);
+}
+
 /* aio_get_req
  *	Allocate a slot for an aio request.
  * Returns NULL if no requests are free.
@@ -884,8 +947,11 @@ static inline struct kiocb *aio_get_req(struct kioctx *ctx)
 {
 	struct kiocb *req;
 
-	if (!get_reqs_available(ctx))
-		return NULL;
+	if (!get_reqs_available(ctx)) {
+		user_refill_reqs_available(ctx);
+		if (!get_reqs_available(ctx))
+			return NULL;
+	}
 
 	req = kmem_cache_alloc(kiocb_cachep, GFP_KERNEL|__GFP_ZERO);
 	if (unlikely(!req))
@@ -944,8 +1010,8 @@ void aio_complete(struct kiocb *iocb, long res, long res2)
 	struct kioctx	*ctx = iocb->ki_ctx;
 	struct aio_ring	*ring;
 	struct io_event	*ev_page, *event;
+	unsigned tail, pos, head;
 	unsigned long	flags;
-	unsigned tail, pos;
 
 	/*
 	 * Special case handling for sync iocbs:
@@ -1006,10 +1072,14 @@ void aio_complete(struct kiocb *iocb, long res, long res2)
 	ctx->tail = tail;
 
 	ring = kmap_atomic(ctx->ring_pages[0]);
+	head = ring->head;
 	ring->tail = tail;
 	kunmap_atomic(ring);
 	flush_dcache_page(ctx->ring_pages[0]);
 
+	ctx->completed_events++;
+	if (ctx->completed_events > 1)
+		refill_reqs_available(ctx, head, tail);
 	spin_unlock_irqrestore(&ctx->completion_lock, flags);
 
 	pr_debug("added to ring %p at [%u]\n", iocb, tail);
@@ -1024,7 +1094,6 @@ void aio_complete(struct kiocb *iocb, long res, long res2)
 
 	/* everything turned out well, dispose of the aiocb. */
 	kiocb_free(iocb);
-	put_reqs_available(ctx, 1);
 
 	/*
 	 * We have to order our ring_info tail store above and test