From patchwork Mon Jun 4 06:42:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 924829 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40zljp5rK0z9s0w; Mon, 4 Jun 2018 16:42:38 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fPjCI-00026H-5B; Mon, 04 Jun 2018 06:42:30 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fPjCG-00025t-DL for kernel-team@lists.canonical.com; Mon, 04 Jun 2018 06:42:28 +0000 Received: from mail-pf0-f198.google.com ([209.85.192.198]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1fPjCF-0007xv-Hk for kernel-team@lists.canonical.com; Mon, 04 Jun 2018 06:42:28 +0000 Received: by mail-pf0-f198.google.com with SMTP id j17-v6so3172617pfi.21 for ; Sun, 03 Jun 2018 23:42:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=UhPMvNW4PaNvEl2FTjL0ScAAWiC34XY7nmrj0xtbhcg=; b=ccBIApVV/geLT1PDInae/jg4Z3vrFYclW1jf6gzP/6mSMi8sO08TKJJGt5qmAwFfTL 5mc3TkQFGm7ge5dUWtFvme+XQGeYHeAP3vOeiCYOlCZDssgWg0e1vUXwz8be4e1ZYSgR XCj3ZEiHV4Bc+jSl1mDjdtAkjCTFwRqHgu/KVNDjY2ZKynkgNLqIcXufIijbNlLIeycL ixX1vGCYS6qQnqp0kKGD3tHhE90+AcL2fKL041kbAhBzhk7DbGNBBJAq8VuFx02PqH6U 1vMZ+bn99A6LlwYOFQ/ZOAOrNXqboEqLVU6Vk2K/M23BOpa7BIyujHjzu0PY8kwuLTyr 5KgA== X-Gm-Message-State: ALKqPwcIKQ/NRA97hWoClOki9DVKQmXJ0rYdSp9Cz7sYj1ziqd74+6sz AJEyMbIadmK6grM/kQaz5C/GFS+vzc6w7e9dKY14aw1SRQ1LTVQyslnNyvU3bqFsuIxscU5OEXL IZEOsIDr2gj2RmxHHDdesUvDnEz4UAQcDqyuVtEe98uIUDjBv X-Received: by 2002:a17:902:24e:: with SMTP id 72-v6mr20247657plc.87.1528094546242; Sun, 03 Jun 2018 23:42:26 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLeaTjcaUZkEhFEVf1zjSUa1FLUgnsA/zRCrZgT4d2Ht0lsjqanMDVJEbUs/BXoW/9zWFR3eQ== X-Received: by 2002:a17:902:24e:: with SMTP id 72-v6mr20247649plc.87.1528094546100; Sun, 03 Jun 2018 23:42:26 -0700 (PDT) Received: from linkitivity.iinet.net.au ([203.59.21.225]) by smtp.gmail.com with ESMTPSA id u75-v6sm94539116pfd.92.2018.06.03.23.42.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 03 Jun 2018 23:42:25 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU T/X/A/B][C][PATCH 1/1] UBUNTU: SAUCE: CacheFiles: fix a read_waiter/read_copier race Date: Mon, 4 Jun 2018 16:42:16 +1000 Message-Id: <20180604064216.32075-2-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180604064216.32075-1-daniel.axtens@canonical.com> References: <20180604064216.32075-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: carmark.dlut@gmail.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Lei Xue BugLink: https://bugs.launchpad.net/bugs/1774336 There is a potential race in fscache operation enqueuing for reading and copying multiple pages from cachefiles to netfs. If this race occurs, an oops similar to the following is seen: [585042.202316] FS-Cache: [585042.202343] FS-Cache: Assertion failed [585042.202367] FS-Cache: 6 == 5 is false [585042.202452] ------------[ cut here ]------------ [585042.202480] kernel BUG at fs/fscache/operation.c:494! ... [585042.209600] Call Trace: [585042.211233] [] fscache_op_work_func+0x2a/0x50 [fscache] [585042.212677] [] process_one_work+0x150/0x3f0 [585042.213550] [] worker_thread+0x11a/0x470 ... The race occurs in the following situation: One thread is in cachefiles_read_waiter: 1) object->work_lock is taken. 2) the operation is added to the to_do list. 3) the work lock is dropped. 4) fscache_enqueue_retrieval is called, which takes a reference. Another thread is in cachefiles_read_copier: 1) object->work_lock is taken 2) an item is popped off the to_do list. 3) object->work_lock is dropped. 4) some processing is done on the item, and fscache_put_retrieval() is called, dropping a reference. Now if the this process in cachefiles_read_copier takes place *between* steps 3 and 4 in cachefiles_read_waiter, a reference will be dropped before it is taken, which leads to the object's reference count hitting zero, which leads to lifecycle events for the object happening too soon, leading to the assertion failure later on. Move fscache_enqueue_retrieval under the lock in cachefiles_read_waiter. This means that the object cannot be popped off the to_do list until it is in a fully consistent state with the reference taken. Signed-off-by: Lei Xue Reviewed-by: Daniel Axtens [dja: rewrite and expand commit message] (From https://www.redhat.com/archives/linux-cachefs/2018-February/msg00000.html This patch has been sitting on the mailing list for months with no response from the maintainer. A similar patch fixing the same issue was posted as far back as May 2017, and likewise had no response: https://www.redhat.com/archives/linux-cachefs/2017-May/msg00002.html I poked the list recently and also got nothing: https://www.redhat.com/archives/linux-cachefs/2018-May/msg00000.html and the problem was again reported and this patch validated by another user: https://www.redhat.com/archives/linux-cachefs/2018-May/msg00001.html Hence the submission as a sauce patch.) Signed-off-by: Daniel Axtens Acked-by: Stefan Bader --- fs/cachefiles/rdwr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index c0f3da3926a0..d9bb47d1e16d 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -58,9 +58,9 @@ static int cachefiles_read_waiter(wait_queue_t *wait, unsigned mode, spin_lock(&object->work_lock); list_add_tail(&monitor->op_link, &monitor->op->to_do); + fscache_enqueue_retrieval(monitor->op); spin_unlock(&object->work_lock); - fscache_enqueue_retrieval(monitor->op); return 0; }