From patchwork Thu Aug 2 04:18:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 952571 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41gxl652Bfz9s4s; Thu, 2 Aug 2018 14:19:14 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1fl54u-0007UG-GX; Thu, 02 Aug 2018 04:19:08 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1fl54q-0007RX-Rj for kernel-team@lists.canonical.com; Thu, 02 Aug 2018 04:19:04 +0000 Received: from mail-pf1-f199.google.com ([209.85.210.199]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1fl54q-0001xG-Fu for kernel-team@lists.canonical.com; Thu, 02 Aug 2018 04:19:04 +0000 Received: by mail-pf1-f199.google.com with SMTP id v9-v6so663396pff.4 for ; Wed, 01 Aug 2018 21:19:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=FgQRH5DQyUhr63MqVzh1kOlSYwhvgg6twWAG/10rBWk=; b=DWLu5nQVJ25OyXV7recqGOCkira2+mGEklLCHqHGP/lVAaymSI0xvKlazEN+Vgda8i DBECxRt7/vU1MrlATeO4pIqz7R8CRE1CykaV+oDs/rh2MF8LKG5/yxobYLqG/xTbDz+4 EvxefK3mZVVPlh1cQ/koLmK4WCAdR5uyvd+JoJFCZPpFLSyIPVRbxspZuZPzEcWLdroR BabjeL/WdDLV/knuQcerTP3rLbYdGZAMcTsnmqBHnJbSFvVMo//4/qMyzmunnglUowC3 0qJYNYIRjmnpS+a4+OqmC/yKBDoC0G16/i1RQle5OdEFXNpGiYbyf8KU19w1buO6rwOf JOkQ== X-Gm-Message-State: AOUpUlGs9SwNDbaXEzQXQoty1qjoDQ5Z26JX5QFH4gz+2YiTOOq0PtP7 wmo4rViODTiFaUQzV6y/P+kC5s5xoJjG/XQUoa8P5N32yzI1ML7YTVradvU3SCGxUwBcjbx+uH6 8l4vuB4PiD1K7fpjqY7tDaZv9BEKsz71G08NB0pzGrBhF1Kze X-Received: by 2002:a65:6109:: with SMTP id z9-v6mr1074307pgu.243.1533183543169; Wed, 01 Aug 2018 21:19:03 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeMQ2+TUPhKTJC+RaWmH3Si7+3UjbgcVQ3gcUoXxwzgmnIIG1PvxTtRBlkH655RJJkgbJU8eQ== X-Received: by 2002:a65:6109:: with SMTP id z9-v6mr1074297pgu.243.1533183542970; Wed, 01 Aug 2018 21:19:02 -0700 (PDT) Received: from localhost.localdomain (124-171-193-200.dyn.iinet.net.au. [124.171.193.200]) by smtp.gmail.com with ESMTPSA id p73-v6sm952509pfk.186.2018.08.01.21.19.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Aug 2018 21:19:02 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU T][PATCH 4/6] fscache: Fix reference overput in fscache_attach_object() error handling Date: Thu, 2 Aug 2018 14:18:48 +1000 Message-Id: <20180802041850.22961-5-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180802041850.22961-1-daniel.axtens@canonical.com> References: <20180802041850.22961-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Kiran Kumar Modukuri BugLink: https://bugs.launchpad.net/bugs/1776277 When a cookie is allocated that causes fscache_object structs to be allocated, those objects are initialised with the cookie pointer, but aren't blessed with a ref on that cookie unless the attachment is successfully completed in fscache_attach_object(). If attachment fails because the parent object was dying or there was a collision, fscache_attach_object() returns without incrementing the cookie counter - but upon failure of this function, the object is released which then puts the cookie, whether or not a ref was taken on the cookie. Fix this by taking a ref on the cookie when it is assigned in fscache_object_init(), even when we're creating a root object. Analysis from Kiran Kumar: This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277 fscache cookie ref count updated incorrectly during fscache object allocation resulting in following Oops. kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321! kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! [Cause] Two threads are trying to do operate on a cookie and two objects. (1) One thread tries to unmount the filesystem and in process goes over a huge list of objects marking them dead and deleting the objects. cookie->usage is also decremented in following path: nfs_fscache_release_super_cookie -> __fscache_relinquish_cookie ->__fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); (2) A second thread tries to lookup an object for reading data in following path: fscache_alloc_object 1) cachefiles_alloc_object -> fscache_object_init -> assign cookie, but usage not bumped. 2) fscache_attach_object -> fails in cant_attach_object because the cookie's backing object or cookie's->parent object are going away 3) fscache_put_object -> cachefiles_put_object ->fscache_object_destroy ->fscache_cookie_put ->BUG_ON(atomic_read(&cookie->usage) <= 0); [NOTE from dhowells] It's unclear as to the circumstances in which (2) can take place, given that thread (1) is in nfs_kill_super(), however a conflicting NFS mount with slightly different parameters that creates a different superblock would do it. A backtrace from Kiran seems to show that this is a possibility: kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639! ... RIP: __fscache_cookie_put+0x3a/0x40 [fscache] Call Trace: __fscache_relinquish_cookie+0x87/0x120 [fscache] nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs] nfs_kill_super+0x29/0x40 [nfs] deactivate_locked_super+0x48/0x80 deactivate_super+0x5c/0x60 cleanup_mnt+0x3f/0x90 __cleanup_mnt+0x12/0x20 task_work_run+0x86/0xb0 exit_to_usermode_loop+0xc2/0xd0 syscall_return_slowpath+0x4e/0x60 int_ret_from_sys_call+0x25/0x9f [Fix] Bump up the cookie usage in fscache_object_init, when it is first being assigned a cookie atomically such that the cookie is added and bumped up if its refcount is not zero. Remove the assignment in fscache_attach_object(). [Testcase] I have run ~100 hours of NFS stress tests and not seen this bug recur. [Regression Potential] - Limited to fscache/cachefiles. Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API") Signed-off-by: Kiran Kumar Modukuri Signed-off-by: David Howells (backported from commit f29507ce66701084c39aeb1b0ae71690cbff3554) Signed-off-by: Daniel Axtens --- fs/cachefiles/bind.c | 3 ++- fs/fscache/cache.c | 2 +- fs/fscache/cookie.c | 7 ++++--- fs/fscache/object.c | 1 + 4 files changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/cachefiles/bind.c b/fs/cachefiles/bind.c index 622f4696e484..2a5b374aff9c 100644 --- a/fs/cachefiles/bind.c +++ b/fs/cachefiles/bind.c @@ -219,7 +219,8 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache) "%s", fsdef->dentry->d_sb->s_id); - fscache_object_init(&fsdef->fscache, NULL, &cache->cache); + fscache_object_init(&fsdef->fscache, &fscache_fsdef_index, + &cache->cache); ret = fscache_add_cache(&cache->cache, &fsdef->fscache, cache->tag); if (ret < 0) diff --git a/fs/fscache/cache.c b/fs/fscache/cache.c index f7cff367db7f..5d5c1987b841 100644 --- a/fs/fscache/cache.c +++ b/fs/fscache/cache.c @@ -220,6 +220,7 @@ int fscache_add_cache(struct fscache_cache *cache, { struct fscache_cache_tag *tag; + ASSERTCMP(ifsdef->cookie, ==, &fscache_fsdef_index); BUG_ON(!cache->ops); BUG_ON(!ifsdef); @@ -248,7 +249,6 @@ int fscache_add_cache(struct fscache_cache *cache, if (!cache->kobj) goto error; - ifsdef->cookie = &fscache_fsdef_index; ifsdef->cache = cache; cache->fsdef = ifsdef; diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index 29d7feb62cf7..269773d7e698 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -302,6 +302,7 @@ static int fscache_alloc_object(struct fscache_cache *cache, goto error; } + ASSERTCMP(object->cookie, ==, cookie); fscache_stat(&fscache_n_object_alloc); object->debug_id = atomic_inc_return(&fscache_object_debug_id); @@ -356,6 +357,8 @@ static int fscache_attach_object(struct fscache_cookie *cookie, _enter("{%s},{OBJ%x}", cookie->def->name, object->debug_id); + ASSERTCMP(object->cookie, ==, cookie); + spin_lock(&cookie->lock); /* there may be multiple initial creations of this object, but we only @@ -395,9 +398,7 @@ static int fscache_attach_object(struct fscache_cookie *cookie, spin_unlock(&cache->object_list_lock); } - /* attach to the cookie */ - object->cookie = cookie; - atomic_inc(&cookie->usage); + /* Attach to the cookie. The object already has a ref on it. */ hlist_add_head(&object->cookie_link, &cookie->backing_objects); fscache_objlist_add(object); diff --git a/fs/fscache/object.c b/fs/fscache/object.c index 53d35c504240..0de8ad801acb 100644 --- a/fs/fscache/object.c +++ b/fs/fscache/object.c @@ -313,6 +313,7 @@ void fscache_object_init(struct fscache_object *object, object->store_limit_l = 0; object->cache = cache; object->cookie = cookie; + atomic_inc(&cookie->usage); object->parent = NULL; object->oob_event_mask = 0;