From patchwork Thu Sep 20 05:51:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 972144 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42G5TN4jN7z9sBJ; Thu, 20 Sep 2018 15:51:52 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1g2rsQ-0006xg-6s; Thu, 20 Sep 2018 05:51:46 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1g2rsO-0006x2-H4 for kernel-team@lists.canonical.com; Thu, 20 Sep 2018 05:51:44 +0000 Received: from mail-qk1-f200.google.com ([209.85.222.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1g2rsO-0002u8-71 for kernel-team@lists.canonical.com; Thu, 20 Sep 2018 05:51:44 +0000 Received: by mail-qk1-f200.google.com with SMTP id g26-v6so133606qkm.20 for ; Wed, 19 Sep 2018 22:51:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f6sQqmAcfNKFourz518UX1HNqsD6BYTe1PImxRh2yMQ=; b=G9JE+H5hmQDt4y9kNAOI4MfEHfCwaK+Tp5MWOvlkP7Dpo9dpDuft/X2a6t5JxWUUNq e2j/8PQIHpGQzIfc0VWA8aDcc1FUiupSlJLQ8sWIXLBl1bHfRtequugNLeON9qGZb4Iw NZKwOwDR0pF6rb8UsgAHHnUmbbKJEs5S+euLC2HMAGVJQdJGMxnczolJZiw636k2DTPP 74yYEMvSO2OvUdaKze+2aYezKECslx0lxJZaDd6m403L6uvRCG9UtQVD1ZnNkmrp4Bg7 XEQJ4jwQy+36NLmW/XTpE9x0oum1wCCTsMG5F4+BvElzfo6oLau+mNa+jol2KPGFM0qp /4ww== X-Gm-Message-State: APzg51B8WqSHMVM4Y7OXEqRssyET1YrtABYVXGOeyJJaSrigaXS5+zRB ELFBG16/tSEKO/3O7okWPQWUAwbbXwFT3jTaoFAVtZodNJ8ISSq/GM6T0PfRcSP45z2wEV03joB HGi7UUS0RntL8Eo27y+AAqqnDoMb1N8GPhmDoUSLWyAajHw5n X-Received: by 2002:a0c:d7c3:: with SMTP id g3-v6mr27062835qvj.85.1537422703248; Wed, 19 Sep 2018 22:51:43 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdbjurfukk2MMnPmHonZQUspAo7YXJqaEEu0YJS8CW5Ol3/6yM6/7IiiYhn+SRHbUG9yIMWAVQ== X-Received: by 2002:a0c:d7c3:: with SMTP id g3-v6mr27062827qvj.85.1537422703059; Wed, 19 Sep 2018 22:51:43 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id e29-v6sm17119530qte.47.2018.09.19.22.51.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 19 Sep 2018 22:51:42 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU X][PATCH 1/1] UBUNTU: SAUCE: cachefiles: Page leaking in cachefiles_read_backing_file while vmscan is active Date: Thu, 20 Sep 2018 15:51:33 +1000 Message-Id: <20180920055133.3402-2-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180920055133.3402-1-daniel.axtens@canonical.com> References: <20180920055133.3402-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kiran.modukuri@gmail.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Kiran Kumar Modukuri BugLink: https://bugs.launchpad.net/bugs/1793430 [Description] In a heavily loaded system where the system pagecache is nearing memory limits and fscache is enabled, pages can be leaked by fscache while trying read pages from cachefiles backend. This can happen because two applications can be reading same page from a single mount, two threads can be trying to read the backing page at same time. This results in one of the thread finding that a page for the backing file or netfs file is already in the radix tree. During the error handling cachefiles does not cleanup the reference on backing page, leading to page leak. [Fix] The fix is straightforward, to decrement the reference when error is encounterd. [Testing] I have tested the fix using following method for 12+ hrs. 1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc :/export /mnt/nfs 2) create 10000 files of 2.8MB in a NFS mount. 3) start a thread to simulate heavy VM presssure (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)& 4) start multiple parallel reader for data set at same time find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & .. .. find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & find /mnt/nfs -type f | xargs -P 80 cat > /dev/null & 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ; free -h , cat /proc/meminfo and page-types -r -b lru to ensure all pages are freed. Reviewed-by: Daniel Axtens Signed-off-by: Shantanu Goel Signed-off-by: Kiran Kumar Modukuri [dja: forward ported to current upstream] Signed-off-by: Daniel Axtens [backported from https://www.redhat.com/archives/linux-cachefs/2018-August/msg00007.html This is v2 of the patch. It has sat on the list for weeks without any response or forward progress. v1 first was posted in 2014 and reposted this August.] Signed-off-by: Daniel Axtens --- fs/cachefiles/rdwr.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index 5b68cf526887..95f3d227bbca 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -275,6 +275,8 @@ static int cachefiles_read_backing_file_one(struct cachefiles_object *object, goto installed_new_backing_page; if (ret != -EEXIST) goto nomem_page; + page_cache_release(newpage); + newpage = NULL; } /* we've installed a new backing page, so now we need to start @@ -513,6 +515,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, goto installed_new_backing_page; if (ret != -EEXIST) goto nomem; + page_cache_release(newpage); + newpage = NULL; } /* we've installed a new backing page, so now we need @@ -537,7 +541,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, netpage->index, cachefiles_gfp); if (ret < 0) { if (ret == -EEXIST) { + page_cache_release(backpage); + backpage = NULL; page_cache_release(netpage); + netpage = NULL; fscache_retrieval_complete(op, 1); continue; } @@ -610,6 +617,8 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object, netpage->index, cachefiles_gfp); if (ret < 0) { if (ret == -EEXIST) { + page_cache_release(backpage); + backpage = NULL; page_cache_release(netpage); fscache_retrieval_complete(op, 1); continue;