From patchwork Mon Feb 25 22:26:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1047995 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 447c4t3klTz9s71; Tue, 26 Feb 2019 09:27:10 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gyOiB-0004kK-U8; Mon, 25 Feb 2019 22:26:59 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gyOi9-0004k9-6j for kernel-team@lists.ubuntu.com; Mon, 25 Feb 2019 22:26:57 +0000 Received: from mail-qk1-f199.google.com ([209.85.222.199]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gyOi8-0005FV-Tj for kernel-team@lists.ubuntu.com; Mon, 25 Feb 2019 22:26:57 +0000 Received: by mail-qk1-f199.google.com with SMTP id z198so9026877qkb.15 for ; Mon, 25 Feb 2019 14:26:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=X27MGykBuegWy//evI0MiAIsopOjP6yBdATQQc1Qtlg=; b=jAU+GfHCm30VQOEqjBqlmBbI59v1Qrvg0ef2u/+KMDbRLm1ttSTNSjjrb+xuj+DS6w wsztZlUHCtb2Y+GV0bbO9RK/ycAyYJvxLQ+zoOhZJlBh7mM4ljkLNqh2U539GJp1Si+J RvoFN8QB4Jnt0IEQhsQx24M4Sj+xtl4uN2W+ZtmisvN75iMxMWak5WMHHeuyAELi6A3m mEaOaGVf34XhlnR50AbimTDJvXG/NIb5wwRWQosUPprvjjMIr4Qrx3velgDysS7bVydh xg52vB1dkhuAb5anJXHvWGf1Eo6ieOAEI2sgpaA2cH9h896AWVO9LVNYfiOLRYn24Y+a g9QQ== X-Gm-Message-State: AHQUAuZ1wBLBzuwdTuIRNl0iF523MrRWS+UAZikysPQvzNAYQYy2abBs Uy35YPyYNaxUFcNqPdQO6OPPc07o5BgCwF8UG8hiPIIa9EmoxDqxSeBnHqJBYa/EYxqykXFvHEP N78zxx1hnuSBZCAScI0VbDIxpLwBEHwwHYaXFKKAoTA== X-Received: by 2002:a0c:b701:: with SMTP id t1mr15108004qvd.179.1551133615917; Mon, 25 Feb 2019 14:26:55 -0800 (PST) X-Google-Smtp-Source: AHgI3IaVLHw/z4muTpFu8aIxa1WdxDG2ykgeaoPM/k97dpLsdoEbq5a3Zx8WhLJQ4BJaNYaMrAm+8w== X-Received: by 2002:a0c:b701:: with SMTP id t1mr15107998qvd.179.1551133615747; Mon, 25 Feb 2019 14:26:55 -0800 (PST) Received: from localhost.localdomain ([177.181.227.2]) by smtp.gmail.com with ESMTPSA id b24sm7708951qta.76.2019.02.25.14.26.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Feb 2019 14:26:55 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [X/B][PATCH] mm: do not stall register_shrinker() Date: Mon, 25 Feb 2019 19:26:11 -0300 Message-Id: <20190225222611.29564-1-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Minchan Kim BugLink: https://bugs.launchpad.net/bugs/1817628 Shakeel Butt reported he has observed in production systems that the job loader gets stuck for 10s of seconds while doing a mount operation. It turns out that it was stuck in register_shrinker() because some unrelated job was under memory pressure and was spending time in shrink_slab(). Machines have a lot of shrinkers registered and jobs under memory pressure have to traverse all of those memcg-aware shrinkers and affect unrelated jobs which want to register their own shrinkers. To solve the issue, this patch simply bails out slab shrinking if it is found that someone wants to register a shrinker in parallel. A downside is it could cause unfair shrinking between shrinkers. However, it should be rare and we can add compilcated logic if we find it's not enough. [akpm@linux-foundation.org: tweak code comment] Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox Link: http://lkml.kernel.org/r/1511481899-20335-1-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim Signed-off-by: Shakeel Butt Reported-by: Shakeel Butt Tested-by: Shakeel Butt Acked-by: Johannes Weiner Acked-by: Michal Hocko Cc: Tetsuo Handa Cc: Anshuman Khandual Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds (backported from commit e496612c5130567fc9d5f1969ca4b86665aa3cbb) [mfo: refresh one context line for do_shrink_slab() arguments] Signed-off-by: Mauricio Faria de Oliveira Acked-by: Khalid Elmously Acked-by: Stefan Bader --- mm/vmscan.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/vmscan.c b/mm/vmscan.c index f53dcdccdb83..decc5f99b85c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -453,6 +453,15 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, sc.nid = 0; freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible); + /* + * Bail out if someone want to register a new shrinker to + * prevent the regsitration from being stalled for long periods + * by parallel ongoing shrinking. + */ + if (rwsem_is_contended(&shrinker_rwsem)) { + freed = freed ? : 1; + break; + } } up_read(&shrinker_rwsem);