From patchwork Tue Dec 3 10:58:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrea Righi X-Patchwork-Id: 1203612 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47RzWK0l0wz9sPV; Tue, 3 Dec 2019 21:59:09 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ic5tX-0006iL-Q5; Tue, 03 Dec 2019 10:59:03 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ic5tV-0006iF-Tx for kernel-team@lists.ubuntu.com; Tue, 03 Dec 2019 10:59:01 +0000 Received: from mail-wr1-f71.google.com ([209.85.221.71]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ic5tV-0005Tq-KJ for kernel-team@lists.ubuntu.com; Tue, 03 Dec 2019 10:59:01 +0000 Received: by mail-wr1-f71.google.com with SMTP id w6so1567346wrm.16 for ; Tue, 03 Dec 2019 02:59:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-disposition:user-agent; bh=jbNd5f0wZWC7kGb187xyPMmpq9y4oq+1XwC7Q5Msxrg=; b=priiGSo+CvbqIheubkJe41ZanTfl6EJFQ5R6SXRGgA9ji0C7vLJDskHGXigzauHo81 84Qvn/OVn40e4z6FS3y3DjEvIUO35G9MoQ4WQRP7UEVO+stWk92t4WG5YVjxgBetFPMy QAHt55Z4UL/yQ9mR+Xk13bYWoRr6Pd1Q18kKlredi74TpfisKqC1+v4z99NAU30l32At ryU6aMO3FoBnI45+NBtb6TbhW1vVze3cvCgT9a0uBUvIuMbmMqVltPnlt90vPqgBc0D0 LuiC1XbNtCrolG5UlD6akTAv6NXrLW0dUSTDWs96NuMMHNdIdmj99vOa7coCLCAPHQWn qcJw== X-Gm-Message-State: APjAAAU55tqe0wJcGp31/lfcopSc4zBXOvPYWP5oMaFfq/7UVa0bu/0t +St/0DOlo1SnwDF5m/9eusInWZjQoaFXWiCAVnJtcdyYLpqLirgwtIvjkfsrXYWjdj5QeANXTgD msZkFnMDNoJe6yNzZf5kpXI97dk/3jYR0clfwn5R5qA== X-Received: by 2002:a1c:9903:: with SMTP id b3mr9257774wme.139.1575370741104; Tue, 03 Dec 2019 02:59:01 -0800 (PST) X-Google-Smtp-Source: APXvYqy5ltqNEVcKXBBV+Ot5uHiXzfu6xPvm0sBmLKECI+Hlthh5C4cAFr/MhXvYF5KOhYyKjXdUzQ== X-Received: by 2002:a1c:9903:: with SMTP id b3mr9257749wme.139.1575370740742; Tue, 03 Dec 2019 02:59:00 -0800 (PST) Received: from localhost (host40-61-dynamic.57-82-r.retail.telecomitalia.it. [82.57.61.40]) by smtp.gmail.com with ESMTPSA id a20sm2610549wmd.19.2019.12.03.02.59.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Dec 2019 02:59:00 -0800 (PST) Date: Tue, 3 Dec 2019 11:58:59 +0100 From: Andrea Righi To: kernel-team@lists.ubuntu.com Subject: [PATCH][B][aws] UBUNTU SAUCE: mm: swap: improve swap readahead heuristic Message-ID: <20191203105859.GA13534@xps-13> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/1831940 Apply a more aggressive swapin readahead policy to improve swapoff performance. The idea is to start with no readahead (only read one page) and linearly increment the amount of readahead pages each time swapin_readahead() is called, up to the maximum cluster size (defined by vm.page-cluster), then go back to one page to give the disk enough time to prefetch the requested pages and avoid re-requesting them multiple times. Also increase the default vm.page-cluster size to 8 (that seems to work better with this new heuristic). Signed-off-by: Andrea Righi Acked-by: Connor Kuehl Acked-by: Kamal Mostafa Nacked-by: Sultan Alsawaf --- mm/swap.c | 2 +- mm/swap_state.c | 60 ++++++++----------------------------------------- 2 files changed, 10 insertions(+), 52 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index abc82e6c14d1..5603bc987ef0 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -1022,7 +1022,7 @@ void __init swap_setup(void) if (megs < 16) page_cluster = 2; else - page_cluster = 3; + page_cluster = 8; /* * Right now other parts of the system means that we * _really_ don't want to cluster much more diff --git a/mm/swap_state.c b/mm/swap_state.c index 6dac8c6ee6d9..a2246bcebc77 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -472,62 +472,21 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, return retpage; } -static unsigned int __swapin_nr_pages(unsigned long prev_offset, - unsigned long offset, - int hits, - int max_pages, - int prev_win) -{ - unsigned int pages, last_ra; - - /* - * This heuristic has been found to work well on both sequential and - * random loads, swapping to hard disk or to SSD: please don't ask - * what the "+ 2" means, it just happens to work well, that's all. - */ - pages = hits + 2; - if (pages == 2) { - /* - * We can have no readahead hits to judge by: but must not get - * stuck here forever, so check for an adjacent offset instead - * (and don't even bother to check whether swap type is same). - */ - if (offset != prev_offset + 1 && offset != prev_offset - 1) - pages = 1; - } else { - unsigned int roundup = 4; - while (roundup < pages) - roundup <<= 1; - pages = roundup; - } - - if (pages > max_pages) - pages = max_pages; - - /* Don't shrink readahead too fast */ - last_ra = prev_win / 2; - if (pages < last_ra) - pages = last_ra; - - return pages; -} - static unsigned long swapin_nr_pages(unsigned long offset) { - static unsigned long prev_offset; - unsigned int hits, pages, max_pages; - static atomic_t last_readahead_pages; + static unsigned int prev_pages; + unsigned long pages, max_pages; max_pages = 1 << READ_ONCE(page_cluster); if (max_pages <= 1) return 1; - hits = atomic_xchg(&swapin_readahead_hits, 0); - pages = __swapin_nr_pages(prev_offset, offset, hits, max_pages, - atomic_read(&last_readahead_pages)); - if (!hits) - prev_offset = offset; - atomic_set(&last_readahead_pages, pages); + pages = READ_ONCE(prev_pages) + 1; + if (pages > max_pages) { + WRITE_ONCE(prev_pages, 0); + pages = max_pages; + } else + WRITE_ONCE(prev_pages, pages); return pages; } @@ -684,8 +643,7 @@ struct page *swap_readahead_detect(struct vm_fault *vmf, pfn = PFN_DOWN(SWAP_RA_ADDR(swap_ra_info)); prev_win = SWAP_RA_WIN(swap_ra_info); hits = SWAP_RA_HITS(swap_ra_info); - swap_ra->win = win = __swapin_nr_pages(pfn, fpfn, hits, - max_win, prev_win); + swap_ra->win = win = swapin_nr_pages(fpfn); atomic_long_set(&vma->swap_readahead_info, SWAP_RA_VAL(faddr, win, 0));