Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/806491/?format=api
{ "id": 806491, "url": "http://patchwork.ozlabs.org/api/patches/806491/?format=api", "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/patch/1503914913-28893-5-git-send-email-wei.w.wang@intel.com/", "project": { "id": 14, "url": "http://patchwork.ozlabs.org/api/projects/14/?format=api", "name": "QEMU Development", "link_name": "qemu-devel", "list_id": "qemu-devel.nongnu.org", "list_email": "qemu-devel@nongnu.org", "web_url": "", "scm_url": "", "webscm_url": "", "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<1503914913-28893-5-git-send-email-wei.w.wang@intel.com>", "list_archive_url": null, "date": "2017-08-28T10:08:32", "name": "[v15,4/5] mm: support reporting free page blocks", "commit_ref": null, "pull_url": null, "state": "new", "archived": false, "hash": "08567e1a8463cc0a6a276dbd2f96b03330cc5c55", "submitter": { "id": 69100, "url": "http://patchwork.ozlabs.org/api/people/69100/?format=api", "name": "Wang, Wei W", "email": "wei.w.wang@intel.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/qemu-devel/patch/1503914913-28893-5-git-send-email-wei.w.wang@intel.com/mbox/", "series": [ { "id": 126, "url": "http://patchwork.ozlabs.org/api/series/126/?format=api", "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/list/?series=126", "date": "2017-08-28T10:08:28", "name": "Virtio-balloon Enhancement", "version": 15, "mbox": "http://patchwork.ozlabs.org/series/126/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/806491/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/806491/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>", "X-Original-To": "incoming@patchwork.ozlabs.org", "Delivered-To": "patchwork-incoming@bilbo.ozlabs.org", "Authentication-Results": "ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=nongnu.org\n\t(client-ip=2001:4830:134:3::11; helo=lists.gnu.org;\n\tenvelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n\treceiver=<UNKNOWN>)", "Received": [ "from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])\n\t(using TLSv1 with cipher AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xgnv91DFtz9s8P\n\tfor <incoming@patchwork.ozlabs.org>;\n\tMon, 28 Aug 2017 20:24:37 +1000 (AEST)", "from localhost ([::1]:37862 helo=lists.gnu.org)\n\tby lists.gnu.org with esmtp (Exim 4.71) (envelope-from\n\t<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)\n\tid 1dmHDf-00047H-0r\n\tfor incoming@patchwork.ozlabs.org; Mon, 28 Aug 2017 06:24:35 -0400", "from eggs.gnu.org ([2001:4830:134:3::10]:57873)\n\tby lists.gnu.org with esmtp (Exim 4.71)\n\t(envelope-from <wei.w.wang@intel.com>) id 1dmH9u-00023c-O7\n\tfor qemu-devel@nongnu.org; Mon, 28 Aug 2017 06:20:44 -0400", "from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)\n\t(envelope-from <wei.w.wang@intel.com>) id 1dmH9q-00066R-MR\n\tfor qemu-devel@nongnu.org; Mon, 28 Aug 2017 06:20:42 -0400", "from mga04.intel.com ([192.55.52.120]:17236)\n\tby eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)\n\t(Exim 4.71) (envelope-from <wei.w.wang@intel.com>)\n\tid 1dmH9q-0005ys-01\n\tfor qemu-devel@nongnu.org; Mon, 28 Aug 2017 06:20:38 -0400", "from orsmga003.jf.intel.com ([10.7.209.27])\n\tby fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;\n\t28 Aug 2017 03:20:37 -0700", "from devel-ww.sh.intel.com ([10.239.48.92])\n\tby orsmga003.jf.intel.com with ESMTP; 28 Aug 2017 03:20:34 -0700" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos; i=\"5.41,441,1498546800\"; d=\"scan'208\";\n\ta=\"1008318975\"", "From": "Wei Wang <wei.w.wang@intel.com>", "To": "virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,\n\tqemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,\n\tkvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com,\n\tmhocko@kernel.org, akpm@linux-foundation.org, mawilcox@microsoft.com", "Date": "Mon, 28 Aug 2017 18:08:32 +0800", "Message-Id": "<1503914913-28893-5-git-send-email-wei.w.wang@intel.com>", "X-Mailer": "git-send-email 2.7.4", "In-Reply-To": "<1503914913-28893-1-git-send-email-wei.w.wang@intel.com>", "References": "<1503914913-28893-1-git-send-email-wei.w.wang@intel.com>", "X-detected-operating-system": "by eggs.gnu.org: Genre and OS details not\n\trecognized.", "X-Received-From": "192.55.52.120", "Subject": "[Qemu-devel] [PATCH v15 4/5] mm: support reporting free page blocks", "X-BeenThere": "qemu-devel@nongnu.org", "X-Mailman-Version": "2.1.21", "Precedence": "list", "List-Id": "<qemu-devel.nongnu.org>", "List-Unsubscribe": "<https://lists.nongnu.org/mailman/options/qemu-devel>,\n\t<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>", "List-Archive": "<http://lists.nongnu.org/archive/html/qemu-devel/>", "List-Post": "<mailto:qemu-devel@nongnu.org>", "List-Help": "<mailto:qemu-devel-request@nongnu.org?subject=help>", "List-Subscribe": "<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n\t<mailto:qemu-devel-request@nongnu.org?subject=subscribe>", "Cc": "aarcange@redhat.com, yang.zhang.wz@gmail.com, david@redhat.com,\n\tliliang.opensource@gmail.com, willy@infradead.org,\n\tamit.shah@redhat.com, wei.w.wang@intel.com, quan.xu@aliyun.com,\n\tcornelia.huck@de.ibm.com, pbonzini@redhat.com,\n\tmgorman@techsingularity.net", "Errors-To": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org", "Sender": "\"Qemu-devel\"\n\t<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>" }, "content": "This patch adds support to walk through the free page blocks in the\nsystem and report them via a callback function. Some page blocks may\nleave the free list after zone->lock is released, so it is the caller's\nresponsibility to either detect or prevent the use of such pages.\n\nOne use example of this patch is to accelerate live migration by skipping\nthe transfer of free pages reported from the guest. A popular method used\nby the hypervisor to track which part of memory is written during live\nmigration is to write-protect all the guest memory. So, those pages that\nare reported as free pages but are written after the report function\nreturns will be captured by the hypervisor, and they will be added to the\nnext round of memory transfer.\n\nSigned-off-by: Wei Wang <wei.w.wang@intel.com>\nSigned-off-by: Liang Li <liang.z.li@intel.com>\nCc: Michal Hocko <mhocko@kernel.org>\nCc: Michael S. Tsirkin <mst@redhat.com>\n---\n include/linux/mm.h | 5 +++++\n mm/page_alloc.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++++\n 2 files changed, 70 insertions(+)", "diff": "diff --git a/include/linux/mm.h b/include/linux/mm.h\nindex 46b9ac5..3c4267d 100644\n--- a/include/linux/mm.h\n+++ b/include/linux/mm.h\n@@ -1835,6 +1835,11 @@ extern void free_area_init_node(int nid, unsigned long * zones_size,\n \t\tunsigned long zone_start_pfn, unsigned long *zholes_size);\n extern void free_initmem(void);\n \n+extern void walk_free_mem_block(void *opaque,\n+\t\t\t\tint min_order,\n+\t\t\t\tbool (*report_page_block)(void *, unsigned long,\n+\t\t\t\t\t\t\t unsigned long));\n+\n /*\n * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK)\n * into the buddy system. The freed pages will be poisoned with pattern\ndiff --git a/mm/page_alloc.c b/mm/page_alloc.c\nindex 6d00f74..81eedc7 100644\n--- a/mm/page_alloc.c\n+++ b/mm/page_alloc.c\n@@ -4762,6 +4762,71 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)\n \tshow_swap_cache_info();\n }\n \n+/**\n+ * walk_free_mem_block - Walk through the free page blocks in the system\n+ * @opaque: the context passed from the caller\n+ * @min_order: the minimum order of free lists to check\n+ * @report_page_block: the callback function to report free page blocks\n+ *\n+ * If the callback returns 1, stop iterating the list of free page blocks.\n+ * Otherwise, continue to report.\n+ *\n+ * Please note that there are no locking guarantees for the callback and\n+ * that the reported pfn range might be freed or disappear after the\n+ * callback returns so the caller has to be very careful how it is used.\n+ *\n+ * The callback itself must not sleep or perform any operations which would\n+ * require any memory allocations directly (not even GFP_NOWAIT/GFP_ATOMIC)\n+ * or via any lock dependency. It is generally advisable to implement\n+ * the callback as simple as possible and defer any heavy lifting to a\n+ * different context.\n+ *\n+ * There is no guarantee that each free range will be reported only once\n+ * during one walk_free_mem_block invocation.\n+ *\n+ * pfn_to_page on the given range is strongly discouraged and if there is\n+ * an absolute need for that make sure to contact MM people to discuss\n+ * potential problems.\n+ *\n+ * The function itself might sleep so it cannot be called from atomic\n+ * contexts.\n+ *\n+ * In general low orders tend to be very volatile and so it makes more\n+ * sense to query larger ones first for various optimizations which like\n+ * ballooning etc... This will reduce the overhead as well.\n+ */\n+void walk_free_mem_block(void *opaque,\n+\t\t\t int min_order,\n+\t\t\t bool (*report_page_block)(void *, unsigned long,\n+\t\t\t\t\t\t unsigned long))\n+{\n+\tstruct zone *zone;\n+\tstruct page *page;\n+\tstruct list_head *list;\n+\tint order;\n+\tenum migratetype mt;\n+\tunsigned long pfn, flags;\n+\tbool stop = 0;\n+\n+\tfor_each_populated_zone(zone) {\n+\t\tfor (order = MAX_ORDER - 1; order >= min_order; order--) {\n+\t\t\tfor (mt = 0; !stop && mt < MIGRATE_TYPES; mt++) {\n+\t\t\t\tspin_lock_irqsave(&zone->lock, flags);\n+\t\t\t\tlist = &zone->free_area[order].free_list[mt];\n+\t\t\t\tlist_for_each_entry(page, list, lru) {\n+\t\t\t\t\tpfn = page_to_pfn(page);\n+\t\t\t\t\tstop = report_page_block(opaque, pfn,\n+\t\t\t\t\t\t\t\t 1 << order);\n+\t\t\t\t\tif (stop)\n+\t\t\t\t\t\tbreak;\n+\t\t\t\t}\n+\t\t\t\tspin_unlock_irqrestore(&zone->lock, flags);\n+\t\t\t}\n+\t\t}\n+\t}\n+}\n+EXPORT_SYMBOL_GPL(walk_free_mem_block);\n+\n static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref)\n {\n \tzoneref->zone = zone;\n", "prefixes": [ "v15", "4/5" ] }