Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/806490/?format=api
{ "id": 806490, "url": "http://patchwork.ozlabs.org/api/patches/806490/?format=api", "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/patch/1503914913-28893-4-git-send-email-wei.w.wang@intel.com/", "project": { "id": 14, "url": "http://patchwork.ozlabs.org/api/projects/14/?format=api", "name": "QEMU Development", "link_name": "qemu-devel", "list_id": "qemu-devel.nongnu.org", "list_email": "qemu-devel@nongnu.org", "web_url": "", "scm_url": "", "webscm_url": "", "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<1503914913-28893-4-git-send-email-wei.w.wang@intel.com>", "list_archive_url": null, "date": "2017-08-28T10:08:31", "name": "[v15,3/5] virtio-balloon: VIRTIO_BALLOON_F_SG", "commit_ref": null, "pull_url": null, "state": "new", "archived": false, "hash": "7641e082d3af6e440d230c42a1486ea4c928b29a", "submitter": { "id": 69100, "url": "http://patchwork.ozlabs.org/api/people/69100/?format=api", "name": "Wang, Wei W", "email": "wei.w.wang@intel.com" }, "delegate": null, "mbox": "http://patchwork.ozlabs.org/project/qemu-devel/patch/1503914913-28893-4-git-send-email-wei.w.wang@intel.com/mbox/", "series": [ { "id": 126, "url": "http://patchwork.ozlabs.org/api/series/126/?format=api", "web_url": "http://patchwork.ozlabs.org/project/qemu-devel/list/?series=126", "date": "2017-08-28T10:08:28", "name": "Virtio-balloon Enhancement", "version": 15, "mbox": "http://patchwork.ozlabs.org/series/126/mbox/" } ], "comments": "http://patchwork.ozlabs.org/api/patches/806490/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/806490/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>", "X-Original-To": "incoming@patchwork.ozlabs.org", "Delivered-To": "patchwork-incoming@bilbo.ozlabs.org", "Authentication-Results": "ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=nongnu.org\n\t(client-ip=2001:4830:134:3::11; helo=lists.gnu.org;\n\tenvelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;\n\treceiver=<UNKNOWN>)", "Received": [ "from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])\n\t(using TLSv1 with cipher AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xgnsM49r4z9sN5\n\tfor <incoming@patchwork.ozlabs.org>;\n\tMon, 28 Aug 2017 20:23:03 +1000 (AEST)", "from localhost ([::1]:37858 helo=lists.gnu.org)\n\tby lists.gnu.org with esmtp (Exim 4.71) (envelope-from\n\t<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)\n\tid 1dmHC9-00036w-EQ\n\tfor incoming@patchwork.ozlabs.org; Mon, 28 Aug 2017 06:23:01 -0400", "from eggs.gnu.org ([2001:4830:134:3::10]:57844)\n\tby lists.gnu.org with esmtp (Exim 4.71)\n\t(envelope-from <wei.w.wang@intel.com>) id 1dmH9o-0001z6-ID\n\tfor qemu-devel@nongnu.org; Mon, 28 Aug 2017 06:20:38 -0400", "from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)\n\t(envelope-from <wei.w.wang@intel.com>) id 1dmH9m-00063H-MZ\n\tfor qemu-devel@nongnu.org; Mon, 28 Aug 2017 06:20:36 -0400", "from mga04.intel.com ([192.55.52.120]:17236)\n\tby eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)\n\t(Exim 4.71) (envelope-from <wei.w.wang@intel.com>)\n\tid 1dmH9m-0005ys-9q\n\tfor qemu-devel@nongnu.org; Mon, 28 Aug 2017 06:20:34 -0400", "from orsmga003.jf.intel.com ([10.7.209.27])\n\tby fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;\n\t28 Aug 2017 03:20:33 -0700", "from devel-ww.sh.intel.com ([10.239.48.92])\n\tby orsmga003.jf.intel.com with ESMTP; 28 Aug 2017 03:20:30 -0700" ], "X-ExtLoop1": "1", "X-IronPort-AV": "E=Sophos; i=\"5.41,441,1498546800\"; d=\"scan'208\";\n\ta=\"1008318963\"", "From": "Wei Wang <wei.w.wang@intel.com>", "To": "virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,\n\tqemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,\n\tkvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com,\n\tmhocko@kernel.org, akpm@linux-foundation.org, mawilcox@microsoft.com", "Date": "Mon, 28 Aug 2017 18:08:31 +0800", "Message-Id": "<1503914913-28893-4-git-send-email-wei.w.wang@intel.com>", "X-Mailer": "git-send-email 2.7.4", "In-Reply-To": "<1503914913-28893-1-git-send-email-wei.w.wang@intel.com>", "References": "<1503914913-28893-1-git-send-email-wei.w.wang@intel.com>", "X-detected-operating-system": "by eggs.gnu.org: Genre and OS details not\n\trecognized.", "X-Received-From": "192.55.52.120", "Subject": "[Qemu-devel] [PATCH v15 3/5] virtio-balloon: VIRTIO_BALLOON_F_SG", "X-BeenThere": "qemu-devel@nongnu.org", "X-Mailman-Version": "2.1.21", "Precedence": "list", "List-Id": "<qemu-devel.nongnu.org>", "List-Unsubscribe": "<https://lists.nongnu.org/mailman/options/qemu-devel>,\n\t<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>", "List-Archive": "<http://lists.nongnu.org/archive/html/qemu-devel/>", "List-Post": "<mailto:qemu-devel@nongnu.org>", "List-Help": "<mailto:qemu-devel-request@nongnu.org?subject=help>", "List-Subscribe": "<https://lists.nongnu.org/mailman/listinfo/qemu-devel>,\n\t<mailto:qemu-devel-request@nongnu.org?subject=subscribe>", "Cc": "aarcange@redhat.com, yang.zhang.wz@gmail.com, david@redhat.com,\n\tliliang.opensource@gmail.com, willy@infradead.org,\n\tamit.shah@redhat.com, wei.w.wang@intel.com, quan.xu@aliyun.com,\n\tcornelia.huck@de.ibm.com, pbonzini@redhat.com,\n\tmgorman@techsingularity.net", "Errors-To": "qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org", "Sender": "\"Qemu-devel\"\n\t<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>" }, "content": "Add a new feature, VIRTIO_BALLOON_F_SG, which enables the transfer\nof balloon (i.e. inflated/deflated) pages using scatter-gather lists\nto the host.\n\nThe implementation of the previous virtio-balloon is not very\nefficient, because the balloon pages are transferred to the\nhost one by one. Here is the breakdown of the time in percentage\nspent on each step of the balloon inflating process (inflating\n7GB of an 8GB idle guest).\n\n1) allocating pages (6.5%)\n2) sending PFNs to host (68.3%)\n3) address translation (6.1%)\n4) madvise (19%)\n\nIt takes about 4126ms for the inflating process to complete.\nThe above profiling shows that the bottlenecks are stage 2)\nand stage 4).\n\nThis patch optimizes step 2) by transferring pages to the host in\nsgs. An sg describes a chunk of guest physically continuous pages.\nWith this mechanism, step 4) can also be optimized by doing address\ntranslation and madvise() in chunks rather than page by page.\n\nWith this new feature, the above ballooning process takes ~597ms\nresulting in an improvement of ~86%.\n\nTODO: optimize stage 1) by allocating/freeing a chunk of pages\ninstead of a single page each time.\n\nSigned-off-by: Wei Wang <wei.w.wang@intel.com>\nSigned-off-by: Liang Li <liang.z.li@intel.com>\nSuggested-by: Michael S. Tsirkin <mst@redhat.com>\n---\n drivers/virtio/virtio_balloon.c | 171 ++++++++++++++++++++++++++++++++----\n include/uapi/linux/virtio_balloon.h | 1 +\n 2 files changed, 155 insertions(+), 17 deletions(-)", "diff": "diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c\nindex f0b3a0b..8ecc1d4 100644\n--- a/drivers/virtio/virtio_balloon.c\n+++ b/drivers/virtio/virtio_balloon.c\n@@ -32,6 +32,8 @@\n #include <linux/mm.h>\n #include <linux/mount.h>\n #include <linux/magic.h>\n+#include <linux/xbitmap.h>\n+#include <asm/page.h>\n \n /*\n * Balloon device works in 4K page units. So each page is pointed to by\n@@ -79,6 +81,9 @@ struct virtio_balloon {\n \t/* Synchronize access/update to this struct virtio_balloon elements */\n \tstruct mutex balloon_lock;\n \n+\t/* The xbitmap used to record balloon pages */\n+\tstruct xb page_xb;\n+\n \t/* The array of pfns we tell the Host about. */\n \tunsigned int num_pfns;\n \t__virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];\n@@ -141,13 +146,111 @@ static void set_page_pfns(struct virtio_balloon *vb,\n \t\t\t\t\t page_to_balloon_pfn(page) + i);\n }\n \n+static int add_one_sg(struct virtqueue *vq, void *addr, uint32_t size)\n+{\n+\tstruct scatterlist sg;\n+\n+\tsg_init_one(&sg, addr, size);\n+\treturn virtqueue_add_inbuf(vq, &sg, 1, vq, GFP_KERNEL);\n+}\n+\n+static void send_balloon_page_sg(struct virtio_balloon *vb,\n+\t\t\t\t struct virtqueue *vq,\n+\t\t\t\t void *addr,\n+\t\t\t\t uint32_t size,\n+\t\t\t\t bool batch)\n+{\n+\tunsigned int len;\n+\tint err;\n+\n+\terr = add_one_sg(vq, addr, size);\n+\t/* Sanity check: this can't really happen */\n+\tWARN_ON(err);\n+\n+\t/* If batching is in use, we batch the sgs till the vq is full. */\n+\tif (!batch || !vq->num_free) {\n+\t\tvirtqueue_kick(vq);\n+\t\twait_event(vb->acked, virtqueue_get_buf(vq, &len));\n+\t\t/* Release all the entries if there are */\n+\t\twhile (virtqueue_get_buf(vq, &len))\n+\t\t\t;\n+\t}\n+}\n+\n+/*\n+ * Send balloon pages in sgs to host. The balloon pages are recorded in the\n+ * page xbitmap. Each bit in the bitmap corresponds to a page of PAGE_SIZE.\n+ * The page xbitmap is searched for continuous \"1\" bits, which correspond\n+ * to continuous pages, to chunk into sgs.\n+ *\n+ * @page_xb_start and @page_xb_end form the range of bits in the xbitmap that\n+ * need to be searched.\n+ */\n+static void tell_host_sgs(struct virtio_balloon *vb,\n+\t\t\t struct virtqueue *vq,\n+\t\t\t unsigned long page_xb_start,\n+\t\t\t unsigned long page_xb_end)\n+{\n+\tunsigned long sg_pfn_start, sg_pfn_end;\n+\tvoid *sg_addr;\n+\tuint32_t sg_len, sg_max_len = round_down(UINT_MAX, PAGE_SIZE);\n+\n+\tsg_pfn_start = page_xb_start;\n+\twhile (sg_pfn_start < page_xb_end) {\n+\t\tsg_pfn_start = xb_find_next_bit(&vb->page_xb, sg_pfn_start,\n+\t\t\t\t\t\tpage_xb_end, 1);\n+\t\tif (sg_pfn_start == page_xb_end + 1)\n+\t\t\tbreak;\n+\t\tsg_pfn_end = xb_find_next_bit(&vb->page_xb, sg_pfn_start + 1,\n+\t\t\t\t\t page_xb_end, 0);\n+\t\tsg_addr = (void *)pfn_to_kaddr(sg_pfn_start);\n+\t\tsg_len = (sg_pfn_end - sg_pfn_start) << PAGE_SHIFT;\n+\t\twhile (sg_len > sg_max_len) {\n+\t\t\tsend_balloon_page_sg(vb, vq, sg_addr, sg_max_len, 1);\n+\t\t\tsg_addr += sg_max_len;\n+\t\t\tsg_len -= sg_max_len;\n+\t\t}\n+\t\tsend_balloon_page_sg(vb, vq, sg_addr, sg_len, 1);\n+\t\txb_zero(&vb->page_xb, sg_pfn_start, sg_pfn_end);\n+\t\tsg_pfn_start = sg_pfn_end + 1;\n+\t}\n+\n+\t/*\n+\t * The last few sgs may not reach the batch size, but need a kick to\n+\t * notify the device to handle them.\n+\t */\n+\tif (vq->num_free != virtqueue_get_vring_size(vq)) {\n+\t\tvirtqueue_kick(vq);\n+\t\twait_event(vb->acked, virtqueue_get_buf(vq, &sg_len));\n+\t\twhile (virtqueue_get_buf(vq, &sg_len))\n+\t\t\t;\n+\t}\n+}\n+\n+static inline void xb_set_page(struct virtio_balloon *vb,\n+\t\t\t struct page *page,\n+\t\t\t unsigned long *pfn_min,\n+\t\t\t unsigned long *pfn_max)\n+{\n+\tunsigned long pfn = page_to_pfn(page);\n+\n+\t*pfn_min = min(pfn, *pfn_min);\n+\t*pfn_max = max(pfn, *pfn_max);\n+\txb_preload(GFP_KERNEL);\n+\txb_set_bit(&vb->page_xb, pfn);\n+\txb_preload_end();\n+}\n+\n static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)\n {\n \tstruct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;\n \tunsigned num_allocated_pages;\n+\tbool use_sg = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_SG);\n+\tunsigned long pfn_max = 0, pfn_min = ULONG_MAX;\n \n \t/* We can only do one array worth at a time. */\n-\tnum = min(num, ARRAY_SIZE(vb->pfns));\n+\tif (!use_sg)\n+\t\tnum = min(num, ARRAY_SIZE(vb->pfns));\n \n \tmutex_lock(&vb->balloon_lock);\n \tfor (vb->num_pfns = 0; vb->num_pfns < num;\n@@ -162,7 +265,12 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)\n \t\t\tmsleep(200);\n \t\t\tbreak;\n \t\t}\n-\t\tset_page_pfns(vb, vb->pfns + vb->num_pfns, page);\n+\n+\t\tif (use_sg)\n+\t\t\txb_set_page(vb, page, &pfn_min, &pfn_max);\n+\t\telse\n+\t\t\tset_page_pfns(vb, vb->pfns + vb->num_pfns, page);\n+\n \t\tvb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;\n \t\tif (!virtio_has_feature(vb->vdev,\n \t\t\t\t\tVIRTIO_BALLOON_F_DEFLATE_ON_OOM))\n@@ -171,8 +279,12 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)\n \n \tnum_allocated_pages = vb->num_pfns;\n \t/* Did we get any? */\n-\tif (vb->num_pfns != 0)\n-\t\ttell_host(vb, vb->inflate_vq);\n+\tif (vb->num_pfns) {\n+\t\tif (use_sg)\n+\t\t\ttell_host_sgs(vb, vb->inflate_vq, pfn_min, pfn_max);\n+\t\telse\n+\t\t\ttell_host(vb, vb->inflate_vq);\n+\t}\n \tmutex_unlock(&vb->balloon_lock);\n \n \treturn num_allocated_pages;\n@@ -198,9 +310,12 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)\n \tstruct page *page;\n \tstruct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;\n \tLIST_HEAD(pages);\n+\tbool use_sg = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_SG);\n+\tunsigned long pfn_max = 0, pfn_min = ULONG_MAX;\n \n-\t/* We can only do one array worth at a time. */\n-\tnum = min(num, ARRAY_SIZE(vb->pfns));\n+\t/* Traditionally, we can only do one array worth at a time. */\n+\tif (!use_sg)\n+\t\tnum = min(num, ARRAY_SIZE(vb->pfns));\n \n \tmutex_lock(&vb->balloon_lock);\n \t/* We can't release more pages than taken */\n@@ -210,7 +325,11 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)\n \t\tpage = balloon_page_dequeue(vb_dev_info);\n \t\tif (!page)\n \t\t\tbreak;\n-\t\tset_page_pfns(vb, vb->pfns + vb->num_pfns, page);\n+\t\tif (use_sg)\n+\t\t\txb_set_page(vb, page, &pfn_min, &pfn_max);\n+\t\telse\n+\t\t\tset_page_pfns(vb, vb->pfns + vb->num_pfns, page);\n+\n \t\tlist_add(&page->lru, &pages);\n \t\tvb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE;\n \t}\n@@ -221,8 +340,12 @@ static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)\n \t * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);\n \t * is true, we *have* to do it in this order\n \t */\n-\tif (vb->num_pfns != 0)\n-\t\ttell_host(vb, vb->deflate_vq);\n+\tif (vb->num_pfns) {\n+\t\tif (use_sg)\n+\t\t\ttell_host_sgs(vb, vb->deflate_vq, pfn_min, pfn_max);\n+\t\telse\n+\t\t\ttell_host(vb, vb->deflate_vq);\n+\t}\n \trelease_pages_balloon(vb, &pages);\n \tmutex_unlock(&vb->balloon_lock);\n \treturn num_freed_pages;\n@@ -441,6 +564,7 @@ static int init_vqs(struct virtio_balloon *vb)\n }\n \n #ifdef CONFIG_BALLOON_COMPACTION\n+\n /*\n * virtballoon_migratepage - perform the balloon page migration on behalf of\n *\t\t\t a compation thread. (called under page lock)\n@@ -464,6 +588,7 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,\n {\n \tstruct virtio_balloon *vb = container_of(vb_dev_info,\n \t\t\tstruct virtio_balloon, vb_dev_info);\n+\tbool use_sg = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_SG);\n \tunsigned long flags;\n \n \t/*\n@@ -485,16 +610,24 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,\n \tvb_dev_info->isolated_pages--;\n \t__count_vm_event(BALLOON_MIGRATE);\n \tspin_unlock_irqrestore(&vb_dev_info->pages_lock, flags);\n-\tvb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;\n-\tset_page_pfns(vb, vb->pfns, newpage);\n-\ttell_host(vb, vb->inflate_vq);\n-\n+\tif (use_sg) {\n+\t\tsend_balloon_page_sg(vb, vb->inflate_vq, page_address(newpage),\n+\t\t\t\t PAGE_SIZE, 0);\n+\t} else {\n+\t\tvb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;\n+\t\tset_page_pfns(vb, vb->pfns, newpage);\n+\t\ttell_host(vb, vb->inflate_vq);\n+\t}\n \t/* balloon's page migration 2nd step -- deflate \"page\" */\n \tballoon_page_delete(page);\n-\tvb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;\n-\tset_page_pfns(vb, vb->pfns, page);\n-\ttell_host(vb, vb->deflate_vq);\n-\n+\tif (use_sg) {\n+\t\tsend_balloon_page_sg(vb, vb->deflate_vq, page_address(page),\n+\t\t\t\t PAGE_SIZE, 0);\n+\t} else {\n+\t\tvb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;\n+\t\tset_page_pfns(vb, vb->pfns, page);\n+\t\ttell_host(vb, vb->deflate_vq);\n+\t}\n \tmutex_unlock(&vb->balloon_lock);\n \n \tput_page(page); /* balloon reference */\n@@ -553,6 +686,9 @@ static int virtballoon_probe(struct virtio_device *vdev)\n \tif (err)\n \t\tgoto out_free_vb;\n \n+\tif (virtio_has_feature(vdev, VIRTIO_BALLOON_F_SG))\n+\t\txb_init(&vb->page_xb);\n+\n \tvb->nb.notifier_call = virtballoon_oom_notify;\n \tvb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY;\n \terr = register_oom_notifier(&vb->nb);\n@@ -669,6 +805,7 @@ static unsigned int features[] = {\n \tVIRTIO_BALLOON_F_MUST_TELL_HOST,\n \tVIRTIO_BALLOON_F_STATS_VQ,\n \tVIRTIO_BALLOON_F_DEFLATE_ON_OOM,\n+\tVIRTIO_BALLOON_F_SG,\n };\n \n static struct virtio_driver virtio_balloon_driver = {\ndiff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h\nindex 343d7dd..37780a7 100644\n--- a/include/uapi/linux/virtio_balloon.h\n+++ b/include/uapi/linux/virtio_balloon.h\n@@ -34,6 +34,7 @@\n #define VIRTIO_BALLOON_F_MUST_TELL_HOST\t0 /* Tell before reclaiming pages */\n #define VIRTIO_BALLOON_F_STATS_VQ\t1 /* Memory Stats virtqueue */\n #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM\t2 /* Deflate balloon on OOM */\n+#define VIRTIO_BALLOON_F_SG\t\t3 /* Use sg instead of PFN lists */\n \n /* Size of a PFN in the balloon interface. */\n #define VIRTIO_BALLOON_PFN_SHIFT 12\n", "prefixes": [ "v15", "3/5" ] }