From patchwork Fri Oct 12 03:24:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Gibson X-Patchwork-Id: 982834 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.b="Yw0g78Z4"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42WYFm0HkYz9s1x for ; Fri, 12 Oct 2018 14:28:28 +1100 (AEDT) Received: from localhost ([::1]:38073 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gAo7l-00062f-MS for incoming@patchwork.ozlabs.org; Thu, 11 Oct 2018 23:28:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60997) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gAo4D-0003GM-3i for qemu-devel@nongnu.org; Thu, 11 Oct 2018 23:24:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gAo49-0007X5-4b for qemu-devel@nongnu.org; Thu, 11 Oct 2018 23:24:45 -0400 Received: from ozlabs.org ([203.11.71.1]:46621) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gAo48-0007V6-ED; Thu, 11 Oct 2018 23:24:41 -0400 Received: by ozlabs.org (Postfix, from userid 1007) id 42WY9J3Qnhz9s7T; Fri, 12 Oct 2018 14:24:36 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1539314676; bh=b2xHySV2YLT5uqSdahSZMPdoR5keh5VD1s2mbEtaUKM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Yw0g78Z46DcCb2lSx8SnSkxcE1d0fYTvVk+1IDnbczGkQKXevGrqvJ/eeci2rKI8P X0qcjuCVI344r3Yi3SGYHSCfI6PRQBHUE0CR4fIoHFAuMlUQONTUM+YDpFIH87tN+Y q+P+C7eYrIqD29SqkSq9pSc0HVaMYmb/R4iDaI8k= From: David Gibson To: dhildenb@redhat.com, imammedo@redhat.com, ehabkost@redhat.com Date: Fri, 12 Oct 2018 14:24:30 +1100 Message-Id: <20181012032431.32693-5-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181012032431.32693-1-david@gibson.dropbear.id.au> References: <20181012032431.32693-1-david@gibson.dropbear.id.au> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 203.11.71.1 Subject: [Qemu-devel] [RFC 4/5] virtio-balloon: Use ram_block_discard_range() instead of raw madvise() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually discard RAM pages inserted into the balloon. This is basically a Linux only interface (MADV_DONTNEED exists on some other platforms, but doesn't always have the same semantics). It also doesn't work on hugepages and has some other limitations. It turns out that postcopy also needs to discard chunks of memory, and uses a better interface for it: ram_block_discard_range(). It doesn't cover every case, but it covers more than going direct to madvise() and this gives us a single place to update for more possibilities in future. There are some subtleties here to maintain the current balloon behaviour: * For now, we just ignore requests to balloon in a hugepage backed region. That matches current behaviour, because MADV_DONTNEED on a hugepage would simply fail, and we ignore the error. * If host page size is > BALLOON_PAGE_SIZE we can frequently call this on non-host-page-aligned addresses. These would also fail in madvise(), which we then ignored. ram_block_discard_range() error_report()s calls on unaligned addresses, so we explicitly check that case to avoid spamming the logs. * We now call ram_block_discard_range() with the *host* page size, whereas we previously called madvise() with BALLOON_PAGE_SIZE. Surprisingly, this also matches existing behaviour. Although the kernel fails madvise on unaligned addresses, it will round unaligned sizes *up* to the host page size. Yes, this means that if BALLOON_PAGE_SIZE < guest page size we can incorrectly discard more memory than the guest asked us to. I'm planning to address that soon. Errors other than the ones discussed above, will now be reported by ram_block_discard_range(), rather than silently ignored, which means we have a much better chance of seeing when something is going wrong. Signed-off-by: David Gibson --- hw/virtio/virtio-balloon.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 7229afad6e..4435905c87 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -37,7 +37,29 @@ static void balloon_inflate_page(VirtIOBalloon *balloon, MemoryRegion *mr, hwaddr offset) { void *addr = memory_region_get_ram_ptr(mr) + offset; - qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED); + RAMBlock *rb; + size_t rb_page_size; + ram_addr_t ram_offset; + + /* XXX is there a better way to get to the RAMBlock than via a + * host address? */ + rb = qemu_ram_block_from_host(addr, false, &ram_offset); + rb_page_size = qemu_ram_pagesize(rb); + + /* Silently ignore hugepage RAM blocks */ + if (rb_page_size != getpagesize()) { + return; + } + + /* Silently ignore unaligned requests */ + if (ram_offset & (rb_page_size - 1)) { + return; + } + + ram_block_discard_range(rb, ram_offset, rb_page_size); + /* We ignore errors from ram_block_discard_range(), because it has + * already reported them, and failing to discard a balloon page is + * not fatal */ } static const char *balloon_stat_names[] = {