From patchwork Mon Jul 23 20:16:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Roth X-Patchwork-Id: 947998 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="iLlgBum9"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41ZCmk4Xhfz9s1x for ; Tue, 24 Jul 2018 06:30:46 +1000 (AEST) Received: from localhost ([::1]:36391 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fhhTg-0003hh-9Q for incoming@patchwork.ozlabs.org; Mon, 23 Jul 2018 16:30:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41079) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fhhIh-0001oo-BU for qemu-devel@nongnu.org; Mon, 23 Jul 2018 16:19:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fhhId-000354-MV for qemu-devel@nongnu.org; Mon, 23 Jul 2018 16:19:23 -0400 Received: from mail-oi0-x230.google.com ([2607:f8b0:4003:c06::230]:37377) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fhhId-00034e-FR; Mon, 23 Jul 2018 16:19:19 -0400 Received: by mail-oi0-x230.google.com with SMTP id k81-v6so3471999oib.4; Mon, 23 Jul 2018 13:19:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=09ZAfLfbd6SrobG/bzSpP+lzCAHMCLCl2wtQ/YqclnA=; b=iLlgBum9usqsqjqOTC1Fzk+Ozu9dm5ZxG0zur5U7Abx0D5I4/+sYXf/xumGaOiKgSy Rw+m/PrkOm+/wNmrhkIVNsI64l1up7aH6lgMd9mxWFgmYwz9Izb4hrwKtI+YGfn0io5R Hqy/QULUG8m3NLGvM9qh0eWvBLa2RI7vOpg8Agta5s7N+UA+8UiuYJqi6DnliyAsbg3g MBIsm3z3MfnCT2/q9JwnyGEo8fxl0lE7YqTSPPvDgqcan3SKUb2AbXaOUKA2opyVquGR 1At/KAqePNsFlNz+FmcrOeRU77A6ZZGfBg2A+05PIsfG0OCloUwEQ+YfDU+DWeozNk2C tZbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=09ZAfLfbd6SrobG/bzSpP+lzCAHMCLCl2wtQ/YqclnA=; b=lkxvqycr3LMGnYY09/jq2WjUV1VsdyoV8S97veo4+OanaXqT2G1MBuXPYWkRDAhUmD 5tc7ZWRYV1GAhK9ClUIEGRFUQs1BCT520WIiJwZopz6XlGhinSHQBiMQQQmoziSnweuY mh0/dVhTxrNwzbsziHDUwYED7foPT3SZZh7PNkKIzA7Qyon6JI9+aIV0qNBJDN3Rx0jx 1G32K/NPXlEXsfaYR4hysbmuGUpWQD2QHUo1KfM3GQUQxCk499+q426pChFmJkfgiCgr JacWZ5vEpen5er5NNNp0ydil/DPlZSaE0tjRo65EsSHT0Kk1XY1T3gtf2WV104iTyy1D icxQ== X-Gm-Message-State: AOUpUlFHh6W0WUMeNpVRF7I0hvjSypOuKzHwvpbsQS5kFTckNEZWSDrf mLKlRPD3d7QjtPU9dyslfwOb5jFgEUk= X-Google-Smtp-Source: AAOMgpekMw1ALzIrXOHDVlPKV1Z7q0XHAA3/NFI0iItGS17GNskw/sAx6blIZ9GZU55dfOzkjisdcA== X-Received: by 2002:aca:ba57:: with SMTP id k84-v6mr304647oif.10.1532377158324; Mon, 23 Jul 2018 13:19:18 -0700 (PDT) Received: from localhost (76-251-165-188.lightspeed.austtx.sbcglobal.net. [76.251.165.188]) by smtp.gmail.com with ESMTPSA id m203-v6sm25492354oig.42.2018.07.23.13.19.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 23 Jul 2018 13:19:17 -0700 (PDT) From: Michael Roth To: qemu-devel@nongnu.org Date: Mon, 23 Jul 2018 15:16:34 -0500 Message-Id: <20180723201748.25573-26-mdroth@linux.vnet.ibm.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180723201748.25573-1-mdroth@linux.vnet.ibm.com> References: <20180723201748.25573-1-mdroth@linux.vnet.ibm.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4003:c06::230 Subject: [Qemu-devel] [PATCH 25/99] intel-iommu: send PSI always even if across PDEs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-stable@nongnu.org, Peter Xu , "Michael S . Tsirkin" Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Peter Xu SECURITY IMPLICATION: without this patch, any guest with both assigned device and a vIOMMU might encounter stale IO page mappings even if guest has already unmapped the page, which may lead to guest memory corruption. The stale mappings will only be limited to the guest's own memory range, so it should not affect the host memory or other guests on the host. During IOVA page table walking, there is a special case when the PSI covers one whole PDE (Page Directory Entry, which contains 512 Page Table Entries) or more. In the past, we skip that entry and we don't notify the IOMMU notifiers. This is not correct. We should send UNMAP notification to registered UNMAP notifiers in this case. For UNMAP only notifiers, this might cause IOTLBs cached in the devices even if they were already invalid. For MAP/UNMAP notifiers like vfio-pci, this will cause stale page mappings. This special case doesn't trigger often, but it is very easy to be triggered by nested device assignments, since in that case we'll possibly map the whole L2 guest RAM region into the device's IOVA address space (several GBs at least), which is far bigger than normal kernel driver usages of the device (tens of MBs normally). Without this patch applied to L1 QEMU, nested device assignment to L2 guests will dump some errors like: qemu-system-x86_64: VFIO_MAP_DMA: -17 qemu-system-x86_64: vfio_dma_map(0x557305420c30, 0xad000, 0x1000, 0x7f89a920d000) = -17 (File exists) CC: QEMU Stable Acked-by: Jason Wang [peterx: rewrite the commit message] Signed-off-by: Peter Xu Reviewed-by: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin (cherry picked from commit 36d2d52bdb45f5b753a61fdaf0fe7891f1f5b61d) Signed-off-by: Michael Roth --- hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fb31de9416..b359efd6f9 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -722,6 +722,15 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t iova, bool is_write, typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); +static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, + vtd_page_walk_hook hook_fn, void *private) +{ + assert(hook_fn); + trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, + entry->addr_mask, entry->perm); + return hook_fn(entry, private); +} + /** * vtd_page_walk_level - walk over specific level for IOVA range * @@ -781,28 +790,37 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, */ entry_valid = read_cur | write_cur; + entry.target_as = &address_space_memory; + entry.iova = iova & subpage_mask; + entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur); + entry.addr_mask = ~subpage_mask; + if (vtd_is_last_slpte(slpte, level)) { - entry.target_as = &address_space_memory; - entry.iova = iova & subpage_mask; /* NOTE: this is only meaningful if entry_valid == true */ entry.translated_addr = vtd_get_slpte_addr(slpte, aw); - entry.addr_mask = ~subpage_mask; - entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur); if (!entry_valid && !notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - trace_vtd_page_walk_one(level, entry.iova, entry.translated_addr, - entry.addr_mask, entry.perm); - if (hook_fn) { - ret = hook_fn(&entry, private); - if (ret < 0) { - return ret; - } + ret = vtd_page_walk_one(&entry, level, hook_fn, private); + if (ret < 0) { + return ret; } } else { if (!entry_valid) { - trace_vtd_page_walk_skip_perm(iova, iova_next); + if (notify_unmap) { + /* + * The whole entry is invalid; unmap it all. + * Translated address is meaningless, zero it. + */ + entry.translated_addr = 0x0; + ret = vtd_page_walk_one(&entry, level, hook_fn, private); + if (ret < 0) { + return ret; + } + } else { + trace_vtd_page_walk_skip_perm(iova, iova_next); + } goto next; } ret = vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iova,