diff mbox series

[PULL,1/2] amd_iommu: Fix pte_override_page_mask()

Message ID 20210422222429.183108-2-mst@redhat.com
State New
Headers show
Series [PULL,1/2] amd_iommu: Fix pte_override_page_mask() | expand

Commit Message

Michael S. Tsirkin April 22, 2021, 10:24 p.m. UTC
From: Jean-Philippe Brucker <jean-philippe@linaro.org>

AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page
size. Quoting the AMD IOMMU specification: "When the Next Level bits [of
a pte] are 7h, the size of the page is determined by the first zero bit
in the page address, starting from bit 12."

So if the lowest bits of the page address is 0, the page is 8kB. If the
lowest bits are 011, the page is 32kB. Currently pte_override_page_mask()
doesn't compute the right value for this page size and amdvi_translate()
can return the wrong guest-physical address. With a Linux guest, DMA
from SATA devices accesses the wrong memory and causes probe failure:

qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \
		-device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0
[    6.613093] ata1.00: qc timeout (cmd 0xec)
[    6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Fix the page mask.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/amd_iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Peter Maydell April 23, 2021, 1:01 p.m. UTC | #1
On Thu, 22 Apr 2021 at 23:24, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>
> AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page
> size. Quoting the AMD IOMMU specification: "When the Next Level bits [of
> a pte] are 7h, the size of the page is determined by the first zero bit
> in the page address, starting from bit 12."
>
> So if the lowest bits of the page address is 0, the page is 8kB. If the
> lowest bits are 011, the page is 32kB. Currently pte_override_page_mask()
> doesn't compute the right value for this page size and amdvi_translate()
> can return the wrong guest-physical address. With a Linux guest, DMA
> from SATA devices accesses the wrong memory and causes probe failure:
>
> qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \
>                 -device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0
> [    6.613093] ata1.00: qc timeout (cmd 0xec)
> [    6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>
> Fix the page mask.
>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Jean-Philippe, do you know if this is a regression since 5.2?
I'm guessing not given that the function in question has been that
way since the amd_iommu was introduced in 2016.

thanks
-- PMM
Jean-Philippe Brucker April 23, 2021, 1:35 p.m. UTC | #2
On Fri, Apr 23, 2021 at 02:01:19PM +0100, Peter Maydell wrote:
> On Thu, 22 Apr 2021 at 23:24, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >
> > AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page
> > size. Quoting the AMD IOMMU specification: "When the Next Level bits [of
> > a pte] are 7h, the size of the page is determined by the first zero bit
> > in the page address, starting from bit 12."
> >
> > So if the lowest bits of the page address is 0, the page is 8kB. If the
> > lowest bits are 011, the page is 32kB. Currently pte_override_page_mask()
> > doesn't compute the right value for this page size and amdvi_translate()
> > can return the wrong guest-physical address. With a Linux guest, DMA
> > from SATA devices accesses the wrong memory and causes probe failure:
> >
> > qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \
> >                 -device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0
> > [    6.613093] ata1.00: qc timeout (cmd 0xec)
> > [    6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> >
> > Fix the page mask.
> >
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org>
> > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Jean-Philippe, do you know if this is a regression since 5.2?

I don't think so, I can reproduce it with v5.2.0.

> I'm guessing not given that the function in question has been that
> way since the amd_iommu was introduced in 2016.

There has been a lot of work on the AMD IOMMU driver in Linux recently.
Maybe that exacerbated the problem but I can't find a relevant change.
It's also possible that this path hasn't been exercised before - I just
happened to run a SATA device under AMD IOMMU this week to debug an
unrelated Linux issue. The other devices in the VM don't seem to have a
problem doing DMA.

Thanks,
Jean
Peter Maydell April 23, 2021, 4:11 p.m. UTC | #3
On Fri, 23 Apr 2021 at 14:35, Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Fri, Apr 23, 2021 at 02:01:19PM +0100, Peter Maydell wrote:
> > On Thu, 22 Apr 2021 at 23:24, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > >
> > > AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page
> > > size. Quoting the AMD IOMMU specification: "When the Next Level bits [of
> > > a pte] are 7h, the size of the page is determined by the first zero bit
> > > in the page address, starting from bit 12."
> > >
> > > So if the lowest bits of the page address is 0, the page is 8kB. If the
> > > lowest bits are 011, the page is 32kB. Currently pte_override_page_mask()
> > > doesn't compute the right value for this page size and amdvi_translate()
> > > can return the wrong guest-physical address. With a Linux guest, DMA
> > > from SATA devices accesses the wrong memory and causes probe failure:
> > >
> > > qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \
> > >                 -device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0
> > > [    6.613093] ata1.00: qc timeout (cmd 0xec)
> > > [    6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> > >
> > > Fix the page mask.
> > >
> > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org>
> > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > Jean-Philippe, do you know if this is a regression since 5.2?
>
> I don't think so, I can reproduce it with v5.2.0.

OK, thanks; I think I favour not putting this into rc5, then.

-- PMM
Jean-Philippe Brucker April 26, 2021, 7:49 a.m. UTC | #4
On Fri, Apr 23, 2021 at 05:11:33PM +0100, Peter Maydell wrote:
> > > Jean-Philippe, do you know if this is a regression since 5.2?
> >
> > I don't think so, I can reproduce it with v5.2.0.
> 
> OK, thanks; I think I favour not putting this into rc5, then.

No problem, please let me know if I should resend after the next release

Thanks,
Jean
diff mbox series

Patch

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 74a93a5d93..43b6e9bf51 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -860,8 +860,8 @@  static inline uint8_t get_pte_translation_mode(uint64_t pte)
 
 static inline uint64_t pte_override_page_mask(uint64_t pte)
 {
-    uint8_t page_mask = 12;
-    uint64_t addr = (pte & AMDVI_DEV_PT_ROOT_MASK) ^ AMDVI_DEV_PT_ROOT_MASK;
+    uint8_t page_mask = 13;
+    uint64_t addr = (pte & AMDVI_DEV_PT_ROOT_MASK) >> 12;
     /* find the first zero bit */
     while (addr & 1) {
         page_mask++;