diff mbox

[RFC,v3,02/14] intel_iommu: simplify irq region translation

Message ID 1484276800-26814-3-git-send-email-peterx@redhat.com
State New
Headers show

Commit Message

Peter Xu Jan. 13, 2017, 3:06 a.m. UTC
Before we have int-remap, we need to bypass interrupt write requests.
That's not necessary now - we have supported int-remap, and all the irq
region requests should be redirected there. Cleaning up the block with
an assertion instead.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hw/i386/intel_iommu.c | 28 ++++++----------------------
 1 file changed, 6 insertions(+), 22 deletions(-)

Comments

Tian, Kevin Jan. 20, 2017, 8:22 a.m. UTC | #1
> From: Peter Xu [mailto:peterx@redhat.com]
> Sent: Friday, January 13, 2017 11:06 AM
> 
> Before we have int-remap, we need to bypass interrupt write requests.
> That's not necessary now - we have supported int-remap, and all the irq
> region requests should be redirected there. Cleaning up the block with
> an assertion instead.

This comment is not accurate. According to code, the reason why you
can do such simplification is because we have standalone memory
region now for interrupt addresses. There should be nothing to do 
with int-remap, which can be disabled by guest... Maybe the standalone
region was added when developing int-remap, but functionally they
are not related. :-)

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  hw/i386/intel_iommu.c | 28 ++++++----------------------
>  1 file changed, 6 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 2868e37..77d467a 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -818,28 +818,12 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as,
> PCIBus *bus,
>      bool writes = true;
>      VTDIOTLBEntry *iotlb_entry;
> 
> -    /* Check if the request is in interrupt address range */
> -    if (vtd_is_interrupt_addr(addr)) {
> -        if (is_write) {
> -            /* FIXME: since we don't know the length of the access here, we
> -             * treat Non-DWORD length write requests without PASID as
> -             * interrupt requests, too. Withoud interrupt remapping support,
> -             * we just use 1:1 mapping.
> -             */
> -            VTD_DPRINTF(MMU, "write request to interrupt address "
> -                        "gpa 0x%"PRIx64, addr);
> -            entry->iova = addr & VTD_PAGE_MASK_4K;
> -            entry->translated_addr = addr & VTD_PAGE_MASK_4K;
> -            entry->addr_mask = ~VTD_PAGE_MASK_4K;
> -            entry->perm = IOMMU_WO;
> -            return;
> -        } else {
> -            VTD_DPRINTF(GENERAL, "error: read request from interrupt address "
> -                        "gpa 0x%"PRIx64, addr);
> -            vtd_report_dmar_fault(s, source_id, addr, VTD_FR_READ, is_write);
> -            return;
> -        }
> -    }
> +    /*
> +     * We have standalone memory region for interrupt addresses, we
> +     * should never receive translation requests in this region.
> +     */
> +    assert(!vtd_is_interrupt_addr(addr));
> +
>      /* Try to fetch slpte form IOTLB */
>      iotlb_entry = vtd_lookup_iotlb(s, source_id, addr);
>      if (iotlb_entry) {
> --
> 2.7.4
Peter Xu Jan. 20, 2017, 9:05 a.m. UTC | #2
On Fri, Jan 20, 2017 at 08:22:14AM +0000, Tian, Kevin wrote:
> > From: Peter Xu [mailto:peterx@redhat.com]
> > Sent: Friday, January 13, 2017 11:06 AM
> > 
> > Before we have int-remap, we need to bypass interrupt write requests.
> > That's not necessary now - we have supported int-remap, and all the irq
> > region requests should be redirected there. Cleaning up the block with
> > an assertion instead.
> 
> This comment is not accurate. According to code, the reason why you
> can do such simplification is because we have standalone memory
> region now for interrupt addresses. There should be nothing to do 
> with int-remap, which can be disabled by guest... Maybe the standalone
> region was added when developing int-remap, but functionally they
> are not related. :-)

IMHO the above commit message is fairly clear. :-)

But sure I can add some more emphasise like:

  "Before we have int-remap memory region, ..."

Do you think it's okay? Or any better suggestion?

(Just to mention that even guest disables IR, the MSI region will
 still be there.)

Thanks,

-- peterx
Tian, Kevin Jan. 20, 2017, 9:15 a.m. UTC | #3
> From: Peter Xu [mailto:peterx@redhat.com]

> Sent: Friday, January 20, 2017 5:05 PM

> 

> On Fri, Jan 20, 2017 at 08:22:14AM +0000, Tian, Kevin wrote:

> > > From: Peter Xu [mailto:peterx@redhat.com]

> > > Sent: Friday, January 13, 2017 11:06 AM

> > >

> > > Before we have int-remap, we need to bypass interrupt write requests.

> > > That's not necessary now - we have supported int-remap, and all the irq

> > > region requests should be redirected there. Cleaning up the block with

> > > an assertion instead.

> >

> > This comment is not accurate. According to code, the reason why you

> > can do such simplification is because we have standalone memory

> > region now for interrupt addresses. There should be nothing to do

> > with int-remap, which can be disabled by guest... Maybe the standalone

> > region was added when developing int-remap, but functionally they

> > are not related. :-)

> 

> IMHO the above commit message is fairly clear. :-)

> 

> But sure I can add some more emphasise like:

> 

>   "Before we have int-remap memory region, ..."

> 

> Do you think it's okay? Or any better suggestion?

> 

> (Just to mention that even guest disables IR, the MSI region will

>  still be there.)

> 


My option is simple - this patch has nothing to do with int-remap.
It's not necessary, not because we supported int-remap. It's because
we have a standalone memory region for interrupt addresses, as you
described in the code. :-)

Thanks
Kevin
Peter Xu Jan. 20, 2017, 9:27 a.m. UTC | #4
On Fri, Jan 20, 2017 at 09:15:27AM +0000, Tian, Kevin wrote:
> > From: Peter Xu [mailto:peterx@redhat.com]
> > Sent: Friday, January 20, 2017 5:05 PM
> > 
> > On Fri, Jan 20, 2017 at 08:22:14AM +0000, Tian, Kevin wrote:
> > > > From: Peter Xu [mailto:peterx@redhat.com]
> > > > Sent: Friday, January 13, 2017 11:06 AM
> > > >
> > > > Before we have int-remap, we need to bypass interrupt write requests.
> > > > That's not necessary now - we have supported int-remap, and all the irq
> > > > region requests should be redirected there. Cleaning up the block with
> > > > an assertion instead.
> > >
> > > This comment is not accurate. According to code, the reason why you
> > > can do such simplification is because we have standalone memory
> > > region now for interrupt addresses. There should be nothing to do
> > > with int-remap, which can be disabled by guest... Maybe the standalone
> > > region was added when developing int-remap, but functionally they
> > > are not related. :-)
> > 
> > IMHO the above commit message is fairly clear. :-)
> > 
> > But sure I can add some more emphasise like:
> > 
> >   "Before we have int-remap memory region, ..."
> > 
> > Do you think it's okay? Or any better suggestion?
> > 
> > (Just to mention that even guest disables IR, the MSI region will
> >  still be there.)
> > 
> 
> My option is simple - this patch has nothing to do with int-remap.
> It's not necessary, not because we supported int-remap. It's because
> we have a standalone memory region for interrupt addresses, as you
> described in the code. :-)

I really think they are the same thing...

How about this:

    Now we have a standalone memory region for MSI, all the irq region
    requests should be redirected there. Cleaning up the block with an
    assertion instead.

-- peterx
Tian, Kevin Jan. 20, 2017, 9:52 a.m. UTC | #5
> From: Peter Xu [mailto:peterx@redhat.com]

> Sent: Friday, January 20, 2017 5:28 PM

> 

> On Fri, Jan 20, 2017 at 09:15:27AM +0000, Tian, Kevin wrote:

> > > From: Peter Xu [mailto:peterx@redhat.com]

> > > Sent: Friday, January 20, 2017 5:05 PM

> > >

> > > On Fri, Jan 20, 2017 at 08:22:14AM +0000, Tian, Kevin wrote:

> > > > > From: Peter Xu [mailto:peterx@redhat.com]

> > > > > Sent: Friday, January 13, 2017 11:06 AM

> > > > >

> > > > > Before we have int-remap, we need to bypass interrupt write requests.

> > > > > That's not necessary now - we have supported int-remap, and all the irq

> > > > > region requests should be redirected there. Cleaning up the block with

> > > > > an assertion instead.

> > > >

> > > > This comment is not accurate. According to code, the reason why you

> > > > can do such simplification is because we have standalone memory

> > > > region now for interrupt addresses. There should be nothing to do

> > > > with int-remap, which can be disabled by guest... Maybe the standalone

> > > > region was added when developing int-remap, but functionally they

> > > > are not related. :-)

> > >

> > > IMHO the above commit message is fairly clear. :-)

> > >

> > > But sure I can add some more emphasise like:

> > >

> > >   "Before we have int-remap memory region, ..."

> > >

> > > Do you think it's okay? Or any better suggestion?

> > >

> > > (Just to mention that even guest disables IR, the MSI region will

> > >  still be there.)

> > >

> >

> > My option is simple - this patch has nothing to do with int-remap.

> > It's not necessary, not because we supported int-remap. It's because

> > we have a standalone memory region for interrupt addresses, as you

> > described in the code. :-)

> 

> I really think they are the same thing...

> 

> How about this:

> 

>     Now we have a standalone memory region for MSI, all the irq region

>     requests should be redirected there. Cleaning up the block with an

>     assertion instead.

> 


btw what about guest setups a valid mapping at 0xFEEx_xxxx in
its remapping structure, which is then programmed to virtual
device as DMA destination? Then when emulating that virtual DMA,
vtd_do_iommu_translate should simply return (maybe throw out
a warning for diagnostic purpose) instead of assert here. 

VT-d spec defines as below:

	Software must ensure the second-level paging-structure entries 
	are programmed not to remap input addresses to the interrupt 
	address range. Hardware behavior is undefined for memory 
	requests remapped to the interrupt address range.

I don't think "hardware behavior is undefined" is equal to "assert
thus kill VM"...

Thanks
Kevin
Peter Xu Jan. 20, 2017, 10:04 a.m. UTC | #6
On Fri, Jan 20, 2017 at 09:52:01AM +0000, Tian, Kevin wrote:

[...]

> btw what about guest setups a valid mapping at 0xFEEx_xxxx in
> its remapping structure, which is then programmed to virtual
> device as DMA destination? Then when emulating that virtual DMA,
> vtd_do_iommu_translate should simply return (maybe throw out
> a warning for diagnostic purpose) instead of assert here. 
> 
> VT-d spec defines as below:
> 
> 	Software must ensure the second-level paging-structure entries 
> 	are programmed not to remap input addresses to the interrupt 
> 	address range. Hardware behavior is undefined for memory 
> 	requests remapped to the interrupt address range.

Thanks for this reference. That's something I was curious about.

> 
> I don't think "hardware behavior is undefined" is equal to "assert
> thus kill VM"...

I don't think it will kill the VM. After we have the MSI region, it
should just use that IR region for everything (read/write/translate).
So iiuc when anyone setups IOVA mapping within range 0xfeexxxxx, then
a DMA will trigger an interrupt (rather than memory moves), but in
most cases the interrupt will be illegal since either the data is
invalid (e.g., non-zero reserved bits, or SID verification failure),
further it should trigger a vIOMMU fault (though IR fault reporting is
still incomplete, that's my next thing to do after this series).

Thanks,

-- peterx
Tian, Kevin Jan. 22, 2017, 4:42 a.m. UTC | #7
> From: Peter Xu [mailto:peterx@redhat.com]

> Sent: Friday, January 20, 2017 6:04 PM

> 

> On Fri, Jan 20, 2017 at 09:52:01AM +0000, Tian, Kevin wrote:

> 

> [...]

> 

> > btw what about guest setups a valid mapping at 0xFEEx_xxxx in

> > its remapping structure, which is then programmed to virtual

> > device as DMA destination? Then when emulating that virtual DMA,

> > vtd_do_iommu_translate should simply return (maybe throw out

> > a warning for diagnostic purpose) instead of assert here.

> >

> > VT-d spec defines as below:

> >

> > 	Software must ensure the second-level paging-structure entries

> > 	are programmed not to remap input addresses to the interrupt

> > 	address range. Hardware behavior is undefined for memory

> > 	requests remapped to the interrupt address range.

> 

> Thanks for this reference. That's something I was curious about.

> 

> >

> > I don't think "hardware behavior is undefined" is equal to "assert

> > thus kill VM"...

> 

> I don't think it will kill the VM. After we have the MSI region, it

> should just use that IR region for everything (read/write/translate).

> So iiuc when anyone setups IOVA mapping within range 0xfeexxxxx, then

> a DMA will trigger an interrupt (rather than memory moves), but in

> most cases the interrupt will be illegal since either the data is

> invalid (e.g., non-zero reserved bits, or SID verification failure),

> further it should trigger a vIOMMU fault (though IR fault reporting is

> still incomplete, that's my next thing to do after this series).

> 


Yes, you're right here. Sorry for bothering with my wrong understanding. :-)

Thanks
Kevin
Peter Xu Jan. 22, 2017, 4:50 a.m. UTC | #8
On Sun, Jan 22, 2017 at 04:42:13AM +0000, Tian, Kevin wrote:
> > From: Peter Xu [mailto:peterx@redhat.com]
> > Sent: Friday, January 20, 2017 6:04 PM
> > 
> > On Fri, Jan 20, 2017 at 09:52:01AM +0000, Tian, Kevin wrote:
> > 
> > [...]
> > 
> > > btw what about guest setups a valid mapping at 0xFEEx_xxxx in
> > > its remapping structure, which is then programmed to virtual
> > > device as DMA destination? Then when emulating that virtual DMA,
> > > vtd_do_iommu_translate should simply return (maybe throw out
> > > a warning for diagnostic purpose) instead of assert here.
> > >
> > > VT-d spec defines as below:
> > >
> > > 	Software must ensure the second-level paging-structure entries
> > > 	are programmed not to remap input addresses to the interrupt
> > > 	address range. Hardware behavior is undefined for memory
> > > 	requests remapped to the interrupt address range.
> > 
> > Thanks for this reference. That's something I was curious about.
> > 
> > >
> > > I don't think "hardware behavior is undefined" is equal to "assert
> > > thus kill VM"...
> > 
> > I don't think it will kill the VM. After we have the MSI region, it
> > should just use that IR region for everything (read/write/translate).
> > So iiuc when anyone setups IOVA mapping within range 0xfeexxxxx, then
> > a DMA will trigger an interrupt (rather than memory moves), but in
> > most cases the interrupt will be illegal since either the data is
> > invalid (e.g., non-zero reserved bits, or SID verification failure),
> > further it should trigger a vIOMMU fault (though IR fault reporting is
> > still incomplete, that's my next thing to do after this series).
> > 
> 
> Yes, you're right here. Sorry for bothering with my wrong understanding. :-)

No problem at all.

Looking forward to any of your further comments on v4. :-)

-- peterx
diff mbox

Patch

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 2868e37..77d467a 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -818,28 +818,12 @@  static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
     bool writes = true;
     VTDIOTLBEntry *iotlb_entry;
 
-    /* Check if the request is in interrupt address range */
-    if (vtd_is_interrupt_addr(addr)) {
-        if (is_write) {
-            /* FIXME: since we don't know the length of the access here, we
-             * treat Non-DWORD length write requests without PASID as
-             * interrupt requests, too. Withoud interrupt remapping support,
-             * we just use 1:1 mapping.
-             */
-            VTD_DPRINTF(MMU, "write request to interrupt address "
-                        "gpa 0x%"PRIx64, addr);
-            entry->iova = addr & VTD_PAGE_MASK_4K;
-            entry->translated_addr = addr & VTD_PAGE_MASK_4K;
-            entry->addr_mask = ~VTD_PAGE_MASK_4K;
-            entry->perm = IOMMU_WO;
-            return;
-        } else {
-            VTD_DPRINTF(GENERAL, "error: read request from interrupt address "
-                        "gpa 0x%"PRIx64, addr);
-            vtd_report_dmar_fault(s, source_id, addr, VTD_FR_READ, is_write);
-            return;
-        }
-    }
+    /*
+     * We have standalone memory region for interrupt addresses, we
+     * should never receive translation requests in this region.
+     */
+    assert(!vtd_is_interrupt_addr(addr));
+
     /* Try to fetch slpte form IOTLB */
     iotlb_entry = vtd_lookup_iotlb(s, source_id, addr);
     if (iotlb_entry) {