diff mbox

[1/2] i386: Prepare for interrupt remapping

Message ID 20160314032503.GA27208@pxdev.xzpeter.org
State New
Headers show

Commit Message

Peter Xu March 14, 2016, 3:25 a.m. UTC
On Fri, Mar 11, 2016 at 08:27:57AM +0100, Jan Kiszka wrote:
[...]
> >> @@ -282,7 +288,7 @@ static void vtd_generate_interrupt(IntelIOMMUState *s, hwaddr mesg_addr_reg,
> >>      data = vtd_get_long_raw(s, mesg_data_reg);
> >>  
> >>      VTD_DPRINTF(FLOG, "msi: addr 0x%"PRIx64 " data 0x%"PRIx32, addr, data);
> >> -    address_space_stl_le(&address_space_memory, addr, data,
> >> +    address_space_stl_le(get_dma_address_space(), addr, data,
> >>                           MEMTXATTRS_UNSPECIFIED, NULL);
> >>  }
> > 
> > Would this work? AFAIU, IOMMU generated fault interrupts does not
> > need any translation at all.
> 
> get_dma_address_space() returns the native one, untranslated. If you
> look at the succeeding patch, we replace the address spaces of those
> devices that are under IOMMU control. And the IOMMU continues to use
> this one.

I did misunderstood. Thanks to point out.

> 
> > 
> > One more question about the design itself: I see that one new AS is
> > created for DMA address space named dma_address_space. Could you
> > help explain why we need this? I am still naive on QEMU memory, what
> > I feel is that, current memory framework can work nicely without
> > this extra address space, using existing address translation
> > mechanisms, like the implementation in the following patch:
> > 
> > https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg04393.html
> > 
> > With the new address space, we will need more loops when doing
> > memory address translation for IR (in address_space_translate()). 
> 
> At the time of designing this (about 1.5 years ago), there were no
> memory region attributes yet. So the per device address spaces also
> helped with identifying MSI request sources. Of course, they also helped
> with modelling which devices get remapped and which not. We need to
> rethink this now, in the light of memory region attributes.

Yes. IMHO we should be able to get source information (in this case,
BDF) even without both MR attributes and above changes? E.g., still
in the patch:

https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg04393.html

Currently in vtd_mem_ir_write(), "void *opaque" is not used yet. If
we pass "VTDAddressSpace" instead of "IntelIOMMUState" when creating
the memory region, then we will be able to get BDF in
vtd_mem_ir_write(), like:


Then, we can do further source validation checks as usual in
vtd_mem_ir_write().

This is not considering IOAPIC, etc.. Sure we may need more changes
(e.g., creating IOMMU address spaces for IOAPIC devices as well,
just like in current patch) to fully enable source validation of
IR.

Anyway, we should be able to avoid the extra "int_remap_as" and more
loops of translations between different address spaces, right?
Please correct me if I am wrong.

[...]

> >>  static void kvm_apic_reset(APICCommonState *s)
> >> @@ -182,8 +186,10 @@ static void kvm_apic_realize(DeviceState *dev, Error **errp)
> >>  {
> >>      APICCommonState *s = APIC_COMMON(dev);
> >>  
> >> -    memory_region_init_io(&s->io_memory, NULL, &kvm_apic_io_ops, s, "kvm-apic-msi",
> >> -                          APIC_SPACE_SIZE);
> >> +    memory_region_init(&s->io_memory, NULL, "kvm-apic", APIC_SPACE_SIZE);
> >> +
> >> +    memory_region_init_io(&s->msi_region, NULL, &kvm_msi_region_ops, NULL,
> >> +                          "kvm-msi", MSI_REGION_SIZE);
> > 
> > I do not quite understand why we need to have two MRs. Could you
> > help explain too?
> 
> MSI requests from the devices have nothing to do with APIC access from
> the CPUs - two different sources, two different target (a CPU can't
> trigger MSIs, and devices can't access the APICs). This is currently
> mangled due to past limitations of QEMU, and that should be cleaned up
> eventually. E.g. by introducing a DMA address spaces.

Seems make sense. I just started to curious about what would happen
if we write to MSI from CPU... If we can do that, it would be cool
too, when we want to inject some manual interrupts. Never know the
truth.

Thanks.
Peter
diff mbox

Patch

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 34aa1fa..34946c8 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2057,7 +2057,7 @@  VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
         memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
                                  &s->iommu_ops, "intel_iommu", UINT64_MAX);
         memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
-                              &vtd_mem_ir_ops, s, "intel_iommu_ir",
+                              &vtd_mem_ir_ops, vtd_dev_as, "intel_iommu_ir",
                               VTD_INTERRUPT_ADDR_SIZE);
         memory_region_add_subregion(&vtd_dev_as->iommu, VTD_INTERRUPT_ADDR_FIRST,
                                     &vtd_dev_as->iommu_ir);