Message ID | 5204C0D8.5010205@ozlabs.ru |
---|---|
State | New |
Headers | show |
Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: > On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>> this patch introduces a helper/wrapper function. >>>>> >>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>>>> [aik: added inheritance from parent if iommu is not set for the current bus] >>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>> >>>>> --- >>>>> Changes: >>>>> v2: >>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>> --- >>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>> include/hw/pci/pci.h | 1 + >>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>> index 4c004f5..0072b54 100644 >>>>> --- a/hw/pci/pci.c >>>>> +++ b/hw/pci/pci.c >>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>> } >>>>> >>>>> pci_dev->bus = bus; >>>>> - if (bus->iommu_fn) { >>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>> - } else { >>>>> - /* FIXME: inherit memory region from bus creator */ >>>>> - dma_as = &address_space_memory; >>>>> - } >>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>> >>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>> OBJECT(pci_dev), "bus master", >>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data) >>>>> k->props = pci_props; >>>>> } >>>>> >>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>> +{ >>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>> + >>>>> + if (bus->iommu_fn) { >>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>> + } >>>>> + >>>>> + if (bus->parent_dev) { >>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>> + } >>>> >>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>> either. >>> >>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >> >> Doh, I misread the code, I thought it was the "parent" field in >> BusState. Why do we have parent_dev at all? > > The code is too old? Don't know. > > >>>> You can use object_dynamic_cast to convert the parent_dev to >>>> PCIDevice, and if the cast succeeds you call the new function. >>>> >>>> Perhaps you could make the new function take a PCIBus instead. >>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>> bus-master address space) doesn't make much sense, VFIO is really a >>>> special case. Putting the new API on the bus side instead looks better. >>>> >>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>> devices sitting on the secondary bus?) >>> >>> It happens naturally I guess when linux enables devices. >> >> Yes, but then using the IOMMU address space would be wrong; you would >> have to use the bus-master address space as a base for the child's >> bus-master address space. > > > Like this? Works too. > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index 8bdcedc..a4c70e6 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -2247,23 +2247,23 @@ AddressSpace > *pci_device_iommu_address_space(PCIDevice *dev) > { > PCIBus *bus = PCI_BUS(dev->bus); > > if (bus->iommu_fn) { > return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); > } > > if (bus->parent_dev) { > return pci_device_iommu_address_space(bus->parent_dev); > } > > - return &address_space_memory; > + return &dev->bus_master_as; > } I was thinking more like this: if (bus->parent_dev) { - return pci_device_iommu_address_space(bus->parent_dev); + /* Take parent device's bus-master enable bit into account. */ + return pci_get_address_space(bus->parent_dev); } + /* Not a secondary bus and no IOMMU. Use system memory. */ return &address_space_memory; Paolo > > >> Also, we would have to fix the x86 firmware. >> >> Paolo >> > >
On 08/09/2013 08:19 PM, Paolo Bonzini wrote: > Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>> this patch introduces a helper/wrapper function. >>>>>> >>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>>>>> [aik: added inheritance from parent if iommu is not set for the current bus] >>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>>> >>>>>> --- >>>>>> Changes: >>>>>> v2: >>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>> --- >>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>> include/hw/pci/pci.h | 1 + >>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>> >>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>> index 4c004f5..0072b54 100644 >>>>>> --- a/hw/pci/pci.c >>>>>> +++ b/hw/pci/pci.c >>>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>> } >>>>>> >>>>>> pci_dev->bus = bus; >>>>>> - if (bus->iommu_fn) { >>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>> - } else { >>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>> - dma_as = &address_space_memory; >>>>>> - } >>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>> >>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>> OBJECT(pci_dev), "bus master", >>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data) >>>>>> k->props = pci_props; >>>>>> } >>>>>> >>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>> +{ >>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>> + >>>>>> + if (bus->iommu_fn) { >>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>> + } >>>>>> + >>>>>> + if (bus->parent_dev) { >>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>> + } >>>>> >>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>> either. >>>> >>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >>> >>> Doh, I misread the code, I thought it was the "parent" field in >>> BusState. Why do we have parent_dev at all? >> >> The code is too old? Don't know. >> >> >>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>> >>>>> Perhaps you could make the new function take a PCIBus instead. >>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>> special case. Putting the new API on the bus side instead looks better. >>>>> >>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>>> devices sitting on the secondary bus?) >>>> >>>> It happens naturally I guess when linux enables devices. >>> >>> Yes, but then using the IOMMU address space would be wrong; you would >>> have to use the bus-master address space as a base for the child's >>> bus-master address space. >> >> >> Like this? Works too. >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index 8bdcedc..a4c70e6 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -2247,23 +2247,23 @@ AddressSpace >> *pci_device_iommu_address_space(PCIDevice *dev) >> { >> PCIBus *bus = PCI_BUS(dev->bus); >> >> if (bus->iommu_fn) { >> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >> } >> >> if (bus->parent_dev) { >> return pci_device_iommu_address_space(bus->parent_dev); >> } >> >> - return &address_space_memory; >> + return &dev->bus_master_as; >> } > > I was thinking more like this: > > if (bus->parent_dev) { > - return pci_device_iommu_address_space(bus->parent_dev); > + /* Take parent device's bus-master enable bit into account. */ > + return pci_get_address_space(bus->parent_dev); > } > > + /* Not a secondary bus and no IOMMU. Use system memory. */ > return &address_space_memory; Ok. Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices being in the same address space as every single device has its own "bus master" AddressSpace. > > Paolo > >> >> >>> Also, we would have to fix the x86 firmware. btw what would it mean? :) >>> >>> Paolo >>> >> >> >
Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto: > On 08/09/2013 08:19 PM, Paolo Bonzini wrote: >> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>>> this patch introduces a helper/wrapper function. >>>>>>> >>>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>>>>>> [aik: added inheritance from parent if iommu is not set for the current bus] >>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>>>> >>>>>>> --- >>>>>>> Changes: >>>>>>> v2: >>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>>> --- >>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>>> include/hw/pci/pci.h | 1 + >>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>>> >>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>>> index 4c004f5..0072b54 100644 >>>>>>> --- a/hw/pci/pci.c >>>>>>> +++ b/hw/pci/pci.c >>>>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>>> } >>>>>>> >>>>>>> pci_dev->bus = bus; >>>>>>> - if (bus->iommu_fn) { >>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>>> - } else { >>>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>>> - dma_as = &address_space_memory; >>>>>>> - } >>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>>> >>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>>> OBJECT(pci_dev), "bus master", >>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data) >>>>>>> k->props = pci_props; >>>>>>> } >>>>>>> >>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>>> +{ >>>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>>> + >>>>>>> + if (bus->iommu_fn) { >>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>>> + } >>>>>>> + >>>>>>> + if (bus->parent_dev) { >>>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>>> + } >>>>>> >>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>>> either. >>>>> >>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >>>> >>>> Doh, I misread the code, I thought it was the "parent" field in >>>> BusState. Why do we have parent_dev at all? >>> >>> The code is too old? Don't know. >>> >>> >>>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>>> >>>>>> Perhaps you could make the new function take a PCIBus instead. >>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>>> special case. Putting the new API on the bus side instead looks better. >>>>>> >>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>>>> devices sitting on the secondary bus?) >>>>> >>>>> It happens naturally I guess when linux enables devices. >>>> >>>> Yes, but then using the IOMMU address space would be wrong; you would >>>> have to use the bus-master address space as a base for the child's >>>> bus-master address space. >>> >>> >>> Like this? Works too. >>> >>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>> index 8bdcedc..a4c70e6 100644 >>> --- a/hw/pci/pci.c >>> +++ b/hw/pci/pci.c >>> @@ -2247,23 +2247,23 @@ AddressSpace >>> *pci_device_iommu_address_space(PCIDevice *dev) >>> { >>> PCIBus *bus = PCI_BUS(dev->bus); >>> >>> if (bus->iommu_fn) { >>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>> } >>> >>> if (bus->parent_dev) { >>> return pci_device_iommu_address_space(bus->parent_dev); >>> } >>> >>> - return &address_space_memory; >>> + return &dev->bus_master_as; >>> } >> >> I was thinking more like this: >> >> if (bus->parent_dev) { >> - return pci_device_iommu_address_space(bus->parent_dev); >> + /* Take parent device's bus-master enable bit into account. */ >> + return pci_get_address_space(bus->parent_dev); >> } >> >> + /* Not a secondary bus and no IOMMU. Use system memory. */ >> return &address_space_memory; > > Ok. > > Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices > being in the same address space as every single device has its own "bus > master" AddressSpace. Yes, fixing that check is another good use of the new API you introduced. But after Ben's answer, I guess the above change is not really needed. It would add complication for VFIO, too. Proper emulation would require QEMU to trap writes to the device's bus-master bit. QEMU would have to take of the value that the guest writes, AND it with the bus-master bit(s) of all bridges between the host bridge and the VFIO device, and write the result to the passed-through device. This is because the bridges are emulated, and do not exist in real hardware. >>>> Also, we would have to fix the x86 firmware. > > btw what would it mean? :) Firmware would have to look for bridges and enable bus master DMA on them. Otherwise, for example a SCSI controller below a bridge (not virtio-scsi) would fail to boot. Paolo
On 08/09/2013 09:07 PM, Paolo Bonzini wrote: > Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto: >> On 08/09/2013 08:19 PM, Paolo Bonzini wrote: >>> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >>>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>>>> this patch introduces a helper/wrapper function. >>>>>>>> >>>>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>>>>>>> [aik: added inheritance from parent if iommu is not set for the current bus] >>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>>>>> >>>>>>>> --- >>>>>>>> Changes: >>>>>>>> v2: >>>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>>>> --- >>>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>>>> include/hw/pci/pci.h | 1 + >>>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>>>> >>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>>>> index 4c004f5..0072b54 100644 >>>>>>>> --- a/hw/pci/pci.c >>>>>>>> +++ b/hw/pci/pci.c >>>>>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>>>> } >>>>>>>> >>>>>>>> pci_dev->bus = bus; >>>>>>>> - if (bus->iommu_fn) { >>>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>>>> - } else { >>>>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>>>> - dma_as = &address_space_memory; >>>>>>>> - } >>>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>>>> >>>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>>>> OBJECT(pci_dev), "bus master", >>>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data) >>>>>>>> k->props = pci_props; >>>>>>>> } >>>>>>>> >>>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>>>> +{ >>>>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>>>> + >>>>>>>> + if (bus->iommu_fn) { >>>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>>>> + } >>>>>>>> + >>>>>>>> + if (bus->parent_dev) { >>>>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>>>> + } >>>>>>> >>>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>>>> either. >>>>>> >>>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >>>>> >>>>> Doh, I misread the code, I thought it was the "parent" field in >>>>> BusState. Why do we have parent_dev at all? >>>> >>>> The code is too old? Don't know. >>>> >>>> >>>>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>>>> >>>>>>> Perhaps you could make the new function take a PCIBus instead. >>>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>>>> special case. Putting the new API on the bus side instead looks better. >>>>>>> >>>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>>>>> devices sitting on the secondary bus?) >>>>>> >>>>>> It happens naturally I guess when linux enables devices. >>>>> >>>>> Yes, but then using the IOMMU address space would be wrong; you would >>>>> have to use the bus-master address space as a base for the child's >>>>> bus-master address space. >>>> >>>> >>>> Like this? Works too. >>>> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>> index 8bdcedc..a4c70e6 100644 >>>> --- a/hw/pci/pci.c >>>> +++ b/hw/pci/pci.c >>>> @@ -2247,23 +2247,23 @@ AddressSpace >>>> *pci_device_iommu_address_space(PCIDevice *dev) >>>> { >>>> PCIBus *bus = PCI_BUS(dev->bus); >>>> >>>> if (bus->iommu_fn) { >>>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>> } >>>> >>>> if (bus->parent_dev) { >>>> return pci_device_iommu_address_space(bus->parent_dev); >>>> } >>>> >>>> - return &address_space_memory; >>>> + return &dev->bus_master_as; >>>> } >>> >>> I was thinking more like this: >>> >>> if (bus->parent_dev) { >>> - return pci_device_iommu_address_space(bus->parent_dev); >>> + /* Take parent device's bus-master enable bit into account. */ >>> + return pci_get_address_space(bus->parent_dev); >>> } >>> >>> + /* Not a secondary bus and no IOMMU. Use system memory. */ >>> return &address_space_memory; >> >> Ok. >> >> Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices >> being in the same address space as every single device has its own "bus >> master" AddressSpace. > > Yes, fixing that check is another good use of the new API you introduced. > > But after Ben's answer, I guess the above change is not really needed. > It would add complication for VFIO, too. Proper emulation would require > QEMU to trap writes to the device's bus-master bit. QEMU would have to > take of the value that the guest writes, AND it with the bus-master > bit(s) of all bridges between the host bridge and the VFIO device, and > write the result to the passed-through device. This is because the > bridges are emulated, and do not exist in real hardware. So does this mean that we go with the original patch and ignore bus master address space here? I guess I could overcome that VFIO check by comparing not just AddressSpace but AddressSpace->root if AddressSpace is different but it does not make a lot of sense. > >>>>> Also, we would have to fix the x86 firmware. >> >> btw what would it mean? :) > > Firmware would have to look for bridges and enable bus master DMA on > them. Otherwise, for example a SCSI controller below a bridge (not > virtio-scsi) would fail to boot. > > Paolo >
Il 09/08/2013 13:21, Alexey Kardashevskiy ha scritto: > On 08/09/2013 09:07 PM, Paolo Bonzini wrote: >> Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto: >>> On 08/09/2013 08:19 PM, Paolo Bonzini wrote: >>>> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >>>>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>>>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>>>>> this patch introduces a helper/wrapper function. >>>>>>>>> >>>>>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>>>>>>>> [aik: added inheritance from parent if iommu is not set for the current bus] >>>>>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>>>>>> >>>>>>>>> --- >>>>>>>>> Changes: >>>>>>>>> v2: >>>>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>>>>> --- >>>>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>>>>> include/hw/pci/pci.h | 1 + >>>>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>>>>> index 4c004f5..0072b54 100644 >>>>>>>>> --- a/hw/pci/pci.c >>>>>>>>> +++ b/hw/pci/pci.c >>>>>>>>> @@ -812,12 +812,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>>>>> } >>>>>>>>> >>>>>>>>> pci_dev->bus = bus; >>>>>>>>> - if (bus->iommu_fn) { >>>>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>>>>> - } else { >>>>>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>>>>> - dma_as = &address_space_memory; >>>>>>>>> - } >>>>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>>>>> >>>>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>>>>> OBJECT(pci_dev), "bus master", >>>>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass *klass, void *data) >>>>>>>>> k->props = pci_props; >>>>>>>>> } >>>>>>>>> >>>>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>>>>> +{ >>>>>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>>>>> + >>>>>>>>> + if (bus->iommu_fn) { >>>>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + if (bus->parent_dev) { >>>>>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>>>>> + } >>>>>>>> >>>>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>>>>> either. >>>>>>> >>>>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >>>>>> >>>>>> Doh, I misread the code, I thought it was the "parent" field in >>>>>> BusState. Why do we have parent_dev at all? >>>>> >>>>> The code is too old? Don't know. >>>>> >>>>> >>>>>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>>>>> >>>>>>>> Perhaps you could make the new function take a PCIBus instead. >>>>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>>>>> special case. Putting the new API on the bus side instead looks better. >>>>>>>> >>>>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>>>>>> devices sitting on the secondary bus?) >>>>>>> >>>>>>> It happens naturally I guess when linux enables devices. >>>>>> >>>>>> Yes, but then using the IOMMU address space would be wrong; you would >>>>>> have to use the bus-master address space as a base for the child's >>>>>> bus-master address space. >>>>> >>>>> >>>>> Like this? Works too. >>>>> >>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>> index 8bdcedc..a4c70e6 100644 >>>>> --- a/hw/pci/pci.c >>>>> +++ b/hw/pci/pci.c >>>>> @@ -2247,23 +2247,23 @@ AddressSpace >>>>> *pci_device_iommu_address_space(PCIDevice *dev) >>>>> { >>>>> PCIBus *bus = PCI_BUS(dev->bus); >>>>> >>>>> if (bus->iommu_fn) { >>>>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>> } >>>>> >>>>> if (bus->parent_dev) { >>>>> return pci_device_iommu_address_space(bus->parent_dev); >>>>> } >>>>> >>>>> - return &address_space_memory; >>>>> + return &dev->bus_master_as; >>>>> } >>>> >>>> I was thinking more like this: >>>> >>>> if (bus->parent_dev) { >>>> - return pci_device_iommu_address_space(bus->parent_dev); >>>> + /* Take parent device's bus-master enable bit into account. */ >>>> + return pci_get_address_space(bus->parent_dev); >>>> } >>>> >>>> + /* Not a secondary bus and no IOMMU. Use system memory. */ >>>> return &address_space_memory; >>> >>> Ok. >>> >>> Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices >>> being in the same address space as every single device has its own "bus >>> master" AddressSpace. >> >> Yes, fixing that check is another good use of the new API you introduced. >> >> But after Ben's answer, I guess the above change is not really needed. >> It would add complication for VFIO, too. Proper emulation would require >> QEMU to trap writes to the device's bus-master bit. QEMU would have to >> take of the value that the guest writes, AND it with the bus-master >> bit(s) of all bridges between the host bridge and the VFIO device, and >> write the result to the passed-through device. This is because the >> bridges are emulated, and do not exist in real hardware. > > So does this mean that we go with the original patch and ignore bus master > address space here? Yes. Just add a comment that we are ignoring the bus master DMA bit of the bridge. > I guess I could overcome that VFIO check by comparing not just AddressSpace > but AddressSpace->root if AddressSpace is different but it does not make a > lot of sense. Paolo
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 8bdcedc..a4c70e6 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2247,23 +2247,23 @@ AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) { PCIBus *bus = PCI_BUS(dev->bus); if (bus->iommu_fn) { return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); } if (bus->parent_dev) { return pci_device_iommu_address_space(bus->parent_dev); } - return &address_space_memory; + return &dev->bus_master_as; } > Also, we would have to fix the x86 firmware. > > Paolo >