[v4,2/5] xen, gfx passthrough: create intel isa bridge

Message ID	1401440369-29929-3-git-send-email-tiejun.chen@intel.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Tiejun Chen <tiejun.chen@intel.com> To: anthony.perard@citrix.com, stefano.stabellini@eu.citrix.com, mst@redhat.com, Kelly.Zytaruk@amd.com Date: Fri, 30 May 2014 16:59:26 +0800 Message-Id: <1401440369-29929-3-git-send-email-tiejun.chen@intel.com> In-Reply-To: <1401440369-29929-1-git-send-email-tiejun.chen@intel.com> References: <1401440369-29929-1-git-send-email-tiejun.chen@intel.com> Cc: peter.maydell@linaro.org, xen-devel@lists.xensource.com, allen.m.kay@intel.com, qemu-devel@nongnu.org, anthony@codemonkey.ws, yang.z.zhang@intel.com Subject: [Qemu-devel] [v4][PATCH 2/5] xen, gfx passthrough: create intel isa bridge Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Tiejun Chen May 30, 2014, 8:59 a.m. UTC

ISA bridge is needed since Intel gfx drive will probe it instead
of Dev31:Fun0 to make graphics device passthrough work easy for VMM, that
only need to expose ISA bridge to let driver know the real hardware underneath.

The original patch is from Allen Kay [allen.m.kay@intel.com]

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Cc: Allen Kay <allen.m.kay@intel.com>
---
v4:

* Remove some unnecessary "return" in void foo().

v3:

* Fix some typos.
* Improve some return paths.

v2:

* Nothing is changed.

 hw/xen/xen_pt_graphics.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

Konrad Rzeszutek Wilk May 30, 2014, 4:13 p.m. UTC | #1

On Fri, May 30, 2014 at 04:59:26PM +0800, Tiejun Chen wrote:
> ISA bridge is needed since Intel gfx drive will probe it instead
> of Dev31:Fun0 to make graphics device passthrough work easy for VMM, that
> only need to expose ISA bridge to let driver know the real hardware underneath.
> 
> The original patch is from Allen Kay [allen.m.kay@intel.com]
> 
> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> Cc: Allen Kay <allen.m.kay@intel.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> v4:
> 
> * Remove some unnecessary "return" in void foo().
> 
> v3:
> 
> * Fix some typos.
> * Improve some return paths.
> 
> v2:
> 
> * Nothing is changed.
> 
>  hw/xen/xen_pt_graphics.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 59 insertions(+)
> 
> diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
> index 461e526..236b13a 100644
> --- a/hw/xen/xen_pt_graphics.c
> +++ b/hw/xen/xen_pt_graphics.c
> @@ -230,3 +230,62 @@ out:
>      g_free(bios);
>      return rc;
>  }
> +
> +static uint32_t isa_bridge_read_config(PCIDevice *d, uint32_t addr, int len)
> +{
> +    return pci_default_read_config(d, addr, len);
> +}
> +
> +static void isa_bridge_write_config(PCIDevice *d, uint32_t addr, uint32_t v,
> +                                    int len)
> +{
> +    pci_default_write_config(d, addr, v, len);
> +}
> +
> +static void isa_bridge_class_init(ObjectClass *klass, void *data)
> +{
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    k->config_read = isa_bridge_read_config;
> +    k->config_write = isa_bridge_write_config;
> +};
> +
> +typedef struct {
> +    PCIDevice dev;
> +} ISABridgeState;
> +
> +static TypeInfo isa_bridge_info = {
> +    .name          = "intel-pch-isa-bridge",
> +    .parent        = TYPE_PCI_DEVICE,
> +    .instance_size = sizeof(ISABridgeState),
> +    .class_init = isa_bridge_class_init,
> +};
> +
> +static void xen_pt_graphics_register_types(void)
> +{
> +    type_register_static(&isa_bridge_info);
> +}
> +
> +type_init(xen_pt_graphics_register_types)
> +
> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
> +{
> +    struct PCIDevice *dev;
> +
> +    char rid;
> +
> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
> +
> +    qdev_init_nofail(&dev->qdev);
> +
> +    pci_config_set_vendor_id(dev->config, hdev->vendor_id);
> +    pci_config_set_device_id(dev->config, hdev->device_id);
> +
> +    xen_host_pci_get_block(hdev, PCI_REVISION_ID, (uint8_t *)&rid, 1);
> +
> +    pci_config_set_revision(dev->config, rid);
> +    pci_config_set_class(dev->config, PCI_CLASS_BRIDGE_ISA);
> +
> +    XEN_PT_LOG(dev, "Intel PCH ISA bridge created.\n");
> +    return 0;
> +}
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Stefano Stabellini June 2, 2014, 2:52 p.m. UTC | #2

On Fri, 30 May 2014, Tiejun Chen wrote:
> ISA bridge is needed since Intel gfx drive will probe it instead
> of Dev31:Fun0 to make graphics device passthrough work easy for VMM, that
> only need to expose ISA bridge to let driver know the real hardware underneath.
> 
> The original patch is from Allen Kay [allen.m.kay@intel.com]
> 
> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
> Cc: Allen Kay <allen.m.kay@intel.com>

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


> v4:
> 
> * Remove some unnecessary "return" in void foo().
> 
> v3:
> 
> * Fix some typos.
> * Improve some return paths.
> 
> v2:
> 
> * Nothing is changed.
> 
>  hw/xen/xen_pt_graphics.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 59 insertions(+)
> 
> diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
> index 461e526..236b13a 100644
> --- a/hw/xen/xen_pt_graphics.c
> +++ b/hw/xen/xen_pt_graphics.c
> @@ -230,3 +230,62 @@ out:
>      g_free(bios);
>      return rc;
>  }
> +
> +static uint32_t isa_bridge_read_config(PCIDevice *d, uint32_t addr, int len)
> +{
> +    return pci_default_read_config(d, addr, len);
> +}
> +
> +static void isa_bridge_write_config(PCIDevice *d, uint32_t addr, uint32_t v,
> +                                    int len)
> +{
> +    pci_default_write_config(d, addr, v, len);
> +}
> +
> +static void isa_bridge_class_init(ObjectClass *klass, void *data)
> +{
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    k->config_read = isa_bridge_read_config;
> +    k->config_write = isa_bridge_write_config;
> +};
> +
> +typedef struct {
> +    PCIDevice dev;
> +} ISABridgeState;
> +
> +static TypeInfo isa_bridge_info = {
> +    .name          = "intel-pch-isa-bridge",
> +    .parent        = TYPE_PCI_DEVICE,
> +    .instance_size = sizeof(ISABridgeState),
> +    .class_init = isa_bridge_class_init,
> +};
> +
> +static void xen_pt_graphics_register_types(void)
> +{
> +    type_register_static(&isa_bridge_info);
> +}
> +
> +type_init(xen_pt_graphics_register_types)
> +
> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
> +{
> +    struct PCIDevice *dev;
> +
> +    char rid;
> +
> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
> +
> +    qdev_init_nofail(&dev->qdev);
> +
> +    pci_config_set_vendor_id(dev->config, hdev->vendor_id);
> +    pci_config_set_device_id(dev->config, hdev->device_id);
> +
> +    xen_host_pci_get_block(hdev, PCI_REVISION_ID, (uint8_t *)&rid, 1);
> +
> +    pci_config_set_revision(dev->config, rid);
> +    pci_config_set_class(dev->config, PCI_CLASS_BRIDGE_ISA);
> +
> +    XEN_PT_LOG(dev, "Intel PCH ISA bridge created.\n");
> +    return 0;
> +}
> -- 
> 1.9.1
>

Paolo Bonzini June 3, 2014, 8:46 a.m. UTC | #3

Il 30/05/2014 10:59, Tiejun Chen ha scritto:
> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
> +{
> +    struct PCIDevice *dev;
> +
> +    char rid;
> +
> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");

This is really a huge hack.  You're going to have two ISA bridge devices 
in the machine, with the BIOS imagining that the "good" one is at 1f.0 
and the ACPI tables actually describing the other one.  But the PCI 
device at 1f.0 remains there and a driver in the OS could catch it---not 
just intel_detect_pch---and if you want to add such a hack it should be 
done in the Xen management layers.

If possible, the host bridge patches are even worse.  If you change the 
vendor and device ID while keeping the registers of the i440FX you're 
going to get conflicts or break firmware badly; TianoCore and SeaBIOS 
both expect the normal i440FX vendor and device IDs, for example.

The hardcoded list of offsets is also not acceptable.  It is also not 
clear who is accessing the registers, whether the BIOS or the driver. 
For Linux, a cursory look at the driver shows that it only accesses 
0x50/0x52 of the listed offsets, but also 0x44/0x48 ("MCH BAR"), what 
happens if that code path is encountered?

The main problem with IGD passthrough is the incestuous (and that's a 
euphemism) relationship between the MCH, PCH and graphics driver.  It 
may make sense at the hardware level, but for virtualization it doesn't. 
  A virt-specific driver for GPU command passthrough (with aid from the 
kernel driver, but abstracting all the MCH/PCH-dependent details) would 
make much more sense.

It's really not your fault, there's not much you can do given the 
hardware architecture.  But I don't think this code can be accepted 
upstream, sorry.

Paolo

Stefano Stabellini June 3, 2014, 11:29 a.m. UTC | #4

On Tue, 3 Jun 2014, Paolo Bonzini wrote:
> Il 30/05/2014 10:59, Tiejun Chen ha scritto:
> > +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
> > +{
> > +    struct PCIDevice *dev;
> > +
> > +    char rid;
> > +
> > +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
> 
> This is really a huge hack.  You're going to have two ISA bridge devices in
> the machine, with the BIOS imagining that the "good" one is at 1f.0 and the
> ACPI tables actually describing the other one.  But the PCI device at 1f.0
> remains there and a driver in the OS could catch it---not just
> intel_detect_pch---and if you want to add such a hack it should be done in the
> Xen management layers.
> 
> If possible, the host bridge patches are even worse.  If you change the vendor
> and device ID while keeping the registers of the i440FX you're going to get
> conflicts or break firmware badly; TianoCore and SeaBIOS both expect the
> normal i440FX vendor and device IDs, for example.
> 
> The hardcoded list of offsets is also not acceptable.  It is also not clear
> who is accessing the registers, whether the BIOS or the driver. For Linux, a
> cursory look at the driver shows that it only accesses 0x50/0x52 of the listed
> offsets, but also 0x44/0x48 ("MCH BAR"), what happens if that code path is
> encountered?
> 
> The main problem with IGD passthrough is the incestuous (and that's a
> euphemism) relationship between the MCH, PCH and graphics driver.  It may make
> sense at the hardware level, but for virtualization it doesn't.  A
> virt-specific driver for GPU command passthrough (with aid from the kernel
> driver, but abstracting all the MCH/PCH-dependent details) would make much
> more sense.
> 
> It's really not your fault, there's not much you can do given the hardware
> architecture.  But I don't think this code can be accepted upstream, sorry.

Yeah, the code is not nice and it is not Tiejun's fault.

Is there any way at all he could change the patch series to make it more
appealing to you? Or maybe we could having more clearly separated from
the rest of the codebase?


Otherwise I hate to have to diverge again from upstream QEMU but given
that we were already carrying these changes in the old
qemu-xen-traditional tree without issues, I feel that it would be unfair
for me not to merge them in the upstream based qemu-xen tree.
Unfortunately I imagine that the lack of this feature could be
considered a regression for us.

Do the other Xen maintainters have any opinions on this? I would
appreciate your opinions.

Paolo Bonzini June 3, 2014, 11:39 a.m. UTC | #5

Il 03/06/2014 13:29, Stefano Stabellini ha scritto:
>> It's really not your fault, there's not much you can do given the hardware
>> architecture.  But I don't think this code can be accepted upstream, sorry.
>
> Yeah, the code is not nice and it is not Tiejun's fault.
>
> Is there any way at all he could change the patch series to make it more
> appealing to you? Or maybe we could having more clearly separated from
> the rest of the codebase?

I don't know, perhaps a new "-M pc-xen-igd-passthrough" machine type 
based on what you're using now (pc-1.6?) but with all the required hacks?

Paolo

George Dunlap June 3, 2014, 11:42 a.m. UTC | #6

On 06/03/2014 12:29 PM, Stefano Stabellini wrote:
> On Tue, 3 Jun 2014, Paolo Bonzini wrote:
>> Il 30/05/2014 10:59, Tiejun Chen ha scritto:
>>> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
>>> +{
>>> +    struct PCIDevice *dev;
>>> +
>>> +    char rid;
>>> +
>>> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
>> This is really a huge hack.  You're going to have two ISA bridge devices in
>> the machine, with the BIOS imagining that the "good" one is at 1f.0 and the
>> ACPI tables actually describing the other one.  But the PCI device at 1f.0
>> remains there and a driver in the OS could catch it---not just
>> intel_detect_pch---and if you want to add such a hack it should be done in the
>> Xen management layers.
>>
>> If possible, the host bridge patches are even worse.  If you change the vendor
>> and device ID while keeping the registers of the i440FX you're going to get
>> conflicts or break firmware badly; TianoCore and SeaBIOS both expect the
>> normal i440FX vendor and device IDs, for example.
>>
>> The hardcoded list of offsets is also not acceptable.  It is also not clear
>> who is accessing the registers, whether the BIOS or the driver. For Linux, a
>> cursory look at the driver shows that it only accesses 0x50/0x52 of the listed
>> offsets, but also 0x44/0x48 ("MCH BAR"), what happens if that code path is
>> encountered?
>>
>> The main problem with IGD passthrough is the incestuous (and that's a
>> euphemism) relationship between the MCH, PCH and graphics driver.  It may make
>> sense at the hardware level, but for virtualization it doesn't.  A
>> virt-specific driver for GPU command passthrough (with aid from the kernel
>> driver, but abstracting all the MCH/PCH-dependent details) would make much
>> more sense.
>>
>> It's really not your fault, there's not much you can do given the hardware
>> architecture.  But I don't think this code can be accepted upstream, sorry.
> Yeah, the code is not nice and it is not Tiejun's fault.
>
> Is there any way at all he could change the patch series to make it more
> appealing to you? Or maybe we could having more clearly separated from
> the rest of the codebase?
>
>
> Otherwise I hate to have to diverge again from upstream QEMU but given
> that we were already carrying these changes in the old
> qemu-xen-traditional tree without issues, I feel that it would be unfair
> for me not to merge them in the upstream based qemu-xen tree.
> Unfortunately I imagine that the lack of this feature could be
> considered a regression for us.
>
> Do the other Xen maintainters have any opinions on this? I would
> appreciate your opinions.

Well my very initial take is to say that it was a mistake to accept the 
IGD stuff into qemu-xen-traditional before making sure that it would be 
suitable for qemu-upstream.  Avoiding having a fork again (or 
maintaining a set of out-of-tree patches) is more important than this 
one feature, IMHO.

When Intel submitted this for qemu-xen-traditional, we should have 
recommended to them at that time to start with qemu-upstream; or, we 
should have made it clear that if they chose to submit it to 
qemu-xen-traditional first, they would be taking the risk of having it 
only be there.  If we didn't warn them of that, that was a mistake on 
our part; but I don't think we can do anything other than apologize.  
(And of course see if there *is* a way to actually get it upstream.)

  -George

Stefano Stabellini June 3, 2014, 11:43 a.m. UTC | #7

On Tue, 3 Jun 2014, Paolo Bonzini wrote:
> Il 03/06/2014 13:29, Stefano Stabellini ha scritto:
> > > It's really not your fault, there's not much you can do given the hardware
> > > architecture.  But I don't think this code can be accepted upstream,
> > > sorry.
> > 
> > Yeah, the code is not nice and it is not Tiejun's fault.
> > 
> > Is there any way at all he could change the patch series to make it more
> > appealing to you? Or maybe we could having more clearly separated from
> > the rest of the codebase?
> 
> I don't know, perhaps a new "-M pc-xen-igd-passthrough" machine type based on
> what you're using now (pc-1.6?) but with all the required hacks?

That might be a good compromise as it would make it easier to identify
regressions caused by these changes. I would be happy with that.

Sander Eikelenboom June 3, 2014, 12:21 p.m. UTC | #8

Tuesday, June 3, 2014, 1:42:44 PM, you wrote:

> On 06/03/2014 12:29 PM, Stefano Stabellini wrote:
>> On Tue, 3 Jun 2014, Paolo Bonzini wrote:
>>> Il 30/05/2014 10:59, Tiejun Chen ha scritto:
>>>> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
>>>> +{
>>>> +    struct PCIDevice *dev;
>>>> +
>>>> +    char rid;
>>>> +
>>>> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
>>> This is really a huge hack.  You're going to have two ISA bridge devices in
>>> the machine, with the BIOS imagining that the "good" one is at 1f.0 and the
>>> ACPI tables actually describing the other one.  But the PCI device at 1f.0
>>> remains there and a driver in the OS could catch it---not just
>>> intel_detect_pch---and if you want to add such a hack it should be done in the
>>> Xen management layers.
>>>
>>> If possible, the host bridge patches are even worse.  If you change the vendor
>>> and device ID while keeping the registers of the i440FX you're going to get
>>> conflicts or break firmware badly; TianoCore and SeaBIOS both expect the
>>> normal i440FX vendor and device IDs, for example.
>>>
>>> The hardcoded list of offsets is also not acceptable.  It is also not clear
>>> who is accessing the registers, whether the BIOS or the driver. For Linux, a
>>> cursory look at the driver shows that it only accesses 0x50/0x52 of the listed
>>> offsets, but also 0x44/0x48 ("MCH BAR"), what happens if that code path is
>>> encountered?
>>>
>>> The main problem with IGD passthrough is the incestuous (and that's a
>>> euphemism) relationship between the MCH, PCH and graphics driver.  It may make
>>> sense at the hardware level, but for virtualization it doesn't.  A
>>> virt-specific driver for GPU command passthrough (with aid from the kernel
>>> driver, but abstracting all the MCH/PCH-dependent details) would make much
>>> more sense.
>>>
>>> It's really not your fault, there's not much you can do given the hardware
>>> architecture.  But I don't think this code can be accepted upstream, sorry.
>> Yeah, the code is not nice and it is not Tiejun's fault.
>>
>> Is there any way at all he could change the patch series to make it more
>> appealing to you? Or maybe we could having more clearly separated from
>> the rest of the codebase?
>>
>>
>> Otherwise I hate to have to diverge again from upstream QEMU but given
>> that we were already carrying these changes in the old
>> qemu-xen-traditional tree without issues, I feel that it would be unfair
>> for me not to merge them in the upstream based qemu-xen tree.
>> Unfortunately I imagine that the lack of this feature could be
>> considered a regression for us.
>>
>> Do the other Xen maintainters have any opinions on this? I would
>> appreciate your opinions.

> Well my very initial take is to say that it was a mistake to accept the 
> IGD stuff into qemu-xen-traditional before making sure that it would be 
> suitable for qemu-upstream.  Avoiding having a fork again (or 
> maintaining a set of out-of-tree patches) is more important than this 
> one feature, IMHO.

> When Intel submitted this for qemu-xen-traditional, we should have 
> recommended to them at that time to start with qemu-upstream; or, we 
> should have made it clear that if they chose to submit it to 
> qemu-xen-traditional first, they would be taking the risk of having it 
> only be there.  If we didn't warn them of that, that was a mistake on 
> our part; but I don't think we can do anything other than apologize.  
> (And of course see if there *is* a way to actually get it upstream.)

>   -George

Which make me wonder how much of the hackery is due to fundamental passthrough
issues of vga devices and how much is due to driver/firmware assumptions made because the 
driver/firmware was made with an integrated device in mind ?

The first part (probably legacy vga stuff) would probably be acceptable since 
it's probably shared for all vga devices (vendor independent) and would also be 
required for KVM.

The last part should perhaps be fixed in the driver/firmware (for instance 
the assumption the device is always device 00:02.0, that's valid for real intel 
hardware, but not necessary for passthrough or emulation). A requirement for 
using the latest firmware/drivers for passthrough to work could be acceptable i 
guess ?

--
Sander

Paolo Bonzini June 3, 2014, 12:24 p.m. UTC | #9

Il 03/06/2014 14:21, Sander Eikelenboom ha scritto:
>
> The last part should perhaps be fixed in the driver/firmware (for instance
> the assumption the device is always device 00:02.0, that's valid for real intel
> hardware, but not necessary for passthrough or emulation). A requirement for
> using the latest firmware/drivers for passthrough to work could be acceptable i
> guess ?

The problem is worse than that.  The assumptions are that the driver can 
make choices based on the device/vendor IDs of 00:1f.0, and that the 
driver can read/write to the config space of 00:00.0.

Paolo

Sander Eikelenboom June 3, 2014, 12:38 p.m. UTC | #10

Tuesday, June 3, 2014, 2:24:40 PM, you wrote:

> Il 03/06/2014 14:21, Sander Eikelenboom ha scritto:
>>
>> The last part should perhaps be fixed in the driver/firmware (for instance
>> the assumption the device is always device 00:02.0, that's valid for real intel
>> hardware, but not necessary for passthrough or emulation). A requirement for
>> using the latest firmware/drivers for passthrough to work could be acceptable i
>> guess ?

> The problem is worse than that.  The assumptions are that the driver can 
> make choices based on the device/vendor IDs of 00:1f.0, and that the 
> driver can read/write to the config space of 00:00.0.

> Paolo

It's quite tempting to start making assumptions and shortcuts when writing drivers for 
hardware fixed in a platfrom (IGD's), they probably wouldn't have done that when they 
were also making seperate PCI(-e) graphic cards with these chips because it 
would blow up on all sorts of machines (which is what happens now when trying to 
pass it through (or emulate it) on another platform).

--
Sander

Tian, Kevin June 3, 2014, 11:24 p.m. UTC | #11

> From: Paolo Bonzini
> Sent: Tuesday, June 03, 2014 4:40 AM
> 
> Il 03/06/2014 13:29, Stefano Stabellini ha scritto:
> >> It's really not your fault, there's not much you can do given the hardware
> >> architecture.  But I don't think this code can be accepted upstream, sorry.
> >
> > Yeah, the code is not nice and it is not Tiejun's fault.
> >
> > Is there any way at all he could change the patch series to make it more
> > appealing to you? Or maybe we could having more clearly separated from
> > the rest of the codebase?
> 
> I don't know, perhaps a new "-M pc-xen-igd-passthrough" machine type
> based on what you're using now (pc-1.6?) but with all the required hacks?
> 

Note the tricks here are not Xen specific, because it's a driver assumption.
If KVM wants to support igd passthrough, same tricks are required too.

Thanks
Kevin

Zhang, Yang Z June 6, 2014, 3:06 a.m. UTC | #12

Paolo Bonzini wrote on 2014-06-03:
> Il 30/05/2014 10:59, Tiejun Chen ha scritto:
>> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
>> +{
>> +    struct PCIDevice *dev;
>> +
>> +    char rid;
>> +
>> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
> 
> This is really a huge hack.  You're going to have two ISA bridge devices
> in the machine, with the BIOS imagining that the "good" one is at 1f.0

Definitely. So how about expose a fake device at 00:1f.0 but not Isa Bridge? I have discussion with gfx driver developer, it is ok for them to check the device on 00:1f.0 no matter what device it is.

> and the ACPI tables actually describing the other one.  But the PCI
> device at 1f.0 remains there and a driver in the OS could catch it---not
> just intel_detect_pch---and if you want to add such a hack it should be
> done in the Xen management layers.
> 
> If possible, the host bridge patches are even worse.  If you change the
> vendor and device ID while keeping the registers of the i440FX you're
> going to get conflicts or break firmware badly; TianoCore and SeaBIOS
> both expect the normal i440FX vendor and device IDs, for example.

I only see the class id is changed but not vendor and device id. 

> 
> The hardcoded list of offsets is also not acceptable.  It is also not
> clear who is accessing the registers, whether the BIOS or the driver.
> For Linux, a cursory look at the driver shows that it only accesses
> 0x50/0x52 of the listed offsets, but also 0x44/0x48 ("MCH BAR"), what
> happens if that code path is encountered?

Will have a double check.

> 
> The main problem with IGD passthrough is the incestuous (and that's a
> euphemism) relationship between the MCH, PCH and graphics driver.  It
> may make sense at the hardware level, but for virtualization it doesn't.
>   A virt-specific driver for GPU command passthrough (with aid from the
> kernel driver, but abstracting all the MCH/PCH-dependent details) would
> make much more sense.

Agree. But it is too hard.

> 
> It's really not your fault, there's not much you can do given the
> hardware architecture.  But I don't think this code can be accepted
> upstream, sorry.
> 
> Paolo


Best regards,
Yang

Paolo Bonzini June 6, 2014, 6:44 a.m. UTC | #13

Il 06/06/2014 05:06, Zhang, Yang Z ha scritto:
> Paolo Bonzini wrote on 2014-06-03:
>> Il 30/05/2014 10:59, Tiejun Chen ha scritto:
>>> +static int create_pch_isa_bridge(PCIBus *bus, XenHostPCIDevice *hdev)
>>> +{
>>> +    struct PCIDevice *dev;
>>> +
>>> +    char rid;
>>> +
>>> +    dev = pci_create(bus, PCI_DEVFN(0x1f, 0), "intel-pch-isa-bridge");
>>
>> This is really a huge hack.  You're going to have two ISA bridge devices
>> in the machine, with the BIOS imagining that the "good" one is at 1f.0
>
> Definitely. So how about expose a fake device at 00:1f.0 but not Isa Bridge? I have discussion with gfx driver developer, it is ok for them to check the device on 00:1f.0 no matter what device it is.

That would be slightly better.

>> and the ACPI tables actually describing the other one.  But the PCI
>> device at 1f.0 remains there and a driver in the OS could catch it---not
>> just intel_detect_pch---and if you want to add such a hack it should be
>> done in the Xen management layers.
>>
>> If possible, the host bridge patches are even worse.  If you change the
>> vendor and device ID while keeping the registers of the i440FX you're
>> going to get conflicts or break firmware badly; TianoCore and SeaBIOS
>> both expect the normal i440FX vendor and device IDs, for example.
>
> I only see the class id is changed but not vendor and device id.

Yes, and the class ID is a typo probably.

But when the guest reads the vendor and device ID, igd_pci_read passes 
it through.  So effectively it changes, if I read the code correctly.

Paolo

>>
>> The hardcoded list of offsets is also not acceptable.  It is also not
>> clear who is accessing the registers, whether the BIOS or the driver.
>> For Linux, a cursory look at the driver shows that it only accesses
>> 0x50/0x52 of the listed offsets, but also 0x44/0x48 ("MCH BAR"), what
>> happens if that code path is encountered?
>
> Will have a double check.
>
>>
>> The main problem with IGD passthrough is the incestuous (and that's a
>> euphemism) relationship between the MCH, PCH and graphics driver.  It
>> may make sense at the hardware level, but for virtualization it doesn't.
>>   A virt-specific driver for GPU command passthrough (with aid from the
>> kernel driver, but abstracting all the MCH/PCH-dependent details) would
>> make much more sense.
>
> Agree. But it is too hard.
>
>>
>> It's really not your fault, there's not much you can do given the
>> hardware architecture.  But I don't think this code can be accepted
>> upstream, sorry.
>>
>> Paolo
>
>
> Best regards,
> Yang
>
>
>
>

[v4,2/5] xen, gfx passthrough: create intel isa bridge

Commit Message

Comments

Patch