diff mbox

[v2,4/4] hw/pci-bridge: push down PXB in qtree in order to format PXB bus numer

Message ID 1433956033-11584-5-git-send-email-lersek@redhat.com
State New
Headers show

Commit Message

Laszlo Ersek June 10, 2015, 5:07 p.m. UTC
The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
off devices behind the PXB. This happens because the
sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
enough information to format a unique identifier for the PXB in question,
and consequently the OpenFirmware device path passed down to the guest
firmware in the "bootorder" fw_cfg file is unusable for identifying the
boot device.

For example, the command line fragment

  -device pxb,id=bridge1,bus_nr=4 \
  \
  -netdev user,id=netdev0 \
  -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0

results in the following "bootorder" entry:

  /pci/pci-bridge@0/ethernet@2/ethernet-phy@0

The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
resultant OpenFirmware device path is independent of bus_nr=4 -- and
therefore it is useless for identifying the device.

In this patch we insert a dummy bus between the main sysbus and the
TYPE_PXB_HOST device. Formatting child addresses is a bus class level
responsibility, which is exactly what we'll use here.

After the patch, the same command line fragment results in the following
OpenFirmware device path in the "bootorder" fw_cfg file:

  /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0

The original, initial "/pci" fragment has been replaced with
"/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all the
necessary information. sysbus_get_fw_dev_path() formats the first node
("extra-pci-roots") as always, and the new function
extra_pci_roots_bus_get_fw_dev_path() formats the second one
("pxbhost@4").

Here's the comparison ("diff -u -b -U28") between the "info qtree"
outputs, before and after (the hpet device is the first common line):

> --- qtree.before        2015-06-10 18:16:44.903359633 +0200
> +++ qtree.after 2015-06-10 18:19:01.904139538 +0200
> @@ -1,30 +1,33 @@
>  bus: main-system-bus
>    type System
> +  dev: extra-pci-roots, id ""
> +    bus: extra-pci-roots-bus.0
> +      type extra-pci-roots-bus
>    dev: pxb-host, id ""
>      bus: pxb-internal
>        type pxb-bus
>        dev: pci-basic-bridge, id "bridge1"
>          chassis_nr = 4 (0x4)
>          addr = 00.0
>          romfile = ""
>          rombar = 1 (0x1)
>          multifunction = false
>          command_serr_enable = true
>          class PCI bridge, addr 04:00.0, pci id 1b36:000a (sub 0000:0000)
>          bus: bridge1
>            type PCI
>            dev: e1000, id ""
>              mac = "52:54:00:12:34:56"
>              vlan = <null>
>              netdev = "netdev0"
>              autonegotiation = true
>              mitigation = true
>              addr = 02.0
>              romfile = ""
>              rombar = 1 (0x1)
>              multifunction = false
>              command_serr_enable = true
>              class Ethernet controller, addr 05:02.0, pci id 8086:100e (sub 1af4:1100)
>              bar 0: mem at 0x88100000 [0x8811ffff]
>              bar 1: i/o at 0xd000 [0xd03f]
>    dev: hpet, id ""

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    v2:
    - new in v2, addressing the "bootorder" fw_cfg issue discussed earlier
    - Cc'd Markus, because I probably butchered a handful of QOM rules in
      this patch :)

 hw/pci-bridge/pci_expander_bridge.c | 45 +++++++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

Comments

Marcel Apfelbaum June 10, 2015, 7:26 p.m. UTC | #1
On 06/10/2015 08:07 PM, Laszlo Ersek wrote:
> The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
> off devices behind the PXB. This happens because the
> sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
> enough information to format a unique identifier for the PXB in question,
> and consequently the OpenFirmware device path passed down to the guest
> firmware in the "bootorder" fw_cfg file is unusable for identifying the
> boot device.
>
> For example, the command line fragment
>
>    -device pxb,id=bridge1,bus_nr=4 \
>    \
>    -netdev user,id=netdev0 \
>    -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0
>
> results in the following "bootorder" entry:
>
>    /pci/pci-bridge@0/ethernet@2/ethernet-phy@0
>
> The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
> resultant OpenFirmware device path is independent of bus_nr=4 -- and
> therefore it is useless for identifying the device.
>
> In this patch we insert a dummy bus between the main sysbus and the
> TYPE_PXB_HOST device. Formatting child addresses is a bus class level
> responsibility, which is exactly what we'll use here.
>
> After the patch, the same command line fragment results in the following
> OpenFirmware device path in the "bootorder" fw_cfg file:
>
>    /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0
Here I have a little comment:
Thinking of it again, using pxbhost here is too specific. *Any kind* of PCI root
bridge will benefit from this notation, not only PXB.
I hope you are OK with it.
/extra-pci-roots/pci-root@4/ maybe is more generic.

>
> The original, initial "/pci" fragment has been replaced with
> "/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all the
> necessary information. sysbus_get_fw_dev_path() formats the first node
> ("extra-pci-roots") as always, and the new function
> extra_pci_roots_bus_get_fw_dev_path() formats the second one
> ("pxbhost@4").
The "extra-pci-roots" is a nice touch.

>
> Here's the comparison ("diff -u -b -U28") between the "info qtree"
> outputs, before and after (the hpet device is the first common line):
>
>> --- qtree.before        2015-06-10 18:16:44.903359633 +0200
>> +++ qtree.after 2015-06-10 18:19:01.904139538 +0200
>> @@ -1,30 +1,33 @@
>>   bus: main-system-bus
>>     type System
>> +  dev: extra-pci-roots, id ""
>> +    bus: extra-pci-roots-bus.0
>> +      type extra-pci-roots-bus
>>     dev: pxb-host, id ""
>>       bus: pxb-internal
>>         type pxb-bus
>>         dev: pci-basic-bridge, id "bridge1"
>>           chassis_nr = 4 (0x4)
>>           addr = 00.0
>>           romfile = ""
>>           rombar = 1 (0x1)
>>           multifunction = false
>>           command_serr_enable = true
>>           class PCI bridge, addr 04:00.0, pci id 1b36:000a (sub 0000:0000)
>>           bus: bridge1
>>             type PCI
>>             dev: e1000, id ""
>>               mac = "52:54:00:12:34:56"
>>               vlan = <null>
>>               netdev = "netdev0"
>>               autonegotiation = true
>>               mitigation = true
>>               addr = 02.0
>>               romfile = ""
>>               rombar = 1 (0x1)
>>               multifunction = false
>>               command_serr_enable = true
>>               class Ethernet controller, addr 05:02.0, pci id 8086:100e (sub 1af4:1100)
>>               bar 0: mem at 0x88100000 [0x8811ffff]
>>               bar 1: i/o at 0xd000 [0xd03f]
>>     dev: hpet, id ""
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Marcel Apfelbaum <marcel@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>
> Notes:
>      v2:
>      - new in v2, addressing the "bootorder" fw_cfg issue discussed earlier
>      - Cc'd Markus, because I probably butchered a handful of QOM rules in
>        this patch :)
>
>   hw/pci-bridge/pci_expander_bridge.c | 45 +++++++++++++++++++++++++++++++++++--
>   1 file changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index c7a085d..30e93d6 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -19,6 +19,15 @@
>   #include "qemu/error-report.h"
>   #include "sysemu/numa.h"
>
> +/*
> + * We'll insert a dummy bus between the main sysbus and TYPE_PXB_HOST, because
> + * the main sysbus doesn't know how to format the bus number of the extra root
> + * bus in sysbus_get_fw_dev_path(). We'll use this dummy bus just for
> + * formatting that.
> + */
> +#define TYPE_EXTRA_PCI_ROOTS_DEV "extra-pci-roots"
> +#define TYPE_EXTRA_PCI_ROOTS_BUS "extra-pci-roots-bus"
> +
>   #define TYPE_PXB_BUS "pxb-bus"
>   #define PXB_BUS(obj) OBJECT_CHECK(PXBBus, (obj), TYPE_PXB_BUS)
>
> @@ -93,7 +102,8 @@ static void pxb_host_class_init(ObjectClass *class, void *data)
>       DeviceClass *dc = DEVICE_CLASS(class);
>       PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
>
> -    dc->fw_name = "pci";
> +    dc->fw_name = "pxbhost";
> +    dc->bus_type = TYPE_EXTRA_PCI_ROOTS_BUS;
>       hc->root_bus_path = pxb_host_root_bus_path;
>   }
>
> @@ -151,6 +161,8 @@ static int pxb_map_irq_fn(PCIDevice *pci_dev, int pin)
>   static int pxb_dev_initfn(PCIDevice *dev)
>   {
>       PXBDev *pxb = PXB_DEV(dev);
> +    DeviceState *extra_pci_roots_dev;
> +    BusState *extra_pci_roots_bus;
>       DeviceState *ds, *bds;
>       PCIBus *bus;
>       const char *dev_name = NULL;
> @@ -165,7 +177,10 @@ static int pxb_dev_initfn(PCIDevice *dev)
>           dev_name = dev->qdev.id;
>       }
>
> -    ds = qdev_create(NULL, TYPE_PXB_HOST);
> +    extra_pci_roots_dev = qdev_create(NULL, TYPE_EXTRA_PCI_ROOTS_DEV);
> +    extra_pci_roots_bus = qbus_create(TYPE_EXTRA_PCI_ROOTS_BUS,
> +                                      extra_pci_roots_dev, NULL);
Yep, this is the answer.

> +    ds = qdev_create(extra_pci_roots_bus, TYPE_PXB_HOST);
>       bus = pci_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
>
>       bus->parent_dev = dev;
> @@ -221,11 +236,37 @@ static const TypeInfo pxb_dev_info = {
>       .class_init    = pxb_dev_class_init,
>   };
>
> +static char *extra_pci_roots_bus_get_fw_dev_path(DeviceState *dev)
> +{
> +    return g_strdup_printf("%s@%x", qdev_fw_name(dev),
> +                           pxb_bus_num(PCI_HOST_BRIDGE(dev)->bus));
> +}
> +
> +static void extra_pci_roots_bus_class_init(ObjectClass *klass, void *data)
> +{
> +    BusClass *k = BUS_CLASS(klass);
> +
> +    k->get_fw_dev_path = extra_pci_roots_bus_get_fw_dev_path;
I tried to add this to the existing PXB Bus, but it didn't work of course.
The PXB Bus is *behind* the PXB device.

> +}
> +
> +static const TypeInfo extra_pci_roots_bus_info = {
> +    .name = TYPE_EXTRA_PCI_ROOTS_BUS,
> +    .parent = TYPE_BUS,
> +    .class_init = extra_pci_roots_bus_class_init,
> +};
> +
> +static const TypeInfo extra_pci_roots_bus_dev_info = {
> +    .name = TYPE_EXTRA_PCI_ROOTS_DEV,
> +    .parent = TYPE_SYS_BUS_DEVICE,
> +};
> +
>   static void pxb_register_types(void)
>   {
>       type_register_static(&pxb_bus_info);
>       type_register_static(&pxb_host_info);
>       type_register_static(&pxb_dev_info);
> +    type_register_static(&extra_pci_roots_bus_info);
> +    type_register_static(&extra_pci_roots_bus_dev_info);
>   }
>
>   type_init(pxb_register_types)
>

Thanks,
Marcel
Marcel Apfelbaum June 10, 2015, 7:26 p.m. UTC | #2
On 06/10/2015 08:07 PM, Laszlo Ersek wrote:
> The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
> off devices behind the PXB. This happens because the
> sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
> enough information to format a unique identifier for the PXB in question,
> and consequently the OpenFirmware device path passed down to the guest
> firmware in the "bootorder" fw_cfg file is unusable for identifying the
> boot device.
>
> For example, the command line fragment
>
>    -device pxb,id=bridge1,bus_nr=4 \
>    \
>    -netdev user,id=netdev0 \
>    -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0
>
> results in the following "bootorder" entry:
>
>    /pci/pci-bridge@0/ethernet@2/ethernet-phy@0
>
> The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
> resultant OpenFirmware device path is independent of bus_nr=4 -- and
> therefore it is useless for identifying the device.
>
> In this patch we insert a dummy bus between the main sysbus and the
> TYPE_PXB_HOST device. Formatting child addresses is a bus class level
> responsibility, which is exactly what we'll use here.
>
> After the patch, the same command line fragment results in the following
> OpenFirmware device path in the "bootorder" fw_cfg file:
>
>    /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0
>
> The original, initial "/pci" fragment has been replaced with
> "/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all the
> necessary information. sysbus_get_fw_dev_path() formats the first node
> ("extra-pci-roots") as always, and the new function
> extra_pci_roots_bus_get_fw_dev_path() formats the second one
> ("pxbhost@4").
>
> Here's the comparison ("diff -u -b -U28") between the "info qtree"
> outputs, before and after (the hpet device is the first common line):

BTW, did you try to boot from it :) ?

>
>> --- qtree.before        2015-06-10 18:16:44.903359633 +0200
>> +++ qtree.after 2015-06-10 18:19:01.904139538 +0200
>> @@ -1,30 +1,33 @@
>>   bus: main-system-bus
>>     type System
>> +  dev: extra-pci-roots, id ""
>> +    bus: extra-pci-roots-bus.0
>> +      type extra-pci-roots-bus
>>     dev: pxb-host, id ""
>>       bus: pxb-internal
>>         type pxb-bus
>>         dev: pci-basic-bridge, id "bridge1"
>>           chassis_nr = 4 (0x4)
>>           addr = 00.0
>>           romfile = ""
>>           rombar = 1 (0x1)
>>           multifunction = false
>>           command_serr_enable = true
>>           class PCI bridge, addr 04:00.0, pci id 1b36:000a (sub 0000:0000)
>>           bus: bridge1
>>             type PCI
>>             dev: e1000, id ""
>>               mac = "52:54:00:12:34:56"
>>               vlan = <null>
>>               netdev = "netdev0"
>>               autonegotiation = true
>>               mitigation = true
>>               addr = 02.0
>>               romfile = ""
>>               rombar = 1 (0x1)
>>               multifunction = false
>>               command_serr_enable = true
>>               class Ethernet controller, addr 05:02.0, pci id 8086:100e (sub 1af4:1100)
>>               bar 0: mem at 0x88100000 [0x8811ffff]
>>               bar 1: i/o at 0xd000 [0xd03f]
>>     dev: hpet, id ""
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Marcel Apfelbaum <marcel@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>
> Notes:
>      v2:
>      - new in v2, addressing the "bootorder" fw_cfg issue discussed earlier
>      - Cc'd Markus, because I probably butchered a handful of QOM rules in
>        this patch :)
>
>   hw/pci-bridge/pci_expander_bridge.c | 45 +++++++++++++++++++++++++++++++++++--
>   1 file changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index c7a085d..30e93d6 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -19,6 +19,15 @@
>   #include "qemu/error-report.h"
>   #include "sysemu/numa.h"
>
> +/*
> + * We'll insert a dummy bus between the main sysbus and TYPE_PXB_HOST, because
> + * the main sysbus doesn't know how to format the bus number of the extra root
> + * bus in sysbus_get_fw_dev_path(). We'll use this dummy bus just for
> + * formatting that.
> + */
> +#define TYPE_EXTRA_PCI_ROOTS_DEV "extra-pci-roots"
> +#define TYPE_EXTRA_PCI_ROOTS_BUS "extra-pci-roots-bus"
> +
>   #define TYPE_PXB_BUS "pxb-bus"
>   #define PXB_BUS(obj) OBJECT_CHECK(PXBBus, (obj), TYPE_PXB_BUS)
>
> @@ -93,7 +102,8 @@ static void pxb_host_class_init(ObjectClass *class, void *data)
>       DeviceClass *dc = DEVICE_CLASS(class);
>       PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
>
> -    dc->fw_name = "pci";
> +    dc->fw_name = "pxbhost";
> +    dc->bus_type = TYPE_EXTRA_PCI_ROOTS_BUS;
>       hc->root_bus_path = pxb_host_root_bus_path;
>   }
>
> @@ -151,6 +161,8 @@ static int pxb_map_irq_fn(PCIDevice *pci_dev, int pin)
>   static int pxb_dev_initfn(PCIDevice *dev)
>   {
>       PXBDev *pxb = PXB_DEV(dev);
> +    DeviceState *extra_pci_roots_dev;
> +    BusState *extra_pci_roots_bus;
>       DeviceState *ds, *bds;
>       PCIBus *bus;
>       const char *dev_name = NULL;
> @@ -165,7 +177,10 @@ static int pxb_dev_initfn(PCIDevice *dev)
>           dev_name = dev->qdev.id;
>       }
>
> -    ds = qdev_create(NULL, TYPE_PXB_HOST);
> +    extra_pci_roots_dev = qdev_create(NULL, TYPE_EXTRA_PCI_ROOTS_DEV);
> +    extra_pci_roots_bus = qbus_create(TYPE_EXTRA_PCI_ROOTS_BUS,
> +                                      extra_pci_roots_dev, NULL);
> +    ds = qdev_create(extra_pci_roots_bus, TYPE_PXB_HOST);
>       bus = pci_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
>
>       bus->parent_dev = dev;
> @@ -221,11 +236,37 @@ static const TypeInfo pxb_dev_info = {
>       .class_init    = pxb_dev_class_init,
>   };
>
> +static char *extra_pci_roots_bus_get_fw_dev_path(DeviceState *dev)
> +{
> +    return g_strdup_printf("%s@%x", qdev_fw_name(dev),
> +                           pxb_bus_num(PCI_HOST_BRIDGE(dev)->bus));
> +}
> +
> +static void extra_pci_roots_bus_class_init(ObjectClass *klass, void *data)
> +{
> +    BusClass *k = BUS_CLASS(klass);
> +
> +    k->get_fw_dev_path = extra_pci_roots_bus_get_fw_dev_path;
> +}
> +
> +static const TypeInfo extra_pci_roots_bus_info = {
> +    .name = TYPE_EXTRA_PCI_ROOTS_BUS,
> +    .parent = TYPE_BUS,
> +    .class_init = extra_pci_roots_bus_class_init,
> +};
> +
> +static const TypeInfo extra_pci_roots_bus_dev_info = {
> +    .name = TYPE_EXTRA_PCI_ROOTS_DEV,
> +    .parent = TYPE_SYS_BUS_DEVICE,
> +};
> +
>   static void pxb_register_types(void)
>   {
>       type_register_static(&pxb_bus_info);
>       type_register_static(&pxb_host_info);
>       type_register_static(&pxb_dev_info);
> +    type_register_static(&extra_pci_roots_bus_info);
> +    type_register_static(&extra_pci_roots_bus_dev_info);
>   }
>
>   type_init(pxb_register_types)
>
Laszlo Ersek June 10, 2015, 7:34 p.m. UTC | #3
On 06/10/15 21:26, Marcel Apfelbaum wrote:
> On 06/10/2015 08:07 PM, Laszlo Ersek wrote:
>> The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
>> off devices behind the PXB. This happens because the
>> sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
>> enough information to format a unique identifier for the PXB in question,
>> and consequently the OpenFirmware device path passed down to the guest
>> firmware in the "bootorder" fw_cfg file is unusable for identifying the
>> boot device.
>>
>> For example, the command line fragment
>>
>>    -device pxb,id=bridge1,bus_nr=4 \
>>    \
>>    -netdev user,id=netdev0 \
>>    -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0
>>
>> results in the following "bootorder" entry:
>>
>>    /pci/pci-bridge@0/ethernet@2/ethernet-phy@0
>>
>> The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
>> resultant OpenFirmware device path is independent of bus_nr=4 -- and
>> therefore it is useless for identifying the device.
>>
>> In this patch we insert a dummy bus between the main sysbus and the
>> TYPE_PXB_HOST device. Formatting child addresses is a bus class level
>> responsibility, which is exactly what we'll use here.
>>
>> After the patch, the same command line fragment results in the following
>> OpenFirmware device path in the "bootorder" fw_cfg file:
>>
>>    /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0
>>
>> The original, initial "/pci" fragment has been replaced with
>> "/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all
>> the
>> necessary information. sysbus_get_fw_dev_path() formats the first node
>> ("extra-pci-roots") as always, and the new function
>> extra_pci_roots_bus_get_fw_dev_path() formats the second one
>> ("pxbhost@4").
>>
>> Here's the comparison ("diff -u -b -U28") between the "info qtree"
>> outputs, before and after (the hpet device is the first common line):
> 
> BTW, did you try to boot from it :) ?

No, not yet; I'll have to write additional OVMF code for that; but the
syntax is OK, and the information is there, so I'm fairly sure I can
write that code. :)

The SeaBIOS side I'll probably leave to you... /me ducks :)

Thanks
Laszlo
Laszlo Ersek June 10, 2015, 7:44 p.m. UTC | #4
On 06/10/15 21:26, Marcel Apfelbaum wrote:
> On 06/10/2015 08:07 PM, Laszlo Ersek wrote:
>> The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
>> off devices behind the PXB. This happens because the
>> sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
>> enough information to format a unique identifier for the PXB in question,
>> and consequently the OpenFirmware device path passed down to the guest
>> firmware in the "bootorder" fw_cfg file is unusable for identifying the
>> boot device.
>>
>> For example, the command line fragment
>>
>>    -device pxb,id=bridge1,bus_nr=4 \
>>    \
>>    -netdev user,id=netdev0 \
>>    -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0
>>
>> results in the following "bootorder" entry:
>>
>>    /pci/pci-bridge@0/ethernet@2/ethernet-phy@0
>>
>> The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
>> resultant OpenFirmware device path is independent of bus_nr=4 -- and
>> therefore it is useless for identifying the device.
>>
>> In this patch we insert a dummy bus between the main sysbus and the
>> TYPE_PXB_HOST device. Formatting child addresses is a bus class level
>> responsibility, which is exactly what we'll use here.
>>
>> After the patch, the same command line fragment results in the following
>> OpenFirmware device path in the "bootorder" fw_cfg file:
>>
>>    /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0
> Here I have a little comment:
> Thinking of it again, using pxbhost here is too specific. *Any kind* of
> PCI root
> bridge will benefit from this notation, not only PXB.
> I hope you are OK with it.
> /extra-pci-roots/pci-root@4/ maybe is more generic.

Sure, absolutely.

> 
>>
>> The original, initial "/pci" fragment has been replaced with
>> "/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all
>> the
>> necessary information. sysbus_get_fw_dev_path() formats the first node
>> ("extra-pci-roots") as always, and the new function
>> extra_pci_roots_bus_get_fw_dev_path() formats the second one
>> ("pxbhost@4").
> The "extra-pci-roots" is a nice touch.
> 
>>
>> Here's the comparison ("diff -u -b -U28") between the "info qtree"
>> outputs, before and after (the hpet device is the first common line):
>>
>>> --- qtree.before        2015-06-10 18:16:44.903359633 +0200
>>> +++ qtree.after 2015-06-10 18:19:01.904139538 +0200
>>> @@ -1,30 +1,33 @@
>>>   bus: main-system-bus
>>>     type System
>>> +  dev: extra-pci-roots, id ""
>>> +    bus: extra-pci-roots-bus.0
>>> +      type extra-pci-roots-bus
>>>     dev: pxb-host, id ""
>>>       bus: pxb-internal
>>>         type pxb-bus
>>>         dev: pci-basic-bridge, id "bridge1"
>>>           chassis_nr = 4 (0x4)
>>>           addr = 00.0
>>>           romfile = ""
>>>           rombar = 1 (0x1)
>>>           multifunction = false
>>>           command_serr_enable = true
>>>           class PCI bridge, addr 04:00.0, pci id 1b36:000a (sub
>>> 0000:0000)
>>>           bus: bridge1
>>>             type PCI
>>>             dev: e1000, id ""
>>>               mac = "52:54:00:12:34:56"
>>>               vlan = <null>
>>>               netdev = "netdev0"
>>>               autonegotiation = true
>>>               mitigation = true
>>>               addr = 02.0
>>>               romfile = ""
>>>               rombar = 1 (0x1)
>>>               multifunction = false
>>>               command_serr_enable = true
>>>               class Ethernet controller, addr 05:02.0, pci id
>>> 8086:100e (sub 1af4:1100)
>>>               bar 0: mem at 0x88100000 [0x8811ffff]
>>>               bar 1: i/o at 0xd000 [0xd03f]
>>>     dev: hpet, id ""
>>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Marcel Apfelbaum <marcel@redhat.com>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>
>> Notes:
>>      v2:
>>      - new in v2, addressing the "bootorder" fw_cfg issue discussed
>> earlier
>>      - Cc'd Markus, because I probably butchered a handful of QOM
>> rules in
>>        this patch :)
>>
>>   hw/pci-bridge/pci_expander_bridge.c | 45
>> +++++++++++++++++++++++++++++++++++--
>>   1 file changed, 43 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/pci-bridge/pci_expander_bridge.c
>> b/hw/pci-bridge/pci_expander_bridge.c
>> index c7a085d..30e93d6 100644
>> --- a/hw/pci-bridge/pci_expander_bridge.c
>> +++ b/hw/pci-bridge/pci_expander_bridge.c
>> @@ -19,6 +19,15 @@
>>   #include "qemu/error-report.h"
>>   #include "sysemu/numa.h"
>>
>> +/*
>> + * We'll insert a dummy bus between the main sysbus and
>> TYPE_PXB_HOST, because
>> + * the main sysbus doesn't know how to format the bus number of the
>> extra root
>> + * bus in sysbus_get_fw_dev_path(). We'll use this dummy bus just for
>> + * formatting that.
>> + */
>> +#define TYPE_EXTRA_PCI_ROOTS_DEV "extra-pci-roots"
>> +#define TYPE_EXTRA_PCI_ROOTS_BUS "extra-pci-roots-bus"
>> +
>>   #define TYPE_PXB_BUS "pxb-bus"
>>   #define PXB_BUS(obj) OBJECT_CHECK(PXBBus, (obj), TYPE_PXB_BUS)
>>
>> @@ -93,7 +102,8 @@ static void pxb_host_class_init(ObjectClass *class,
>> void *data)
>>       DeviceClass *dc = DEVICE_CLASS(class);
>>       PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
>>
>> -    dc->fw_name = "pci";
>> +    dc->fw_name = "pxbhost";
>> +    dc->bus_type = TYPE_EXTRA_PCI_ROOTS_BUS;
>>       hc->root_bus_path = pxb_host_root_bus_path;
>>   }
>>
>> @@ -151,6 +161,8 @@ static int pxb_map_irq_fn(PCIDevice *pci_dev, int
>> pin)
>>   static int pxb_dev_initfn(PCIDevice *dev)
>>   {
>>       PXBDev *pxb = PXB_DEV(dev);
>> +    DeviceState *extra_pci_roots_dev;
>> +    BusState *extra_pci_roots_bus;
>>       DeviceState *ds, *bds;
>>       PCIBus *bus;
>>       const char *dev_name = NULL;
>> @@ -165,7 +177,10 @@ static int pxb_dev_initfn(PCIDevice *dev)
>>           dev_name = dev->qdev.id;
>>       }
>>
>> -    ds = qdev_create(NULL, TYPE_PXB_HOST);
>> +    extra_pci_roots_dev = qdev_create(NULL, TYPE_EXTRA_PCI_ROOTS_DEV);
>> +    extra_pci_roots_bus = qbus_create(TYPE_EXTRA_PCI_ROOTS_BUS,
>> +                                      extra_pci_roots_dev, NULL);
> Yep, this is the answer.
> 
>> +    ds = qdev_create(extra_pci_roots_bus, TYPE_PXB_HOST);
>>       bus = pci_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
>>
>>       bus->parent_dev = dev;
>> @@ -221,11 +236,37 @@ static const TypeInfo pxb_dev_info = {
>>       .class_init    = pxb_dev_class_init,
>>   };
>>
>> +static char *extra_pci_roots_bus_get_fw_dev_path(DeviceState *dev)
>> +{
>> +    return g_strdup_printf("%s@%x", qdev_fw_name(dev),
>> +                           pxb_bus_num(PCI_HOST_BRIDGE(dev)->bus));
>> +}
>> +
>> +static void extra_pci_roots_bus_class_init(ObjectClass *klass, void
>> *data)
>> +{
>> +    BusClass *k = BUS_CLASS(klass);
>> +
>> +    k->get_fw_dev_path = extra_pci_roots_bus_get_fw_dev_path;
> I tried to add this to the existing PXB Bus, but it didn't work of course.
> The PXB Bus is *behind* the PXB device.

Right.

The PXB bus class (TYPE_PXB_BUS) is derived from TYPE_PCI_BUS, which has
its own "child-formatter" callback: pcibus_get_fw_dev_path().

One can override that member in a derived class, but that's not useful,
for two reason:
- it's already too far from the qtree root (ie. past the node where the
  needed info is available), like you said; and
- pcibus_get_fw_dev_path() is *correct* in the way it formats its
  children's addresses.

Namely, pcibus_get_fw_dev_path() takes credit for formatting the

  pci-bridge@0

node, which is just right, because the bridge is in slot 0 of the parent
TYPE_PXB_HOST. We should not change that; the current method mirrors the
counterpart UEFI device paths precisely.

Thanks for the feedback, if Markus and Michael are otherwise okay with
this patch, I'll replace "pxbhost" with "pci-root".

Thanks!
Laszlo

> 
>> +}
>> +
>> +static const TypeInfo extra_pci_roots_bus_info = {
>> +    .name = TYPE_EXTRA_PCI_ROOTS_BUS,
>> +    .parent = TYPE_BUS,
>> +    .class_init = extra_pci_roots_bus_class_init,
>> +};
>> +
>> +static const TypeInfo extra_pci_roots_bus_dev_info = {
>> +    .name = TYPE_EXTRA_PCI_ROOTS_DEV,
>> +    .parent = TYPE_SYS_BUS_DEVICE,
>> +};
>> +
>>   static void pxb_register_types(void)
>>   {
>>       type_register_static(&pxb_bus_info);
>>       type_register_static(&pxb_host_info);
>>       type_register_static(&pxb_dev_info);
>> +    type_register_static(&extra_pci_roots_bus_info);
>> +    type_register_static(&extra_pci_roots_bus_dev_info);
>>   }
>>
>>   type_init(pxb_register_types)
>>
> 
> Thanks,
> Marcel
>
Laszlo Ersek June 10, 2015, 8:11 p.m. UTC | #5
On 06/10/15 21:34, Laszlo Ersek wrote:
> On 06/10/15 21:26, Marcel Apfelbaum wrote:
>> On 06/10/2015 08:07 PM, Laszlo Ersek wrote:
>>> The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
>>> off devices behind the PXB. This happens because the
>>> sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
>>> enough information to format a unique identifier for the PXB in question,
>>> and consequently the OpenFirmware device path passed down to the guest
>>> firmware in the "bootorder" fw_cfg file is unusable for identifying the
>>> boot device.
>>>
>>> For example, the command line fragment
>>>
>>>    -device pxb,id=bridge1,bus_nr=4 \
>>>    \
>>>    -netdev user,id=netdev0 \
>>>    -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0
>>>
>>> results in the following "bootorder" entry:
>>>
>>>    /pci/pci-bridge@0/ethernet@2/ethernet-phy@0
>>>
>>> The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
>>> resultant OpenFirmware device path is independent of bus_nr=4 -- and
>>> therefore it is useless for identifying the device.
>>>
>>> In this patch we insert a dummy bus between the main sysbus and the
>>> TYPE_PXB_HOST device. Formatting child addresses is a bus class level
>>> responsibility, which is exactly what we'll use here.
>>>
>>> After the patch, the same command line fragment results in the following
>>> OpenFirmware device path in the "bootorder" fw_cfg file:
>>>
>>>    /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0
>>>
>>> The original, initial "/pci" fragment has been replaced with
>>> "/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all
>>> the
>>> necessary information. sysbus_get_fw_dev_path() formats the first node
>>> ("extra-pci-roots") as always, and the new function
>>> extra_pci_roots_bus_get_fw_dev_path() formats the second one
>>> ("pxbhost@4").
>>>
>>> Here's the comparison ("diff -u -b -U28") between the "info qtree"
>>> outputs, before and after (the hpet device is the first common line):
>>
>> BTW, did you try to boot from it :) ?
> 
> No, not yet; I'll have to write additional OVMF code for that; but the
> syntax is OK, and the information is there, so I'm fairly sure I can
> write that code. :)

On a second thought -- I *will* write the OVMF code for this now, or
tomorrow, because the syntax is *still* a little bit off. I think we'll
need:

  /extra-pci-roots@0/pci-root@4/pci-bridge@0/ethernet@2/ethernet-phy@0
                  ^^

The part I marked (ie. the @ and the UnitAddress after it) are
mandatory. I'll just hardcode the "@0" string there in QEMU, as written
above.

Thanks
Laszlo

> The SeaBIOS side I'll probably leave to you... /me ducks :)
> 
> Thanks
> Laszlo
>
Laszlo Ersek June 10, 2015, 9:29 p.m. UTC | #6
On 06/10/15 22:11, Laszlo Ersek wrote:
> On 06/10/15 21:34, Laszlo Ersek wrote:
>> On 06/10/15 21:26, Marcel Apfelbaum wrote:

>>> BTW, did you try to boot from it :) ?
>>
>> No, not yet; I'll have to write additional OVMF code for that; but the
>> syntax is OK, and the information is there, so I'm fairly sure I can
>> write that code. :)
> 
> On a second thought -- I *will* write the OVMF code for this now, or
> tomorrow, because the syntax is *still* a little bit off. I think we'll
> need:
> 
>   /extra-pci-roots@0/pci-root@4/pci-bridge@0/ethernet@2/ethernet-phy@0
>                   ^^
> 
> The part I marked (ie. the @ and the UnitAddress after it) are
> mandatory. I'll just hardcode the "@0" string there in QEMU, as written
> above.

Yes, that works; I can PXE-boot with OVMF from a virtio-net-pci NIC that
is located at the above OFW devpath.

Thanks
Laszlo
Marcel Apfelbaum June 11, 2015, 4:43 a.m. UTC | #7
On 06/10/2015 10:34 PM, Laszlo Ersek wrote:
> On 06/10/15 21:26, Marcel Apfelbaum wrote:
>> On 06/10/2015 08:07 PM, Laszlo Ersek wrote:
>>> The PXB implementation doesn't allow firmware (SeaBIOS or OVMF) to boot
>>> off devices behind the PXB. This happens because the
>>> sysbus_get_fw_dev_path() function in "hw/core/sysbus.c" doesn't have
>>> enough information to format a unique identifier for the PXB in question,
>>> and consequently the OpenFirmware device path passed down to the guest
>>> firmware in the "bootorder" fw_cfg file is unusable for identifying the
>>> boot device.
>>>
>>> For example, the command line fragment
>>>
>>>     -device pxb,id=bridge1,bus_nr=4 \
>>>     \
>>>     -netdev user,id=netdev0 \
>>>     -device e1000,netdev=netdev0,bus=bridge1,addr=2,bootindex=0
>>>
>>> results in the following "bootorder" entry:
>>>
>>>     /pci/pci-bridge@0/ethernet@2/ethernet-phy@0
>>>
>>> The initial "pci" node is formatted by sysbus_get_fw_dev_path(), and the
>>> resultant OpenFirmware device path is independent of bus_nr=4 -- and
>>> therefore it is useless for identifying the device.
>>>
>>> In this patch we insert a dummy bus between the main sysbus and the
>>> TYPE_PXB_HOST device. Formatting child addresses is a bus class level
>>> responsibility, which is exactly what we'll use here.
>>>
>>> After the patch, the same command line fragment results in the following
>>> OpenFirmware device path in the "bootorder" fw_cfg file:
>>>
>>>     /extra-pci-roots/pxbhost@4/pci-bridge@0/ethernet@2/ethernet-phy@0
>>>
>>> The original, initial "/pci" fragment has been replaced with
>>> "/extra-pci-roots/pxbhost@4", which (a) looks better, (b) provides all
>>> the
>>> necessary information. sysbus_get_fw_dev_path() formats the first node
>>> ("extra-pci-roots") as always, and the new function
>>> extra_pci_roots_bus_get_fw_dev_path() formats the second one
>>> ("pxbhost@4").
>>>
>>> Here's the comparison ("diff -u -b -U28") between the "info qtree"
>>> outputs, before and after (the hpet device is the first common line):
>>
>> BTW, did you try to boot from it :) ?
>
> No, not yet; I'll have to write additional OVMF code for that; but the
> syntax is OK, and the information is there, so I'm fairly sure I can
> write that code. :)
>
> The SeaBIOS side I'll probably leave to you... /me ducks :)
I'll take it, sure, you are doing all the work anyway :)

Thanks,
Marcel
>
> Thanks
> Laszlo
>
Marcel Apfelbaum June 11, 2015, 4:44 a.m. UTC | #8
On 06/11/2015 12:29 AM, Laszlo Ersek wrote:
> On 06/10/15 22:11, Laszlo Ersek wrote:
>> On 06/10/15 21:34, Laszlo Ersek wrote:
>>> On 06/10/15 21:26, Marcel Apfelbaum wrote:
>
>>>> BTW, did you try to boot from it :) ?
>>>
>>> No, not yet; I'll have to write additional OVMF code for that; but the
>>> syntax is OK, and the information is there, so I'm fairly sure I can
>>> write that code. :)
>>
>> On a second thought -- I *will* write the OVMF code for this now, or
>> tomorrow, because the syntax is *still* a little bit off. I think we'll
>> need:
>>
>>    /extra-pci-roots@0/pci-root@4/pci-bridge@0/ethernet@2/ethernet-phy@0
>>                    ^^
>>
>> The part I marked (ie. the @ and the UnitAddress after it) are
>> mandatory. I'll just hardcode the "@0" string there in QEMU, as written
>> above.
>
> Yes, that works; I can PXE-boot with OVMF from a virtio-net-pci NIC that
> is located at the above OFW devpath.
Good!
I'll work now on Seabios
.

Thanks,
Marcel
>
> Thanks
> Laszlo
>
diff mbox

Patch

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index c7a085d..30e93d6 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -19,6 +19,15 @@ 
 #include "qemu/error-report.h"
 #include "sysemu/numa.h"
 
+/*
+ * We'll insert a dummy bus between the main sysbus and TYPE_PXB_HOST, because
+ * the main sysbus doesn't know how to format the bus number of the extra root
+ * bus in sysbus_get_fw_dev_path(). We'll use this dummy bus just for
+ * formatting that.
+ */
+#define TYPE_EXTRA_PCI_ROOTS_DEV "extra-pci-roots"
+#define TYPE_EXTRA_PCI_ROOTS_BUS "extra-pci-roots-bus"
+
 #define TYPE_PXB_BUS "pxb-bus"
 #define PXB_BUS(obj) OBJECT_CHECK(PXBBus, (obj), TYPE_PXB_BUS)
 
@@ -93,7 +102,8 @@  static void pxb_host_class_init(ObjectClass *class, void *data)
     DeviceClass *dc = DEVICE_CLASS(class);
     PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
 
-    dc->fw_name = "pci";
+    dc->fw_name = "pxbhost";
+    dc->bus_type = TYPE_EXTRA_PCI_ROOTS_BUS;
     hc->root_bus_path = pxb_host_root_bus_path;
 }
 
@@ -151,6 +161,8 @@  static int pxb_map_irq_fn(PCIDevice *pci_dev, int pin)
 static int pxb_dev_initfn(PCIDevice *dev)
 {
     PXBDev *pxb = PXB_DEV(dev);
+    DeviceState *extra_pci_roots_dev;
+    BusState *extra_pci_roots_bus;
     DeviceState *ds, *bds;
     PCIBus *bus;
     const char *dev_name = NULL;
@@ -165,7 +177,10 @@  static int pxb_dev_initfn(PCIDevice *dev)
         dev_name = dev->qdev.id;
     }
 
-    ds = qdev_create(NULL, TYPE_PXB_HOST);
+    extra_pci_roots_dev = qdev_create(NULL, TYPE_EXTRA_PCI_ROOTS_DEV);
+    extra_pci_roots_bus = qbus_create(TYPE_EXTRA_PCI_ROOTS_BUS,
+                                      extra_pci_roots_dev, NULL);
+    ds = qdev_create(extra_pci_roots_bus, TYPE_PXB_HOST);
     bus = pci_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
 
     bus->parent_dev = dev;
@@ -221,11 +236,37 @@  static const TypeInfo pxb_dev_info = {
     .class_init    = pxb_dev_class_init,
 };
 
+static char *extra_pci_roots_bus_get_fw_dev_path(DeviceState *dev)
+{
+    return g_strdup_printf("%s@%x", qdev_fw_name(dev),
+                           pxb_bus_num(PCI_HOST_BRIDGE(dev)->bus));
+}
+
+static void extra_pci_roots_bus_class_init(ObjectClass *klass, void *data)
+{
+    BusClass *k = BUS_CLASS(klass);
+
+    k->get_fw_dev_path = extra_pci_roots_bus_get_fw_dev_path;
+}
+
+static const TypeInfo extra_pci_roots_bus_info = {
+    .name = TYPE_EXTRA_PCI_ROOTS_BUS,
+    .parent = TYPE_BUS,
+    .class_init = extra_pci_roots_bus_class_init,
+};
+
+static const TypeInfo extra_pci_roots_bus_dev_info = {
+    .name = TYPE_EXTRA_PCI_ROOTS_DEV,
+    .parent = TYPE_SYS_BUS_DEVICE,
+};
+
 static void pxb_register_types(void)
 {
     type_register_static(&pxb_bus_info);
     type_register_static(&pxb_host_info);
     type_register_static(&pxb_dev_info);
+    type_register_static(&extra_pci_roots_bus_info);
+    type_register_static(&extra_pci_roots_bus_dev_info);
 }
 
 type_init(pxb_register_types)