diff mbox

pci hotplug: rescan bridge after device hotplug

Message ID 20120523024313.GA2888@redhat.com
State Superseded
Headers show

Commit Message

Jason Baron May 23, 2012, 2:43 a.m. UTC
On Tue, May 22, 2012 at 02:21:13PM -0700, Yinghai Lu wrote:
> On Tue, May 22, 2012 at 1:11 PM, Jason Baron <jbaron@redhat.com> wrote:
> > I'm tyring to support bridge hotplug and devices below it in qemu via acpi
> > hotplug. Currently only 1 level or 32 slots are supported. By allowing for a
> > second level, we will be able to support 32^2 devices.
> >
> > If I first hotplug the bridge with no devices intially below it, the hotplug
> > code sets the bridge memory window to 0 and does not increase it when
> > subsequent devices are added below it.
> >
> > Fix this, by calling pci_rescan_bus_bridge_resize(), on the bridge directly
> > below the root to re-size all the birdge windows that may have changed.
> >
> > Signed-off-by: Jason Baron <jbaron@redhat.com>
> > ---
> >  drivers/pci/hotplug/acpiphp_glue.c |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 806c44f..8960c1e 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -792,6 +792,7 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >  {
> >        struct pci_dev *dev;
> >        struct pci_bus *bus = slot->bridge->pci_bus;
> > +       struct pci_bus *rescan_bus;
> >        struct acpiphp_func *func;
> >        int retval = 0;
> >        int num, max, pass;
> > @@ -821,6 +822,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >                }
> >        }
> >
> > +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
> > +       rescan_bus = bus;
> > +       while (rescan_bus->parent && rescan_bus->parent->self)
> > +               rescan_bus = rescan_bus->parent;
> > +       if (rescan_bus->self)
> > +               pci_rescan_bus_bridge_resize(rescan_bus->self);
> > +
> 
> No, you can not do that.  some parents bus could have other devices
> and driver could be loaded for those devices.
> so can not release resources that are used by those devices to resize
> parent bridges.
> 

hmmm...this patch also does what I want:



There appears to be a precedant for something very similar in:
drivers/pci/hotplug/pciehp_pci.c:pciehp_configure_device(), where there
is a call to 'pci_assign_unassigned_bridge_resources(), when a new
device is added...

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Yinghai Lu May 23, 2012, 3:31 a.m. UTC | #1
On Tue, May 22, 2012 at 7:43 PM, Jason Baron <jbaron@redhat.com> wrote:
> On Tue, May 22, 2012 at 02:21:13PM -0700, Yinghai Lu wrote:
>> On Tue, May 22, 2012 at 1:11 PM, Jason Baron <jbaron@redhat.com> wrote:
>> > I'm tyring to support bridge hotplug and devices below it in qemu via acpi
>> > hotplug. Currently only 1 level or 32 slots are supported. By allowing for a
>> > second level, we will be able to support 32^2 devices.
>> >
>> > If I first hotplug the bridge with no devices intially below it, the hotplug
>> > code sets the bridge memory window to 0 and does not increase it when
>> > subsequent devices are added below it.
>> >
>> > Fix this, by calling pci_rescan_bus_bridge_resize(), on the bridge directly
>> > below the root to re-size all the birdge windows that may have changed.
>> >
>> > Signed-off-by: Jason Baron <jbaron@redhat.com>
>> > ---
>> >  drivers/pci/hotplug/acpiphp_glue.c |    8 ++++++++
>> >  1 files changed, 8 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>> > index 806c44f..8960c1e 100644
>> > --- a/drivers/pci/hotplug/acpiphp_glue.c
>> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
>> > @@ -792,6 +792,7 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>> >  {
>> >        struct pci_dev *dev;
>> >        struct pci_bus *bus = slot->bridge->pci_bus;
>> > +       struct pci_bus *rescan_bus;
>> >        struct acpiphp_func *func;
>> >        int retval = 0;
>> >        int num, max, pass;
>> > @@ -821,6 +822,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>> >                }
>> >        }
>> >
>> > +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
>> > +       rescan_bus = bus;
>> > +       while (rescan_bus->parent && rescan_bus->parent->self)
>> > +               rescan_bus = rescan_bus->parent;
>> > +       if (rescan_bus->self)
>> > +               pci_rescan_bus_bridge_resize(rescan_bus->self);
>> > +
>>
>> No, you can not do that.  some parents bus could have other devices
>> and driver could be loaded for those devices.
>> so can not release resources that are used by those devices to resize
>> parent bridges.
>>
>
> hmmm...this patch also does what I want:
>
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index 806c44f..be63c72 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -821,6 +821,10 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>                }
>        }
>
> +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
> +       if (bus->self)
> +               pci_assign_unassigned_bridge_resources(bus->self);
> +
>        list_for_each_entry(func, &slot->funcs, sibling)
>                acpiphp_bus_add(func);
>
>
> There appears to be a precedant for something very similar in:
> drivers/pci/hotplug/pciehp_pci.c:pciehp_configure_device(), where there
> is a call to 'pci_assign_unassigned_bridge_resources(), when a new
> device is added...

the bus could have other devices under that bridge already.

so pci_assign_unassigned_bridge_resources() will still try to release
the bridge resource and
those devices resources.

but pciehp we could do that, because it will only have one device
under that pcie root port or downstream port.

so you can make qemu to support pcie and pciehp.

for hot add one hotplug bridge support, we have
     bus->self->is_hotplug_bridge to set, so it will have allocate
minimum resource.

so you also could try to use quirk etc to get that bit set.
please check
drivers/pci/quirks.c::
/* Allow manual resource allocation for PCI hotplug bridges
 * via pci=hpmemsize=nnM and pci=hpiosize=nnM parameters. For
 * some PCI-PCI hotplug bridges, like PLX 6254 (former HINT HB6),
 * kernel fails to allocate resources when hotplug device is
 * inserted and PCI bus is rescanned.
 */
static void __devinit quirk_hotplug_bridge(struct pci_dev *dev)
{
        dev->is_hotplug_bridge = 1;
}

DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);

but still would prefer you to make qemu to support pciehp.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu May 23, 2012, 4:07 a.m. UTC | #2
On Tue, May 22, 2012 at 8:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Tue, May 22, 2012 at 7:43 PM, Jason Baron <jbaron@redhat.com> wrote:
>> On Tue, May 22, 2012 at 02:21:13PM -0700, Yinghai Lu wrote:
>>> On Tue, May 22, 2012 at 1:11 PM, Jason Baron <jbaron@redhat.com> wrote:
>>> > I'm tyring to support bridge hotplug and devices below it in qemu via acpi
>>> > hotplug. Currently only 1 level or 32 slots are supported. By allowing for a
>>> > second level, we will be able to support 32^2 devices.
>>> >
>>> > If I first hotplug the bridge with no devices intially below it, the hotplug
>>> > code sets the bridge memory window to 0 and does not increase it when
>>> > subsequent devices are added below it.
>>> >
>>> > Fix this, by calling pci_rescan_bus_bridge_resize(), on the bridge directly
>>> > below the root to re-size all the birdge windows that may have changed.
>>> >
>>> > Signed-off-by: Jason Baron <jbaron@redhat.com>
>>> > ---
>>> >  drivers/pci/hotplug/acpiphp_glue.c |    8 ++++++++
>>> >  1 files changed, 8 insertions(+), 0 deletions(-)
>>> >
>>> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>>> > index 806c44f..8960c1e 100644
>>> > --- a/drivers/pci/hotplug/acpiphp_glue.c
>>> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
>>> > @@ -792,6 +792,7 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>>> >  {
>>> >        struct pci_dev *dev;
>>> >        struct pci_bus *bus = slot->bridge->pci_bus;
>>> > +       struct pci_bus *rescan_bus;
>>> >        struct acpiphp_func *func;
>>> >        int retval = 0;
>>> >        int num, max, pass;
>>> > @@ -821,6 +822,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>>> >                }
>>> >        }
>>> >
>>> > +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
>>> > +       rescan_bus = bus;
>>> > +       while (rescan_bus->parent && rescan_bus->parent->self)
>>> > +               rescan_bus = rescan_bus->parent;
>>> > +       if (rescan_bus->self)
>>> > +               pci_rescan_bus_bridge_resize(rescan_bus->self);
>>> > +
>>>
>>> No, you can not do that.  some parents bus could have other devices
>>> and driver could be loaded for those devices.
>>> so can not release resources that are used by those devices to resize
>>> parent bridges.
>>>
>>
>> hmmm...this patch also does what I want:
>>
>>
>> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>> index 806c44f..be63c72 100644
>> --- a/drivers/pci/hotplug/acpiphp_glue.c
>> +++ b/drivers/pci/hotplug/acpiphp_glue.c
>> @@ -821,6 +821,10 @@ static int __ref enable_device(struct acpiphp_slot *slot)
>>                }
>>        }
>>
>> +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
>> +       if (bus->self)
>> +               pci_assign_unassigned_bridge_resources(bus->self);
>> +
>>        list_for_each_entry(func, &slot->funcs, sibling)
>>                acpiphp_bus_add(func);
>>
>>
>> There appears to be a precedant for something very similar in:
>> drivers/pci/hotplug/pciehp_pci.c:pciehp_configure_device(), where there
>> is a call to 'pci_assign_unassigned_bridge_resources(), when a new
>> device is added...
>
> the bus could have other devices under that bridge already.
>
> so pci_assign_unassigned_bridge_resources() will still try to release
> the bridge resource and
> those devices resources.
>
> but pciehp we could do that, because it will only have one device
> under that pcie root port or downstream port.
>
> so you can make qemu to support pcie and pciehp.
>
> for hot add one hotplug bridge support, we have
>     bus->self->is_hotplug_bridge to set, so it will have allocate
> minimum resource.
>
> so you also could try to use quirk etc to get that bit set.
> please check
> drivers/pci/quirks.c::
> /* Allow manual resource allocation for PCI hotplug bridges
>  * via pci=hpmemsize=nnM and pci=hpiosize=nnM parameters. For
>  * some PCI-PCI hotplug bridges, like PLX 6254 (former HINT HB6),
>  * kernel fails to allocate resources when hotplug device is
>  * inserted and PCI bus is rescanned.
>  */
> static void __devinit quirk_hotplug_bridge(struct pci_dev *dev)
> {
>        dev->is_hotplug_bridge = 1;
> }
>
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
>
> but still would prefer you to make qemu to support pciehp.

another solution could be:

in qemu acpi dsdt, you could set bridge size for new added bridge.

current pbus_size_mem() will not shrink the old bridge resource size.

Thanks

Yinghai Lu
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Baron May 23, 2012, 3:53 p.m. UTC | #3
On Tue, May 22, 2012 at 09:07:54PM -0700, Yinghai Lu wrote:
> On Tue, May 22, 2012 at 8:31 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> > On Tue, May 22, 2012 at 7:43 PM, Jason Baron <jbaron@redhat.com> wrote:
> >> On Tue, May 22, 2012 at 02:21:13PM -0700, Yinghai Lu wrote:
> >>> On Tue, May 22, 2012 at 1:11 PM, Jason Baron <jbaron@redhat.com> wrote:
> >>> > I'm tyring to support bridge hotplug and devices below it in qemu via acpi
> >>> > hotplug. Currently only 1 level or 32 slots are supported. By allowing for a
> >>> > second level, we will be able to support 32^2 devices.
> >>> >
> >>> > If I first hotplug the bridge with no devices intially below it, the hotplug
> >>> > code sets the bridge memory window to 0 and does not increase it when
> >>> > subsequent devices are added below it.
> >>> >
> >>> > Fix this, by calling pci_rescan_bus_bridge_resize(), on the bridge directly
> >>> > below the root to re-size all the birdge windows that may have changed.
> >>> >
> >>> > Signed-off-by: Jason Baron <jbaron@redhat.com>
> >>> > ---
> >>> >  drivers/pci/hotplug/acpiphp_glue.c |    8 ++++++++
> >>> >  1 files changed, 8 insertions(+), 0 deletions(-)
> >>> >
> >>> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> >>> > index 806c44f..8960c1e 100644
> >>> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> >>> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> >>> > @@ -792,6 +792,7 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >>> >  {
> >>> >        struct pci_dev *dev;
> >>> >        struct pci_bus *bus = slot->bridge->pci_bus;
> >>> > +       struct pci_bus *rescan_bus;
> >>> >        struct acpiphp_func *func;
> >>> >        int retval = 0;
> >>> >        int num, max, pass;
> >>> > @@ -821,6 +822,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >>> >                }
> >>> >        }
> >>> >
> >>> > +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
> >>> > +       rescan_bus = bus;
> >>> > +       while (rescan_bus->parent && rescan_bus->parent->self)
> >>> > +               rescan_bus = rescan_bus->parent;
> >>> > +       if (rescan_bus->self)
> >>> > +               pci_rescan_bus_bridge_resize(rescan_bus->self);
> >>> > +
> >>>
> >>> No, you can not do that.  some parents bus could have other devices
> >>> and driver could be loaded for those devices.
> >>> so can not release resources that are used by those devices to resize
> >>> parent bridges.
> >>>
> >>
> >> hmmm...this patch also does what I want:
> >>
> >>
> >> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> >> index 806c44f..be63c72 100644
> >> --- a/drivers/pci/hotplug/acpiphp_glue.c
> >> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> >> @@ -821,6 +821,10 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >>                }
> >>        }
> >>
> >> +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
> >> +       if (bus->self)
> >> +               pci_assign_unassigned_bridge_resources(bus->self);
> >> +
> >>        list_for_each_entry(func, &slot->funcs, sibling)
> >>                acpiphp_bus_add(func);
> >>
> >>
> >> There appears to be a precedant for something very similar in:
> >> drivers/pci/hotplug/pciehp_pci.c:pciehp_configure_device(), where there
> >> is a call to 'pci_assign_unassigned_bridge_resources(), when a new
> >> device is added...
> >
> > the bus could have other devices under that bridge already.
> >
> > so pci_assign_unassigned_bridge_resources() will still try to release
> > the bridge resource and
> > those devices resources.
> >
> > but pciehp we could do that, because it will only have one device
> > under that pcie root port or downstream port.
> >
> > so you can make qemu to support pcie and pciehp.
> >
> > for hot add one hotplug bridge support, we have
> >     bus->self->is_hotplug_bridge to set, so it will have allocate
> > minimum resource.
> >
> > so you also could try to use quirk etc to get that bit set.
> > please check
> > drivers/pci/quirks.c::
> > /* Allow manual resource allocation for PCI hotplug bridges
> >  * via pci=hpmemsize=nnM and pci=hpiosize=nnM parameters. For
> >  * some PCI-PCI hotplug bridges, like PLX 6254 (former HINT HB6),
> >  * kernel fails to allocate resources when hotplug device is
> >  * inserted and PCI bus is rescanned.
> >  */
> > static void __devinit quirk_hotplug_bridge(struct pci_dev *dev)
> > {
> >        dev->is_hotplug_bridge = 1;
> > }
> >
> > DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> >
> > but still would prefer you to make qemu to support pciehp.
> 
> another solution could be:
> 
> in qemu acpi dsdt, you could set bridge size for new added bridge.
> 
> current pbus_size_mem() will not shrink the old bridge resource size.
> 
> Thanks
> 
> Yinghai Lu

ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
qemu side. That seems to work without any guest-side hotplug code changes. And
would seem to be more flexible than putting the limits in acpi.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu May 23, 2012, 5:16 p.m. UTC | #4
On Wed, May 23, 2012 at 8:53 AM, Jason Baron <jbaron@redhat.com> wrote:
>> > but still would prefer you to make qemu to support pciehp.
>>
>> another solution could be:
>>
>> in qemu acpi dsdt, you could set bridge size for new added bridge.
>>
>> current pbus_size_mem() will not shrink the old bridge resource size.
>
> ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
> qemu side. That seems to work without any guest-side hotplug code changes. And
> would seem to be more flexible than putting the limits in acpi.

that should be acpi asl code or SMI work. and should make sure that
range is not overlapped with resources that are used by other bridges
and pci devices.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Baron May 23, 2012, 6:44 p.m. UTC | #5
On Wed, May 23, 2012 at 10:16:06AM -0700, Yinghai Lu wrote:
> On Wed, May 23, 2012 at 8:53 AM, Jason Baron <jbaron@redhat.com> wrote:
> >> > but still would prefer you to make qemu to support pciehp.
> >>
> >> another solution could be:
> >>
> >> in qemu acpi dsdt, you could set bridge size for new added bridge.
> >>
> >> current pbus_size_mem() will not shrink the old bridge resource size.
> >
> > ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
> > qemu side. That seems to work without any guest-side hotplug code changes. And
> > would seem to be more flexible than putting the limits in acpi.
> 
> that should be acpi asl code or SMI work. and should make sure that
> range is not overlapped with resources that are used by other bridges
> and pci devices.
> 

Ok. So you are saying to define a 'Name (_CRS, ResourceTemplate() {
Memory() IO() })' block for each bridge? The acpi code I currently have,
has one 'Device()' definition for each top-level hotplug slot. Would it
be ok to have the '_CRS' apply to either one?

Also, I'm wondering where in the acpiphp code it picks up the memory/io
ranges to configure them as bridge ranges?

I also see in ACPIspec40a.pdf, section "9.11 Module Device":

"
If no _CRS object is present, OSPM will assume that the module device is
a simple container object that does not produce the resources consumed by its
child devices. In this case, OSPM will assign resources to the child devices as
if they were direct children of the module device's parent object.
"

So not sure if that applies to hotplug, but that is not what acpiphp is
doing atm.

Finally, I don't see why putting the bridge window range logic into qemu is a
bad solution. Qemu knows memory ranges consumed by various resoures,
looking at 'info mtree', so why can't qemu hand out the bridge ranges
dynamically? In that way, it can re-size the windows more optimally than
something hard-coded into acpi. So to me, it seems like a better
approach.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin May 23, 2012, 7:08 p.m. UTC | #6
On Wed, May 23, 2012 at 02:44:49PM -0400, Jason Baron wrote:
> On Wed, May 23, 2012 at 10:16:06AM -0700, Yinghai Lu wrote:
> > On Wed, May 23, 2012 at 8:53 AM, Jason Baron <jbaron@redhat.com> wrote:
> > >> > but still would prefer you to make qemu to support pciehp.
> > >>
> > >> another solution could be:
> > >>
> > >> in qemu acpi dsdt, you could set bridge size for new added bridge.
> > >>
> > >> current pbus_size_mem() will not shrink the old bridge resource size.
> > >
> > > ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
> > > qemu side. That seems to work without any guest-side hotplug code changes. And
> > > would seem to be more flexible than putting the limits in acpi.
> > 
> > that should be acpi asl code or SMI work. and should make sure that
> > range is not overlapped with resources that are used by other bridges
> > and pci devices.
> > 
> 
> Ok. So you are saying to define a 'Name (_CRS, ResourceTemplate() {
> Memory() IO() })' block for each bridge?

We are talking about hotplugging bridges.
This sounds strange.

> The acpi code I currently have,
> has one 'Device()' definition for each top-level hotplug slot. Would it
> be ok to have the '_CRS' apply to either one?
> 
> Also, I'm wondering where in the acpiphp code it picks up the memory/io
> ranges to configure them as bridge ranges?
> 
> I also see in ACPIspec40a.pdf, section "9.11 Module Device":
> 
> "
> If no _CRS object is present, OSPM will assume that the module device is
> a simple container object that does not produce the resources consumed by its
> child devices. In this case, OSPM will assign resources to the child devices as
> if they were direct children of the module device's parent object.
> "
> 
> So not sure if that applies to hotplug, but that is not what acpiphp is
> doing atm.
> 
> Finally, I don't see why putting the bridge window range logic into qemu is a
> bad solution. Qemu knows memory ranges consumed by various resoures,
> looking at 'info mtree', so why can't qemu hand out the bridge ranges
> dynamically? In that way, it can re-size the windows more optimally than
> something hard-coded into acpi. So to me, it seems like a better
> approach.
> 
> Thanks,
> 
> -Jason

Because this is not how real hardware behaves.

Fundamentally BARs are under guest control.
So while you are handing out memory for the bridges you would have to
tiptoe around guest assigned BARs.

Also what happens after reboot? bios currently
assigns memory to bridges that are present on boot.
So this makes it even messier as who assigns memory
depends on whether device is hotplugged.

Cleaner to have guest do it.

I think Windows even can even rebalance BARs
as needed. So this is just a linux bug that needs
to be fixed.
Michael S. Tsirkin May 23, 2012, 7:13 p.m. UTC | #7
On Tue, May 22, 2012 at 08:31:30PM -0700, Yinghai Lu wrote:
> but pciehp we could do that, because it will only have one device
> under that pcie root port or downstream port.
> 
> so you can make qemu to support pcie and pciehp.

This does not solve the problem I think because
there are still multiple devices behind an upstream port.

Also, swiching to pcie creates a host of other complexities
so we want to support pci bridges too.
Michael S. Tsirkin May 23, 2012, 7:20 p.m. UTC | #8
On Wed, May 23, 2012 at 10:08:41PM +0300, Michael S. Tsirkin wrote:
> On Wed, May 23, 2012 at 02:44:49PM -0400, Jason Baron wrote:
> > On Wed, May 23, 2012 at 10:16:06AM -0700, Yinghai Lu wrote:
> > > On Wed, May 23, 2012 at 8:53 AM, Jason Baron <jbaron@redhat.com> wrote:
> > > >> > but still would prefer you to make qemu to support pciehp.
> > > >>
> > > >> another solution could be:
> > > >>
> > > >> in qemu acpi dsdt, you could set bridge size for new added bridge.
> > > >>
> > > >> current pbus_size_mem() will not shrink the old bridge resource size.
> > > >
> > > > ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
> > > > qemu side. That seems to work without any guest-side hotplug code changes. And
> > > > would seem to be more flexible than putting the limits in acpi.
> > > 
> > > that should be acpi asl code or SMI work. and should make sure that
> > > range is not overlapped with resources that are used by other bridges
> > > and pci devices.
> > > 
> > 
> > Ok. So you are saying to define a 'Name (_CRS, ResourceTemplate() {
> > Memory() IO() })' block for each bridge?
> 
> We are talking about hotplugging bridges.
> This sounds strange.

However if it works it is certainly simple.

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu May 23, 2012, 8:31 p.m. UTC | #9
On Wed, May 23, 2012 at 11:44 AM, Jason Baron <jbaron@redhat.com> wrote:
> On Wed, May 23, 2012 at 10:16:06AM -0700, Yinghai Lu wrote:
>> On Wed, May 23, 2012 at 8:53 AM, Jason Baron <jbaron@redhat.com> wrote:
>> >> > but still would prefer you to make qemu to support pciehp.
>> >>
>> >> another solution could be:
>> >>
>> >> in qemu acpi dsdt, you could set bridge size for new added bridge.
>> >>
>> >> current pbus_size_mem() will not shrink the old bridge resource size.
>> >
>> > ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
>> > qemu side. That seems to work without any guest-side hotplug code changes. And
>> > would seem to be more flexible than putting the limits in acpi.
>>
>> that should be acpi asl code or SMI work. and should make sure that
>> range is not overlapped with resources that are used by other bridges
>> and pci devices.
>>
>
> Ok. So you are saying to define a 'Name (_CRS, ResourceTemplate() {
> Memory() IO() })' block for each bridge? The acpi code I currently have,
> has one 'Device()' definition for each top-level hotplug slot. Would it
> be ok to have the '_CRS' apply to either one?
>
> Also, I'm wondering where in the acpiphp code it picks up the memory/io
> ranges to configure them as bridge ranges?
>
> I also see in ACPIspec40a.pdf, section "9.11 Module Device":
>
> "
> If no _CRS object is present, OSPM will assume that the module device is
> a simple container object that does not produce the resources consumed by its
> child devices. In this case, OSPM will assign resources to the child devices as
> if they were direct children of the module device's parent object.
> "
>
> So not sure if that applies to hotplug, but that is not what acpiphp is
> doing atm.

_CRS is only used for pci root bus.

I mean in you _PS0 in your socket.

>
> Finally, I don't see why putting the bridge window range logic into qemu is a
> bad solution. Qemu knows memory ranges consumed by various resoures,
> looking at 'info mtree', so why can't qemu hand out the bridge ranges
> dynamically? In that way, it can re-size the windows more optimally than
> something hard-coded into acpi. So to me, it seems like a better
> approach.

acpi could be dynamically too.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin May 23, 2012, 8:49 p.m. UTC | #10
On Wed, May 23, 2012 at 01:31:23PM -0700, Yinghai Lu wrote:
> On Wed, May 23, 2012 at 11:44 AM, Jason Baron <jbaron@redhat.com> wrote:
> > On Wed, May 23, 2012 at 10:16:06AM -0700, Yinghai Lu wrote:
> >> On Wed, May 23, 2012 at 8:53 AM, Jason Baron <jbaron@redhat.com> wrote:
> >> >> > but still would prefer you to make qemu to support pciehp.
> >> >>
> >> >> another solution could be:
> >> >>
> >> >> in qemu acpi dsdt, you could set bridge size for new added bridge.
> >> >>
> >> >> current pbus_size_mem() will not shrink the old bridge resource size.
> >> >
> >> > ok, I also tried hard-wiring the bridge io/mem base and limit registers on the
> >> > qemu side. That seems to work without any guest-side hotplug code changes. And
> >> > would seem to be more flexible than putting the limits in acpi.
> >>
> >> that should be acpi asl code or SMI work. and should make sure that
> >> range is not overlapped with resources that are used by other bridges
> >> and pci devices.
> >>
> >
> > Ok. So you are saying to define a 'Name (_CRS, ResourceTemplate() {
> > Memory() IO() })' block for each bridge? The acpi code I currently have,
> > has one 'Device()' definition for each top-level hotplug slot. Would it
> > be ok to have the '_CRS' apply to either one?
> >
> > Also, I'm wondering where in the acpiphp code it picks up the memory/io
> > ranges to configure them as bridge ranges?
> >
> > I also see in ACPIspec40a.pdf, section "9.11 Module Device":
> >
> > "
> > If no _CRS object is present, OSPM will assume that the module device is
> > a simple container object that does not produce the resources consumed by its
> > child devices. In this case, OSPM will assign resources to the child devices as
> > if they were direct children of the module device's parent object.
> > "
> >
> > So not sure if that applies to hotplug, but that is not what acpiphp is
> > doing atm.
> 
> _CRS is only used for pci root bus.
> 
> I mean in you _PS0 in your socket.
> 
> >
> > Finally, I don't see why putting the bridge window range logic into qemu is a
> > bad solution. Qemu knows memory ranges consumed by various resoures,
> > looking at 'info mtree', so why can't qemu hand out the bridge ranges
> > dynamically? In that way, it can re-size the windows more optimally than
> > something hard-coded into acpi. So to me, it seems like a better
> > approach.
> 
> acpi could be dynamically too.
> 
> Yinghai

There isn't much you can do besides loading a new table, is there?
Jason Baron May 23, 2012, 8:52 p.m. UTC | #11
On Tue, May 22, 2012 at 08:31:30PM -0700, Yinghai Lu wrote:
> On Tue, May 22, 2012 at 7:43 PM, Jason Baron <jbaron@redhat.com> wrote:
> > On Tue, May 22, 2012 at 02:21:13PM -0700, Yinghai Lu wrote:
> >> On Tue, May 22, 2012 at 1:11 PM, Jason Baron <jbaron@redhat.com> wrote:
> >> > I'm tyring to support bridge hotplug and devices below it in qemu via acpi
> >> > hotplug. Currently only 1 level or 32 slots are supported. By allowing for a
> >> > second level, we will be able to support 32^2 devices.
> >> >
> >> > If I first hotplug the bridge with no devices intially below it, the hotplug
> >> > code sets the bridge memory window to 0 and does not increase it when
> >> > subsequent devices are added below it.
> >> >
> >> > Fix this, by calling pci_rescan_bus_bridge_resize(), on the bridge directly
> >> > below the root to re-size all the birdge windows that may have changed.
> >> >
> >> > Signed-off-by: Jason Baron <jbaron@redhat.com>
> >> > ---
> >> >  drivers/pci/hotplug/acpiphp_glue.c |    8 ++++++++
> >> >  1 files changed, 8 insertions(+), 0 deletions(-)
> >> >
> >> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> >> > index 806c44f..8960c1e 100644
> >> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> >> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> >> > @@ -792,6 +792,7 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >> >  {
> >> >        struct pci_dev *dev;
> >> >        struct pci_bus *bus = slot->bridge->pci_bus;
> >> > +       struct pci_bus *rescan_bus;
> >> >        struct acpiphp_func *func;
> >> >        int retval = 0;
> >> >        int num, max, pass;
> >> > @@ -821,6 +822,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >> >                }
> >> >        }
> >> >
> >> > +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
> >> > +       rescan_bus = bus;
> >> > +       while (rescan_bus->parent && rescan_bus->parent->self)
> >> > +               rescan_bus = rescan_bus->parent;
> >> > +       if (rescan_bus->self)
> >> > +               pci_rescan_bus_bridge_resize(rescan_bus->self);
> >> > +
> >>
> >> No, you can not do that.  some parents bus could have other devices
> >> and driver could be loaded for those devices.
> >> so can not release resources that are used by those devices to resize
> >> parent bridges.
> >>
> >
> > hmmm...this patch also does what I want:
> >
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 806c44f..be63c72 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -821,6 +821,10 @@ static int __ref enable_device(struct acpiphp_slot *slot)
> >                }
> >        }
> >
> > +       /* Ensure we rescan/setup a bridge for new devs hanging off of it */
> > +       if (bus->self)
> > +               pci_assign_unassigned_bridge_resources(bus->self);
> > +
> >        list_for_each_entry(func, &slot->funcs, sibling)
> >                acpiphp_bus_add(func);
> >
> >
> > There appears to be a precedant for something very similar in:
> > drivers/pci/hotplug/pciehp_pci.c:pciehp_configure_device(), where there
> > is a call to 'pci_assign_unassigned_bridge_resources(), when a new
> > device is added...
> 
> the bus could have other devices under that bridge already.
> 
> so pci_assign_unassigned_bridge_resources() will still try to release
> the bridge resource and
> those devices resources.
> 
> but pciehp we could do that, because it will only have one device
> under that pcie root port or downstream port.
> 
> so you can make qemu to support pcie and pciehp.
> 
> for hot add one hotplug bridge support, we have
>      bus->self->is_hotplug_bridge to set, so it will have allocate
> minimum resource.
> 
> so you also could try to use quirk etc to get that bit set.
> please check
> drivers/pci/quirks.c::
> /* Allow manual resource allocation for PCI hotplug bridges
>  * via pci=hpmemsize=nnM and pci=hpiosize=nnM parameters. For
>  * some PCI-PCI hotplug bridges, like PLX 6254 (former HINT HB6),
>  * kernel fails to allocate resources when hotplug device is
>  * inserted and PCI bus is rescanned.
>  */
> static void __devinit quirk_hotplug_bridge(struct pci_dev *dev)
> {
>         dev->is_hotplug_bridge = 1;
> }
> 
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> 
> but still would prefer you to make qemu to support pciehp.
> 

Hi,

Ok, I've also verified that quirk approach resolves the issue for me as
well. It might not be a bad solution, especially if we anticipate moving
to pciehp. We can also always change bridge vendor id to not hit the
quirk, if we choose to implement more complex guest, or acpi fixes. The
defaults for i/o and mem seem pretty sane too:


#define DEFAULT_HOTPLUG_IO_SIZE         (256)
#define DEFAULT_HOTPLUG_MEM_SIZE        (2*1024*1024)

And further can be set on the guest command line via:

/* pci=hpmemsize=nnM,hpiosize=nn can override this */

So I don't see any down-side really to the quirk approach (which is a
quite simple 1-line patch to the guest), even if it is not a long-term
solution.

Thanks,

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu May 23, 2012, 8:53 p.m. UTC | #12
On Wed, May 23, 2012 at 12:08 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> Fundamentally BARs are under guest control.
> So while you are handing out memory for the bridges you would have to
> tiptoe around guest assigned BARs.
>
> Also what happens after reboot? bios currently
> assigns memory to bridges that are present on boot.
> So this makes it even messier as who assigns memory
> depends on whether device is hotplugged.
>
> Cleaner to have guest do it.

assume here  "guest' mean: BIOS + kernel + (acpi dsl code from BIOS).

>
> I think Windows even can even rebalance BARs
> as needed. So this is just a linux bug that needs
> to be fixed.

Now pciehp in Linux will resize the bridge.
but not for acpiphp. reason: acpiphp could handle more complicated case.
does not mean new add card have to been under one bridge existing.

slot just on root bus, you could just add device or bridge in it.
and the bridge could have other hotlpug slots.

in that case, if one of children slot already have card
installed/driver loaded, later
after another card is put in another child slot, we can not simple
resize the bridge in parent slot.
because other child devices is operating...

solution will be
1. add quirks to set is_hotplug_bridge for apciphp....
2. or probe_resource in busn_alloc patchset, that maybe still need to
extend to support alignment.

BTW, I was thinking Linux is the ONLY one that will scratch bridge BAR
if that is big enough for children devices.
Do you have evidence that other os does that?

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin May 23, 2012, 9:18 p.m. UTC | #13
On Wed, May 23, 2012 at 01:53:10PM -0700, Yinghai Lu wrote:
> On Wed, May 23, 2012 at 12:08 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > Fundamentally BARs are under guest control.
> > So while you are handing out memory for the bridges you would have to
> > tiptoe around guest assigned BARs.
> >
> > Also what happens after reboot? bios currently
> > assigns memory to bridges that are present on boot.
> > So this makes it even messier as who assigns memory
> > depends on whether device is hotplugged.
> >
> > Cleaner to have guest do it.
> 
> assume here  "guest' mean: BIOS + kernel + (acpi dsl code from BIOS).
> 
> >
> > I think Windows even can even rebalance BARs
> > as needed. So this is just a linux bug that needs
> > to be fixed.
> 
> Now pciehp in Linux will resize the bridge.
> but not for acpiphp. reason: acpiphp could handle more complicated case.
> does not mean new add card have to been under one bridge existing.

expess has this too. It just needs a bigger hierarchy:
upsteam ports can have many downstream ports behind them.


> slot just on root bus, you could just add device or bridge in it.
> and the bridge could have other hotlpug slots.
> 
> in that case, if one of children slot already have card
> installed/driver loaded, later
> after another card is put in another child slot, we can not simple
> resize the bridge in parent slot.
> because other child devices is operating...
> 
> solution will be
> 1. add quirks to set is_hotplug_bridge for apciphp....
> 2. or probe_resource in busn_alloc patchset, that maybe still need to
> extend to support alignment.
> 
> BTW, I was thinking Linux is the ONLY one that will scratch bridge BAR
> if that is big enough for children devices.
> Do you have evidence that other os does that?
> 
> Yinghai

I saw some docs on the net but do not have them handy. Google it.

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu May 24, 2012, midnight UTC | #14
On Wed, May 23, 2012 at 1:52 PM, Jason Baron <jbaron@redhat.com> wrote:
> Ok, I've also verified that quirk approach resolves the issue for me as
> well. It might not be a bad solution, especially if we anticipate moving
> to pciehp. We can also always change bridge vendor id to not hit the
> quirk, if we choose to implement more complex guest, or acpi fixes. The
> defaults for i/o and mem seem pretty sane too:
>
>
> #define DEFAULT_HOTPLUG_IO_SIZE         (256)
> #define DEFAULT_HOTPLUG_MEM_SIZE        (2*1024*1024)
>
> And further can be set on the guest command line via:
>
> /* pci=hpmemsize=nnM,hpiosize=nn can override this */
>
> So I don't see any down-side really to the quirk approach (which is a
> quite simple 1-line patch to the guest), even if it is not a long-term
> solution.

Good. please check attached patch,
it should auto set that bit for apciphp when complex slot are
supported by platform.

next, we still need to extend probe_resource() to support
mmio/pref_mmio extension.
that is for when you have a lot slots on hotplug bridge.

Thanks

Yinghai
Jason Baron May 24, 2012, 1:43 p.m. UTC | #15
On Wed, May 23, 2012 at 05:00:05PM -0700, Yinghai Lu wrote:
> On Wed, May 23, 2012 at 1:52 PM, Jason Baron <jbaron@redhat.com> wrote:
> > Ok, I've also verified that quirk approach resolves the issue for me as
> > well. It might not be a bad solution, especially if we anticipate moving
> > to pciehp. We can also always change bridge vendor id to not hit the
> > quirk, if we choose to implement more complex guest, or acpi fixes. The
> > defaults for i/o and mem seem pretty sane too:
> >
> >
> > #define DEFAULT_HOTPLUG_IO_SIZE         (256)
> > #define DEFAULT_HOTPLUG_MEM_SIZE        (2*1024*1024)
> >
> > And further can be set on the guest command line via:
> >
> > /* pci=hpmemsize=nnM,hpiosize=nn can override this */
> >
> > So I don't see any down-side really to the quirk approach (which is a
> > quite simple 1-line patch to the guest), even if it is not a long-term
> > solution.
> 
> Good. please check attached patch,
> it should auto set that bit for apciphp when complex slot are
> supported by platform.
> 
> next, we still need to extend probe_resource() to support
> mmio/pref_mmio extension.
> that is for when you have a lot slots on hotplug bridge.
> 
> Thanks
> 
> Yinghai

Yes, I can comfirm that the patch works with my qemu hotplug bridge
code. Thanks!

Acked-by: Jason Baron <jbaron@redhat.com>

-Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 806c44f..be63c72 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -821,6 +821,10 @@  static int __ref enable_device(struct acpiphp_slot *slot)
 		}
 	}
 
+	/* Ensure we rescan/setup a bridge for new devs hanging off of it */
+	if (bus->self)
+		pci_assign_unassigned_bridge_resources(bus->self);
+
 	list_for_each_entry(func, &slot->funcs, sibling)
 		acpiphp_bus_add(func);