diff mbox

[12/12] pseries: Generate unique LIOBNs for PCI host bridges

Message ID 1352774820-22804-13-git-send-email-david@gibson.dropbear.id.au
State New
Headers show

Commit Message

David Gibson Nov. 13, 2012, 2:47 a.m. UTC
From: Alexey Kardashevskiy <aik@ozlabs.ru>

In future (with VFIO) we will have multiple PCI host bridges on
pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
derive these from the pci domain number, but the whole notion of
domain numbers on the qemu side is bogus and in any case they're not
actually uniquely allocated at this point.

This patch, therefore uses a simple sequence counter to generate
unique LIOBNs for PCI host bridges.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/spapr_pci.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Alexander Graf Nov. 19, 2012, 4:34 p.m. UTC | #1
On 13.11.2012, at 03:47, David Gibson wrote:

> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> In future (with VFIO) we will have multiple PCI host bridges on
> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> derive these from the pci domain number, but the whole notion of
> domain numbers on the qemu side is bogus and in any case they're not
> actually uniquely allocated at this point.
> 
> This patch, therefore uses a simple sequence counter to generate
> unique LIOBNs for PCI host bridges.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

I don't really like the idea of having a global variable just because our domain ID generation seems to not work as expected. Michael, any comments here?


Alex

> ---
> hw/spapr_pci.c |    3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/spapr_pci.c b/hw/spapr_pci.c
> index 3c5b855..f6544d7 100644
> --- a/hw/spapr_pci.c
> +++ b/hw/spapr_pci.c
> @@ -521,6 +521,7 @@ static int spapr_phb_init(SysBusDevice *s)
>     char *namebuf;
>     int i;
>     PCIBus *bus;
> +    static int phbnum;
> 
>     sphb->dtbusname = g_strdup_printf("pci@%" PRIx64, sphb->buid);
>     namebuf = alloca(strlen(sphb->dtbusname) + 32);
> @@ -572,7 +573,7 @@ static int spapr_phb_init(SysBusDevice *s)
>                            PCI_DEVFN(0, 0), PCI_NUM_PINS);
>     phb->bus = bus;
> 
> -    sphb->dma_liobn = SPAPR_PCI_BASE_LIOBN | (pci_find_domain(bus) << 16);
> +    sphb->dma_liobn = SPAPR_PCI_BASE_LIOBN | (++phbnum << 16);
>     sphb->dma_window_start = 0;
>     sphb->dma_window_size = 0x40000000;
>     sphb->dma = spapr_tce_new_dma_context(sphb->dma_liobn, sphb->dma_window_size);
> -- 
> 1.7.10.4
>
David Gibson Nov. 19, 2012, 10:51 p.m. UTC | #2
On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> 
> On 13.11.2012, at 03:47, David Gibson wrote:
> 
> > From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > 
> > In future (with VFIO) we will have multiple PCI host bridges on
> > pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > derive these from the pci domain number, but the whole notion of
> > domain numbers on the qemu side is bogus and in any case they're not
> > actually uniquely allocated at this point.
> > 
> > This patch, therefore uses a simple sequence counter to generate
> > unique LIOBNs for PCI host bridges.
> > 
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> I don't really like the idea of having a global variable just
> because our domain ID generation seems to not work as
> expected. Michael, any comments here?

Well, the patch I sent which changed domain id generation was
ignored.  In any case, as I said, the whole concept of domain numbers
makes no sense on the qemu side, so I don't think increasing reliance
on them by using them here is a good idea.

It would be conceptually nicer to derive the liobn from the buid, but
that would rely on the buid's being unique in the low 32-bits, which
is true in practice, but seems risky to rely on.
Alexander Graf Nov. 20, 2012, 9:27 a.m. UTC | #3
On 19.11.2012, at 23:51, David Gibson wrote:

> On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
>> 
>> On 13.11.2012, at 03:47, David Gibson wrote:
>> 
>>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> 
>>> In future (with VFIO) we will have multiple PCI host bridges on
>>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
>>> derive these from the pci domain number, but the whole notion of
>>> domain numbers on the qemu side is bogus and in any case they're not
>>> actually uniquely allocated at this point.
>>> 
>>> This patch, therefore uses a simple sequence counter to generate
>>> unique LIOBNs for PCI host bridges.
>>> 
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>> 
>> I don't really like the idea of having a global variable just
>> because our domain ID generation seems to not work as
>> expected. Michael, any comments here?
> 
> Well, the patch I sent which changed domain id generation was
> ignored.  In any case, as I said, the whole concept of domain numbers

Michael?

> makes no sense on the qemu side, so I don't think increasing reliance
> on them by using them here is a good idea.
> 
> It would be conceptually nicer to derive the liobn from the buid, but
> that would rely on the buid's being unique in the low 32-bits, which
> is true in practice, but seems risky to rely on.

Well, there has to be some uniqueness from the guest's POV already, no?


Alex
Michael S. Tsirkin Nov. 20, 2012, 12:26 p.m. UTC | #4
On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> 
> On 19.11.2012, at 23:51, David Gibson wrote:
> 
> > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> >> 
> >> On 13.11.2012, at 03:47, David Gibson wrote:
> >> 
> >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>> 
> >>> In future (with VFIO) we will have multiple PCI host bridges on
> >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> >>> derive these from the pci domain number, but the whole notion of
> >>> domain numbers on the qemu side is bogus and in any case they're not
> >>> actually uniquely allocated at this point.
> >>> 
> >>> This patch, therefore uses a simple sequence counter to generate
> >>> unique LIOBNs for PCI host bridges.
> >>> 
> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >> 
> >> I don't really like the idea of having a global variable just
> >> because our domain ID generation seems to not work as
> >> expected. Michael, any comments here?
> > 
> > Well, the patch I sent which changed domain id generation was
> > ignored.  In any case, as I said, the whole concept of domain numbers
> 
> Michael?

This is user visible, right?
So IMHO we should have the user specify LIOBN through a property,
rather than assign what's essentially a random value.

For ACPI, domain number can go into _SEG method -
this is what linux seems to use to assign domain numbers
so if we do this things match.


> > makes no sense on the qemu side, so I don't think increasing reliance
> > on them by using them here is a good idea.
> > 
> > It would be conceptually nicer to derive the liobn from the buid, but
> > that would rely on the buid's being unique in the low 32-bits, which
> > is true in practice, but seems risky to rely on.
> 
> Well, there has to be some uniqueness from the guest's POV already, no?
> 
> 
> Alex
David Gibson Nov. 21, 2012, 12:57 a.m. UTC | #5
On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > 
> > On 19.11.2012, at 23:51, David Gibson wrote:
> > 
> > > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > >> 
> > >> On 13.11.2012, at 03:47, David Gibson wrote:
> > >> 
> > >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > >>> 
> > >>> In future (with VFIO) we will have multiple PCI host bridges on
> > >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > >>> derive these from the pci domain number, but the whole notion of
> > >>> domain numbers on the qemu side is bogus and in any case they're not
> > >>> actually uniquely allocated at this point.
> > >>> 
> > >>> This patch, therefore uses a simple sequence counter to generate
> > >>> unique LIOBNs for PCI host bridges.
> > >>> 
> > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > >> 
> > >> I don't really like the idea of having a global variable just
> > >> because our domain ID generation seems to not work as
> > >> expected. Michael, any comments here?
> > > 
> > > Well, the patch I sent which changed domain id generation was
> > > ignored.  In any case, as I said, the whole concept of domain numbers
> > 
> > Michael?
> 
> This is user visible, right?
> So IMHO we should have the user specify LIOBN through a property,
> rather than assign what's essentially a random value.

Well, I can implement an override through a property, which could be
useful in some circumstances.  But we still need to have qemu generate
unique defaults, rather than forcing it to be specified in every case.

> For ACPI, domain number can go into _SEG method -
> this is what linux seems to use to assign domain numbers
> so if we do this things match.

I don't follow.
David Gibson Nov. 21, 2012, 5 a.m. UTC | #6
On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> 
> On 19.11.2012, at 23:51, David Gibson wrote:
> 
> > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> >> 
> >> On 13.11.2012, at 03:47, David Gibson wrote:
> >> 
> >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>> 
> >>> In future (with VFIO) we will have multiple PCI host bridges on
> >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> >>> derive these from the pci domain number, but the whole notion of
> >>> domain numbers on the qemu side is bogus and in any case they're not
> >>> actually uniquely allocated at this point.
> >>> 
> >>> This patch, therefore uses a simple sequence counter to generate
> >>> unique LIOBNs for PCI host bridges.
> >>> 
> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >> 
> >> I don't really like the idea of having a global variable just
> >> because our domain ID generation seems to not work as
> >> expected. Michael, any comments here?
> > 
> > Well, the patch I sent which changed domain id generation was
> > ignored.  In any case, as I said, the whole concept of domain numbers
> 
> Michael?
> 
> > makes no sense on the qemu side, so I don't think increasing reliance
> > on them by using them here is a good idea.
> > 
> > It would be conceptually nicer to derive the liobn from the buid, but
> > that would rely on the buid's being unique in the low 32-bits, which
> > is true in practice, but seems risky to rely on.
> 
> Well, there has to be some uniqueness from the guest's POV already,
> no?

Yes, the BUIDs are unique, but they are 64-bit, whereas the LIOBN is
only 32-bit.
Alexander Graf Nov. 21, 2012, 10:09 a.m. UTC | #7
On 21.11.2012, at 06:00, David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
>> 
>> On 19.11.2012, at 23:51, David Gibson wrote:
>> 
>>> On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
>>>> 
>>>> On 13.11.2012, at 03:47, David Gibson wrote:
>>>> 
>>>>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>> 
>>>>> In future (with VFIO) we will have multiple PCI host bridges on
>>>>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
>>>>> derive these from the pci domain number, but the whole notion of
>>>>> domain numbers on the qemu side is bogus and in any case they're not
>>>>> actually uniquely allocated at this point.
>>>>> 
>>>>> This patch, therefore uses a simple sequence counter to generate
>>>>> unique LIOBNs for PCI host bridges.
>>>>> 
>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>>> 
>>>> I don't really like the idea of having a global variable just
>>>> because our domain ID generation seems to not work as
>>>> expected. Michael, any comments here?
>>> 
>>> Well, the patch I sent which changed domain id generation was
>>> ignored.  In any case, as I said, the whole concept of domain numbers
>> 
>> Michael?
>> 
>>> makes no sense on the qemu side, so I don't think increasing reliance
>>> on them by using them here is a good idea.
>>> 
>>> It would be conceptually nicer to derive the liobn from the buid, but
>>> that would rely on the buid's being unique in the low 32-bits, which
>>> is true in practice, but seems risky to rely on.
>> 
>> Well, there has to be some uniqueness from the guest's POV already,
>> no?
> 
> Yes, the BUIDs are unique, but they are 64-bit, whereas the LIOBN is
> only 32-bit.

Tricky. Michael, any ideas?

Alex

> 
> -- 
> David Gibson            | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au    | minimalist, thank you.  NOT _the_ _other_
>                | _way_ _around_!
> http://www.ozlabs.org/~dgibson
Michael S. Tsirkin Nov. 21, 2012, 11:34 a.m. UTC | #8
On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > 
> > > On 19.11.2012, at 23:51, David Gibson wrote:
> > > 
> > > > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > >> 
> > > >> On 13.11.2012, at 03:47, David Gibson wrote:
> > > >> 
> > > >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > >>> 
> > > >>> In future (with VFIO) we will have multiple PCI host bridges on
> > > >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > >>> derive these from the pci domain number, but the whole notion of
> > > >>> domain numbers on the qemu side is bogus and in any case they're not
> > > >>> actually uniquely allocated at this point.
> > > >>> 
> > > >>> This patch, therefore uses a simple sequence counter to generate
> > > >>> unique LIOBNs for PCI host bridges.
> > > >>> 
> > > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > >> 
> > > >> I don't really like the idea of having a global variable just
> > > >> because our domain ID generation seems to not work as
> > > >> expected. Michael, any comments here?
> > > > 
> > > > Well, the patch I sent which changed domain id generation was
> > > > ignored.  In any case, as I said, the whole concept of domain numbers
> > > 
> > > Michael?
> > 
> > This is user visible, right?
> > So IMHO we should have the user specify LIOBN through a property,
> > rather than assign what's essentially a random value.
> 
> Well, I can implement an override through a property, which could be
> useful in some circumstances.  But we still need to have qemu generate
> unique defaults, rather than forcing it to be specified in every case.

I don't see why.
And if you want automatic defaults then they need to be generated in a
way that does not depend on implementation detail such as order of
device initialization.

> > For ACPI, domain number can go into _SEG method -
> > this is what linux seems to use to assign domain numbers
> > so if we do this things match.
> 
> I don't follow.

I'm saying this is not a PPC specific issue.

> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
Michael S. Tsirkin Nov. 21, 2012, 11:40 a.m. UTC | #9
On Wed, Nov 21, 2012 at 11:09:50AM +0100, Alexander Graf wrote:
> 
> 
> On 21.11.2012, at 06:00, David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> >> 
> >> On 19.11.2012, at 23:51, David Gibson wrote:
> >> 
> >>> On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> >>>> 
> >>>> On 13.11.2012, at 03:47, David Gibson wrote:
> >>>> 
> >>>>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>>> 
> >>>>> In future (with VFIO) we will have multiple PCI host bridges on
> >>>>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> >>>>> derive these from the pci domain number, but the whole notion of
> >>>>> domain numbers on the qemu side is bogus and in any case they're not
> >>>>> actually uniquely allocated at this point.
> >>>>> 
> >>>>> This patch, therefore uses a simple sequence counter to generate
> >>>>> unique LIOBNs for PCI host bridges.
> >>>>> 
> >>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>>> 
> >>>> I don't really like the idea of having a global variable just
> >>>> because our domain ID generation seems to not work as
> >>>> expected. Michael, any comments here?
> >>> 
> >>> Well, the patch I sent which changed domain id generation was
> >>> ignored.  In any case, as I said, the whole concept of domain numbers
> >> 
> >> Michael?
> >> 
> >>> makes no sense on the qemu side, so I don't think increasing reliance
> >>> on them by using them here is a good idea.
> >>> 
> >>> It would be conceptually nicer to derive the liobn from the buid, but
> >>> that would rely on the buid's being unique in the low 32-bits, which
> >>> is true in practice, but seems risky to rely on.
> >> 
> >> Well, there has to be some uniqueness from the guest's POV already,
> >> no?
> > 
> > Yes, the BUIDs are unique, but they are 64-bit, whereas the LIOBN is
> > only 32-bit.
> 
> Tricky. Michael, any ideas?
> 
> Alex

Whatever scheme we come up with, becomes part of guest ABI
that we have to maintain.
So rather than maintain it, I think it's easiest to require full
specification from users, and verify uniqueness.

> > 
> > -- 
> > David Gibson            | I'll have my music baroque, and my code
> > david AT gibson.dropbear.id.au    | minimalist, thank you.  NOT _the_ _other_
> >                | _way_ _around_!
> > http://www.ozlabs.org/~dgibson
David Gibson Nov. 21, 2012, 12:36 p.m. UTC | #10
On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > 
> > > > On 19.11.2012, at 23:51, David Gibson wrote:
> > > > 
> > > > > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > >> 
> > > > >> On 13.11.2012, at 03:47, David Gibson wrote:
> > > > >> 
> > > > >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > >>> 
> > > > >>> In future (with VFIO) we will have multiple PCI host bridges on
> > > > >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > >>> derive these from the pci domain number, but the whole notion of
> > > > >>> domain numbers on the qemu side is bogus and in any case they're not
> > > > >>> actually uniquely allocated at this point.
> > > > >>> 
> > > > >>> This patch, therefore uses a simple sequence counter to generate
> > > > >>> unique LIOBNs for PCI host bridges.
> > > > >>> 
> > > > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > >> 
> > > > >> I don't really like the idea of having a global variable just
> > > > >> because our domain ID generation seems to not work as
> > > > >> expected. Michael, any comments here?
> > > > > 
> > > > > Well, the patch I sent which changed domain id generation was
> > > > > ignored.  In any case, as I said, the whole concept of domain numbers
> > > > 
> > > > Michael?
> > > 
> > > This is user visible, right?
> > > So IMHO we should have the user specify LIOBN through a property,
> > > rather than assign what's essentially a random value.
> > 
> > Well, I can implement an override through a property, which could be
> > useful in some circumstances.  But we still need to have qemu generate
> > unique defaults, rather than forcing it to be specified in every case.
> 
> I don't see why.
> And if you want automatic defaults then they need to be generated in a
> way that does not depend on implementation detail such as order of
> device initialization.

Because requiring explicit unique liobns to be supplied whenever there
is more than one PHB is horrible for usability.
Michael S. Tsirkin Nov. 21, 2012, 1:13 p.m. UTC | #11
On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > > On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > > 
> > > > > On 19.11.2012, at 23:51, David Gibson wrote:
> > > > > 
> > > > > > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > > >> 
> > > > > >> On 13.11.2012, at 03:47, David Gibson wrote:
> > > > > >> 
> > > > > >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > > >>> 
> > > > > >>> In future (with VFIO) we will have multiple PCI host bridges on
> > > > > >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > > >>> derive these from the pci domain number, but the whole notion of
> > > > > >>> domain numbers on the qemu side is bogus and in any case they're not
> > > > > >>> actually uniquely allocated at this point.
> > > > > >>> 
> > > > > >>> This patch, therefore uses a simple sequence counter to generate
> > > > > >>> unique LIOBNs for PCI host bridges.
> > > > > >>> 
> > > > > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > >> 
> > > > > >> I don't really like the idea of having a global variable just
> > > > > >> because our domain ID generation seems to not work as
> > > > > >> expected. Michael, any comments here?
> > > > > > 
> > > > > > Well, the patch I sent which changed domain id generation was
> > > > > > ignored.  In any case, as I said, the whole concept of domain numbers
> > > > > 
> > > > > Michael?
> > > > 
> > > > This is user visible, right?
> > > > So IMHO we should have the user specify LIOBN through a property,
> > > > rather than assign what's essentially a random value.
> > > 
> > > Well, I can implement an override through a property, which could be
> > > useful in some circumstances.  But we still need to have qemu generate
> > > unique defaults, rather than forcing it to be specified in every case.
> > 
> > I don't see why.
> > And if you want automatic defaults then they need to be generated in a
> > way that does not depend on implementation detail such as order of
> > device initialization.
> 
> Because requiring explicit unique liobns to be supplied whenever there
> is more than one PHB is horrible for usability.

We should make simple things simple and complex things possible.
More than one PHB seems like an advanced feature
so I don't see why it needs to be made very easy to use.
With an appropriate error message it should not be
too hard for users to figure it out.

> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
David Gibson Nov. 21, 2012, 1:21 p.m. UTC | #12
On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > > On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > > > On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > > > 
> > > > > > On 19.11.2012, at 23:51, David Gibson wrote:
> > > > > > 
> > > > > > > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > > > >> 
> > > > > > >> On 13.11.2012, at 03:47, David Gibson wrote:
> > > > > > >> 
> > > > > > >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > > > >>> 
> > > > > > >>> In future (with VFIO) we will have multiple PCI host bridges on
> > > > > > >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > > > >>> derive these from the pci domain number, but the whole notion of
> > > > > > >>> domain numbers on the qemu side is bogus and in any case they're not
> > > > > > >>> actually uniquely allocated at this point.
> > > > > > >>> 
> > > > > > >>> This patch, therefore uses a simple sequence counter to generate
> > > > > > >>> unique LIOBNs for PCI host bridges.
> > > > > > >>> 
> > > > > > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > >> 
> > > > > > >> I don't really like the idea of having a global variable just
> > > > > > >> because our domain ID generation seems to not work as
> > > > > > >> expected. Michael, any comments here?
> > > > > > > 
> > > > > > > Well, the patch I sent which changed domain id generation was
> > > > > > > ignored.  In any case, as I said, the whole concept of domain numbers
> > > > > > 
> > > > > > Michael?
> > > > > 
> > > > > This is user visible, right?
> > > > > So IMHO we should have the user specify LIOBN through a property,
> > > > > rather than assign what's essentially a random value.
> > > > 
> > > > Well, I can implement an override through a property, which could be
> > > > useful in some circumstances.  But we still need to have qemu generate
> > > > unique defaults, rather than forcing it to be specified in every case.
> > > 
> > > I don't see why.
> > > And if you want automatic defaults then they need to be generated in a
> > > way that does not depend on implementation detail such as order of
> > > device initialization.
> > 
> > Because requiring explicit unique liobns to be supplied whenever there
> > is more than one PHB is horrible for usability.
> 
> We should make simple things simple and complex things possible.
> More than one PHB seems like an advanced feature

Not for pseries.  On real hardware of this type, dozens of PHBs is
routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
for emulated devices and one for passthrough devices.

> so I don't see why it needs to be made very easy to use.
> With an appropriate error message it should not be
> too hard for users to figure it out.
Alexander Graf Nov. 21, 2012, 1:27 p.m. UTC | #13
On 11/21/2012 02:21 PM, David Gibson wrote:
> On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
>> On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
>>> On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
>>>> On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
>>>>> On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
>>>>>> On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
>>>>>>> On 19.11.2012, at 23:51, David Gibson wrote:
>>>>>>>
>>>>>>>> On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
>>>>>>>>> On 13.11.2012, at 03:47, David Gibson wrote:
>>>>>>>>>
>>>>>>>>>> From: Alexey Kardashevskiy<aik@ozlabs.ru>
>>>>>>>>>>
>>>>>>>>>> In future (with VFIO) we will have multiple PCI host bridges on
>>>>>>>>>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
>>>>>>>>>> derive these from the pci domain number, but the whole notion of
>>>>>>>>>> domain numbers on the qemu side is bogus and in any case they're not
>>>>>>>>>> actually uniquely allocated at this point.
>>>>>>>>>>
>>>>>>>>>> This patch, therefore uses a simple sequence counter to generate
>>>>>>>>>> unique LIOBNs for PCI host bridges.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
>>>>>>>>>> Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
>>>>>>>>> I don't really like the idea of having a global variable just
>>>>>>>>> because our domain ID generation seems to not work as
>>>>>>>>> expected. Michael, any comments here?
>>>>>>>> Well, the patch I sent which changed domain id generation was
>>>>>>>> ignored.  In any case, as I said, the whole concept of domain numbers
>>>>>>> Michael?
>>>>>> This is user visible, right?
>>>>>> So IMHO we should have the user specify LIOBN through a property,
>>>>>> rather than assign what's essentially a random value.
>>>>> Well, I can implement an override through a property, which could be
>>>>> useful in some circumstances.  But we still need to have qemu generate
>>>>> unique defaults, rather than forcing it to be specified in every case.
>>>> I don't see why.
>>>> And if you want automatic defaults then they need to be generated in a
>>>> way that does not depend on implementation detail such as order of
>>>> device initialization.
>>> Because requiring explicit unique liobns to be supplied whenever there
>>> is more than one PHB is horrible for usability.
>> We should make simple things simple and complex things possible.
>> More than one PHB seems like an advanced feature
> Not for pseries.  On real hardware of this type, dozens of PHBs is
> routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> for emulated devices and one for passthrough devices.

Yeah, I second Davids opinion here. We need to make this easy for users.

How do we assign PCI slot IDs today when all you do is a -device? This 
should probably follow the same scheme.


Alex
Michael S. Tsirkin Nov. 21, 2012, 3:20 p.m. UTC | #14
On Thu, Nov 22, 2012 at 12:21:57AM +1100, David Gibson wrote:
> On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > > On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > > > On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > > > > On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > > > > 
> > > > > > > On 19.11.2012, at 23:51, David Gibson wrote:
> > > > > > > 
> > > > > > > > On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > > > > >> 
> > > > > > > >> On 13.11.2012, at 03:47, David Gibson wrote:
> > > > > > > >> 
> > > > > > > >>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > > > > >>> 
> > > > > > > >>> In future (with VFIO) we will have multiple PCI host bridges on
> > > > > > > >>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > > > > >>> derive these from the pci domain number, but the whole notion of
> > > > > > > >>> domain numbers on the qemu side is bogus and in any case they're not
> > > > > > > >>> actually uniquely allocated at this point.
> > > > > > > >>> 
> > > > > > > >>> This patch, therefore uses a simple sequence counter to generate
> > > > > > > >>> unique LIOBNs for PCI host bridges.
> > > > > > > >>> 
> > > > > > > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > > > > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > > >> 
> > > > > > > >> I don't really like the idea of having a global variable just
> > > > > > > >> because our domain ID generation seems to not work as
> > > > > > > >> expected. Michael, any comments here?
> > > > > > > > 
> > > > > > > > Well, the patch I sent which changed domain id generation was
> > > > > > > > ignored.  In any case, as I said, the whole concept of domain numbers
> > > > > > > 
> > > > > > > Michael?
> > > > > > 
> > > > > > This is user visible, right?
> > > > > > So IMHO we should have the user specify LIOBN through a property,
> > > > > > rather than assign what's essentially a random value.
> > > > > 
> > > > > Well, I can implement an override through a property, which could be
> > > > > useful in some circumstances.  But we still need to have qemu generate
> > > > > unique defaults, rather than forcing it to be specified in every case.
> > > > 
> > > > I don't see why.
> > > > And if you want automatic defaults then they need to be generated in a
> > > > way that does not depend on implementation detail such as order of
> > > > device initialization.
> > > 
> > > Because requiring explicit unique liobns to be supplied whenever there
> > > is more than one PHB is horrible for usability.
> > 
> > We should make simple things simple and complex things possible.
> > More than one PHB seems like an advanced feature
> 
> Not for pseries.  On real hardware of this type, dozens of PHBs is
> routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> for emulated devices and one for passthrough devices.

Then by default you probably want to both create the PHB and assign
the device to it automatically. Such an automatically created PHB
can have address assigned by default by setting a property.

> > so I don't see why it needs to be made very easy to use.
> > With an appropriate error message it should not be
> > too hard for users to figure it out.
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
Michael S. Tsirkin Nov. 21, 2012, 3:27 p.m. UTC | #15
On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> On 11/21/2012 02:21 PM, David Gibson wrote:
> >On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> >>On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> >>>On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> >>>>On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> >>>>>On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> >>>>>>On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> >>>>>>>On 19.11.2012, at 23:51, David Gibson wrote:
> >>>>>>>
> >>>>>>>>On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> >>>>>>>>>On 13.11.2012, at 03:47, David Gibson wrote:
> >>>>>>>>>
> >>>>>>>>>>From: Alexey Kardashevskiy<aik@ozlabs.ru>
> >>>>>>>>>>
> >>>>>>>>>>In future (with VFIO) we will have multiple PCI host bridges on
> >>>>>>>>>>pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> >>>>>>>>>>derive these from the pci domain number, but the whole notion of
> >>>>>>>>>>domain numbers on the qemu side is bogus and in any case they're not
> >>>>>>>>>>actually uniquely allocated at this point.
> >>>>>>>>>>
> >>>>>>>>>>This patch, therefore uses a simple sequence counter to generate
> >>>>>>>>>>unique LIOBNs for PCI host bridges.
> >>>>>>>>>>
> >>>>>>>>>>Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> >>>>>>>>>>Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> >>>>>>>>>I don't really like the idea of having a global variable just
> >>>>>>>>>because our domain ID generation seems to not work as
> >>>>>>>>>expected. Michael, any comments here?
> >>>>>>>>Well, the patch I sent which changed domain id generation was
> >>>>>>>>ignored.  In any case, as I said, the whole concept of domain numbers
> >>>>>>>Michael?
> >>>>>>This is user visible, right?
> >>>>>>So IMHO we should have the user specify LIOBN through a property,
> >>>>>>rather than assign what's essentially a random value.
> >>>>>Well, I can implement an override through a property, which could be
> >>>>>useful in some circumstances.  But we still need to have qemu generate
> >>>>>unique defaults, rather than forcing it to be specified in every case.
> >>>>I don't see why.
> >>>>And if you want automatic defaults then they need to be generated in a
> >>>>way that does not depend on implementation detail such as order of
> >>>>device initialization.
> >>>Because requiring explicit unique liobns to be supplied whenever there
> >>>is more than one PHB is horrible for usability.
> >>We should make simple things simple and complex things possible.
> >>More than one PHB seems like an advanced feature
> >Not for pseries.  On real hardware of this type, dozens of PHBs is
> >routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> >for emulated devices and one for passthrough devices.
> 
> Yeah, I second Davids opinion here. We need to make this easy for users.

I think users don't normally create PHBs. They request a disk, a network
device, a pass-through device ... In this case if this requires
more PHBs create them internally but I imagine this doesn't
require any allocation scheme - simply set some address
for virtual devices, some other one for assigned devices ... no?

If user wants to play with low level detail such as PHBs I don't see why
it's not reasonable to require full specification.

> How do we assign PCI slot IDs today when all you do is a -device?
> This should probably follow the same scheme.
> 
> 
> Alex

What we do is find a first free slow. But it's exactly why I worry:
changing pci addresses between qemu releases has been a source of pain
and compatibility hassles in the past.
The problem would be more manageable if you simply reserve some fixed
addresses for internal use, like we reserve slots for VGA and IDE,
though even that becomes a problem as we switch to q35.
David Gibson Nov. 22, 2012, 2:27 a.m. UTC | #16
On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
> On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> > On 11/21/2012 02:21 PM, David Gibson wrote:
> > >On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> > >>On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > >>>On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > >>>>On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > >>>>>On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > >>>>>>On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > >>>>>>>On 19.11.2012, at 23:51, David Gibson wrote:
> > >>>>>>>
> > >>>>>>>>On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > >>>>>>>>>On 13.11.2012, at 03:47, David Gibson wrote:
> > >>>>>>>>>
> > >>>>>>>>>>From: Alexey Kardashevskiy<aik@ozlabs.ru>
> > >>>>>>>>>>
> > >>>>>>>>>>In future (with VFIO) we will have multiple PCI host bridges on
> > >>>>>>>>>>pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > >>>>>>>>>>derive these from the pci domain number, but the whole notion of
> > >>>>>>>>>>domain numbers on the qemu side is bogus and in any case they're not
> > >>>>>>>>>>actually uniquely allocated at this point.
> > >>>>>>>>>>
> > >>>>>>>>>>This patch, therefore uses a simple sequence counter to generate
> > >>>>>>>>>>unique LIOBNs for PCI host bridges.
> > >>>>>>>>>>
> > >>>>>>>>>>Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> > >>>>>>>>>>Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> > >>>>>>>>>I don't really like the idea of having a global variable just
> > >>>>>>>>>because our domain ID generation seems to not work as
> > >>>>>>>>>expected. Michael, any comments here?
> > >>>>>>>>Well, the patch I sent which changed domain id generation was
> > >>>>>>>>ignored.  In any case, as I said, the whole concept of domain numbers
> > >>>>>>>Michael?
> > >>>>>>This is user visible, right?
> > >>>>>>So IMHO we should have the user specify LIOBN through a property,
> > >>>>>>rather than assign what's essentially a random value.
> > >>>>>Well, I can implement an override through a property, which could be
> > >>>>>useful in some circumstances.  But we still need to have qemu generate
> > >>>>>unique defaults, rather than forcing it to be specified in every case.
> > >>>>I don't see why.
> > >>>>And if you want automatic defaults then they need to be generated in a
> > >>>>way that does not depend on implementation detail such as order of
> > >>>>device initialization.
> > >>>Because requiring explicit unique liobns to be supplied whenever there
> > >>>is more than one PHB is horrible for usability.
> > >>We should make simple things simple and complex things possible.
> > >>More than one PHB seems like an advanced feature
> > >Not for pseries.  On real hardware of this type, dozens of PHBs is
> > >routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> > >for emulated devices and one for passthrough devices.
> > 
> > Yeah, I second Davids opinion here. We need to make this easy for users.
> 
> I think users don't normally create PHBs. They request a disk, a network
> device, a pass-through device ... In this case if this requires
> more PHBs create them internally but I imagine this doesn't
> require any allocation scheme - simply set some address
> for virtual devices, some other one for assigned devices ... no?

No.  One PHB for passthrough and one for emulated is the minimum.
Since we don't emulated p2p bridges, each PHB can only support a small
number of PCI devices, so if enough PCI devices are requested, we will
still need to create - and assign numbers to - additional PHBs.

> If user wants to play with low level detail such as PHBs I don't see why
> it's not reasonable to require full specification.
> 
> > How do we assign PCI slot IDs today when all you do is a -device?
> > This should probably follow the same scheme.
> > 
> > 
> > Alex
> 
> What we do is find a first free slow. But it's exactly why I worry:
> changing pci addresses between qemu releases has been a source of pain
> and compatibility hassles in the past.
> The problem would be more manageable if you simply reserve some fixed
> addresses for internal use, like we reserve slots for VGA and IDE,
> though even that becomes a problem as we switch to q35.
>
Michael S. Tsirkin Nov. 22, 2012, 7:23 a.m. UTC | #17
On Thu, Nov 22, 2012 at 01:27:18PM +1100, David Gibson wrote:
> On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> > > On 11/21/2012 02:21 PM, David Gibson wrote:
> > > >On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> > > >>On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > > >>>On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > > >>>>On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > >>>>>On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > >>>>>>On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > >>>>>>>On 19.11.2012, at 23:51, David Gibson wrote:
> > > >>>>>>>
> > > >>>>>>>>On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > >>>>>>>>>On 13.11.2012, at 03:47, David Gibson wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>>From: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > >>>>>>>>>>
> > > >>>>>>>>>>In future (with VFIO) we will have multiple PCI host bridges on
> > > >>>>>>>>>>pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > >>>>>>>>>>derive these from the pci domain number, but the whole notion of
> > > >>>>>>>>>>domain numbers on the qemu side is bogus and in any case they're not
> > > >>>>>>>>>>actually uniquely allocated at this point.
> > > >>>>>>>>>>
> > > >>>>>>>>>>This patch, therefore uses a simple sequence counter to generate
> > > >>>>>>>>>>unique LIOBNs for PCI host bridges.
> > > >>>>>>>>>>
> > > >>>>>>>>>>Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > >>>>>>>>>>Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> > > >>>>>>>>>I don't really like the idea of having a global variable just
> > > >>>>>>>>>because our domain ID generation seems to not work as
> > > >>>>>>>>>expected. Michael, any comments here?
> > > >>>>>>>>Well, the patch I sent which changed domain id generation was
> > > >>>>>>>>ignored.  In any case, as I said, the whole concept of domain numbers
> > > >>>>>>>Michael?
> > > >>>>>>This is user visible, right?
> > > >>>>>>So IMHO we should have the user specify LIOBN through a property,
> > > >>>>>>rather than assign what's essentially a random value.
> > > >>>>>Well, I can implement an override through a property, which could be
> > > >>>>>useful in some circumstances.  But we still need to have qemu generate
> > > >>>>>unique defaults, rather than forcing it to be specified in every case.
> > > >>>>I don't see why.
> > > >>>>And if you want automatic defaults then they need to be generated in a
> > > >>>>way that does not depend on implementation detail such as order of
> > > >>>>device initialization.
> > > >>>Because requiring explicit unique liobns to be supplied whenever there
> > > >>>is more than one PHB is horrible for usability.
> > > >>We should make simple things simple and complex things possible.
> > > >>More than one PHB seems like an advanced feature
> > > >Not for pseries.  On real hardware of this type, dozens of PHBs is
> > > >routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> > > >for emulated devices and one for passthrough devices.
> > > 
> > > Yeah, I second Davids opinion here. We need to make this easy for users.
> > 
> > I think users don't normally create PHBs. They request a disk, a network
> > device, a pass-through device ... In this case if this requires
> > more PHBs create them internally but I imagine this doesn't
> > require any allocation scheme - simply set some address
> > for virtual devices, some other one for assigned devices ... no?
> 
> No.  One PHB for passthrough and one for emulated is the minimum.
> Since we don't emulated p2p bridges,

Actually qemu does emulate p2p bridges.

> each PHB can only support a small
> number of PCI devices, so if enough PCI devices are requested, we will
> still need to create - and assign numbers to - additional PHBs.

Each PHB can support up to 32 slots right? This seems ample for
a typical use. If you want many tens of devices you need to
supply addresses manually, this looks reasonable to me.

Allocating PHBs on the fly seems unencessarily tricky.

> > If user wants to play with low level detail such as PHBs I don't see why
> > it's not reasonable to require full specification.
> > 
> > > How do we assign PCI slot IDs today when all you do is a -device?
> > > This should probably follow the same scheme.
> > > 
> > > 
> > > Alex
> > 
> > What we do is find a first free slow. But it's exactly why I worry:
> > changing pci addresses between qemu releases has been a source of pain
> > and compatibility hassles in the past.
> > The problem would be more manageable if you simply reserve some fixed
> > addresses for internal use, like we reserve slots for VGA and IDE,
> > though even that becomes a problem as we switch to q35.
> > 
> 
> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
Alexander Graf Nov. 22, 2012, 11:27 a.m. UTC | #18
On 22.11.2012, at 08:23, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Nov 22, 2012 at 01:27:18PM +1100, David Gibson wrote:
>> On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
>>> On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
>>>> On 11/21/2012 02:21 PM, David Gibson wrote:
>>>>> On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
>>>>>> On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
>>>>>>> On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
>>>>>>>> On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
>>>>>>>>> On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
>>>>>>>>>> On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
>>>>>>>>>>> On 19.11.2012, at 23:51, David Gibson wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
>>>>>>>>>>>>> On 13.11.2012, at 03:47, David Gibson wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> From: Alexey Kardashevskiy<aik@ozlabs.ru>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In future (with VFIO) we will have multiple PCI host bridges on
>>>>>>>>>>>>>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
>>>>>>>>>>>>>> derive these from the pci domain number, but the whole notion of
>>>>>>>>>>>>>> domain numbers on the qemu side is bogus and in any case they're not
>>>>>>>>>>>>>> actually uniquely allocated at this point.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> This patch, therefore uses a simple sequence counter to generate
>>>>>>>>>>>>>> unique LIOBNs for PCI host bridges.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
>>>>>>>>>>>>>> Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
>>>>>>>>>>>>> I don't really like the idea of having a global variable just
>>>>>>>>>>>>> because our domain ID generation seems to not work as
>>>>>>>>>>>>> expected. Michael, any comments here?
>>>>>>>>>>>> Well, the patch I sent which changed domain id generation was
>>>>>>>>>>>> ignored.  In any case, as I said, the whole concept of domain numbers
>>>>>>>>>>> Michael?
>>>>>>>>>> This is user visible, right?
>>>>>>>>>> So IMHO we should have the user specify LIOBN through a property,
>>>>>>>>>> rather than assign what's essentially a random value.
>>>>>>>>> Well, I can implement an override through a property, which could be
>>>>>>>>> useful in some circumstances.  But we still need to have qemu generate
>>>>>>>>> unique defaults, rather than forcing it to be specified in every case.
>>>>>>>> I don't see why.
>>>>>>>> And if you want automatic defaults then they need to be generated in a
>>>>>>>> way that does not depend on implementation detail such as order of
>>>>>>>> device initialization.
>>>>>>> Because requiring explicit unique liobns to be supplied whenever there
>>>>>>> is more than one PHB is horrible for usability.
>>>>>> We should make simple things simple and complex things possible.
>>>>>> More than one PHB seems like an advanced feature
>>>>> Not for pseries.  On real hardware of this type, dozens of PHBs is
>>>>> routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
>>>>> for emulated devices and one for passthrough devices.
>>>> 
>>>> Yeah, I second Davids opinion here. We need to make this easy for users.
>>> 
>>> I think users don't normally create PHBs. They request a disk, a network
>>> device, a pass-through device ... In this case if this requires
>>> more PHBs create them internally but I imagine this doesn't
>>> require any allocation scheme - simply set some address
>>> for virtual devices, some other one for assigned devices ... no?
>> 
>> No.  One PHB for passthrough and one for emulated is the minimum.
>> Since we don't emulated p2p bridges,
> 
> Actually qemu does emulate p2p bridges.
> 
>> each PHB can only support a small
>> number of PCI devices, so if enough PCI devices are requested, we will
>> still need to create - and assign numbers to - additional PHBs.
> 
> Each PHB can support up to 32 slots right? This seems ample for
> a typical use. If you want many tens of devices you need to
> supply addresses manually, this looks reasonable to me.
> 
> Allocating PHBs on the fly seems unencessarily tricky.

IIRC it's required for fault isolation.

Alex

> 
>>> If user wants to play with low level detail such as PHBs I don't see why
>>> it's not reasonable to require full specification.
>>> 
>>>> How do we assign PCI slot IDs today when all you do is a -device?
>>>> This should probably follow the same scheme.
>>>> 
>>>> 
>>>> Alex
>>> 
>>> What we do is find a first free slow. But it's exactly why I worry:
>>> changing pci addresses between qemu releases has been a source of pain
>>> and compatibility hassles in the past.
>>> The problem would be more manageable if you simply reserve some fixed
>>> addresses for internal use, like we reserve slots for VGA and IDE,
>>> though even that becomes a problem as we switch to q35.
>>> 
>> 
>> -- 
>> David Gibson            | I'll have my music baroque, and my code
>> david AT gibson.dropbear.id.au    | minimalist, thank you.  NOT _the_ _other_
>>                | _way_ _around_!
>> http://www.ozlabs.org/~dgibson
Michael S. Tsirkin Nov. 22, 2012, 11:39 a.m. UTC | #19
On Thu, Nov 22, 2012 at 12:27:49PM +0100, Alexander Graf wrote:
> 
> 
> On 22.11.2012, at 08:23, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Thu, Nov 22, 2012 at 01:27:18PM +1100, David Gibson wrote:
> >> On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
> >>> On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> >>>> On 11/21/2012 02:21 PM, David Gibson wrote:
> >>>>> On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> >>>>>> On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> >>>>>>> On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> >>>>>>>> On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> >>>>>>>>> On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> >>>>>>>>>> On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> >>>>>>>>>>> On 19.11.2012, at 23:51, David Gibson wrote:
> >>>>>>>>>>> 
> >>>>>>>>>>>> On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> >>>>>>>>>>>>> On 13.11.2012, at 03:47, David Gibson wrote:
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>>> From: Alexey Kardashevskiy<aik@ozlabs.ru>
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> In future (with VFIO) we will have multiple PCI host bridges on
> >>>>>>>>>>>>>> pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> >>>>>>>>>>>>>> derive these from the pci domain number, but the whole notion of
> >>>>>>>>>>>>>> domain numbers on the qemu side is bogus and in any case they're not
> >>>>>>>>>>>>>> actually uniquely allocated at this point.
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> This patch, therefore uses a simple sequence counter to generate
> >>>>>>>>>>>>>> unique LIOBNs for PCI host bridges.
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> >>>>>>>>>>>>>> Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> >>>>>>>>>>>>> I don't really like the idea of having a global variable just
> >>>>>>>>>>>>> because our domain ID generation seems to not work as
> >>>>>>>>>>>>> expected. Michael, any comments here?
> >>>>>>>>>>>> Well, the patch I sent which changed domain id generation was
> >>>>>>>>>>>> ignored.  In any case, as I said, the whole concept of domain numbers
> >>>>>>>>>>> Michael?
> >>>>>>>>>> This is user visible, right?
> >>>>>>>>>> So IMHO we should have the user specify LIOBN through a property,
> >>>>>>>>>> rather than assign what's essentially a random value.
> >>>>>>>>> Well, I can implement an override through a property, which could be
> >>>>>>>>> useful in some circumstances.  But we still need to have qemu generate
> >>>>>>>>> unique defaults, rather than forcing it to be specified in every case.
> >>>>>>>> I don't see why.
> >>>>>>>> And if you want automatic defaults then they need to be generated in a
> >>>>>>>> way that does not depend on implementation detail such as order of
> >>>>>>>> device initialization.
> >>>>>>> Because requiring explicit unique liobns to be supplied whenever there
> >>>>>>> is more than one PHB is horrible for usability.
> >>>>>> We should make simple things simple and complex things possible.
> >>>>>> More than one PHB seems like an advanced feature
> >>>>> Not for pseries.  On real hardware of this type, dozens of PHBs is
> >>>>> routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> >>>>> for emulated devices and one for passthrough devices.
> >>>> 
> >>>> Yeah, I second Davids opinion here. We need to make this easy for users.
> >>> 
> >>> I think users don't normally create PHBs. They request a disk, a network
> >>> device, a pass-through device ... In this case if this requires
> >>> more PHBs create them internally but I imagine this doesn't
> >>> require any allocation scheme - simply set some address
> >>> for virtual devices, some other one for assigned devices ... no?
> >> 
> >> No.  One PHB for passthrough and one for emulated is the minimum.
> >> Since we don't emulated p2p bridges,
> > 
> > Actually qemu does emulate p2p bridges.
> > 
> >> each PHB can only support a small
> >> number of PCI devices, so if enough PCI devices are requested, we will
> >> still need to create - and assign numbers to - additional PHBs.
> > 
> > Each PHB can support up to 32 slots right? This seems ample for
> > a typical use. If you want many tens of devices you need to
> > supply addresses manually, this looks reasonable to me.
> > 
> > Allocating PHBs on the fly seems unencessarily tricky.
> 
> IIRC it's required for fault isolation.
> 
> Alex

IIUC separate physical PHBs might help fault isolation
of assigned devices.
OTOH I don't see why we need to emulate PHBs for this.

> > 
> >>> If user wants to play with low level detail such as PHBs I don't see why
> >>> it's not reasonable to require full specification.
> >>> 
> >>>> How do we assign PCI slot IDs today when all you do is a -device?
> >>>> This should probably follow the same scheme.
> >>>> 
> >>>> 
> >>>> Alex
> >>> 
> >>> What we do is find a first free slow. But it's exactly why I worry:
> >>> changing pci addresses between qemu releases has been a source of pain
> >>> and compatibility hassles in the past.
> >>> The problem would be more manageable if you simply reserve some fixed
> >>> addresses for internal use, like we reserve slots for VGA and IDE,
> >>> though even that becomes a problem as we switch to q35.
> >>> 
> >> 
> >> -- 
> >> David Gibson            | I'll have my music baroque, and my code
> >> david AT gibson.dropbear.id.au    | minimalist, thank you.  NOT _the_ _other_
> >>                | _way_ _around_!
> >> http://www.ozlabs.org/~dgibson
David Gibson Nov. 23, 2012, 4:13 a.m. UTC | #20
On Thu, Nov 22, 2012 at 09:23:03AM +0200, Michael S. Tsirkin wrote:
> On Thu, Nov 22, 2012 at 01:27:18PM +1100, David Gibson wrote:
> > On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> > > > On 11/21/2012 02:21 PM, David Gibson wrote:
> > > > >On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> > > > >>On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > > > >>>On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > > > >>>>On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > > >>>>>On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > > >>>>>>On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > >>>>>>>On 19.11.2012, at 23:51, David Gibson wrote:
> > > > >>>>>>>
> > > > >>>>>>>>On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > >>>>>>>>>On 13.11.2012, at 03:47, David Gibson wrote:
> > > > >>>>>>>>>
> > > > >>>>>>>>>>From: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>In future (with VFIO) we will have multiple PCI host bridges on
> > > > >>>>>>>>>>pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > >>>>>>>>>>derive these from the pci domain number, but the whole notion of
> > > > >>>>>>>>>>domain numbers on the qemu side is bogus and in any case they're not
> > > > >>>>>>>>>>actually uniquely allocated at this point.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>This patch, therefore uses a simple sequence counter to generate
> > > > >>>>>>>>>>unique LIOBNs for PCI host bridges.
> > > > >>>>>>>>>>
> > > > >>>>>>>>>>Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > > >>>>>>>>>>Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> > > > >>>>>>>>>I don't really like the idea of having a global variable just
> > > > >>>>>>>>>because our domain ID generation seems to not work as
> > > > >>>>>>>>>expected. Michael, any comments here?
> > > > >>>>>>>>Well, the patch I sent which changed domain id generation was
> > > > >>>>>>>>ignored.  In any case, as I said, the whole concept of domain numbers
> > > > >>>>>>>Michael?
> > > > >>>>>>This is user visible, right?
> > > > >>>>>>So IMHO we should have the user specify LIOBN through a property,
> > > > >>>>>>rather than assign what's essentially a random value.
> > > > >>>>>Well, I can implement an override through a property, which could be
> > > > >>>>>useful in some circumstances.  But we still need to have qemu generate
> > > > >>>>>unique defaults, rather than forcing it to be specified in every case.
> > > > >>>>I don't see why.
> > > > >>>>And if you want automatic defaults then they need to be generated in a
> > > > >>>>way that does not depend on implementation detail such as order of
> > > > >>>>device initialization.
> > > > >>>Because requiring explicit unique liobns to be supplied whenever there
> > > > >>>is more than one PHB is horrible for usability.
> > > > >>We should make simple things simple and complex things possible.
> > > > >>More than one PHB seems like an advanced feature
> > > > >Not for pseries.  On real hardware of this type, dozens of PHBs is
> > > > >routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> > > > >for emulated devices and one for passthrough devices.
> > > > 
> > > > Yeah, I second Davids opinion here. We need to make this easy for users.
> > > 
> > > I think users don't normally create PHBs. They request a disk, a network
> > > device, a pass-through device ... In this case if this requires
> > > more PHBs create them internally but I imagine this doesn't
> > > require any allocation scheme - simply set some address
> > > for virtual devices, some other one for assigned devices ... no?
> > 
> > No.  One PHB for passthrough and one for emulated is the minimum.
> > Since we don't emulated p2p bridges,
> 
> Actually qemu does emulate p2p bridges.
> 
> > each PHB can only support a small
> > number of PCI devices, so if enough PCI devices are requested, we will
> > still need to create - and assign numbers to - additional PHBs.
> 
> Each PHB can support up to 32 slots right? This seems ample for
> a typical use. If you want many tens of devices you need to
> supply addresses manually, this looks reasonable to me.
> 
> Allocating PHBs on the fly seems unencessarily tricky.

I still think you're thinking far too much in terms of x86-like
guests.  But, I guess, emulated PCI devices will be pretty rare, since
they're so slow anyway.

But, what Alex was getting at is another factor I forgot.  Because of
the way the PAPR PCI paravirtualized interfaces work, host devices
which are in different "partitionable endpoints" (roughly the minimum
granularity for safe isolation) must appear in different virtual PHBs
in the guest.  So if we pass through a lot of PCI devices, we will
almost certainly need a lot of guest PHBs.
Michael S. Tsirkin Nov. 23, 2012, 10:53 a.m. UTC | #21
On Fri, Nov 23, 2012 at 03:13:07PM +1100, David Gibson wrote:
> On Thu, Nov 22, 2012 at 09:23:03AM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 22, 2012 at 01:27:18PM +1100, David Gibson wrote:
> > > On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> > > > > On 11/21/2012 02:21 PM, David Gibson wrote:
> > > > > >On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> > > > > >>On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > > > > >>>On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > > > > >>>>On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > > > >>>>>On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > > > >>>>>>On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > > >>>>>>>On 19.11.2012, at 23:51, David Gibson wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>>On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > > >>>>>>>>>On 13.11.2012, at 03:47, David Gibson wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>>From: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>In future (with VFIO) we will have multiple PCI host bridges on
> > > > > >>>>>>>>>>pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > > >>>>>>>>>>derive these from the pci domain number, but the whole notion of
> > > > > >>>>>>>>>>domain numbers on the qemu side is bogus and in any case they're not
> > > > > >>>>>>>>>>actually uniquely allocated at this point.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>This patch, therefore uses a simple sequence counter to generate
> > > > > >>>>>>>>>>unique LIOBNs for PCI host bridges.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > > > >>>>>>>>>>Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> > > > > >>>>>>>>>I don't really like the idea of having a global variable just
> > > > > >>>>>>>>>because our domain ID generation seems to not work as
> > > > > >>>>>>>>>expected. Michael, any comments here?
> > > > > >>>>>>>>Well, the patch I sent which changed domain id generation was
> > > > > >>>>>>>>ignored.  In any case, as I said, the whole concept of domain numbers
> > > > > >>>>>>>Michael?
> > > > > >>>>>>This is user visible, right?
> > > > > >>>>>>So IMHO we should have the user specify LIOBN through a property,
> > > > > >>>>>>rather than assign what's essentially a random value.
> > > > > >>>>>Well, I can implement an override through a property, which could be
> > > > > >>>>>useful in some circumstances.  But we still need to have qemu generate
> > > > > >>>>>unique defaults, rather than forcing it to be specified in every case.
> > > > > >>>>I don't see why.
> > > > > >>>>And if you want automatic defaults then they need to be generated in a
> > > > > >>>>way that does not depend on implementation detail such as order of
> > > > > >>>>device initialization.
> > > > > >>>Because requiring explicit unique liobns to be supplied whenever there
> > > > > >>>is more than one PHB is horrible for usability.
> > > > > >>We should make simple things simple and complex things possible.
> > > > > >>More than one PHB seems like an advanced feature
> > > > > >Not for pseries.  On real hardware of this type, dozens of PHBs is
> > > > > >routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> > > > > >for emulated devices and one for passthrough devices.
> > > > > 
> > > > > Yeah, I second Davids opinion here. We need to make this easy for users.
> > > > 
> > > > I think users don't normally create PHBs. They request a disk, a network
> > > > device, a pass-through device ... In this case if this requires
> > > > more PHBs create them internally but I imagine this doesn't
> > > > require any allocation scheme - simply set some address
> > > > for virtual devices, some other one for assigned devices ... no?
> > > 
> > > No.  One PHB for passthrough and one for emulated is the minimum.
> > > Since we don't emulated p2p bridges,
> > 
> > Actually qemu does emulate p2p bridges.
> > 
> > > each PHB can only support a small
> > > number of PCI devices, so if enough PCI devices are requested, we will
> > > still need to create - and assign numbers to - additional PHBs.
> > 
> > Each PHB can support up to 32 slots right? This seems ample for
> > a typical use. If you want many tens of devices you need to
> > supply addresses manually, this looks reasonable to me.
> > 
> > Allocating PHBs on the fly seems unencessarily tricky.
> 
> I still think you're thinking far too much in terms of x86-like
> guests.  But, I guess, emulated PCI devices will be pretty rare, since
> they're so slow anyway.

They are not very slow.  My point is the reverse. Device assignment is
pretty rare, assigning multiple devices even more rare. It should work
but I don't see why we should increase maintainance burden to save the user
ruynning qemu from CLI the trouble of specifying a single extra property
in this advanced case.

> But, what Alex was getting at is another factor I forgot.  Because of
> the way the PAPR PCI paravirtualized interfaces work, host devices
> which are in different "partitionable endpoints" (roughly the minimum
> granularity for safe isolation) must appear in different virtual PHBs
> in the guest.  So if we pass through a lot of PCI devices, we will
> almost certainly need a lot of guest PHBs.

One further point: physical systems normally don't let you hotplug PHBs.
So if for non hotplug you auto-add PHBs, you end up treating hotplug
and non hotplug devices differently from UI point of view this is nasty.

Look, even if solution using a required property is less elegant for CLI
use, it will work, won't it?
So how about we merge it so that things work, and then we can discuss a
patch on top that auto-generates this property?

> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
David Gibson Nov. 23, 2012, 12:59 p.m. UTC | #22
On Fri, Nov 23, 2012 at 12:53:23PM +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 23, 2012 at 03:13:07PM +1100, David Gibson wrote:
> > On Thu, Nov 22, 2012 at 09:23:03AM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Nov 22, 2012 at 01:27:18PM +1100, David Gibson wrote:
> > > > On Wed, Nov 21, 2012 at 05:27:37PM +0200, Michael S. Tsirkin wrote:
> > > > > On Wed, Nov 21, 2012 at 02:27:08PM +0100, Alexander Graf wrote:
> > > > > > On 11/21/2012 02:21 PM, David Gibson wrote:
> > > > > > >On Wed, Nov 21, 2012 at 03:13:39PM +0200, Michael S. Tsirkin wrote:
> > > > > > >>On Wed, Nov 21, 2012 at 11:36:00PM +1100, David Gibson wrote:
> > > > > > >>>On Wed, Nov 21, 2012 at 01:34:48PM +0200, Michael S. Tsirkin wrote:
> > > > > > >>>>On Wed, Nov 21, 2012 at 11:57:05AM +1100, David Gibson wrote:
> > > > > > >>>>>On Tue, Nov 20, 2012 at 02:26:09PM +0200, Michael S. Tsirkin wrote:
> > > > > > >>>>>>On Tue, Nov 20, 2012 at 10:27:11AM +0100, Alexander Graf wrote:
> > > > > > >>>>>>>On 19.11.2012, at 23:51, David Gibson wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>>>On Mon, Nov 19, 2012 at 05:34:12PM +0100, Alexander Graf wrote:
> > > > > > >>>>>>>>>On 13.11.2012, at 03:47, David Gibson wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>>From: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>In future (with VFIO) we will have multiple PCI host bridges on
> > > > > > >>>>>>>>>>pseries.  Each one needs a unique LIOBN (IOMMU id).  At the moment we
> > > > > > >>>>>>>>>>derive these from the pci domain number, but the whole notion of
> > > > > > >>>>>>>>>>domain numbers on the qemu side is bogus and in any case they're not
> > > > > > >>>>>>>>>>actually uniquely allocated at this point.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>This patch, therefore uses a simple sequence counter to generate
> > > > > > >>>>>>>>>>unique LIOBNs for PCI host bridges.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>Signed-off-by: Alexey Kardashevskiy<aik@ozlabs.ru>
> > > > > > >>>>>>>>>>Signed-off-by: David Gibson<david@gibson.dropbear.id.au>
> > > > > > >>>>>>>>>I don't really like the idea of having a global variable just
> > > > > > >>>>>>>>>because our domain ID generation seems to not work as
> > > > > > >>>>>>>>>expected. Michael, any comments here?
> > > > > > >>>>>>>>Well, the patch I sent which changed domain id generation was
> > > > > > >>>>>>>>ignored.  In any case, as I said, the whole concept of domain numbers
> > > > > > >>>>>>>Michael?
> > > > > > >>>>>>This is user visible, right?
> > > > > > >>>>>>So IMHO we should have the user specify LIOBN through a property,
> > > > > > >>>>>>rather than assign what's essentially a random value.
> > > > > > >>>>>Well, I can implement an override through a property, which could be
> > > > > > >>>>>useful in some circumstances.  But we still need to have qemu generate
> > > > > > >>>>>unique defaults, rather than forcing it to be specified in every case.
> > > > > > >>>>I don't see why.
> > > > > > >>>>And if you want automatic defaults then they need to be generated in a
> > > > > > >>>>way that does not depend on implementation detail such as order of
> > > > > > >>>>device initialization.
> > > > > > >>>Because requiring explicit unique liobns to be supplied whenever there
> > > > > > >>>is more than one PHB is horrible for usability.
> > > > > > >>We should make simple things simple and complex things possible.
> > > > > > >>More than one PHB seems like an advanced feature
> > > > > > >Not for pseries.  On real hardware of this type, dozens of PHBs is
> > > > > > >routine.  Plus, vfio passthrough is coming, we need at minimum one PHB
> > > > > > >for emulated devices and one for passthrough devices.
> > > > > > 
> > > > > > Yeah, I second Davids opinion here. We need to make this easy for users.
> > > > > 
> > > > > I think users don't normally create PHBs. They request a disk, a network
> > > > > device, a pass-through device ... In this case if this requires
> > > > > more PHBs create them internally but I imagine this doesn't
> > > > > require any allocation scheme - simply set some address
> > > > > for virtual devices, some other one for assigned devices ... no?
> > > > 
> > > > No.  One PHB for passthrough and one for emulated is the minimum.
> > > > Since we don't emulated p2p bridges,
> > > 
> > > Actually qemu does emulate p2p bridges.
> > > 
> > > > each PHB can only support a small
> > > > number of PCI devices, so if enough PCI devices are requested, we will
> > > > still need to create - and assign numbers to - additional PHBs.
> > > 
> > > Each PHB can support up to 32 slots right? This seems ample for
> > > a typical use. If you want many tens of devices you need to
> > > supply addresses manually, this looks reasonable to me.
> > > 
> > > Allocating PHBs on the fly seems unencessarily tricky.
> > 
> > I still think you're thinking far too much in terms of x86-like
> > guests.  But, I guess, emulated PCI devices will be pretty rare, since
> > they're so slow anyway.
> 
> They are not very slow.  My point is the reverse. Device assignment is
> pretty rare, assigning multiple devices even more rare. It should work
> but I don't see why we should increase maintainance burden to save the user
> ruynning qemu from CLI the trouble of specifying a single extra property
> in this advanced case.

You're still in x86-land.  It's rare there, but PCI assignment is
absolutely routine on partitioned Power systems.

> > But, what Alex was getting at is another factor I forgot.  Because of
> > the way the PAPR PCI paravirtualized interfaces work, host devices
> > which are in different "partitionable endpoints" (roughly the minimum
> > granularity for safe isolation) must appear in different virtual PHBs
> > in the guest.  So if we pass through a lot of PCI devices, we will
> > almost certainly need a lot of guest PHBs.
> 
> One further point: physical systems normally don't let you hotplug PHBs.

No, but the PAPR interface is already para-virtualized, so that's not
really a limitation.  I haven't looked at the hotplug interfaces
recently (implementing hotplug is a project for next year) - but I
think it actually typically operates by hotplugging (virtual) PHBs.

> So if for non hotplug you auto-add PHBs, you end up treating hotplug
> and non hotplug devices differently from UI point of view this is nasty.

> Look, even if solution using a required property is less elegant for CLI
> use, it will work, won't it?
> So how about we merge it so that things work, and then we can discuss a
> patch on top that auto-generates this property?

Well, there you have a point.  And actually I've realised there are
other things we need to assign uniquely for each PHB and don't yet (IO
window addresses).  So I need to look at a wider rework of this, which
I'll start on next week.
Michael S. Tsirkin Nov. 23, 2012, 1:44 p.m. UTC | #23
On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
> > Look, even if solution using a required property is less elegant for CLI
> > use, it will work, won't it?
> > So how about we merge it so that things work, and then we can discuss a
> > patch on top that auto-generates this property?
> 
> Well, there you have a point.  And actually I've realised there are
> other things we need to assign uniquely for each PHB and don't yet (IO
> window addresses).  So I need to look at a wider rework of this, which
> I'll start on next week.

Fine. Basically my point is it's typically a mistake to
make some userspace visible parameter depend on order
of initialization of devices in qemu. I don't insist
on making users fully specify such parameters but it
is one way to do this.


> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
Alexander Graf Nov. 23, 2012, 1:44 p.m. UTC | #24
On 23.11.2012, at 14:44, Michael S. Tsirkin wrote:

> On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
>>> Look, even if solution using a required property is less elegant for CLI
>>> use, it will work, won't it?
>>> So how about we merge it so that things work, and then we can discuss a
>>> patch on top that auto-generates this property?
>> 
>> Well, there you have a point.  And actually I've realised there are
>> other things we need to assign uniquely for each PHB and don't yet (IO
>> window addresses).  So I need to look at a wider rework of this, which
>> I'll start on next week.
> 
> Fine. Basically my point is it's typically a mistake to
> make some userspace visible parameter depend on order
> of initialization of devices in qemu. I don't insist
> on making users fully specify such parameters but it
> is one way to do this.

I think it's reasonable to require to be able to specify it. If you don't, it's fine to base on device order IMHO. But you have to have the ability to specify it by hand if your management tool of choice actually wants reproducible results.


Alex
Michael S. Tsirkin Nov. 23, 2012, 2:01 p.m. UTC | #25
On Fri, Nov 23, 2012 at 02:44:15PM +0100, Alexander Graf wrote:
> 
> On 23.11.2012, at 14:44, Michael S. Tsirkin wrote:
> 
> > On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
> >>> Look, even if solution using a required property is less elegant for CLI
> >>> use, it will work, won't it?
> >>> So how about we merge it so that things work, and then we can discuss a
> >>> patch on top that auto-generates this property?
> >> 
> >> Well, there you have a point.  And actually I've realised there are
> >> other things we need to assign uniquely for each PHB and don't yet (IO
> >> window addresses).  So I need to look at a wider rework of this, which
> >> I'll start on next week.
> > 
> > Fine. Basically my point is it's typically a mistake to
> > make some userspace visible parameter depend on order
> > of initialization of devices in qemu. I don't insist
> > on making users fully specify such parameters but it
> > is one way to do this.
> 
> I think it's reasonable to require to be able to specify it. If you
> don't, it's fine to base on device order IMHO.

Let me clarify why it's not fine.  My understanding is these addresses
do not change across reboots on real hardware.  On the other hand order
of initialization can change because of internal changes in qemu;
thus shut down and start guest under another qemu version
changes addresses.

So it's a bug. No?

> But you have to have the ability to specify it by hand if your
> management tool of choice actually wants reproducible results.
> 
> 
> Alex

How do you know whether your guest relies on reproducible results?
You typically don't. Imagine a guest which does rely on this. Then:

What I propose: user runs qemu with many PHBs but with no iobns, gets error
'initialization failed. please add iobn=XXXX where XXXX is a unique
 number 1 to 64K.'

What you propose: user runs qemu with many PHBs but with no iobns,
guest fails to boot or behaves incorrectly.

I think as a user I prefer the first type of failure. No?
Alexander Graf Nov. 23, 2012, 2:03 p.m. UTC | #26
On 23.11.2012, at 15:01, Michael S. Tsirkin wrote:

> On Fri, Nov 23, 2012 at 02:44:15PM +0100, Alexander Graf wrote:
>> 
>> On 23.11.2012, at 14:44, Michael S. Tsirkin wrote:
>> 
>>> On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
>>>>> Look, even if solution using a required property is less elegant for CLI
>>>>> use, it will work, won't it?
>>>>> So how about we merge it so that things work, and then we can discuss a
>>>>> patch on top that auto-generates this property?
>>>> 
>>>> Well, there you have a point.  And actually I've realised there are
>>>> other things we need to assign uniquely for each PHB and don't yet (IO
>>>> window addresses).  So I need to look at a wider rework of this, which
>>>> I'll start on next week.
>>> 
>>> Fine. Basically my point is it's typically a mistake to
>>> make some userspace visible parameter depend on order
>>> of initialization of devices in qemu. I don't insist
>>> on making users fully specify such parameters but it
>>> is one way to do this.
>> 
>> I think it's reasonable to require to be able to specify it. If you
>> don't, it's fine to base on device order IMHO.
> 
> Let me clarify why it's not fine.  My understanding is these addresses
> do not change across reboots on real hardware.  On the other hand order
> of initialization can change because of internal changes in qemu;
> thus shut down and start guest under another qemu version
> changes addresses.
> 
> So it's a bug. No?
> 
>> But you have to have the ability to specify it by hand if your
>> management tool of choice actually wants reproducible results.
>> 
>> 
>> Alex
> 
> How do you know whether your guest relies on reproducible results?
> You typically don't. Imagine a guest which does rely on this. Then:
> 
> What I propose: user runs qemu with many PHBs but with no iobns, gets error
> 'initialization failed. please add iobn=XXXX where XXXX is a unique
> number 1 to 64K.'
> 
> What you propose: user runs qemu with many PHBs but with no iobns,
> guest fails to boot or behaves incorrectly.
> 
> I think as a user I prefer the first type of failure. No?

I tend to disagree. The device creation logic should stay identical with the same versioned machine. So if I use -M pc-0.12, I am guaranteed that QEMU's internal semantics on which device goes where does not change.

For unstable machine types (which -M pseries is, same as -M pc), we don't guarantee that the guest view stays stable across version updates FWIW. So if we want to give the user a stable view of the world, we would need to create a -M pseries-1.3 which then would always keep the same device creation order.

It's the same for x86, no?


Alex
Michael S. Tsirkin Nov. 23, 2012, 2:18 p.m. UTC | #27
On Fri, Nov 23, 2012 at 03:03:24PM +0100, Alexander Graf wrote:
> 
> On 23.11.2012, at 15:01, Michael S. Tsirkin wrote:
> 
> > On Fri, Nov 23, 2012 at 02:44:15PM +0100, Alexander Graf wrote:
> >> 
> >> On 23.11.2012, at 14:44, Michael S. Tsirkin wrote:
> >> 
> >>> On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
> >>>>> Look, even if solution using a required property is less elegant for CLI
> >>>>> use, it will work, won't it?
> >>>>> So how about we merge it so that things work, and then we can discuss a
> >>>>> patch on top that auto-generates this property?
> >>>> 
> >>>> Well, there you have a point.  And actually I've realised there are
> >>>> other things we need to assign uniquely for each PHB and don't yet (IO
> >>>> window addresses).  So I need to look at a wider rework of this, which
> >>>> I'll start on next week.
> >>> 
> >>> Fine. Basically my point is it's typically a mistake to
> >>> make some userspace visible parameter depend on order
> >>> of initialization of devices in qemu. I don't insist
> >>> on making users fully specify such parameters but it
> >>> is one way to do this.
> >> 
> >> I think it's reasonable to require to be able to specify it. If you
> >> don't, it's fine to base on device order IMHO.
> > 
> > Let me clarify why it's not fine.  My understanding is these addresses
> > do not change across reboots on real hardware.  On the other hand order
> > of initialization can change because of internal changes in qemu;
> > thus shut down and start guest under another qemu version
> > changes addresses.
> > 
> > So it's a bug. No?
> > 
> >> But you have to have the ability to specify it by hand if your
> >> management tool of choice actually wants reproducible results.
> >> 
> >> 
> >> Alex
> > 
> > How do you know whether your guest relies on reproducible results?
> > You typically don't. Imagine a guest which does rely on this. Then:
> > 
> > What I propose: user runs qemu with many PHBs but with no iobns, gets error
> > 'initialization failed. please add iobn=XXXX where XXXX is a unique
> > number 1 to 64K.'
> > 
> > What you propose: user runs qemu with many PHBs but with no iobns,
> > guest fails to boot or behaves incorrectly.
> > 
> > I think as a user I prefer the first type of failure. No?
> 
> I tend to disagree. The device creation logic should stay identical
> with the same versioned machine. So if I use -M pc-0.12, I am
> guaranteed that QEMU's internal semantics on which device goes where
> does not change.

This is exactly why you should not rely on device initialization
order for your address allocator - it is not guaranteed to
be consistent.

> For unstable machine types (which -M pseries is, same as -M pc), we
> don't guarantee that the guest view stays stable across version
> updates FWIW. So if we want to give the user a stable view of the
> world, we would need to create a -M pseries-1.3 which then would
> always keep the same device creation order.
> 
> It's the same for x86, no?
> 
> 
> Alex

Same for x86 in that we broke guests in the past by changing order,
and the lesson is to always require full specification if possible.
Only reason we keep pci slot allocation around is for
backward compatibility.
Alexander Graf Nov. 23, 2012, 2:27 p.m. UTC | #28
On 23.11.2012, at 15:18, Michael S. Tsirkin wrote:

> On Fri, Nov 23, 2012 at 03:03:24PM +0100, Alexander Graf wrote:
>> 
>> On 23.11.2012, at 15:01, Michael S. Tsirkin wrote:
>> 
>>> On Fri, Nov 23, 2012 at 02:44:15PM +0100, Alexander Graf wrote:
>>>> 
>>>> On 23.11.2012, at 14:44, Michael S. Tsirkin wrote:
>>>> 
>>>>> On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
>>>>>>> Look, even if solution using a required property is less elegant for CLI
>>>>>>> use, it will work, won't it?
>>>>>>> So how about we merge it so that things work, and then we can discuss a
>>>>>>> patch on top that auto-generates this property?
>>>>>> 
>>>>>> Well, there you have a point.  And actually I've realised there are
>>>>>> other things we need to assign uniquely for each PHB and don't yet (IO
>>>>>> window addresses).  So I need to look at a wider rework of this, which
>>>>>> I'll start on next week.
>>>>> 
>>>>> Fine. Basically my point is it's typically a mistake to
>>>>> make some userspace visible parameter depend on order
>>>>> of initialization of devices in qemu. I don't insist
>>>>> on making users fully specify such parameters but it
>>>>> is one way to do this.
>>>> 
>>>> I think it's reasonable to require to be able to specify it. If you
>>>> don't, it's fine to base on device order IMHO.
>>> 
>>> Let me clarify why it's not fine.  My understanding is these addresses
>>> do not change across reboots on real hardware.  On the other hand order
>>> of initialization can change because of internal changes in qemu;
>>> thus shut down and start guest under another qemu version
>>> changes addresses.
>>> 
>>> So it's a bug. No?
>>> 
>>>> But you have to have the ability to specify it by hand if your
>>>> management tool of choice actually wants reproducible results.
>>>> 
>>>> 
>>>> Alex
>>> 
>>> How do you know whether your guest relies on reproducible results?
>>> You typically don't. Imagine a guest which does rely on this. Then:
>>> 
>>> What I propose: user runs qemu with many PHBs but with no iobns, gets error
>>> 'initialization failed. please add iobn=XXXX where XXXX is a unique
>>> number 1 to 64K.'
>>> 
>>> What you propose: user runs qemu with many PHBs but with no iobns,
>>> guest fails to boot or behaves incorrectly.
>>> 
>>> I think as a user I prefer the first type of failure. No?
>> 
>> I tend to disagree. The device creation logic should stay identical
>> with the same versioned machine. So if I use -M pc-0.12, I am
>> guaranteed that QEMU's internal semantics on which device goes where
>> does not change.
> 
> This is exactly why you should not rely on device initialization
> order for your address allocator - it is not guaranteed to
> be consistent.
> 
>> For unstable machine types (which -M pseries is, same as -M pc), we
>> don't guarantee that the guest view stays stable across version
>> updates FWIW. So if we want to give the user a stable view of the
>> world, we would need to create a -M pseries-1.3 which then would
>> always keep the same device creation order.
>> 
>> It's the same for x86, no?
>> 
>> 
>> Alex
> 
> Same for x86 in that we broke guests in the past by changing order,
> and the lesson is to always require full specification if possible.
> Only reason we keep pci slot allocation around is for
> backward compatibility.

Yeah, that's why I would expect libvirt for example to always pass in pci slot ids manually for example. But if you want a convenience QEMU command line, that is not guaranteed to be identical across different versions.

So IMHO it's fine to have a fuzzy non-consistent fallback as long as it's possible to specify the consistent variant. I guess that's a matter of taste really :).


Alex
David Gibson Nov. 25, 2012, 11:24 p.m. UTC | #29
On Fri, Nov 23, 2012 at 04:01:58PM +0200, Michael S. Tsirkin wrote:
> On Fri, Nov 23, 2012 at 02:44:15PM +0100, Alexander Graf wrote:
> > 
> > On 23.11.2012, at 14:44, Michael S. Tsirkin wrote:
> > 
> > > On Fri, Nov 23, 2012 at 11:59:51PM +1100, David Gibson wrote:
> > >>> Look, even if solution using a required property is less elegant for CLI
> > >>> use, it will work, won't it?
> > >>> So how about we merge it so that things work, and then we can discuss a
> > >>> patch on top that auto-generates this property?
> > >> 
> > >> Well, there you have a point.  And actually I've realised there are
> > >> other things we need to assign uniquely for each PHB and don't yet (IO
> > >> window addresses).  So I need to look at a wider rework of this, which
> > >> I'll start on next week.
> > > 
> > > Fine. Basically my point is it's typically a mistake to
> > > make some userspace visible parameter depend on order
> > > of initialization of devices in qemu. I don't insist
> > > on making users fully specify such parameters but it
> > > is one way to do this.
> > 
> > I think it's reasonable to require to be able to specify it. If you
> > don't, it's fine to base on device order IMHO.
> 
> Let me clarify why it's not fine.  My understanding is these addresses
> do not change across reboots on real hardware.

Well, the BUID would be expected to remain the same.  The others
probably remain stable across reboots in practice, but I don't think
there's any reason they need to - the kernel will get the LIOBN and
window addresses from the device tree afresh on every boot.
diff mbox

Patch

diff --git a/hw/spapr_pci.c b/hw/spapr_pci.c
index 3c5b855..f6544d7 100644
--- a/hw/spapr_pci.c
+++ b/hw/spapr_pci.c
@@ -521,6 +521,7 @@  static int spapr_phb_init(SysBusDevice *s)
     char *namebuf;
     int i;
     PCIBus *bus;
+    static int phbnum;
 
     sphb->dtbusname = g_strdup_printf("pci@%" PRIx64, sphb->buid);
     namebuf = alloca(strlen(sphb->dtbusname) + 32);
@@ -572,7 +573,7 @@  static int spapr_phb_init(SysBusDevice *s)
                            PCI_DEVFN(0, 0), PCI_NUM_PINS);
     phb->bus = bus;
 
-    sphb->dma_liobn = SPAPR_PCI_BASE_LIOBN | (pci_find_domain(bus) << 16);
+    sphb->dma_liobn = SPAPR_PCI_BASE_LIOBN | (++phbnum << 16);
     sphb->dma_window_start = 0;
     sphb->dma_window_size = 0x40000000;
     sphb->dma = spapr_tce_new_dma_context(sphb->dma_liobn, sphb->dma_window_size);