Patchwork Correctly assign PCI domain numbers

login
register
mail settings
Submitter David Gibson
Date Aug. 1, 2011, 6:51 a.m.
Message ID <1312181462-29889-1-git-send-email-david@gibson.dropbear.id.au>
Download mbox | patch
Permalink /patch/107681/
State New
Headers show

Comments

David Gibson - Aug. 1, 2011, 6:51 a.m.
qemu already almost supports PCI domains; that is, several entirely
independent PCI host bridges on the same machine.  However, a bug in
pci_bus_new_inplace() means that every host bridge gets assigned domain
number zero and so can't be properly distinguished.  This patch fixes the
bug, giving each new host bridge a new domain number.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/pci.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)
Isaku Yamahata - Aug. 1, 2011, 8:31 a.m.
[Added mst to Cc.]

In order to use multi PCI domain, several areas need to be addressed
in addition to this patch. For example, bios, acpi dsdt.
Do you have any plan for addressing those area?
What's your motivation for multi pci domain?
NOTE: I'm not opposing to this patch. Just curious for your motivation/plan.

On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> qemu already almost supports PCI domains; that is, several entirely
> independent PCI host bridges on the same machine.  However, a bug in
> pci_bus_new_inplace() means that every host bridge gets assigned domain
> number zero and so can't be properly distinguished.  This patch fixes the
> bug, giving each new host bridge a new domain number.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/pci.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 36db58b..2b4aecb 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -262,6 +262,8 @@ int pci_find_domain(const PCIBus *bus)
>      return -1;
>  }
>  
> +static int pci_next_domain; /* = 0 */
> +
>  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
>                           const char *name,
>                           MemoryRegion *address_space,
> @@ -274,7 +276,8 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
>  
>      /* host bridge */
>      QLIST_INIT(&bus->child);
> -    pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
> +
> +    pci_host_bus_register(pci_next_domain++, bus);
>  
>      vmstate_register(NULL, -1, &vmstate_pcibus, bus);
>  }
> -- 
> 1.7.5.4
> 
>
Michael S. Tsirkin - Aug. 1, 2011, 10:10 a.m.
On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> qemu already almost supports PCI domains; that is, several entirely
> independent PCI host bridges on the same machine.  However, a bug in
> pci_bus_new_inplace() means that every host bridge gets assigned domain
> number zero and so can't be properly distinguished.  This patch fixes the
> bug, giving each new host bridge a new domain number.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

OK, but I'd like to see the whole picture.
How does the guest detect multiple domains,
and how does it access them?

> ---
>  hw/pci.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 36db58b..2b4aecb 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -262,6 +262,8 @@ int pci_find_domain(const PCIBus *bus)
>      return -1;
>  }
>  
> +static int pci_next_domain; /* = 0 */
> +
>  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
>                           const char *name,
>                           MemoryRegion *address_space,
> @@ -274,7 +276,8 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
>  
>      /* host bridge */
>      QLIST_INIT(&bus->child);
> -    pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
> +
> +    pci_host_bus_register(pci_next_domain++, bus);

What happens when that overflows?

>  
>      vmstate_register(NULL, -1, &vmstate_pcibus, bus);
>  }
> -- 
> 1.7.5.4
>
David Gibson - Aug. 1, 2011, 1:32 p.m.
On Mon, Aug 01, 2011 at 05:31:06PM +0900, Isaku Yamahata wrote:
> [Added mst to Cc.]
> 
> In order to use multi PCI domain, several areas need to be addressed
> in addition to this patch. For example, bios, acpi dsdt.

For x86, yes.  For powerpc, which is what I'm working on, no.

> Do you have any plan for addressing those area?

No.  AFAICT this won't make anything less working than it is now, and
is sufficient to be useful for the pseries machine.

> What's your motivation for multi pci domain?

Multiple PCI host bridges is typical on IBM pSeries (powerpc)
machines.
David Gibson - Aug. 1, 2011, 1:33 p.m.
On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > qemu already almost supports PCI domains; that is, several entirely
> > independent PCI host bridges on the same machine.  However, a bug in
> > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > number zero and so can't be properly distinguished.  This patch fixes the
> > bug, giving each new host bridge a new domain number.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> OK, but I'd like to see the whole picture.
> How does the guest detect multiple domains,
> and how does it access them?

For the pseries machine, which is what I'm concerned with, each host
bridge is advertised through the device tree passed to the guest.
That gives the necessary handles and addresses for accesing config
space and memory and IO windows for each host bridge.
Michael S. Tsirkin - Aug. 1, 2011, 2:03 p.m.
On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > qemu already almost supports PCI domains; that is, several entirely
> > > independent PCI host bridges on the same machine.  However, a bug in
> > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > number zero and so can't be properly distinguished.  This patch fixes the
> > > bug, giving each new host bridge a new domain number.
> > > 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > 
> > OK, but I'd like to see the whole picture.
> > How does the guest detect multiple domains,
> > and how does it access them?
> 
> For the pseries machine, which is what I'm concerned with, each host
> bridge is advertised through the device tree passed to the guest.

Could you explain please?
What generates the device tree and passes it to the guest?

> That gives the necessary handles and addresses for accesing config
> space and memory and IO windows for each host bridge.

I see. I think maybe a global counter in the common code
is not exactly the best solution in the general case.


> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
David Gibson - Aug. 1, 2011, 2:15 p.m.
On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > qemu already almost supports PCI domains; that is, several entirely
> > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > bug, giving each new host bridge a new domain number.
> > > > 
> > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > 
> > > OK, but I'd like to see the whole picture.
> > > How does the guest detect multiple domains,
> > > and how does it access them?
> > 
> > For the pseries machine, which is what I'm concerned with, each host
> > bridge is advertised through the device tree passed to the guest.
> 
> Could you explain please?
> What generates the device tree and passes it to the guest?

In the case of the pseries machine, it is generated from hw/spapr.c
and loaded into memory for use by the firmware and/or the kernel.

> > That gives the necessary handles and addresses for accesing config
> > space and memory and IO windows for each host bridge.
> 
> I see. I think maybe a global counter in the common code
> is not exactly the best solution in the general case.

Well, which general case do you have in mind.  Since by definition,
PCI domains are entirely independent from each other, domain numbers
are essentially arbitrary as long as they're unique - simply a
convention which makes it easier to describe which host bridge devices
belong on.  I don't see an obvious approach which is better than a
global counter, or least not one that doesn't involve a significant
rewrite of the PCI subsystem.
Stefan Hajnoczi - Aug. 3, 2011, 10:13 a.m.
On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > qemu already almost supports PCI domains; that is, several entirely
> > independent PCI host bridges on the same machine.  However, a bug in
> > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > number zero and so can't be properly distinguished.  This patch fixes the
> > bug, giving each new host bridge a new domain number.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> OK, but I'd like to see the whole picture.
> How does the guest detect multiple domains,
> and how does it access them?
> 
> > ---
> >  hw/pci.c |    5 ++++-
> >  1 files changed, 4 insertions(+), 1 deletions(-)
> > 
> > diff --git a/hw/pci.c b/hw/pci.c
> > index 36db58b..2b4aecb 100644
> > --- a/hw/pci.c
> > +++ b/hw/pci.c
> > @@ -262,6 +262,8 @@ int pci_find_domain(const PCIBus *bus)
> >      return -1;
> >  }
> >  
> > +static int pci_next_domain; /* = 0 */
> > +
> >  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> >                           const char *name,
> >                           MemoryRegion *address_space,
> > @@ -274,7 +276,8 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> >  
> >      /* host bridge */
> >      QLIST_INIT(&bus->child);
> > -    pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
> > +
> > +    pci_host_bus_register(pci_next_domain++, bus);
> 
> What happens when that overflows?

In what scenario do we reach such a high number?  (I'm not saying this
is harmless, just trying to understand if it should be an assert or
error return.)

Stefan
Michael S. Tsirkin - Aug. 3, 2011, 10:21 a.m.
On Wed, Aug 03, 2011 at 11:13:15AM +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > qemu already almost supports PCI domains; that is, several entirely
> > > independent PCI host bridges on the same machine.  However, a bug in
> > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > number zero and so can't be properly distinguished.  This patch fixes the
> > > bug, giving each new host bridge a new domain number.
> > > 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > 
> > OK, but I'd like to see the whole picture.
> > How does the guest detect multiple domains,
> > and how does it access them?
> > 
> > > ---
> > >  hw/pci.c |    5 ++++-
> > >  1 files changed, 4 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/hw/pci.c b/hw/pci.c
> > > index 36db58b..2b4aecb 100644
> > > --- a/hw/pci.c
> > > +++ b/hw/pci.c
> > > @@ -262,6 +262,8 @@ int pci_find_domain(const PCIBus *bus)
> > >      return -1;
> > >  }
> > >  
> > > +static int pci_next_domain; /* = 0 */
> > > +
> > >  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> > >                           const char *name,
> > >                           MemoryRegion *address_space,
> > > @@ -274,7 +276,8 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> > >  
> > >      /* host bridge */
> > >      QLIST_INIT(&bus->child);
> > > -    pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
> > > +
> > > +    pci_host_bus_register(pci_next_domain++, bus);
> > 
> > What happens when that overflows?
> 
> In what scenario do we reach such a high number?  (I'm not saying this
> is harmless, just trying to understand if it should be an assert or
> error return.)
> 
> Stefan

Can bus ever get hot-plugged?
Michael S. Tsirkin - Aug. 3, 2011, 1:28 p.m.
On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > > qemu already almost supports PCI domains; that is, several entirely
> > > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > > bug, giving each new host bridge a new domain number.
> > > > > 
> > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > 
> > > > OK, but I'd like to see the whole picture.
> > > > How does the guest detect multiple domains,
> > > > and how does it access them?
> > > 
> > > For the pseries machine, which is what I'm concerned with, each host
> > > bridge is advertised through the device tree passed to the guest.
> > 
> > Could you explain please?
> > What generates the device tree and passes it to the guest?
> 
> In the case of the pseries machine, it is generated from hw/spapr.c
> and loaded into memory for use by the firmware and/or the kernel.
> 
> > > That gives the necessary handles and addresses for accesing config
> > > space and memory and IO windows for each host bridge.
> > 
> > I see. I think maybe a global counter in the common code
> > is not exactly the best solution in the general case.
> 
> Well, which general case do you have in mind. Since by definition,
> PCI domains are entirely independent from each other, domain numbers
> are essentially arbitrary as long as they're unique - simply a
> convention which makes it easier to describe which host bridge devices
> belong on.  I don't see an obvious approach which is better than a
> global counter, or least not one that doesn't involve a significant
> rewrite of the PCI subsystem.


OK, let's make sure I understand. On your system
'domain numbers' are completely invisible to the
guest, right? You only need them to address
devices on qemu monitor ...

For that, I'm trying to move away from using
a domain number.
Would it be possible to simply give  bus an id,
and use bus=<id> instead?


BTW, how does a linux guest number domains?
Would it make sense to match that?
David Gibson - Aug. 4, 2011, 9 a.m.
On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > > > qemu already almost supports PCI domains; that is, several entirely
> > > > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > > > bug, giving each new host bridge a new domain number.
> > > > > > 
> > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > 
> > > > > OK, but I'd like to see the whole picture.
> > > > > How does the guest detect multiple domains,
> > > > > and how does it access them?
> > > > 
> > > > For the pseries machine, which is what I'm concerned with, each host
> > > > bridge is advertised through the device tree passed to the guest.
> > > 
> > > Could you explain please?
> > > What generates the device tree and passes it to the guest?
> > 
> > In the case of the pseries machine, it is generated from hw/spapr.c
> > and loaded into memory for use by the firmware and/or the kernel.
> > 
> > > > That gives the necessary handles and addresses for accesing config
> > > > space and memory and IO windows for each host bridge.
> > > 
> > > I see. I think maybe a global counter in the common code
> > > is not exactly the best solution in the general case.
> > 
> > Well, which general case do you have in mind. Since by definition,
> > PCI domains are entirely independent from each other, domain numbers
> > are essentially arbitrary as long as they're unique - simply a
> > convention which makes it easier to describe which host bridge devices
> > belong on.  I don't see an obvious approach which is better than a
> > global counter, or least not one that doesn't involve a significant
> > rewrite of the PCI subsystem.
> 
> OK, let's make sure I understand. On your system 'domain numbers'
> are completely invisible to the guest, right? You only need them to
> address devices on qemu monitor ...

Well.. the qemu domain number is not officially visible to the guest.
However the handles that are visible to the guest will need to be
derived from some sort of unique domain number.

> For that, I'm trying to move away from using a domain number.  Would
> it be possible to simply give bus an id, and use bus=<id> instead?

It might be.  In this case we should remove the domain numbers (as
used by pci_find_domain()) from qemu entirely, since they are broken
as they stand without this patch.

> BTW, how does a linux guest number domains?
> Would it make sense to match that?

I'll look into it.  It would be nice to have them match, obviously but
I'm not sure if there will be a way to do this that's both reasonable
and robust.  I suspect they will match already though not in a
terribly robust way, at least for the pseries machine, becuase qemu
will create the host bridge nodes in the same order as domain number,
and I suspect Linux will just allocate domain numbers sequentially in
that same order.
Michael S. Tsirkin - Aug. 4, 2011, 7:14 p.m.
On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:
> On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > > > > qemu already almost supports PCI domains; that is, several entirely
> > > > > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > > > > bug, giving each new host bridge a new domain number.
> > > > > > > 
> > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > 
> > > > > > OK, but I'd like to see the whole picture.
> > > > > > How does the guest detect multiple domains,
> > > > > > and how does it access them?
> > > > > 
> > > > > For the pseries machine, which is what I'm concerned with, each host
> > > > > bridge is advertised through the device tree passed to the guest.
> > > > 
> > > > Could you explain please?
> > > > What generates the device tree and passes it to the guest?
> > > 
> > > In the case of the pseries machine, it is generated from hw/spapr.c
> > > and loaded into memory for use by the firmware and/or the kernel.
> > > 
> > > > > That gives the necessary handles and addresses for accesing config
> > > > > space and memory and IO windows for each host bridge.
> > > > 
> > > > I see. I think maybe a global counter in the common code
> > > > is not exactly the best solution in the general case.
> > > 
> > > Well, which general case do you have in mind. Since by definition,
> > > PCI domains are entirely independent from each other, domain numbers
> > > are essentially arbitrary as long as they're unique - simply a
> > > convention which makes it easier to describe which host bridge devices
> > > belong on.  I don't see an obvious approach which is better than a
> > > global counter, or least not one that doesn't involve a significant
> > > rewrite of the PCI subsystem.
> > 
> > OK, let's make sure I understand. On your system 'domain numbers'
> > are completely invisible to the guest, right? You only need them to
> > address devices on qemu monitor ...
> 
> Well.. the qemu domain number is not officially visible to the guest.
> However the handles that are visible to the guest will need to be
> derived from some sort of unique domain number.

Interesting. How does it work with your patch?

> > For that, I'm trying to move away from using a domain number.  Would
> > it be possible to simply give bus an id, and use bus=<id> instead?
> 
> It might be.  In this case we should remove the domain numbers (as
> used by pci_find_domain()) from qemu entirely,

Or at least, move to acpi-specific code.

> since they are broken

I agree, they are broken.

> as they stand without this patch.
> 
> > BTW, how does a linux guest number domains?
> > Would it make sense to match that?
> 
> I'll look into it.  It would be nice to have them match, obviously but
> I'm not sure if there will be a way to do this that's both reasonable
> and robust.  I suspect they will match already though not in a
> terribly robust way, at least for the pseries machine, becuase qemu
> will create the host bridge nodes in the same order as domain number,
> and I suspect Linux will just allocate domain numbers sequentially in
> that same order.

If the order of things in the tree matters for some guests, we should
give users a way to control that order, or at least make
the order robust.

> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson
David Gibson - Aug. 10, 2011, 3:05 a.m.
On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > qemu already almost supports PCI domains; that is, several entirely
> > independent PCI host bridges on the same machine.  However, a bug in
> > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > number zero and so can't be properly distinguished.  This patch fixes the
> > bug, giving each new host bridge a new domain number.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> OK, but I'd like to see the whole picture.
> How does the guest detect multiple domains,
> and how does it access them?
> 
> > ---
> >  hw/pci.c |    5 ++++-
> >  1 files changed, 4 insertions(+), 1 deletions(-)
> > 
> > diff --git a/hw/pci.c b/hw/pci.c
> > index 36db58b..2b4aecb 100644
> > --- a/hw/pci.c
> > +++ b/hw/pci.c
> > @@ -262,6 +262,8 @@ int pci_find_domain(const PCIBus *bus)
> >      return -1;
> >  }
> >  
> > +static int pci_next_domain; /* = 0 */
> > +
> >  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> >                           const char *name,
> >                           MemoryRegion *address_space,
> > @@ -274,7 +276,8 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
> >  
> >      /* host bridge */
> >      QLIST_INIT(&bus->child);
> > -    pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
> > +
> > +    pci_host_bus_register(pci_next_domain++, bus);
> 
> What happens when that overflows?

Well, I guess we get an overlap, and therefore multiple domains with
the same number.

So, exactly what happens now, only four billion times less often.
Michael S. Tsirkin - Aug. 10, 2011, 8:34 a.m.
On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:
> On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:
> > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > > > > qemu already almost supports PCI domains; that is, several entirely
> > > > > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > > > > bug, giving each new host bridge a new domain number.
> > > > > > > 
> > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > 
> > > > > > OK, but I'd like to see the whole picture.
> > > > > > How does the guest detect multiple domains,
> > > > > > and how does it access them?
> > > > > 
> > > > > For the pseries machine, which is what I'm concerned with, each host
> > > > > bridge is advertised through the device tree passed to the guest.
> > > > 
> > > > Could you explain please?
> > > > What generates the device tree and passes it to the guest?
> > > 
> > > In the case of the pseries machine, it is generated from hw/spapr.c
> > > and loaded into memory for use by the firmware and/or the kernel.
> > > 
> > > > > That gives the necessary handles and addresses for accesing config
> > > > > space and memory and IO windows for each host bridge.
> > > > 
> > > > I see. I think maybe a global counter in the common code
> > > > is not exactly the best solution in the general case.
> > > 
> > > Well, which general case do you have in mind. Since by definition,
> > > PCI domains are entirely independent from each other, domain numbers
> > > are essentially arbitrary as long as they're unique - simply a
> > > convention which makes it easier to describe which host bridge devices
> > > belong on.  I don't see an obvious approach which is better than a
> > > global counter, or least not one that doesn't involve a significant
> > > rewrite of the PCI subsystem.
> > 
> > OK, let's make sure I understand. On your system 'domain numbers'
> > are completely invisible to the guest, right? You only need them to
> > address devices on qemu monitor ...
> 
> Well.. the qemu domain number is not officially visible to the guest.
> However the handles that are visible to the guest will need to be
> derived from some sort of unique domain number.
> 
> > For that, I'm trying to move away from using a domain number.  Would
> > it be possible to simply give bus an id, and use bus=<id> instead?
> 
> It might be.  In this case we should remove the domain numbers (as
> used by pci_find_domain()) from qemu entirely, since they are broken
> as they stand without this patch.
> 
> > BTW, how does a linux guest number domains?
> > Would it make sense to match that?
> 
> I'll look into it.  It would be nice to have them match, obviously but
> I'm not sure if there will be a way to do this that's both reasonable
> and robust.  I suspect they will match already though not in a
> terribly robust way, at least for the pseries machine, becuase qemu
> will create the host bridge nodes in the same order as domain number,
> and I suspect Linux will just allocate domain numbers sequentially in
> that same order.

OK, so what's the plan at the moment?
How about we pass domain number from callers,
and make sure buses are enumerated in this order?
This will make sure linux enumerates them in
the same order.
David Gibson - Aug. 11, 2011, 6:38 a.m.
On Wed, Aug 10, 2011 at 11:34:23AM +0300, Michael S. Tsirkin wrote:
> On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:
> > On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> > > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> > > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > > > > > qemu already almost supports PCI domains; that is, several entirely
> > > > > > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > > > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > > > > > bug, giving each new host bridge a new domain number.
> > > > > > > > 
> > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > > 
> > > > > > > OK, but I'd like to see the whole picture.
> > > > > > > How does the guest detect multiple domains,
> > > > > > > and how does it access them?
> > > > > > 
> > > > > > For the pseries machine, which is what I'm concerned with, each host
> > > > > > bridge is advertised through the device tree passed to the guest.
> > > > > 
> > > > > Could you explain please?
> > > > > What generates the device tree and passes it to the guest?
> > > > 
> > > > In the case of the pseries machine, it is generated from hw/spapr.c
> > > > and loaded into memory for use by the firmware and/or the kernel.
> > > > 
> > > > > > That gives the necessary handles and addresses for accesing config
> > > > > > space and memory and IO windows for each host bridge.
> > > > > 
> > > > > I see. I think maybe a global counter in the common code
> > > > > is not exactly the best solution in the general case.
> > > > 
> > > > Well, which general case do you have in mind. Since by definition,
> > > > PCI domains are entirely independent from each other, domain numbers
> > > > are essentially arbitrary as long as they're unique - simply a
> > > > convention which makes it easier to describe which host bridge devices
> > > > belong on.  I don't see an obvious approach which is better than a
> > > > global counter, or least not one that doesn't involve a significant
> > > > rewrite of the PCI subsystem.
> > > 
> > > OK, let's make sure I understand. On your system 'domain numbers'
> > > are completely invisible to the guest, right? You only need them to
> > > address devices on qemu monitor ...
> > 
> > Well.. the qemu domain number is not officially visible to the guest.
> > However the handles that are visible to the guest will need to be
> > derived from some sort of unique domain number.
> > 
> > > For that, I'm trying to move away from using a domain number.  Would
> > > it be possible to simply give bus an id, and use bus=<id> instead?
> > 
> > It might be.  In this case we should remove the domain numbers (as
> > used by pci_find_domain()) from qemu entirely, since they are broken
> > as they stand without this patch.
> > 
> > > BTW, how does a linux guest number domains?
> > > Would it make sense to match that?
> > 
> > I'll look into it.  It would be nice to have them match, obviously but
> > I'm not sure if there will be a way to do this that's both reasonable
> > and robust.  I suspect they will match already though not in a
> > terribly robust way, at least for the pseries machine, becuase qemu
> > will create the host bridge nodes in the same order as domain number,
> > and I suspect Linux will just allocate domain numbers sequentially in
> > that same order.
> 
> OK, so what's the plan at the moment?

Well, you tell me...

> How about we pass domain number from callers,

From callers of what exactly?

> and make sure buses are enumerated in this order?
> This will make sure linux enumerates them in
> the same order.

I don't think we can do that in general.  After all enumeration order
of domains is essentially a guest internal matter, which we can only
guess at.
Michael S. Tsirkin - Oct. 2, 2011, 10:35 a.m.
On Thu, Aug 11, 2011 at 04:38:34PM +1000, David Gibson wrote:
> On Wed, Aug 10, 2011 at 11:34:23AM +0300, Michael S. Tsirkin wrote:
> > On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:
> > > On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> > > > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> > > > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> > > > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:
> > > > > > > > > qemu already almost supports PCI domains; that is, several entirely
> > > > > > > > > independent PCI host bridges on the same machine.  However, a bug in
> > > > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain
> > > > > > > > > number zero and so can't be properly distinguished.  This patch fixes the
> > > > > > > > > bug, giving each new host bridge a new domain number.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > > > 
> > > > > > > > OK, but I'd like to see the whole picture.
> > > > > > > > How does the guest detect multiple domains,
> > > > > > > > and how does it access them?
> > > > > > > 
> > > > > > > For the pseries machine, which is what I'm concerned with, each host
> > > > > > > bridge is advertised through the device tree passed to the guest.
> > > > > > 
> > > > > > Could you explain please?
> > > > > > What generates the device tree and passes it to the guest?
> > > > > 
> > > > > In the case of the pseries machine, it is generated from hw/spapr.c
> > > > > and loaded into memory for use by the firmware and/or the kernel.
> > > > > 
> > > > > > > That gives the necessary handles and addresses for accesing config
> > > > > > > space and memory and IO windows for each host bridge.
> > > > > > 
> > > > > > I see. I think maybe a global counter in the common code
> > > > > > is not exactly the best solution in the general case.
> > > > > 
> > > > > Well, which general case do you have in mind. Since by definition,
> > > > > PCI domains are entirely independent from each other, domain numbers
> > > > > are essentially arbitrary as long as they're unique - simply a
> > > > > convention which makes it easier to describe which host bridge devices
> > > > > belong on.  I don't see an obvious approach which is better than a
> > > > > global counter, or least not one that doesn't involve a significant
> > > > > rewrite of the PCI subsystem.
> > > > 
> > > > OK, let's make sure I understand. On your system 'domain numbers'
> > > > are completely invisible to the guest, right? You only need them to
> > > > address devices on qemu monitor ...
> > > 
> > > Well.. the qemu domain number is not officially visible to the guest.
> > > However the handles that are visible to the guest will need to be
> > > derived from some sort of unique domain number.
> > > 
> > > > For that, I'm trying to move away from using a domain number.  Would
> > > > it be possible to simply give bus an id, and use bus=<id> instead?
> > > 
> > > It might be.  In this case we should remove the domain numbers (as
> > > used by pci_find_domain()) from qemu entirely, since they are broken
> > > as they stand without this patch.
> > > 
> > > > BTW, how does a linux guest number domains?
> > > > Would it make sense to match that?
> > > 
> > > I'll look into it.  It would be nice to have them match, obviously but
> > > I'm not sure if there will be a way to do this that's both reasonable
> > > and robust.  I suspect they will match already though not in a
> > > terribly robust way, at least for the pseries machine, becuase qemu
> > > will create the host bridge nodes in the same order as domain number,
> > > and I suspect Linux will just allocate domain numbers sequentially in
> > > that same order.
> > 
> > OK, so what's the plan at the moment?
> 
> Well, you tell me...

You wanted to look at how does linux enumerates domains, no?
Any success?

> > How about we pass domain number from callers,
> 
> >From callers of what exactly?

pci_bus_new_inplace I guess.

> > and make sure buses are enumerated in this order?
> > This will make sure linux enumerates them in
> > the same order.
> 
> I don't think we can do that in general.  After all enumeration order
> of domains is essentially a guest internal matter, which we can only
> guess at.

It seems clear that using a domain number in qemu was a mistake.
We can already pass bus= argument to hotplug to specify the bus,
so only bus address (slot# / function #) is needed.

Luckily, ATM the only supported domain # is 0, so we can just
ignore it.

My concern is if we try to expose domain number > 0 to monitor,
users will come to depend on this number, which it is
an implementation detail.
So how about we just have in qemu
default domain when no bus is specified and any other domain
which must be specified explicitly?

> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

Patch

diff --git a/hw/pci.c b/hw/pci.c
index 36db58b..2b4aecb 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -262,6 +262,8 @@  int pci_find_domain(const PCIBus *bus)
     return -1;
 }
 
+static int pci_next_domain; /* = 0 */
+
 void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
                          const char *name,
                          MemoryRegion *address_space,
@@ -274,7 +276,8 @@  void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
 
     /* host bridge */
     QLIST_INIT(&bus->child);
-    pci_host_bus_register(0, bus); /* for now only pci domain 0 is supported */
+
+    pci_host_bus_register(pci_next_domain++, bus);
 
     vmstate_register(NULL, -1, &vmstate_pcibus, bus);
 }