Patchwork PCI: Bus number from the bridge, not the device

login
register
mail settings
Submitter Alex Williamson
Date Oct. 4, 2010, 9:53 p.m.
Message ID <20101004215311.17070.54862.stgit@s20.home>
Download mbox | patch
Permalink /patch/66746/
State New
Headers show

Comments

Alex Williamson - Oct. 4, 2010, 9:53 p.m.
pcibus_dev_print() was erroneously retrieving the device bus
number from the secondary bus number offset of the device
instead of the bridge above the device.  This ends of landing
in the 2nd byte of the 3rd BAR for devices, which thankfully
is usually zero.  pcibus_get_dev_path() copied this code,
inheriting the same bug.  pcibus_get_dev_path() is used for
ramblock naming, so changing it can effect migration.  However,
I've only seen this byte be non-zero for an assigned device,
which can't migrate anyway, so hopefully we won't run into
any issues.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---

 hw/pci.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)
Michael S. Tsirkin - Nov. 8, 2010, 11:22 a.m.
On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
> pcibus_dev_print() was erroneously retrieving the device bus
> number from the secondary bus number offset of the device
> instead of the bridge above the device.  This ends of landing
> in the 2nd byte of the 3rd BAR for devices, which thankfully
> is usually zero.  pcibus_get_dev_path() copied this code,
> inheriting the same bug.  pcibus_get_dev_path() is used for
> ramblock naming, so changing it can effect migration.  However,
> I've only seen this byte be non-zero for an assigned device,
> which can't migrate anyway, so hopefully we won't run into
> any issues.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

Good catch. Applied.
I don't really see why do we put the dev path
in the bus object: why not let device supply its name?

And I think this will affect nested bridges. However they are currently
broken anyway: we really must convert to topological names as bus number
is guest-assigned - they don't have to be unique, even.

What does fixing this involve? Just changing pcibus_get_dev_path?

> ---
> 
>  hw/pci.c |    5 ++---
>  1 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/pci.c b/hw/pci.c
> index 6d0934d..15416dd 100644
> --- a/hw/pci.c
> +++ b/hw/pci.c
> @@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
>  
>      monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, "
>                     "pci id %04x:%04x (sub %04x:%04x)\n",
> -                   indent, "", ctxt,
> -                   d->config[PCI_SECONDARY_BUS],
> +                   indent, "", ctxt, pci_bus_num(d->bus),
>                     PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
>                     pci_get_word(d->config + PCI_VENDOR_ID),
>                     pci_get_word(d->config + PCI_DEVICE_ID),
> @@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
>      char path[16];
>  
>      snprintf(path, sizeof(path), "%04x:%02x:%02x.%x",
> -             pci_find_domain(d->bus), d->config[PCI_SECONDARY_BUS],
> +             pci_find_domain(d->bus), pci_bus_num(d->bus),
>               PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
>  
>      return strdup(path);
>
Alex Williamson - Nov. 8, 2010, 2:52 p.m.
On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
> On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
> > pcibus_dev_print() was erroneously retrieving the device bus
> > number from the secondary bus number offset of the device
> > instead of the bridge above the device.  This ends of landing
> > in the 2nd byte of the 3rd BAR for devices, which thankfully
> > is usually zero.  pcibus_get_dev_path() copied this code,
> > inheriting the same bug.  pcibus_get_dev_path() is used for
> > ramblock naming, so changing it can effect migration.  However,
> > I've only seen this byte be non-zero for an assigned device,
> > which can't migrate anyway, so hopefully we won't run into
> > any issues.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> 
> Good catch. Applied.
> I don't really see why do we put the dev path
> in the bus object: why not let device supply its name?

Because the device name is not unique.  This came about from the
discussion about how to create a canonical device path that Gleb and
Markus are again trying to hash out.  If we go up to the bus and get the
bus address, we have a VM unique name.  Unfortunately, it's difficult to
define what the bus should print in all cases (ISA), but since they
don't do hotplug and typically don't allocate ramblocks, we can mostly
ignore it for this use case.

> And I think this will affect nested bridges. However they are currently
> broken anyway: we really must convert to topological names as bus number
> is guest-assigned - they don't have to be unique, even.

Yes, nested bridges are a problem.  How can the seg/bus/devfn not be
unique?

> What does fixing this involve? Just changing pcibus_get_dev_path?

How do you plan to fix it?  Don't forget that migration depends on these
names, so some kind of compatibility layer would be required.  Thanks,

Alex

> > ---
> > 
> >  hw/pci.c |    5 ++---
> >  1 files changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/pci.c b/hw/pci.c
> > index 6d0934d..15416dd 100644
> > --- a/hw/pci.c
> > +++ b/hw/pci.c
> > @@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
> >  
> >      monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, "
> >                     "pci id %04x:%04x (sub %04x:%04x)\n",
> > -                   indent, "", ctxt,
> > -                   d->config[PCI_SECONDARY_BUS],
> > +                   indent, "", ctxt, pci_bus_num(d->bus),
> >                     PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
> >                     pci_get_word(d->config + PCI_VENDOR_ID),
> >                     pci_get_word(d->config + PCI_DEVICE_ID),
> > @@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
> >      char path[16];
> >  
> >      snprintf(path, sizeof(path), "%04x:%02x:%02x.%x",
> > -             pci_find_domain(d->bus), d->config[PCI_SECONDARY_BUS],
> > +             pci_find_domain(d->bus), pci_bus_num(d->bus),
> >               PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
> >  
> >      return strdup(path);
> >
Alex Williamson - Dec. 13, 2010, 8:04 p.m.
On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
> On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
> > pcibus_dev_print() was erroneously retrieving the device bus
> > number from the secondary bus number offset of the device
> > instead of the bridge above the device.  This ends of landing
> > in the 2nd byte of the 3rd BAR for devices, which thankfully
> > is usually zero.  pcibus_get_dev_path() copied this code,
> > inheriting the same bug.  pcibus_get_dev_path() is used for
> > ramblock naming, so changing it can effect migration.  However,
> > I've only seen this byte be non-zero for an assigned device,
> > which can't migrate anyway, so hopefully we won't run into
> > any issues.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> 
> Good catch. Applied.

Um... submitted vs applied:

     PCI: Bus number from the bridge, not the device
 
@@ -6,20 +8,28 @@
     number from the secondary bus number offset of the device
     instead of the bridge above the device.  This ends of landing
     in the 2nd byte of the 3rd BAR for devices, which thankfully
-    is usually zero.  pcibus_get_dev_path() copied this code,
+    is usually zero.
+    
+    Note: pcibus_get_dev_path() copied this code,
     inheriting the same bug.  pcibus_get_dev_path() is used for
     ramblock naming, so changing it can effect migration.  However,
     I've only seen this byte be non-zero for an assigned device,
     which can't migrate anyway, so hopefully we won't run into
     any issues.
 
+    This patch does not touch pcibus_get_dev_path, as
+    bus number is guest assigned for nested buses,
+    so using it for migration is broken anyway.
+    Fix it properly later.
+    
     Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
+    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 
 diff --git a/hw/pci.c b/hw/pci.c
-index 6d0934d..15416dd 100644
+index 962886e..8f6fcf8 100644
 --- a/hw/pci.c
 +++ b/hw/pci.c
-@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
+@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
  
      monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, "
                     "pci id %04x:%04x (sub %04x:%04x)\n",
@@ -29,14 +39,3 @@
                     PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
                     pci_get_word(d->config + PCI_VENDOR_ID),
                     pci_get_word(d->config + PCI_DEVICE_ID),
-@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
-     char path[16];
- 
-     snprintf(path, sizeof(path), "%04x:%02x:%02x.%x",
--             pci_find_domain(d->bus), d->config[PCI_SECONDARY_BUS],
-+             pci_find_domain(d->bus), pci_bus_num(d->bus),
-              PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
- 
-     return strdup(path);
-
-

So the chunk that fixed the part that I was actually interested in got
dropped even though the existing code is clearly wrong.  Yes, we still
have issues with nested bridges (not that we have many of those), but
until the "Fix it properly later" part comes along, can we please
include the obvious bug fix?  Thanks,

Alex
Michael S. Tsirkin - Dec. 14, 2010, 4:46 a.m.
On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
> On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
> > On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
> > > pcibus_dev_print() was erroneously retrieving the device bus
> > > number from the secondary bus number offset of the device
> > > instead of the bridge above the device.  This ends of landing
> > > in the 2nd byte of the 3rd BAR for devices, which thankfully
> > > is usually zero.  pcibus_get_dev_path() copied this code,
> > > inheriting the same bug.  pcibus_get_dev_path() is used for
> > > ramblock naming, so changing it can effect migration.  However,
> > > I've only seen this byte be non-zero for an assigned device,
> > > which can't migrate anyway, so hopefully we won't run into
> > > any issues.
> > > 
> > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > 
> > Good catch. Applied.
> 
> Um... submitted vs applied:
> 
>      PCI: Bus number from the bridge, not the device
>  
> @@ -6,20 +8,28 @@
>      number from the secondary bus number offset of the device
>      instead of the bridge above the device.  This ends of landing
>      in the 2nd byte of the 3rd BAR for devices, which thankfully
> -    is usually zero.  pcibus_get_dev_path() copied this code,
> +    is usually zero.
> +    
> +    Note: pcibus_get_dev_path() copied this code,
>      inheriting the same bug.  pcibus_get_dev_path() is used for
>      ramblock naming, so changing it can effect migration.  However,
>      I've only seen this byte be non-zero for an assigned device,
>      which can't migrate anyway, so hopefully we won't run into
>      any issues.
>  
> +    This patch does not touch pcibus_get_dev_path, as
> +    bus number is guest assigned for nested buses,
> +    so using it for migration is broken anyway.
> +    Fix it properly later.
> +    
>      Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> +    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>  
>  diff --git a/hw/pci.c b/hw/pci.c
> -index 6d0934d..15416dd 100644
> +index 962886e..8f6fcf8 100644
>  --- a/hw/pci.c
>  +++ b/hw/pci.c
> -@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
> +@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
>   
>       monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, "
>                      "pci id %04x:%04x (sub %04x:%04x)\n",
> @@ -29,14 +39,3 @@
>                      PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
>                      pci_get_word(d->config + PCI_VENDOR_ID),
>                      pci_get_word(d->config + PCI_DEVICE_ID),
> -@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
> -     char path[16];
> - 
> -     snprintf(path, sizeof(path), "%04x:%02x:%02x.%x",
> --             pci_find_domain(d->bus), d->config[PCI_SECONDARY_BUS],
> -+             pci_find_domain(d->bus), pci_bus_num(d->bus),
> -              PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
> - 
> -     return strdup(path);
> -
> -
> 
> So the chunk that fixed the part that I was actually interested in got
> dropped even though the existing code is clearly wrong.  Yes, we still
> have issues with nested bridges (not that we have many of those), but
> until the "Fix it properly later" part comes along, can we please
> include the obvious bug fix?  Thanks,
> 
> Alex

We can stick 0 in there - would that help?  I would much rather not
create a version where we put the bus number there.
Alex Williamson - Dec. 14, 2010, 4:49 a.m.
On Tue, 2010-12-14 at 06:46 +0200, Michael S. Tsirkin wrote:
> On Mon, Dec 13, 2010 at 01:04:23PM -0700, Alex Williamson wrote:
> > On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote:
> > > On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote:
> > > > pcibus_dev_print() was erroneously retrieving the device bus
> > > > number from the secondary bus number offset of the device
> > > > instead of the bridge above the device.  This ends of landing
> > > > in the 2nd byte of the 3rd BAR for devices, which thankfully
> > > > is usually zero.  pcibus_get_dev_path() copied this code,
> > > > inheriting the same bug.  pcibus_get_dev_path() is used for
> > > > ramblock naming, so changing it can effect migration.  However,
> > > > I've only seen this byte be non-zero for an assigned device,
> > > > which can't migrate anyway, so hopefully we won't run into
> > > > any issues.
> > > > 
> > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > > 
> > > Good catch. Applied.
> > 
> > Um... submitted vs applied:
> > 
> >      PCI: Bus number from the bridge, not the device
> >  
> > @@ -6,20 +8,28 @@
> >      number from the secondary bus number offset of the device
> >      instead of the bridge above the device.  This ends of landing
> >      in the 2nd byte of the 3rd BAR for devices, which thankfully
> > -    is usually zero.  pcibus_get_dev_path() copied this code,
> > +    is usually zero.
> > +    
> > +    Note: pcibus_get_dev_path() copied this code,
> >      inheriting the same bug.  pcibus_get_dev_path() is used for
> >      ramblock naming, so changing it can effect migration.  However,
> >      I've only seen this byte be non-zero for an assigned device,
> >      which can't migrate anyway, so hopefully we won't run into
> >      any issues.
> >  
> > +    This patch does not touch pcibus_get_dev_path, as
> > +    bus number is guest assigned for nested buses,
> > +    so using it for migration is broken anyway.
> > +    Fix it properly later.
> > +    
> >      Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > +    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >  
> >  diff --git a/hw/pci.c b/hw/pci.c
> > -index 6d0934d..15416dd 100644
> > +index 962886e..8f6fcf8 100644
> >  --- a/hw/pci.c
> >  +++ b/hw/pci.c
> > -@@ -1940,8 +1940,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
> > +@@ -1806,8 +1806,7 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
> >   
> >       monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, "
> >                      "pci id %04x:%04x (sub %04x:%04x)\n",
> > @@ -29,14 +39,3 @@
> >                      PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
> >                      pci_get_word(d->config + PCI_VENDOR_ID),
> >                      pci_get_word(d->config + PCI_DEVICE_ID),
> > -@@ -1965,7 +1964,7 @@ static char *pcibus_get_dev_path(DeviceState *dev)
> > -     char path[16];
> > - 
> > -     snprintf(path, sizeof(path), "%04x:%02x:%02x.%x",
> > --             pci_find_domain(d->bus), d->config[PCI_SECONDARY_BUS],
> > -+             pci_find_domain(d->bus), pci_bus_num(d->bus),
> > -              PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
> > - 
> > -     return strdup(path);
> > -
> > -
> > 
> > So the chunk that fixed the part that I was actually interested in got
> > dropped even though the existing code is clearly wrong.  Yes, we still
> > have issues with nested bridges (not that we have many of those), but
> > until the "Fix it properly later" part comes along, can we please
> > include the obvious bug fix?  Thanks,
> > 
> > Alex
> 
> We can stick 0 in there - would that help?  I would much rather not
> create a version where we put the bus number there.

Yep, 0 is good enough until we solve the nested bridge problem.  Thanks,

Alex

Patch

diff --git a/hw/pci.c b/hw/pci.c
index 6d0934d..15416dd 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -1940,8 +1940,7 @@  static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent)
 
     monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, "
                    "pci id %04x:%04x (sub %04x:%04x)\n",
-                   indent, "", ctxt,
-                   d->config[PCI_SECONDARY_BUS],
+                   indent, "", ctxt, pci_bus_num(d->bus),
                    PCI_SLOT(d->devfn), PCI_FUNC(d->devfn),
                    pci_get_word(d->config + PCI_VENDOR_ID),
                    pci_get_word(d->config + PCI_DEVICE_ID),
@@ -1965,7 +1964,7 @@  static char *pcibus_get_dev_path(DeviceState *dev)
     char path[16];
 
     snprintf(path, sizeof(path), "%04x:%02x:%02x.%x",
-             pci_find_domain(d->bus), d->config[PCI_SECONDARY_BUS],
+             pci_find_domain(d->bus), pci_bus_num(d->bus),
              PCI_SLOT(d->devfn), PCI_FUNC(d->devfn));
 
     return strdup(path);