Message ID | 20180419044119.1970-1-oohall@gmail.com |
---|---|
State | Rejected |
Headers | show |
Series | npu2: Fix NPU<->GPU binding error message | expand |
On Thu, Apr 19, 2018 at 02:41:19PM +1000, Oliver O'Halloran wrote: >--- a/hw/npu2.c >+++ b/hw/npu2.c >@@ -434,8 +434,9 @@ static void npu2_dev_bind_pci_dev(struct npu2_dev *dev) > } > } > >- prlog(PR_INFO, "%s: No PCI device for NPU2 device %04x:00:%02x.0 to bind to. If you expect a GPU to be there, this is a problem.\n", >- __func__, dev->npu->phb_nvlink.opal_id, dev->index); >+ prlog(PR_INFO, "%s: No PCI device for NPU2 device %04x:00:%02x.%x to bind to. If you expect a GPU to be there, this is a problem.\n", >+ __func__, dev->npu->phb_nvlink.opal_id, (dev->bdfn >> 3) & 0x1f, >+ dev->bdfn & 0x3); > } > > static struct lock pci_npu_phandle_lock = LOCK_UNLOCKED; Oddly enough, I just sent a patch about the same line: http://patchwork.ozlabs.org/patch/901343/ It uses NPU2DEVINF() to print the location, including slot label.
On Fri, Apr 20, 2018 at 7:15 AM, Reza Arbab <arbab@linux.ibm.com> wrote: > On Thu, Apr 19, 2018 at 02:41:19PM +1000, Oliver O'Halloran wrote: >> >> --- a/hw/npu2.c >> +++ b/hw/npu2.c >> @@ -434,8 +434,9 @@ static void npu2_dev_bind_pci_dev(struct npu2_dev >> *dev) >> } >> } >> >> - prlog(PR_INFO, "%s: No PCI device for NPU2 device %04x:00:%02x.0 >> to bind to. If you expect a GPU to be there, this is a problem.\n", >> - __func__, dev->npu->phb_nvlink.opal_id, dev->index); >> + prlog(PR_INFO, "%s: No PCI device for NPU2 device %04x:00:%02x.%x >> to bind to. If you expect a GPU to be there, this is a problem.\n", >> + __func__, dev->npu->phb_nvlink.opal_id, (dev->bdfn >> 3) & >> 0x1f, >> + dev->bdfn & 0x3); >> } >> >> static struct lock pci_npu_phandle_lock = LOCK_UNLOCKED; > > > Oddly enough, I just sent a patch about the same line: > http://patchwork.ozlabs.org/patch/901343/ > > It uses NPU2DEVINF() to print the location, including slot label. Oh cool, that's probably a better idea than open coding the BDFN stuff here. > -- > Reza Arbab >
diff --git a/hw/npu2.c b/hw/npu2.c index 06e06d4ff2aa..99bc67b30af9 100644 --- a/hw/npu2.c +++ b/hw/npu2.c @@ -434,8 +434,9 @@ static void npu2_dev_bind_pci_dev(struct npu2_dev *dev) } } - prlog(PR_INFO, "%s: No PCI device for NPU2 device %04x:00:%02x.0 to bind to. If you expect a GPU to be there, this is a problem.\n", - __func__, dev->npu->phb_nvlink.opal_id, dev->index); + prlog(PR_INFO, "%s: No PCI device for NPU2 device %04x:00:%02x.%x to bind to. If you expect a GPU to be there, this is a problem.\n", + __func__, dev->npu->phb_nvlink.opal_id, (dev->bdfn >> 3) & 0x1f, + dev->bdfn & 0x3); } static struct lock pci_npu_phandle_lock = LOCK_UNLOCKED;
NVLinks with the same target appear as a multi-function PCI device rather than individual devices. When printing out the error message we assume that the link index is the same as the device number, which is no longer true. Fix this by using the BDFN and setting the appropriate fields. e.g. Old and broken: No PCI device for NPU2 device 0006:00:03.0 to bind to. No PCI device for NPU2 device 0006:00:04.0 to bind to. No PCI device for NPU2 device 0006:00:05.0 to bind to. New and fixed: No PCI device for NPU2 device 0006:00:01.00 to bind to. No PCI device for NPU2 device 0006:00:01.01 to bind to. No PCI device for NPU2 device 0006:00:01.02 to bind to. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> --- hw/npu2.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)