Message ID | 1485321910-12648-1-git-send-email-gregory.fong@virgingalactic.com |
---|---|
State | Deferred |
Delegated to: | Tom Rini |
Headers | show |
Hi Gregory, On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong <Gregory.Fong@virgingalactic.com> wrote: > Unlike 0xffff, 0 is not an invalid vendor ID. > > Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com> > --- > Based on question initially asked here: > http://lists.denx.de/pipermail/u-boot/2016-December/276172.html > > I've been looking through the book I have on PCI and through various online > resources, and haven't been able to find evidence that a vendor ID of 0 is > invalid, even if it's unusual. There are still issues in how this will be > handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still > assumes that a vendor and device ID of 0 mean that the last pci_device_id is > reached, but this change at least allows the device's PCI BARs to get > programmed if the vendor ID is 0. > > drivers/pci/pci.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 6b36c18..3ee7180 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -366,7 +366,7 @@ int pci_hose_scan_bus(struct pci_controller *hose, int bus) > > pci_hose_read_config_word(hose, dev, PCI_VENDOR_ID, &vendor); > > - if (vendor == 0xffff || vendor == 0x0000) > + if (vendor == 0xffff) > continue; > > if (!PCI_FUNC(dev)) Do you have such a PCI device that has the vendor ID as zero? Or have you ever seen such? If not, I would say let's leave this as it is. Regards, Bin
Hi Bin, On Wed, Jan 25, 2017 at 10:14:56PM +0800, Bin Meng wrote: > Hi Gregory, > > On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong > <Gregory.Fong@virgingalactic.com> wrote: > > Unlike 0xffff, 0 is not an invalid vendor ID. > > > > Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com> > > --- > > Based on question initially asked here: > > http://lists.denx.de/pipermail/u-boot/2016-December/276172.html > > > > I've been looking through the book I have on PCI and through various online > > resources, and haven't been able to find evidence that a vendor ID of 0 is > > invalid, even if it's unusual. There are still issues in how this will be > > handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still > > assumes that a vendor and device ID of 0 mean that the last pci_device_id is > > reached, but this change at least allows the device's PCI BARs to get > > programmed if the vendor ID is 0. > > > > [snip] > > Do you have such a PCI device that has the vendor ID as zero? Or have > you ever seen such? If not, I would say let's leave this as it is. Yes, this hardware does exist; that's the reason I stumbled across this issue at all. Without this change, the device's PCI BARs don't get programmed, so it isn't usable after booting the OS. I would have greatly preferred to be able to say that a vendor ID of 0 is invalid, but can't find any documentation to back up that assertion. Hence the patch. Best regards, Gregory
Hi Gregory, On Thu, Jan 26, 2017 at 5:23 AM, Gregory Fong <Gregory.Fong@virgingalactic.com> wrote: > Hi Bin, > > On Wed, Jan 25, 2017 at 10:14:56PM +0800, Bin Meng wrote: >> Hi Gregory, >> >> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong >> <Gregory.Fong@virgingalactic.com> wrote: >> > Unlike 0xffff, 0 is not an invalid vendor ID. >> > >> > Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com> >> > --- >> > Based on question initially asked here: >> > http://lists.denx.de/pipermail/u-boot/2016-December/276172.html >> > >> > I've been looking through the book I have on PCI and through various online >> > resources, and haven't been able to find evidence that a vendor ID of 0 is >> > invalid, even if it's unusual. There are still issues in how this will be >> > handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still >> > assumes that a vendor and device ID of 0 mean that the last pci_device_id is >> > reached, but this change at least allows the device's PCI BARs to get >> > programmed if the vendor ID is 0. >> > >> > [snip] >> >> Do you have such a PCI device that has the vendor ID as zero? Or have >> you ever seen such? If not, I would say let's leave this as it is. > > Yes, this hardware does exist; that's the reason I stumbled across this > issue at all. Without this change, the device's PCI BARs don't get > programmed, so it isn't usable after booting the OS. > Which vendor is your PCI device manufactured by? > I would have greatly preferred to be able to say that a vendor ID of 0 > is invalid, but can't find any documentation to back up that assertion. > Hence the patch. > Is it possible that it's zero due to the hardware is buggy? For example, on some Intel cards, PCI vendor ID and device ID can be loaded from an EEPROM and if EEPROM content is wrong, it may expose wrong IDs. Anyway zero does not look like a valid one though. Regards, Bin
Hello all, my two cents on this one: On Thu, Jan 26, 2017 at 11:08 AM, Bin Meng <bmeng.cn@gmail.com> wrote: > >> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong > >> <Gregory.Fong@virgingalactic.com> wrote: > >> > I've been looking through the book I have on PCI and through various > online > >> > resources, and haven't been able to find evidence that a vendor ID of > 0 is > >> > invalid, even if it's unusual. > > Is it possible that it's zero due to the hardware is buggy? For > example, on some Intel cards, PCI vendor ID and device ID can be > loaded from an EEPROM and if EEPROM content is wrong, it may expose > wrong IDs. Anyway zero does not look like a valid one though. > > Regards, > Bin > From the spec. point of view, 0x0000 seems as valid as any in the range [0x0001-0xFFFE]. From the registry point of view, only registered numbers are valid. I.e. if I create a programmable PCI(e) card, I can pick any number in the range [0x0000-0xFFFE] during development and it should work. I will try not to clash with already registered numbers (to prevent the wrong drivers probing the endpoint), and preferably I would try to re-use the (FPGA) silicon vendor's ID, although I am well aware that it should change going into production. If only 0xFFFF is reserved, then [0-0xFFFE] should be accepted in the code, right? If we disallow 0x0000, on what arguments do we do that? Middle solution is to accept the ID with a warning. Regards, Leon.
Hi Leon, On Thu, Jan 26, 2017 at 4:26 AM, Leon Woestenberg <leon@sidebranch.com> wrote: > On Thu, Jan 26, 2017 at 11:08 AM, Bin Meng <bmeng.cn@gmail.com> wrote: > >>>> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong <Gregory.Fong@virgingalactic.com> wrote: >>>>> I've been looking through the book I have on PCI and through various online >>>>> resources, and haven't been able to find evidence that a vendor ID of 0 is >>>>> invalid, even if it's unusual. >> >> Is it possible that it's zero due to the hardware is buggy? For >> example, on some Intel cards, PCI vendor ID and device ID can be >> loaded from an EEPROM and if EEPROM content is wrong, it may expose >> wrong IDs. Anyway zero does not look like a valid one though. >> > > From the spec. point of view, 0x0000 seems as valid as any in the range > [0x0001-0xFFFE]. > From the registry point of view, only registered numbers are valid. > > I.e. if I create a programmable PCI(e) card, I can pick any number in the > range [0x0000-0xFFFE] during development and it should work. I will try not > to clash with already registered numbers (to prevent the wrong drivers > probing the endpoint), and preferably I would try to re-use the (FPGA) > silicon vendor's ID, although I am well aware that it should change going > into production. Indeed, it turns out 0 doesn't tend to clash. :-) > > If only 0xFFFF is reserved, then [0-0xFFFE] should be accepted in the code, > right? > If we disallow 0x0000, on what arguments do we do that? Thanks for weighing in, this is exactly what I'm trying to find out. FWIW, the linux kernel works just fine with a vendor ID of 0. > > Middle solution is to accept the ID with a warning. Would the reason for the warning be that this change doesn't entirely fix the problem? If so, it seems like the place to put such a warning would be on the code that doesn't properly handle a vendor ID of 0 yet. But not sure it's necessary. I did take a quick look at what it would require to better handle the vendor ID 0 case everywhere. It looks like the main problem is that struct pci_device_id with vendor and device both 0 is being used to indicate the end of an array. Not that complicated to change, but it isn't trivial either, so figure it's best to leave for a future changeset. Thanks, Gregory
On Thu, Jan 26, 2017 at 2:19 PM, Gregory Fong <gregory.0xf0@gmail.com> wrote: > It looks like the main problem is that > struct pci_device_id with vendor and device both 0 is being used to > indicate the end of an array. Not that complicated to change, but it > isn't trivial either, so figure it's best to leave for a future > changeset. > > Hmm, then maybe 0xFFFF, 0xFFFF should be used to indicate the end of the array. How does Linux handle the end of the array, while it also supports 0x0000, 0x0000? Thanks, Leon.
On Thu, Jan 26, 2017 at 5:57 AM, Leon Woestenberg <leon@sidebranch.com> wrote: > > On Thu, Jan 26, 2017 at 2:19 PM, Gregory Fong <gregory.0xf0@gmail.com> > wrote: >> >> It looks like the main problem is that >> struct pci_device_id with vendor and device both 0 is being used to >> indicate the end of an array. Not that complicated to change, but it >> isn't trivial either, so figure it's best to leave for a future >> changeset. >> > Hmm, then maybe 0xFFFF, 0xFFFF should be used to indicate the end of the > array. > > How does Linux handle the end of the array, while it also supports 0x0000, > 0x0000? Well, looking at http://lxr.free-electrons.com/source/drivers/pci/pci-driver.c , in pci_match_id(), it stops if !(vendor || subvendor || class_mask), so I'd hazard a guess that the device I have is working is because it has a nonzero class ID. Strictly speaking that is probably still invalid, but it's much less likely to happen. It also supports dynamic IDs for which it iterates over a linked list. Unfortunately there seems to be a fair bit of code relying on all 0s having a special meaning, and it looks like all PCI_ANY_ID has its own meaning as well based off of the fact that there's a static pci_device_id_any struct in there that's used if driver_override is set in the input struct pci_dev. I can't say more than that without digging into the log for that change or finding the users of the driver_override flag. Best regards, Gregory
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 6b36c18..3ee7180 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -366,7 +366,7 @@ int pci_hose_scan_bus(struct pci_controller *hose, int bus) pci_hose_read_config_word(hose, dev, PCI_VENDOR_ID, &vendor); - if (vendor == 0xffff || vendor == 0x0000) + if (vendor == 0xffff) continue; if (!PCI_FUNC(dev))
Unlike 0xffff, 0 is not an invalid vendor ID. Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com> --- Based on question initially asked here: http://lists.denx.de/pipermail/u-boot/2016-December/276172.html I've been looking through the book I have on PCI and through various online resources, and haven't been able to find evidence that a vendor ID of 0 is invalid, even if it's unusual. There are still issues in how this will be handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still assumes that a vendor and device ID of 0 mean that the last pci_device_id is reached, but this change at least allows the device's PCI BARs to get programmed if the vendor ID is 0. drivers/pci/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)