diff mbox

[U-Boot] pci: don't skip vendor ID 0

Message ID 1485321910-12648-1-git-send-email-gregory.fong@virgingalactic.com
State Deferred
Delegated to: Tom Rini
Headers show

Commit Message

Gregory Fong Jan. 25, 2017, 5:25 a.m. UTC
Unlike 0xffff, 0 is not an invalid vendor ID.

Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com>
---
Based on question initially asked here:
http://lists.denx.de/pipermail/u-boot/2016-December/276172.html

I've been looking through the book I have on PCI and through various online
resources, and haven't been able to find evidence that a vendor ID of 0 is
invalid, even if it's unusual.  There are still issues in how this will be
handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still
assumes that a vendor and device ID of 0 mean that the last pci_device_id is
reached, but this change at least allows the device's PCI BARs to get
programmed if the vendor ID is 0.

 drivers/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Bin Meng Jan. 25, 2017, 2:14 p.m. UTC | #1
Hi Gregory,

On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong
<Gregory.Fong@virgingalactic.com> wrote:
> Unlike 0xffff, 0 is not an invalid vendor ID.
>
> Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com>
> ---
> Based on question initially asked here:
> http://lists.denx.de/pipermail/u-boot/2016-December/276172.html
>
> I've been looking through the book I have on PCI and through various online
> resources, and haven't been able to find evidence that a vendor ID of 0 is
> invalid, even if it's unusual.  There are still issues in how this will be
> handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still
> assumes that a vendor and device ID of 0 mean that the last pci_device_id is
> reached, but this change at least allows the device's PCI BARs to get
> programmed if the vendor ID is 0.
>
>  drivers/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 6b36c18..3ee7180 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -366,7 +366,7 @@ int pci_hose_scan_bus(struct pci_controller *hose, int bus)
>
>                 pci_hose_read_config_word(hose, dev, PCI_VENDOR_ID, &vendor);
>
> -               if (vendor == 0xffff || vendor == 0x0000)
> +               if (vendor == 0xffff)
>                         continue;
>
>                 if (!PCI_FUNC(dev))

Do you have such a PCI device that has the vendor ID as zero? Or have
you ever seen such? If not, I would say let's leave this as it is.

Regards,
Bin
Gregory Fong Jan. 25, 2017, 9:23 p.m. UTC | #2
Hi Bin,

On Wed, Jan 25, 2017 at 10:14:56PM +0800, Bin Meng wrote:
> Hi Gregory,
> 
> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong
> <Gregory.Fong@virgingalactic.com> wrote:
> > Unlike 0xffff, 0 is not an invalid vendor ID.
> >
> > Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com>
> > ---
> > Based on question initially asked here:
> > http://lists.denx.de/pipermail/u-boot/2016-December/276172.html
> >
> > I've been looking through the book I have on PCI and through various online
> > resources, and haven't been able to find evidence that a vendor ID of 0 is
> > invalid, even if it's unusual.  There are still issues in how this will be
> > handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still
> > assumes that a vendor and device ID of 0 mean that the last pci_device_id is
> > reached, but this change at least allows the device's PCI BARs to get
> > programmed if the vendor ID is 0.
> >
> > [snip]
> 
> Do you have such a PCI device that has the vendor ID as zero? Or have
> you ever seen such? If not, I would say let's leave this as it is.

Yes, this hardware does exist; that's the reason I stumbled across this
issue at all.  Without this change, the device's PCI BARs don't get
programmed, so it isn't usable after booting the OS.

I would have greatly preferred to be able to say that a vendor ID of 0
is invalid, but can't find any documentation to back up that assertion.
Hence the patch.

Best regards,
Gregory
Bin Meng Jan. 26, 2017, 10:08 a.m. UTC | #3
Hi Gregory,

On Thu, Jan 26, 2017 at 5:23 AM, Gregory Fong
<Gregory.Fong@virgingalactic.com> wrote:
> Hi Bin,
>
> On Wed, Jan 25, 2017 at 10:14:56PM +0800, Bin Meng wrote:
>> Hi Gregory,
>>
>> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong
>> <Gregory.Fong@virgingalactic.com> wrote:
>> > Unlike 0xffff, 0 is not an invalid vendor ID.
>> >
>> > Signed-off-by: Gregory Fong <gregory.fong@virgingalactic.com>
>> > ---
>> > Based on question initially asked here:
>> > http://lists.denx.de/pipermail/u-boot/2016-December/276172.html
>> >
>> > I've been looking through the book I have on PCI and through various online
>> > resources, and haven't been able to find evidence that a vendor ID of 0 is
>> > invalid, even if it's unusual.  There are still issues in how this will be
>> > handled in e.g. drivers/pci/pci_common.c because pci_hose_find_devices() still
>> > assumes that a vendor and device ID of 0 mean that the last pci_device_id is
>> > reached, but this change at least allows the device's PCI BARs to get
>> > programmed if the vendor ID is 0.
>> >
>> > [snip]
>>
>> Do you have such a PCI device that has the vendor ID as zero? Or have
>> you ever seen such? If not, I would say let's leave this as it is.
>
> Yes, this hardware does exist; that's the reason I stumbled across this
> issue at all.  Without this change, the device's PCI BARs don't get
> programmed, so it isn't usable after booting the OS.
>

Which vendor is your PCI device manufactured by?

> I would have greatly preferred to be able to say that a vendor ID of 0
> is invalid, but can't find any documentation to back up that assertion.
> Hence the patch.
>

Is it possible that it's zero due to the hardware is buggy? For
example, on some Intel cards, PCI vendor ID and device ID can be
loaded from an EEPROM and if EEPROM content is wrong, it may expose
wrong IDs. Anyway zero does not look like a valid one though.

Regards,
Bin
Leon Woestenberg Jan. 26, 2017, 12:26 p.m. UTC | #4
Hello all,

my two cents on this one:

On Thu, Jan 26, 2017 at 11:08 AM, Bin Meng <bmeng.cn@gmail.com> wrote:

> >> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong
> >> <Gregory.Fong@virgingalactic.com> wrote:
> >> > I've been looking through the book I have on PCI and through various
> online
> >> > resources, and haven't been able to find evidence that a vendor ID of
> 0 is
> >> > invalid, even if it's unusual.
>
> Is it possible that it's zero due to the hardware is buggy? For
> example, on some Intel cards, PCI vendor ID and device ID can be
> loaded from an EEPROM and if EEPROM content is wrong, it may expose
> wrong IDs. Anyway zero does not look like a valid one though.
>
> Regards,
> Bin
>

From the spec. point of view, 0x0000 seems as valid as any in the range
[0x0001-0xFFFE].
From the registry point of view, only registered numbers are valid.

I.e. if I create a programmable PCI(e) card, I can pick any number in the
range [0x0000-0xFFFE] during development and it should work. I will try not
to clash with already registered numbers (to prevent the wrong drivers
probing the endpoint), and preferably I would try to re-use the (FPGA)
silicon vendor's ID, although I am well aware that it should change going
into production.

If only 0xFFFF is reserved, then [0-0xFFFE] should be accepted in the code,
right?
If we disallow 0x0000, on what arguments do we do that?

Middle solution is to accept the ID with a warning.

Regards,

Leon.
Gregory Fong Jan. 26, 2017, 1:19 p.m. UTC | #5
Hi Leon,

On Thu, Jan 26, 2017 at 4:26 AM, Leon Woestenberg <leon@sidebranch.com> wrote:
> On Thu, Jan 26, 2017 at 11:08 AM, Bin Meng <bmeng.cn@gmail.com> wrote:
>
>>>> On Wed, Jan 25, 2017 at 1:25 PM, Gregory Fong <Gregory.Fong@virgingalactic.com> wrote:
>>>>> I've been looking through the book I have on PCI and through various online
>>>>> resources, and haven't been able to find evidence that a vendor ID of 0 is
>>>>> invalid, even if it's unusual.
>>
>> Is it possible that it's zero due to the hardware is buggy? For
>> example, on some Intel cards, PCI vendor ID and device ID can be
>> loaded from an EEPROM and if EEPROM content is wrong, it may expose
>> wrong IDs. Anyway zero does not look like a valid one though.
>>
>
> From the spec. point of view, 0x0000 seems as valid as any in the range
> [0x0001-0xFFFE].
> From the registry point of view, only registered numbers are valid.
>
> I.e. if I create a programmable PCI(e) card, I can pick any number in the
> range [0x0000-0xFFFE] during development and it should work. I will try not
> to clash with already registered numbers (to prevent the wrong drivers
> probing the endpoint), and preferably I would try to re-use the (FPGA)
> silicon vendor's ID, although I am well aware that it should change going
> into production.

Indeed, it turns out 0 doesn't tend to clash. :-)

>
> If only 0xFFFF is reserved, then [0-0xFFFE] should be accepted in the code,
> right?
> If we disallow 0x0000, on what arguments do we do that?

Thanks for weighing in, this is exactly what I'm trying to find out.
FWIW, the linux kernel works just fine with a vendor ID of 0.

>
> Middle solution is to accept the ID with a warning.

Would the reason for the warning be that this change doesn't entirely
fix the problem?  If so, it seems like the place to put such a warning
would be on the code that doesn't properly handle a vendor ID of 0
yet.  But not sure it's necessary.

I did take a quick look at what it would require to better handle the
vendor ID 0 case everywhere.  It looks like the main problem is that
struct pci_device_id with vendor and device both 0 is being used to
indicate the end of an array.  Not that complicated to change, but it
isn't trivial either, so figure it's best to leave for a future
changeset.

Thanks,
Gregory
Leon Woestenberg Jan. 26, 2017, 1:57 p.m. UTC | #6
On Thu, Jan 26, 2017 at 2:19 PM, Gregory Fong <gregory.0xf0@gmail.com>
wrote:

>  It looks like the main problem is that
> struct pci_device_id with vendor and device both 0 is being used to
> indicate the end of an array.  Not that complicated to change, but it
> isn't trivial either, so figure it's best to leave for a future
> changeset.
>
> Hmm, then maybe 0xFFFF, 0xFFFF should be used to indicate the end of the
array.

How does Linux handle the end of the array, while it also supports 0x0000,
0x0000?

Thanks,

Leon.
Gregory Fong Jan. 27, 2017, 3:11 a.m. UTC | #7
On Thu, Jan 26, 2017 at 5:57 AM, Leon Woestenberg <leon@sidebranch.com> wrote:
>
> On Thu, Jan 26, 2017 at 2:19 PM, Gregory Fong <gregory.0xf0@gmail.com>
> wrote:
>>
>>  It looks like the main problem is that
>> struct pci_device_id with vendor and device both 0 is being used to
>> indicate the end of an array.  Not that complicated to change, but it
>> isn't trivial either, so figure it's best to leave for a future
>> changeset.
>>
> Hmm, then maybe 0xFFFF, 0xFFFF should be used to indicate the end of the
> array.
>
> How does Linux handle the end of the array, while it also supports 0x0000,
> 0x0000?

Well, looking at
http://lxr.free-electrons.com/source/drivers/pci/pci-driver.c , in
pci_match_id(), it stops if !(vendor || subvendor || class_mask), so
I'd hazard a guess that the device I have is working is because it has
a nonzero class ID.  Strictly speaking that is probably still invalid,
but it's much less likely to happen.  It also supports dynamic IDs for
which it iterates over a linked list.

Unfortunately there seems to be a fair bit of code relying on all 0s
having a special meaning, and it looks like all PCI_ANY_ID has its own
meaning as well based off of the fact that there's a static
pci_device_id_any struct in there that's used if driver_override is
set in the input struct pci_dev.  I can't say more than that without
digging into the log for that change or finding the users of the
driver_override flag.

Best regards,
Gregory
diff mbox

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 6b36c18..3ee7180 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -366,7 +366,7 @@  int pci_hose_scan_bus(struct pci_controller *hose, int bus)
 
 		pci_hose_read_config_word(hose, dev, PCI_VENDOR_ID, &vendor);
 
-		if (vendor == 0xffff || vendor == 0x0000)
+		if (vendor == 0xffff)
 			continue;
 
 		if (!PCI_FUNC(dev))