diff mbox

PCI: rockchip: check link status when validating device

Message ID 1495177107-203736-1-git-send-email-shawn.lin@rock-chips.com
State Changes Requested
Headers show

Commit Message

Shawn Lin May 19, 2017, 6:58 a.m. UTC
This patch checks the link status before reading and
writing configure space of devices attached to the RC.
If the link status is down, we shouldn't try to access
the devices.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
---

 drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Brian Norris May 23, 2017, 6 p.m. UTC | #1
On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> This patch checks the link status before reading and
> writing configure space of devices attached to the RC.
> If the link status is down, we shouldn't try to access
> the devices.

I'm curious, in what situations are you seeing the link down? In all the
cases where I can manage to screw up my endpoint and see system aborts
due to config accesses, this check still says the link is up. Presumably
you have some test cases that benefit from this though.

> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 0e020b6..1e64227 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>  }
>  
> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
> +{
> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
> +					PCIE_CLIENT_BASIC_STATUS1));
> +}
> +
>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>  				      struct pci_bus *bus, int dev)
>  {
> +	/* do not access the devices if the link isn't completed */
> +	if (bus->number != rockchip->root_bus_nr) {
> +		if (!rockchip_pcie_link_up(rockchip))
> +			return 0;
> +	}

Seems like this should be the last check in the function, after you
check that the bus and dev numbers are reasonable?

Brian

> +
>  	/* access only one slot on each root port */
>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> -- 
> 1.9.1
> 
>
Brian Norris May 23, 2017, 6:44 p.m. UTC | #2
I forgot, I had one more comment:

On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> This patch checks the link status before reading and
> writing configure space of devices attached to the RC.
> If the link status is down, we shouldn't try to access
> the devices.
> 
> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 0e020b6..1e64227 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>  }
>  
> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
> +{
> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
> +					PCIE_CLIENT_BASIC_STATUS1));
> +}
> +
>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>  				      struct pci_bus *bus, int dev)
>  {
> +	/* do not access the devices if the link isn't completed */
> +	if (bus->number != rockchip->root_bus_nr) {
> +		if (!rockchip_pcie_link_up(rockchip))
> +			return 0;

Not exactly a criticism of this patch directly, but the error handling
sequence that this triggers is strange, and I think it's inconsistent.

A 0 result here becomes PCIBIOS_DEVICE_NOT_FOUND for either the read or
write conf helpers. But the high level code doesn't handle this
consistently. See, e.g., pci_read_config_byte() which can return regular
Linux error codes (like -ENODEV), except it also passes up the return
code of pci_read_config_byte() (a PCIBIOS_* code) directly.

So callers don't really know whether to treat the value from
pci_read_config_<foo>() as a PCIBIOS_* code (which should be translated
with pcibios_err_to_errno()) or as a standard Linux errno.

But then, there are relatively few callers (less than 10% of
pci_read_config_<foo>(); even fewer for writes) that actually check the
error codes...

Maybe the "fix" is to replace -ENODEV with PCIBIOS_DEVICE_NOT_FOUND for
the inconsistent cases (pci_{read,write}_config_{byte,word,dword}()).

Brian

> +	}
> +
>  	/* access only one slot on each root port */
>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> -- 
> 1.9.1
> 
>
Bjorn Helgaas May 23, 2017, 7:44 p.m. UTC | #3
On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> This patch checks the link status before reading and
> writing configure space of devices attached to the RC.
> If the link status is down, we shouldn't try to access
> the devices.

What bad things happen without this patch?

It's racy to check the link status, then do the config access.  The
link might go down after we check but before we can perform the
access.

> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
> ---
> 
>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
> index 0e020b6..1e64227 100644
> --- a/drivers/pci/host/pcie-rockchip.c
> +++ b/drivers/pci/host/pcie-rockchip.c
> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>  }
>  
> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
> +{
> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
> +					PCIE_CLIENT_BASIC_STATUS1));
> +}
> +
>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>  				      struct pci_bus *bus, int dev)
>  {
> +	/* do not access the devices if the link isn't completed */
> +	if (bus->number != rockchip->root_bus_nr) {
> +		if (!rockchip_pcie_link_up(rockchip))
> +			return 0;
> +	}
> +
>  	/* access only one slot on each root port */
>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>  		return 0;
> -- 
> 1.9.1
> 
>
Shawn Lin May 24, 2017, 12:54 a.m. UTC | #4
Hi Brian,

在 2017/5/24 2:00, Brian Norris 写道:
> On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
>> This patch checks the link status before reading and
>> writing configure space of devices attached to the RC.
>> If the link status is down, we shouldn't try to access
>> the devices.
>
> I'm curious, in what situations are you seeing the link down? In all the
> cases where I can manage to screw up my endpoint and see system aborts
> due to config accesses, this check still says the link is up. Presumably
> you have some test cases that benefit from this though.

Of course. This patch doesn't prevent all these cases, for instance,
you do a memory read/write in the EP function driver, since it doesn't
call these two APIs at all.

The reason for me to added this check is that I saw a external abort
down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
the link was re-init or total broken at that time.


>
>> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
>> ---
>>
>>  drivers/pci/host/pcie-rockchip.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
>> index 0e020b6..1e64227 100644
>> --- a/drivers/pci/host/pcie-rockchip.c
>> +++ b/drivers/pci/host/pcie-rockchip.c
>> @@ -275,9 +275,21 @@ static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
>>  	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
>>  }
>>
>> +static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
>> +{
>> +	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
>> +					PCIE_CLIENT_BASIC_STATUS1));
>> +}
>> +
>>  static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
>>  				      struct pci_bus *bus, int dev)
>>  {
>> +	/* do not access the devices if the link isn't completed */
>> +	if (bus->number != rockchip->root_bus_nr) {
>> +		if (!rockchip_pcie_link_up(rockchip))
>> +			return 0;
>> +	}
>
> Seems like this should be the last check in the function, after you
> check that the bus and dev numbers are reasonable?
>

I am fine with either one.


> Brian
>
>> +
>>  	/* access only one slot on each root port */
>>  	if (bus->number == rockchip->root_bus_nr && dev > 0)
>>  		return 0;
>> --
>> 1.9.1
>>
>>
>
>
>
Brian Norris May 24, 2017, 1 a.m. UTC | #5
On Wed, May 24, 2017 at 08:54:14AM +0800, Shawn Lin wrote:
> 在 2017/5/24 2:00, Brian Norris 写道:
> >On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> >>This patch checks the link status before reading and
> >>writing configure space of devices attached to the RC.
> >>If the link status is down, we shouldn't try to access
> >>the devices.
> >
> >I'm curious, in what situations are you seeing the link down? In all the
> >cases where I can manage to screw up my endpoint and see system aborts
> >due to config accesses, this check still says the link is up. Presumably
> >you have some test cases that benefit from this though.

NB: Bjorn asked a similar question in a different form. The underlying
concern though, is that this is racy.

> Of course. This patch doesn't prevent all these cases, for instance,
> you do a memory read/write in the EP function driver, since it doesn't
> call these two APIs at all.

Of course. I'm only talking about config accesses.

> The reason for me to added this check is that I saw a external abort
> down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
> the link was re-init or total broken at that time.

I've seen plenty of aborts in this function as well, but I've verified
that the link was still reported "up" in all the cases I could reproduce.

So, do you "suspect" or did you "prove"? e.g., log cases where this
check actually helps?

And to Bjorn's point: do you know *why* such cases were hit? That would
help to understand if the cases you're worrying about are hopelessly
racy, or if there's some way to ensure synchronization.

Brian
Shawn Lin May 24, 2017, 1:14 a.m. UTC | #6
Hi Brian,

在 2017/5/24 9:00, Brian Norris 写道:
> On Wed, May 24, 2017 at 08:54:14AM +0800, Shawn Lin wrote:
>> 在 2017/5/24 2:00, Brian Norris 写道:
>>> On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
>>>> This patch checks the link status before reading and
>>>> writing configure space of devices attached to the RC.
>>>> If the link status is down, we shouldn't try to access
>>>> the devices.
>>>
>>> I'm curious, in what situations are you seeing the link down? In all the
>>> cases where I can manage to screw up my endpoint and see system aborts
>>> due to config accesses, this check still says the link is up. Presumably
>>> you have some test cases that benefit from this though.
>
> NB: Bjorn asked a similar question in a different form. The underlying
> concern though, is that this is racy.

yes, I saw that.

>
>> Of course. This patch doesn't prevent all these cases, for instance,
>> you do a memory read/write in the EP function driver, since it doesn't
>> call these two APIs at all.
>
> Of course. I'm only talking about config accesses.

okay.

>
>> The reason for me to added this check is that I saw a external abort
>> down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
>> the link was re-init or total broken at that time.
>
> I've seen plenty of aborts in this function as well, but I've verified
> that the link was still reported "up" in all the cases I could reproduce.
>

I think it's reasonable as the link could be retrained automatically if
it's not totaly broken at all. Did you poweroff the endpoint and could
still pass this check?

> So, do you "suspect" or did you "prove"? e.g., log cases where this
> check actually helps?

I was powering off the devices and did a lspci, and saw the log cases
there. I will check this again.

>
> And to Bjorn's point: do you know *why* such cases were hit? That would
> help to understand if the cases you're worrying about are hopelessly
> racy, or if there's some way to ensure synchronization.
>
> Brian
>
>
>
Brian Norris May 24, 2017, 1:15 a.m. UTC | #7
(Since Shawn didn't quite answer this piece)

On Wed, May 24, 2017 at 09:04:33AM +0800, Shawn Lin wrote:
> 在 2017/5/24 3:44, Bjorn Helgaas 写道:
> >On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> >>This patch checks the link status before reading and
> >>writing configure space of devices attached to the RC.
> >>If the link status is down, we shouldn't try to access
> >>the devices.
> >
> >What bad things happen without this patch?

On this SoC, I've seen this sort of behavior (reading the config space
when the device isn't responding) yield aborts, which panic the system. 

I can't speak exactly for which scenarios Shawn is addressing though. As
I mentioned, this patch doesn't do anything for me so far.

Brian
Brian Norris May 24, 2017, 1:25 a.m. UTC | #8
On Wed, May 24, 2017 at 09:14:52AM +0800, Shawn Lin wrote:
> 在 2017/5/24 9:00, Brian Norris 写道:
> >On Wed, May 24, 2017 at 08:54:14AM +0800, Shawn Lin wrote:
> >>The reason for me to added this check is that I saw a external abort
> >>down to rockchip_pcie_rd_own_conf, of which I highly suspected was that
> >>the link was re-init or total broken at that time.
> >
> >I've seen plenty of aborts in this function as well, but I've verified
> >that the link was still reported "up" in all the cases I could reproduce.
> >
> 
> I think it's reasonable as the link could be retrained automatically if
> it's not totaly broken at all. Did you poweroff the endpoint and could
> still pass this check?

I don't think I powered it off entirely, but I did try asserting its PD#
pin, which powers of most of the functionality -- enough that it
apparently causes aborts, but doesn't bring the link down.

> >So, do you "suspect" or did you "prove"? e.g., log cases where this
> >check actually helps?
> 
> I was powering off the devices and did a lspci, and saw the log cases
> there. I will check this again.
> 
> >
> >And to Bjorn's point: do you know *why* such cases were hit? That would
> >help to understand if the cases you're worrying about are hopelessly
> >racy, or if there's some way to ensure synchronization.

OK, so you've answered this question: losing power is hopelessly racy.

I guess it's up to Bjorn as to whether this racy check is useful at all
then.

Brian
Bjorn Helgaas May 24, 2017, 9:33 p.m. UTC | #9
On Tue, May 23, 2017 at 06:15:07PM -0700, Brian Norris wrote:
> (Since Shawn didn't quite answer this piece)
> 
> On Wed, May 24, 2017 at 09:04:33AM +0800, Shawn Lin wrote:
> > 在 2017/5/24 3:44, Bjorn Helgaas 写道:
> > >On Fri, May 19, 2017 at 02:58:27PM +0800, Shawn Lin wrote:
> > >>This patch checks the link status before reading and
> > >>writing configure space of devices attached to the RC.
> > >>If the link status is down, we shouldn't try to access
> > >>the devices.
> > >
> > >What bad things happen without this patch?
> 
> On this SoC, I've seen this sort of behavior (reading the config space
> when the device isn't responding) yield aborts, which panic the system. 

Trying to read config space of a device that doesn't exist is an
essential part of enumeration, and we expect whatever error occurs at
the hardware level to get turned into 0xffffffff data at the CPU (as
in pci_bus_read_dev_vendor_id()).

Shawn mentioned some issue with memory read/write as well.  I think we
need to sort out how to handle both the config space issue and the
memory issue.  This patch seems like it papers over part of it and
reduces the urgency of finding a real solution.

I'm going to drop this for now, pending a more detailed explanation.

Bjorn
Brian Norris May 24, 2017, 9:43 p.m. UTC | #10
On Wed, May 24, 2017 at 04:33:53PM -0500, Bjorn Helgaas wrote:
> On Tue, May 23, 2017 at 06:15:07PM -0700, Brian Norris wrote:
> > (Since Shawn didn't quite answer this piece)
> > 
> > On Wed, May 24, 2017 at 09:04:33AM +0800, Shawn Lin wrote:
> > > 在 2017/5/24 3:44, Bjorn Helgaas 写道:
> > > >What bad things happen without this patch?
> > 
> > On this SoC, I've seen this sort of behavior (reading the config space
> > when the device isn't responding) yield aborts, which panic the system. 
> 
> Trying to read config space of a device that doesn't exist is an
> essential part of enumeration, and we expect whatever error occurs at
> the hardware level to get turned into 0xffffffff data at the CPU (as
> in pci_bus_read_dev_vendor_id()).

Understood.

> Shawn mentioned some issue with memory read/write as well.  I think we
> need to sort out how to handle both the config space issue and the
> memory issue.

Is the memory space issue actually a problem? I don't see anything in
the spec that says how these should behave when the device is off.

(I mean, it's always nice if there are fewer ways to crash the system.
But I thought mem 0xffffffff was only a convention, not a standard.)

> This patch seems like it papers over part of it and
> reduces the urgency of finding a real solution.

I've been bugging Shawn about this for a while already. It's not clear
there's really a good solution so far, apart from hacking the arch
exception handlers, like you're doing in the imx6 driver:

commit 415b6185c541dc0a21457ff307cdb61950a6eb9f
Author: Lucas Stach <l.stach@pengutronix.de>
Date:   Mon May 22 17:06:30 2017 -0500

    PCI: imx6: Fix config read timeout handling

I don't think the ARM64 maintainers have been fond of adding similar
"hook" code...

> I'm going to drop this for now, pending a more detailed explanation.

Fine with me.

Brian
diff mbox

Patch

diff --git a/drivers/pci/host/pcie-rockchip.c b/drivers/pci/host/pcie-rockchip.c
index 0e020b6..1e64227 100644
--- a/drivers/pci/host/pcie-rockchip.c
+++ b/drivers/pci/host/pcie-rockchip.c
@@ -275,9 +275,21 @@  static void rockchip_pcie_update_txcredit_mui(struct rockchip_pcie *rockchip)
 	rockchip_pcie_write(rockchip, val, PCIE_CORE_TXCREDIT_CFG1);
 }
 
+static inline bool rockchip_pcie_link_up(struct rockchip_pcie *rockchip)
+{
+	return PCIE_LINK_UP(rockchip_pcie_read(rockchip,
+					PCIE_CLIENT_BASIC_STATUS1));
+}
+
 static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
 				      struct pci_bus *bus, int dev)
 {
+	/* do not access the devices if the link isn't completed */
+	if (bus->number != rockchip->root_bus_nr) {
+		if (!rockchip_pcie_link_up(rockchip))
+			return 0;
+	}
+
 	/* access only one slot on each root port */
 	if (bus->number == rockchip->root_bus_nr && dev > 0)
 		return 0;