diff mbox series

[v2,2/3] PCI: qcom: Read back PARF_LTSSM register

Message ID 20240210-topic-8280_pcie-v2-2-1cef4b606883@linaro.org
State New
Headers show
Series Qualcomm PCIe RC shutdown & reinit | expand

Commit Message

Konrad Dybcio Feb. 10, 2024, 5:10 p.m. UTC
To ensure write completion, read the PARF_LTSSM register after setting
the LTSSM enable bit before polling for "link up".

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
---
 drivers/pci/controller/dwc/pcie-qcom.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Bjorn Helgaas Feb. 12, 2024, 9:17 p.m. UTC | #1
Maybe include the reason in the subject?  "Read back" is literally
what the diff says.

On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
> To ensure write completion, read the PARF_LTSSM register after setting
> the LTSSM enable bit before polling for "link up".

The write will obviously complete *some* time; I assume the point is
that it's important for it to complete before some other event, and it
would be nice to know why that's important.

> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
> ---
>  drivers/pci/controller/dwc/pcie-qcom.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index cbde9effa352..6a469ed213ce 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -539,6 +539,7 @@ static void qcom_pcie_2_3_2_ltssm_enable(struct qcom_pcie *pcie)
>  	val = readl(pcie->parf + PARF_LTSSM);
>  	val |= LTSSM_EN;
>  	writel(val, pcie->parf + PARF_LTSSM);
> +	readl(pcie->parf + PARF_LTSSM);
>  }
>  
>  static int qcom_pcie_get_resources_2_3_2(struct qcom_pcie *pcie)
> 
> -- 
> 2.40.1
>
Konrad Dybcio Feb. 14, 2024, 9:35 p.m. UTC | #2
On 12.02.2024 22:17, Bjorn Helgaas wrote:
> Maybe include the reason in the subject?  "Read back" is literally
> what the diff says.
> 
> On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
>> To ensure write completion, read the PARF_LTSSM register after setting
>> the LTSSM enable bit before polling for "link up".
> 
> The write will obviously complete *some* time; I assume the point is
> that it's important for it to complete before some other event, and it
> would be nice to know why that's important.

Right, that's very much meaningful on non-total-store-ordering
architectures, like arm64, where the CPU receives a store instruction,
but that does not necessarily impact the memory/MMIO state immediately.

Konrad
Bjorn Helgaas Feb. 14, 2024, 10:28 p.m. UTC | #3
On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
> On 12.02.2024 22:17, Bjorn Helgaas wrote:
> > Maybe include the reason in the subject?  "Read back" is literally
> > what the diff says.
> > 
> > On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
> >> To ensure write completion, read the PARF_LTSSM register after setting
> >> the LTSSM enable bit before polling for "link up".
> > 
> > The write will obviously complete *some* time; I assume the point is
> > that it's important for it to complete before some other event, and it
> > would be nice to know why that's important.
> 
> Right, that's very much meaningful on non-total-store-ordering
> architectures, like arm64, where the CPU receives a store instruction,
> but that does not necessarily impact the memory/MMIO state immediately.

I was hinting that maybe we could say what the other event is, or what
problem this solves?  E.g., maybe it's as simple as "there's no point
in polling for link up until after the PARF_LTSSM store completes."

But while the read of PARF_LTSSM might reduce the number of "is the
link up" polls, it probably wouldn't speed anything up otherwise, so I
suspect there's an actual functional reason for this patch, and that's
what I'm getting at.

Bjorn
Konrad Dybcio Feb. 15, 2024, 10:21 a.m. UTC | #4
On 14.02.2024 23:28, Bjorn Helgaas wrote:
> On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
>> On 12.02.2024 22:17, Bjorn Helgaas wrote:
>>> Maybe include the reason in the subject?  "Read back" is literally
>>> what the diff says.
>>>
>>> On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
>>>> To ensure write completion, read the PARF_LTSSM register after setting
>>>> the LTSSM enable bit before polling for "link up".
>>>
>>> The write will obviously complete *some* time; I assume the point is
>>> that it's important for it to complete before some other event, and it
>>> would be nice to know why that's important.
>>
>> Right, that's very much meaningful on non-total-store-ordering
>> architectures, like arm64, where the CPU receives a store instruction,
>> but that does not necessarily impact the memory/MMIO state immediately.
> 
> I was hinting that maybe we could say what the other event is, or what
> problem this solves?  E.g., maybe it's as simple as "there's no point
> in polling for link up until after the PARF_LTSSM store completes."
> 
> But while the read of PARF_LTSSM might reduce the number of "is the
> link up" polls, it probably wouldn't speed anything up otherwise, so I
> suspect there's an actual functional reason for this patch, and that's
> what I'm getting at.

So, the register containing the "enable switch" (PARF_LTSSM) can (due
to the armv8 memory model) be "written" but not "change the value of
memory/mmio from the perspective of other (non-CPU) memory-readers
(such as the MMIO-mapped PCI controller itself)".

In that case, the CPU will happily continue calling qcom_pcie_link_up()
in a loop, waiting for the PCIe controller to bring the link up, however
the PCIE controller may have never received the PARF_LTSSM "enable link"
write by the time we decide to time out on checking the link status.

It may also never happen for you, but that's exactly like a classic race
condition, where it may simply not manifest due to the code around the
problematic lines hiding it. It may also only manifest on certain CPU
cores that try to be smarter than you and keep reordering/delaying
instructions if they don't seem immediately necessary.

Konrad
Bjorn Helgaas Feb. 15, 2024, 4:11 p.m. UTC | #5
On Thu, Feb 15, 2024 at 11:21:45AM +0100, Konrad Dybcio wrote:
> On 14.02.2024 23:28, Bjorn Helgaas wrote:
> > On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
> >> On 12.02.2024 22:17, Bjorn Helgaas wrote:
> >>> Maybe include the reason in the subject?  "Read back" is literally
> >>> what the diff says.
> >>>
> >>> On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
> >>>> To ensure write completion, read the PARF_LTSSM register after setting
> >>>> the LTSSM enable bit before polling for "link up".
> >>>
> >>> The write will obviously complete *some* time; I assume the point is
> >>> that it's important for it to complete before some other event, and it
> >>> would be nice to know why that's important.
> >>
> >> Right, that's very much meaningful on non-total-store-ordering
> >> architectures, like arm64, where the CPU receives a store instruction,
> >> but that does not necessarily impact the memory/MMIO state immediately.
> > 
> > I was hinting that maybe we could say what the other event is, or what
> > problem this solves?  E.g., maybe it's as simple as "there's no point
> > in polling for link up until after the PARF_LTSSM store completes."
> > 
> > But while the read of PARF_LTSSM might reduce the number of "is the
> > link up" polls, it probably wouldn't speed anything up otherwise, so I
> > suspect there's an actual functional reason for this patch, and that's
> > what I'm getting at.
> 
> So, the register containing the "enable switch" (PARF_LTSSM) can (due
> to the armv8 memory model) be "written" but not "change the value of
> memory/mmio from the perspective of other (non-CPU) memory-readers
> (such as the MMIO-mapped PCI controller itself)".
> 
> In that case, the CPU will happily continue calling qcom_pcie_link_up()
> in a loop, waiting for the PCIe controller to bring the link up, however
> the PCIE controller may have never received the PARF_LTSSM "enable link"
> write by the time we decide to time out on checking the link status.
> 
> It may also never happen for you, but that's exactly like a classic race
> condition, where it may simply not manifest due to the code around the
> problematic lines hiding it. It may also only manifest on certain CPU
> cores that try to be smarter than you and keep reordering/delaying
> instructions if they don't seem immediately necessary.

Does this mean the register is mapped incorrectly, e.g., I see arm64
has many different kinds of mappings for cacheability,
write-buffering, etc?

Or, if it is already mapped correctly, are we confident that none of
the *other* register writes need similar treatment?  Is there a rule
we can apply to know when the read-after-write is needed?

Bjorn
Konrad Dybcio Feb. 15, 2024, 6:44 p.m. UTC | #6
On 15.02.2024 17:11, Bjorn Helgaas wrote:
> On Thu, Feb 15, 2024 at 11:21:45AM +0100, Konrad Dybcio wrote:
>> On 14.02.2024 23:28, Bjorn Helgaas wrote:
>>> On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
>>>> On 12.02.2024 22:17, Bjorn Helgaas wrote:
>>>>> Maybe include the reason in the subject?  "Read back" is literally
>>>>> what the diff says.
>>>>>
>>>>> On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
>>>>>> To ensure write completion, read the PARF_LTSSM register after setting
>>>>>> the LTSSM enable bit before polling for "link up".
>>>>>
>>>>> The write will obviously complete *some* time; I assume the point is
>>>>> that it's important for it to complete before some other event, and it
>>>>> would be nice to know why that's important.
>>>>
>>>> Right, that's very much meaningful on non-total-store-ordering
>>>> architectures, like arm64, where the CPU receives a store instruction,
>>>> but that does not necessarily impact the memory/MMIO state immediately.
>>>
>>> I was hinting that maybe we could say what the other event is, or what
>>> problem this solves?  E.g., maybe it's as simple as "there's no point
>>> in polling for link up until after the PARF_LTSSM store completes."
>>>
>>> But while the read of PARF_LTSSM might reduce the number of "is the
>>> link up" polls, it probably wouldn't speed anything up otherwise, so I
>>> suspect there's an actual functional reason for this patch, and that's
>>> what I'm getting at.
>>
>> So, the register containing the "enable switch" (PARF_LTSSM) can (due
>> to the armv8 memory model) be "written" but not "change the value of
>> memory/mmio from the perspective of other (non-CPU) memory-readers
>> (such as the MMIO-mapped PCI controller itself)".
>>
>> In that case, the CPU will happily continue calling qcom_pcie_link_up()
>> in a loop, waiting for the PCIe controller to bring the link up, however
>> the PCIE controller may have never received the PARF_LTSSM "enable link"
>> write by the time we decide to time out on checking the link status.
>>
>> It may also never happen for you, but that's exactly like a classic race
>> condition, where it may simply not manifest due to the code around the
>> problematic lines hiding it. It may also only manifest on certain CPU
>> cores that try to be smarter than you and keep reordering/delaying
>> instructions if they don't seem immediately necessary.
> 
> Does this mean the register is mapped incorrectly, e.g., I see arm64
> has many different kinds of mappings for cacheability,
> write-buffering, etc?

No, it's correctly mapped as "device" memory, but even that gives no
guarantees about when the writes are "flushed" out of the CPU/cluster

> 
> Or, if it is already mapped correctly, are we confident that none of
> the *other* register writes need similar treatment?  Is there a rule
> we can apply to know when the read-after-write is needed?

Generally, it's a good idea to add such readbacks after all timing-
critical writes, especially when they concern asserting reset,
enabling/disabling power, etc., to make sure we're not assuming the
hardware state of a peripheral has changed before we ask it to do so. 

Konrad
Johan Hovold Feb. 16, 2024, 6:52 a.m. UTC | #7
On Thu, Feb 15, 2024 at 07:44:27PM +0100, Konrad Dybcio wrote:
> On 15.02.2024 17:11, Bjorn Helgaas wrote:
> > On Thu, Feb 15, 2024 at 11:21:45AM +0100, Konrad Dybcio wrote:
> >> On 14.02.2024 23:28, Bjorn Helgaas wrote:
> >>> On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
> >>>> On 12.02.2024 22:17, Bjorn Helgaas wrote:
> >>>>> Maybe include the reason in the subject?  "Read back" is literally
> >>>>> what the diff says.
> >>>>>
> >>>>> On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
> >>>>>> To ensure write completion, read the PARF_LTSSM register after setting
> >>>>>> the LTSSM enable bit before polling for "link up".
> >>>>>
> >>>>> The write will obviously complete *some* time; I assume the point is
> >>>>> that it's important for it to complete before some other event, and it
> >>>>> would be nice to know why that's important.
> >>>>
> >>>> Right, that's very much meaningful on non-total-store-ordering
> >>>> architectures, like arm64, where the CPU receives a store instruction,
> >>>> but that does not necessarily impact the memory/MMIO state immediately.
> >>>
> >>> I was hinting that maybe we could say what the other event is, or what
> >>> problem this solves?  E.g., maybe it's as simple as "there's no point
> >>> in polling for link up until after the PARF_LTSSM store completes."
> >>>
> >>> But while the read of PARF_LTSSM might reduce the number of "is the
> >>> link up" polls, it probably wouldn't speed anything up otherwise, so I
> >>> suspect there's an actual functional reason for this patch, and that's
> >>> what I'm getting at.
> >>
> >> So, the register containing the "enable switch" (PARF_LTSSM) can (due
> >> to the armv8 memory model) be "written" but not "change the value of
> >> memory/mmio from the perspective of other (non-CPU) memory-readers
> >> (such as the MMIO-mapped PCI controller itself)".
> >>
> >> In that case, the CPU will happily continue calling qcom_pcie_link_up()
> >> in a loop, waiting for the PCIe controller to bring the link up, however
> >> the PCIE controller may have never received the PARF_LTSSM "enable link"
> >> write by the time we decide to time out on checking the link status.

This makes no sense. As Bjorn already said, you're just polling for the
link to come up (for a second). And unless you have something else that
depends on the write to have reached the device, there is no need to
read it back. It's not going to be cached indefinitely if that's what
you fear.

> Generally, it's a good idea to add such readbacks after all timing-
> critical writes, especially when they concern asserting reset,
> enabling/disabling power, etc., to make sure we're not assuming the
> hardware state of a peripheral has changed before we ask it to do so. 

Again no, there is no general need to do that. It all depends on what
the code does and how the device works.

Johan
Konrad Dybcio March 15, 2024, 10:16 a.m. UTC | #8
On 2/16/24 07:52, Johan Hovold wrote:
> On Thu, Feb 15, 2024 at 07:44:27PM +0100, Konrad Dybcio wrote:
>> On 15.02.2024 17:11, Bjorn Helgaas wrote:
>>> On Thu, Feb 15, 2024 at 11:21:45AM +0100, Konrad Dybcio wrote:
>>>> On 14.02.2024 23:28, Bjorn Helgaas wrote:
>>>>> On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
>>>>>> On 12.02.2024 22:17, Bjorn Helgaas wrote:
>>>>>>> Maybe include the reason in the subject?  "Read back" is literally
>>>>>>> what the diff says.
>>>>>>>
>>>>>>> On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
>>>>>>>> To ensure write completion, read the PARF_LTSSM register after setting
>>>>>>>> the LTSSM enable bit before polling for "link up".
>>>>>>>
>>>>>>> The write will obviously complete *some* time; I assume the point is
>>>>>>> that it's important for it to complete before some other event, and it
>>>>>>> would be nice to know why that's important.
>>>>>>
>>>>>> Right, that's very much meaningful on non-total-store-ordering
>>>>>> architectures, like arm64, where the CPU receives a store instruction,
>>>>>> but that does not necessarily impact the memory/MMIO state immediately.
>>>>>
>>>>> I was hinting that maybe we could say what the other event is, or what
>>>>> problem this solves?  E.g., maybe it's as simple as "there's no point
>>>>> in polling for link up until after the PARF_LTSSM store completes."
>>>>>
>>>>> But while the read of PARF_LTSSM might reduce the number of "is the
>>>>> link up" polls, it probably wouldn't speed anything up otherwise, so I
>>>>> suspect there's an actual functional reason for this patch, and that's
>>>>> what I'm getting at.
>>>>
>>>> So, the register containing the "enable switch" (PARF_LTSSM) can (due
>>>> to the armv8 memory model) be "written" but not "change the value of
>>>> memory/mmio from the perspective of other (non-CPU) memory-readers
>>>> (such as the MMIO-mapped PCI controller itself)".
>>>>
>>>> In that case, the CPU will happily continue calling qcom_pcie_link_up()
>>>> in a loop, waiting for the PCIe controller to bring the link up, however
>>>> the PCIE controller may have never received the PARF_LTSSM "enable link"
>>>> write by the time we decide to time out on checking the link status.
> 
> This makes no sense. As Bjorn already said, you're just polling for the
> link to come up (for a second). And unless you have something else that
> depends on the write to have reached the device, there is no need to
> read it back. It's not going to be cached indefinitely if that's what
> you fear.

The point is, if we know that the hardware is expected to return "done"
within the polling timeout value of receiving the request to do so, we
are actively taking away an unknown amount of time from that timeout.

So, if the polling condition becomes true after 980ms, but due to write
buffering the value reached the PCIe hardware after 21 ms, we're gonna
hit a timeout. Or under truly extreme circumstances, the polling may
time out before the write has even arrived at the PCIe hw.

> 
>> Generally, it's a good idea to add such readbacks after all timing-
>> critical writes, especially when they concern asserting reset,
>> enabling/disabling power, etc., to make sure we're not assuming the
>> hardware state of a peripheral has changed before we ask it to do so.
> 
> Again no, there is no general need to do that. It all depends on what
> the code does and how the device works.

Agreed it's not necessary *in general*, but as I pointed out, this is
an operation that we expect to complete within a set time frame, which
involves external hardware.

Konrad
Manivannan Sadhasivam March 15, 2024, 11:16 a.m. UTC | #9
On Fri, Mar 15, 2024 at 11:16:59AM +0100, Konrad Dybcio wrote:
> 
> 
> On 2/16/24 07:52, Johan Hovold wrote:
> > On Thu, Feb 15, 2024 at 07:44:27PM +0100, Konrad Dybcio wrote:
> > > On 15.02.2024 17:11, Bjorn Helgaas wrote:
> > > > On Thu, Feb 15, 2024 at 11:21:45AM +0100, Konrad Dybcio wrote:
> > > > > On 14.02.2024 23:28, Bjorn Helgaas wrote:
> > > > > > On Wed, Feb 14, 2024 at 10:35:16PM +0100, Konrad Dybcio wrote:
> > > > > > > On 12.02.2024 22:17, Bjorn Helgaas wrote:
> > > > > > > > Maybe include the reason in the subject?  "Read back" is literally
> > > > > > > > what the diff says.
> > > > > > > > 
> > > > > > > > On Sat, Feb 10, 2024 at 06:10:06PM +0100, Konrad Dybcio wrote:
> > > > > > > > > To ensure write completion, read the PARF_LTSSM register after setting
> > > > > > > > > the LTSSM enable bit before polling for "link up".
> > > > > > > > 
> > > > > > > > The write will obviously complete *some* time; I assume the point is
> > > > > > > > that it's important for it to complete before some other event, and it
> > > > > > > > would be nice to know why that's important.
> > > > > > > 
> > > > > > > Right, that's very much meaningful on non-total-store-ordering
> > > > > > > architectures, like arm64, where the CPU receives a store instruction,
> > > > > > > but that does not necessarily impact the memory/MMIO state immediately.
> > > > > > 
> > > > > > I was hinting that maybe we could say what the other event is, or what
> > > > > > problem this solves?  E.g., maybe it's as simple as "there's no point
> > > > > > in polling for link up until after the PARF_LTSSM store completes."
> > > > > > 
> > > > > > But while the read of PARF_LTSSM might reduce the number of "is the
> > > > > > link up" polls, it probably wouldn't speed anything up otherwise, so I
> > > > > > suspect there's an actual functional reason for this patch, and that's
> > > > > > what I'm getting at.
> > > > > 
> > > > > So, the register containing the "enable switch" (PARF_LTSSM) can (due
> > > > > to the armv8 memory model) be "written" but not "change the value of
> > > > > memory/mmio from the perspective of other (non-CPU) memory-readers
> > > > > (such as the MMIO-mapped PCI controller itself)".
> > > > > 
> > > > > In that case, the CPU will happily continue calling qcom_pcie_link_up()
> > > > > in a loop, waiting for the PCIe controller to bring the link up, however
> > > > > the PCIE controller may have never received the PARF_LTSSM "enable link"
> > > > > write by the time we decide to time out on checking the link status.
> > 
> > This makes no sense. As Bjorn already said, you're just polling for the
> > link to come up (for a second). And unless you have something else that
> > depends on the write to have reached the device, there is no need to
> > read it back. It's not going to be cached indefinitely if that's what
> > you fear.
> 
> The point is, if we know that the hardware is expected to return "done"
> within the polling timeout value of receiving the request to do so, we
> are actively taking away an unknown amount of time from that timeout.
> 
> So, if the polling condition becomes true after 980ms, but due to write
> buffering the value reached the PCIe hardware after 21 ms, we're gonna
> hit a timeout. Or under truly extreme circumstances, the polling may
> time out before the write has even arrived at the PCIe hw.
> 

You should've mentioned the actual reason for doing the readback in the commit
message. That would've clarified the intention.

> > 
> > > Generally, it's a good idea to add such readbacks after all timing-
> > > critical writes, especially when they concern asserting reset,
> > > enabling/disabling power, etc., to make sure we're not assuming the
> > > hardware state of a peripheral has changed before we ask it to do so.
> > 
> > Again no, there is no general need to do that. It all depends on what
> > the code does and how the device works.
> 
> Agreed it's not necessary *in general*, but as I pointed out, this is
> an operation that we expect to complete within a set time frame, which
> involves external hardware.
> 

As I pointed out in the review of v1 series, LTSSM is in PARF register region
and the link status is in DBI region. Techinically both belongs to the PCIe
domain, but I am not 100% sure that both belong to the same hw domain or
different. So I cannot rule out the possibility that the first write may not
reach the hardware by the time link status is queried.

That's the reason I gave my R-b tag. But I need to confirm with the hw team
on this to be sure since this may be applicable to other drivers also.

- Mani
Johan Hovold March 15, 2024, 4:47 p.m. UTC | #10
On Fri, Mar 15, 2024 at 11:16:59AM +0100, Konrad Dybcio wrote:
> On 2/16/24 07:52, Johan Hovold wrote:

> > This makes no sense. As Bjorn already said, you're just polling for the
> > link to come up (for a second). And unless you have something else that
> > depends on the write to have reached the device, there is no need to
> > read it back. It's not going to be cached indefinitely if that's what
> > you fear.
> 
> The point is, if we know that the hardware is expected to return "done"
> within the polling timeout value of receiving the request to do so, we
> are actively taking away an unknown amount of time from that timeout.

We're talking about microseconds, not milliseconds or seconds as you
seem to believe.

> So, if the polling condition becomes true after 980ms, but due to write
> buffering the value reached the PCIe hardware after 21 ms, we're gonna
> hit a timeout. Or under truly extreme circumstances, the polling may
> time out before the write has even arrived at the PCIe hw.

So the write latency is not an issue here.

Johan
Konrad Dybcio March 27, 2024, 7:37 p.m. UTC | #11
On 15.03.2024 5:47 PM, Johan Hovold wrote:
> On Fri, Mar 15, 2024 at 11:16:59AM +0100, Konrad Dybcio wrote:
>> On 2/16/24 07:52, Johan Hovold wrote:
> 
>>> This makes no sense. As Bjorn already said, you're just polling for the
>>> link to come up (for a second). And unless you have something else that
>>> depends on the write to have reached the device, there is no need to
>>> read it back. It's not going to be cached indefinitely if that's what
>>> you fear.
>>
>> The point is, if we know that the hardware is expected to return "done"
>> within the polling timeout value of receiving the request to do so, we
>> are actively taking away an unknown amount of time from that timeout.
> 
> We're talking about microseconds, not milliseconds or seconds as you
> seem to believe.
> 
>> So, if the polling condition becomes true after 980ms, but due to write
>> buffering the value reached the PCIe hardware after 21 ms, we're gonna
>> hit a timeout. Or under truly extreme circumstances, the polling may
>> time out before the write has even arrived at the PCIe hw.
> 
> So the write latency is not an issue here.

Right, I'm willing to believe the CPU will kick the can down the road
for this long. I'll drop this.

Konrad
Konrad Dybcio March 27, 2024, 7:38 p.m. UTC | #12
On 27.03.2024 8:37 PM, Konrad Dybcio wrote:
> On 15.03.2024 5:47 PM, Johan Hovold wrote:
>> On Fri, Mar 15, 2024 at 11:16:59AM +0100, Konrad Dybcio wrote:
>>> On 2/16/24 07:52, Johan Hovold wrote:
>>
>>>> This makes no sense. As Bjorn already said, you're just polling for the
>>>> link to come up (for a second). And unless you have something else that
>>>> depends on the write to have reached the device, there is no need to
>>>> read it back. It's not going to be cached indefinitely if that's what
>>>> you fear.
>>>
>>> The point is, if we know that the hardware is expected to return "done"
>>> within the polling timeout value of receiving the request to do so, we
>>> are actively taking away an unknown amount of time from that timeout.
>>
>> We're talking about microseconds, not milliseconds or seconds as you
>> seem to believe.
>>
>>> So, if the polling condition becomes true after 980ms, but due to write
>>> buffering the value reached the PCIe hardware after 21 ms, we're gonna
>>> hit a timeout. Or under truly extreme circumstances, the polling may
>>> time out before the write has even arrived at the PCIe hw.
>>
>> So the write latency is not an issue here.
> 
> Right, I'm willing to believe the CPU will kick the can down the road
> for this long. I'll drop this.

will not kick the can down the road for this long*

Konrad
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index cbde9effa352..6a469ed213ce 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -539,6 +539,7 @@  static void qcom_pcie_2_3_2_ltssm_enable(struct qcom_pcie *pcie)
 	val = readl(pcie->parf + PARF_LTSSM);
 	val |= LTSSM_EN;
 	writel(val, pcie->parf + PARF_LTSSM);
+	readl(pcie->parf + PARF_LTSSM);
 }
 
 static int qcom_pcie_get_resources_2_3_2(struct qcom_pcie *pcie)